Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+1388.65%)
wechselCode for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-82.97%)
Bert language understandingPre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Stars: ✭ 933 (+307.42%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+2806.55%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (-17.9%)
HugsVisionHugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Stars: ✭ 154 (-32.75%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (-62.88%)
oreilly-bert-nlpThis repository contains code for the O'Reilly Live Online Training for BERT
Stars: ✭ 19 (-91.7%)
Filipino-Text-BenchmarksOpen-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-90.39%)
text2classMulti-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-93.45%)
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+875.98%)
SimpletransformersTransformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (+1158.08%)
ParsBigBirdPersian Bert For Long-Range Sequences
Stars: ✭ 58 (-74.67%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (-34.06%)
FNet-pytorchUnofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
Stars: ✭ 204 (-10.92%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-90.39%)
Skin Lesions Classification DCNNsTransfer Learning with DCNNs (DenseNet, Inception V3, Inception-ResNet V2, VGG16) for skin lesions classification
Stars: ✭ 47 (-79.48%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+999.56%)
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+2117.03%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+958.95%)
ML4K-AI-ExtensionUse machine learning in AppInventor, with easy training using text, images, or numbers through the Machine Learning for Kids website.
Stars: ✭ 18 (-92.14%)
ganbert-pytorchEnhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace
Stars: ✭ 60 (-73.8%)
small-textActive Learning for Text Classification in Python
Stars: ✭ 241 (+5.24%)
textgoText preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (-85.59%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (-72.93%)
NSP-BERTThe code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (-27.51%)
Product-Categorization-NLPMulti-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-86.9%)
kwxBERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-85.59%)
ML2017FALLMachine Learning (EE 5184) in NTU
Stars: ✭ 66 (-71.18%)
question generatorAn NLP system for generating reading comprehension questions
Stars: ✭ 188 (-17.9%)
troveWeakly supervised medical named entity classification
Stars: ✭ 55 (-75.98%)
BertweetBERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Stars: ✭ 282 (+23.14%)
Lightnlp基于Pytorch和torchtext的自然语言处理深度学习框架。
Stars: ✭ 739 (+222.71%)
Concise Ipython Notebooks For Deep LearningIpython Notebooks for solving problems like classification, segmentation, generation using latest Deep learning algorithms on different publicly available text and image data-sets.
Stars: ✭ 23 (-89.96%)
TfclassifierTensorflow based training and classification scripts for text, images, etc
Stars: ✭ 441 (+92.58%)
RmdlRMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (+63.76%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-45.85%)
Lotclass[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Stars: ✭ 160 (-30.13%)
NeuronblocksNLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Stars: ✭ 1,356 (+492.14%)
CatalystAccelerated deep learning R&D
Stars: ✭ 2,804 (+1124.45%)
DrFAQDrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-87.34%)
COCO-LM[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (-52.4%)
FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (-69.43%)
Vaaku2VecLanguage Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (-70.31%)
LibFewShotLibFewShot: A Comprehensive Library for Few-shot Learning.
Stars: ✭ 629 (+174.67%)
X-TransformerX-Transformer: Taming Pretrained Transformers for eXtreme Multi-label Text Classification
Stars: ✭ 127 (-44.54%)
cmrc2019A Sentence Cloze Dataset for Chinese Machine Reading Comprehension (CMRC 2019)
Stars: ✭ 118 (-48.47%)
Ask2TransformersA Framework for Textual Entailment based Zero Shot text classification
Stars: ✭ 102 (-55.46%)
KB-ALBERTKB국민은행에서 제공하는 경제/금융 도메인에 특화된 한국어 ALBERT 모델
Stars: ✭ 215 (-6.11%)
ulm-basenetImplementation of ULMFit algorithm for text classification via transfer learning
Stars: ✭ 94 (-58.95%)