Ai lawall kinds of baseline models for long text classificaiton( text categorization)
Stars: ✭ 243 (+1250%)
TextUnderstandingTsetlinMachineUsing the Tsetlin Machine to learn human-interpretable rules for high-accuracy text categorization with medical applications
Stars: ✭ 48 (+166.67%)
R Text DataList of textual data sources to be used for text mining in R
Stars: ✭ 85 (+372.22%)
HiLAPCode for paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019
Stars: ✭ 116 (+544.44%)
NlpythonThis repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (+1372.22%)
textreadrTools to uniformly read in text data including semi-structured transcripts
Stars: ✭ 65 (+261.11%)
Gwu data miningMaterials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (+1105.56%)
ebe-datasetEvidence-based Explanation Dataset (AACL-IJCNLP 2020)
Stars: ✭ 16 (-11.11%)
Pytorch Transformers ClassificationBased on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.
Stars: ✭ 229 (+1172.22%)
text-classification-svmThe missing SVM-based text classification module implementing HanLP's interface
Stars: ✭ 46 (+155.56%)
tg crawlerJust a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.
Stars: ✭ 71 (+294.44%)
PaddlenlpNLP Core Library and Model Zoo based on PaddlePaddle 2.0
Stars: ✭ 212 (+1077.78%)
synaptic-simple-trainerA ready to go text classification trainer based on synaptic (https://github.com/cazala/synaptic)
Stars: ✭ 19 (+5.56%)
Python nlp tutorialThis repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Stars: ✭ 72 (+300%)
Interpret TextA library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard.
Stars: ✭ 220 (+1122.22%)
monkeylearn-javaOfficial Java client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Java apps.
Stars: ✭ 23 (+27.78%)
snorkelingExtracting biomedical relationships from literature with Snorkel 🏊
Stars: ✭ 56 (+211.11%)
BandBAND:BERT Application aNd Deployment,Simple and efficient BERT model training and deployment, 简单高效的 BERT 模型训练和部署
Stars: ✭ 216 (+1100%)
HiGitClassHiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories (ICDM'19)
Stars: ✭ 58 (+222.22%)
ChemdataextractorAutomatically extract chemical information from scientific documents
Stars: ✭ 152 (+744.44%)
NewsMTSCTarget-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k sentences and a state-of-the-art classification model.
Stars: ✭ 54 (+200%)
Icdar 2019 SroieICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
Stars: ✭ 202 (+1022.22%)
textgoText preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (+83.33%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (+50%)
JfasttextJava interface for fastText
Stars: ✭ 193 (+972.22%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (+244.44%)
SimpletransformersTransformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (+15905.56%)
small-textActive Learning for Text Classification in Python
Stars: ✭ 241 (+1238.89%)
Text2vecFast vectorization, topic modeling, distances and GloVe word embeddings in R.
Stars: ✭ 715 (+3872.22%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+188.89%)
Few Shot Text ClassificationFew-shot binary text classification with Induction Networks and Word2Vec weights initialization
Stars: ✭ 32 (+77.78%)
nlp classificationImplementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 224 (+1144.44%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+13461.11%)
Nlp xiaojiang自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Stars: ✭ 954 (+5200%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (+205.56%)
Omnicat BayesNaive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
Stars: ✭ 30 (+66.67%)
TextanalyzerA text analyzer which is based on machine learning,statistics and dictionaries that can analyze text. So far, it supports hot word extracting, text classification, part of speech tagging, named entity recognition, chinese word segment, extracting address, synonym, text clustering, word2vec model, edit distance, chinese word segment, sentence similarity,word sentiment tendency, name recognition, idiom recognition, placename recognition, organization recognition, traditional chinese recognition, pinyin transform.
Stars: ✭ 162 (+800%)
feedIOA Feed Aggregator that Knows What You Want to Read.
Stars: ✭ 26 (+44.44%)
KonlpyPython package for Korean natural language processing.
Stars: ✭ 1,098 (+6000%)
ConDigSumCode for EMNLP 2021 paper "Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization"
Stars: ✭ 62 (+244.44%)
AravecAraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Stars: ✭ 239 (+1227.78%)
TokenizersFast, Consistent Tokenization of Natural Language Text
Stars: ✭ 161 (+794.44%)
Cogcomp NlpyCogComp's light-weight Python NLP annotators
Stars: ✭ 115 (+538.89%)
BigartmFast topic modeling platform
Stars: ✭ 563 (+3027.78%)