Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (-63.57%)
Lightnlp基于Pytorch和torchtext的自然语言处理深度学习框架。
Stars: ✭ 739 (-88.9%)
backpropBackprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (-96.56%)
Filipino-Text-BenchmarksOpen-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-99.67%)
OpenDialogAn Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (-98.59%)
Bert language understandingPre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Stars: ✭ 933 (-85.98%)
Vaaku2VecLanguage Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (-98.98%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (-48.78%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (-99.07%)
Chatito🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Stars: ✭ 678 (-89.81%)
LightLM高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task
Stars: ✭ 54 (-99.19%)
Product-Categorization-NLPMulti-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-99.55%)
textgoText preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (-99.5%)
NSP-BERTThe code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (-97.51%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (-97.18%)
cdQA-ui⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-99.71%)
mcQA🔮 Answering multiple choice questions with Language Models.
Stars: ✭ 23 (-99.65%)
ganbert-pytorchEnhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace
Stars: ✭ 60 (-99.1%)
Ngram2vecFour word embedding models implemented in Python. Supporting arbitrary context features
Stars: ✭ 703 (-89.44%)
feedIOA Feed Aggregator that Knows What You Want to Read.
Stars: ✭ 26 (-99.61%)
Medi-CoQAConversational Question Answering on Clinical Text
Stars: ✭ 22 (-99.67%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (-98.72%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (-97.73%)
text2classMulti-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-99.77%)
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-99.76%)
MobileQA离线端阅读理解应用 QA for mobile, Android & iPhone
Stars: ✭ 49 (-99.26%)
kwxBERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-99.5%)
wordfish-pythonextract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (-99.71%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-99.67%)
squad-v1.1-ptPortuguese translation of the SQuAD dataset
Stars: ✭ 13 (-99.8%)
ODSQAODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
Stars: ✭ 43 (-99.35%)
FewCLUEFewCLUE 小样本学习测评基准,中文版
Stars: ✭ 251 (-96.23%)
AskNowNQSA question answering system for RDF knowledge graphs.
Stars: ✭ 32 (-99.52%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (-96.17%)
BertweetBERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Stars: ✭ 282 (-95.76%)
Cluecorpus2020Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (-95.82%)
Nlu simall kinds of baseline models for sentence similarity 句子对语义相似度模型
Stars: ✭ 286 (-95.7%)
Text Cnn嵌入Word2vec词向量的CNN中文文本分类
Stars: ✭ 298 (-95.52%)
Giveme5w1hExtraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Stars: ✭ 316 (-95.25%)
CLUE pytorchCLUE baseline pytorch CLUE的pytorch版本基线
Stars: ✭ 72 (-98.92%)
Chinese Text ClassificationChinese-Text-Classification,Tensorflow CNN(卷积神经网络)实现的中文文本分类。QQ群:522785813,微信群二维码:http://www.tensorflownews.com/
Stars: ✭ 284 (-95.73%)
Albert zhA LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Stars: ✭ 3,500 (-47.42%)
Eda nlp for chineseAn implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Stars: ✭ 660 (-90.08%)
Bert PytorchGoogle AI 2018 BERT pytorch implementation
Stars: ✭ 4,642 (-30.26%)