Eda nlp for chineseAn implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Stars: ✭ 660 (-90.08%)
Reuters Full Data SetFull dataset of Reuters composed of 8,551,441 news titles, links and timestamps (Jan 2007 - Aug 2016). Generate your own up to today!
Stars: ✭ 159 (-97.61%)
ZhopenieChinese Open Information Extraction (Tree-based Triple Relation Extraction Module)
Stars: ✭ 98 (-98.53%)
Ngram2vecFour word embedding models implemented in Python. Supporting arbitrary context features
Stars: ✭ 703 (-89.44%)
Weibo terminaterFinal Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Stars: ✭ 2,295 (-65.52%)
Char Rnn ChineseMulti-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch. Based on code of https://github.com/karpathy/char-rnn. Support Chinese and other things.
Stars: ✭ 192 (-97.12%)
FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (-98.95%)
DrFAQDrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-99.56%)
SimpletransformersTransformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (-56.72%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (-89.32%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (-96.17%)
Chinese Text ClassificationChinese-Text-Classification,Tensorflow CNN(卷积神经网络)实现的中文文本分类。QQ群:522785813,微信群二维码:http://www.tensorflownews.com/
Stars: ✭ 284 (-95.73%)
word2vec-moviesBag of words meets bags of popcorn in Python 3 中文教程
Stars: ✭ 54 (-99.19%)
wechselCode for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-99.41%)
KitanaQAKitanaQA: Adversarial training and data augmentation for neural question-answering models
Stars: ✭ 58 (-99.13%)
classifier multi labelmulti-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 127 (-98.09%)
TV4DialogNo description or website provided.
Stars: ✭ 33 (-99.5%)
troveWeakly supervised medical named entity classification
Stars: ✭ 55 (-99.17%)
Text Cnn嵌入Word2vec词向量的CNN中文文本分类
Stars: ✭ 298 (-95.52%)
Nlu simall kinds of baseline models for sentence similarity 句子对语义相似度模型
Stars: ✭ 286 (-95.7%)
Albert zhA LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Stars: ✭ 3,500 (-47.42%)
Cluener2020CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Stars: ✭ 689 (-89.65%)
ganbert-pytorchEnhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace
Stars: ✭ 60 (-99.1%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (-98.72%)
textgoText preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (-99.5%)
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-99.76%)
feedIOA Feed Aggregator that Knows What You Want to Read.
Stars: ✭ 26 (-99.61%)
cdQA-ui⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-99.71%)
Zhparserzhparser is a PostgreSQL extension for full-text search of Chinese language
Stars: ✭ 418 (-93.72%)
kwxBERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-99.5%)
iamQA中文wiki百科QA阅读理解问答系统,使用了CCKS2016数据的NER模型和CMRC2018的阅读理解模型,还有W2V词向量搜索,使用torchserve部署
Stars: ✭ 46 (-99.31%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (-97.73%)
AskNowNQSA question answering system for RDF knowledge graphs.
Stars: ✭ 32 (-99.52%)
BertweetBERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Stars: ✭ 282 (-95.76%)
FewCLUEFewCLUE 小样本学习测评基准,中文版
Stars: ✭ 251 (-96.23%)
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (-23.72%)
Cluecorpus2020Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (-95.82%)
Giveme5w1hExtraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Stars: ✭ 316 (-95.25%)
Nlp Projectsword2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding
Stars: ✭ 360 (-94.59%)
Bert PytorchGoogle AI 2018 BERT pytorch implementation
Stars: ✭ 4,642 (-30.26%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-98.14%)
ODSQAODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
Stars: ✭ 43 (-99.35%)
CLUE pytorchCLUE baseline pytorch CLUE的pytorch版本基线
Stars: ✭ 72 (-98.92%)