All Projects → Nlp_chinese_corpus → Similar Projects or Alternatives

2239 Open source projects that are alternatives of or similar to Nlp_chinese_corpus

Eda nlp for chinese
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Stars: ✭ 660 (-90.08%)
Mutual labels:  chinese, text-classification
Chinese Xinhua
📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。
Stars: ✭ 8,705 (+30.78%)
Mutual labels:  chinese, chinese-nlp
Reuters Full Data Set
Full dataset of Reuters composed of 8,551,441 news titles, links and timestamps (Jan 2007 - Aug 2016). Generate your own up to today!
Stars: ✭ 159 (-97.61%)
Mutual labels:  news, dataset
Zhopenie
Chinese Open Information Extraction (Tree-based Triple Relation Extraction Module)
Stars: ✭ 98 (-98.53%)
Mutual labels:  chinese, chinese-nlp
Ngram2vec
Four word embedding models implemented in Python. Supporting arbitrary context features
Stars: ✭ 703 (-89.44%)
Mutual labels:  chinese, word2vec
Hotnewsanalysis
利用文本挖掘技术进行新闻热点关注问题分析
Stars: ✭ 93 (-98.6%)
Mutual labels:  news, word2vec
Weibo terminater
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Stars: ✭ 2,295 (-65.52%)
Mutual labels:  chinese, corpus
Char Rnn Chinese
Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch. Based on code of https://github.com/karpathy/char-rnn. Support Chinese and other things.
Stars: ✭ 192 (-97.12%)
Mutual labels:  chinese, language-model
Cnn Text Classification Tf Chinese
CNN for Chinese Text Classification in Tensorflow
Stars: ✭ 237 (-96.44%)
Mutual labels:  chinese, text-classification
FinBERT-QA
Financial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (-98.95%)
Mutual labels:  question-answering, bert
DrFAQ
DrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-99.56%)
Mutual labels:  question-answering, bert
Simpletransformers
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (-56.72%)
sqlmap-wiki-zhcn
可能是最完整的 sqlmap 中文文档。
Stars: ✭ 51 (-99.23%)
Mutual labels:  wiki, chinese
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (-89.32%)
Mutual labels:  news, corpus
Fakenewscorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (-96.17%)
Mutual labels:  dataset, corpus
Chinese Text Classification
Chinese-Text-Classification,Tensorflow CNN(卷积神经网络)实现的中文文本分类。QQ群:522785813,微信群二维码:http://www.tensorflownews.com/
Stars: ✭ 284 (-95.73%)
Mutual labels:  chinese, text-classification
word2vec-movies
Bag of words meets bags of popcorn in Python 3 中文教程
Stars: ✭ 54 (-99.19%)
Mutual labels:  word2vec, chinese
wechsel
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-99.41%)
Mutual labels:  language-model, bert
KitanaQA
KitanaQA: Adversarial training and data augmentation for neural question-answering models
Stars: ✭ 58 (-99.13%)
Mutual labels:  question-answering, bert
sarcasm-detection-for-sentiment-analysis
Sarcasm Detection for Sentiment Analysis
Stars: ✭ 21 (-99.68%)
Mutual labels:  text-classification, word2vec
embedding study
中文预训练模型生成字向量学习,测试BERT,ELMO的中文效果
Stars: ✭ 94 (-98.59%)
Mutual labels:  chinese, bert
classifier multi label
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 127 (-98.09%)
Mutual labels:  text-classification, bert
TV4Dialog
No description or website provided.
Stars: ✭ 33 (-99.5%)
Mutual labels:  corpus, chinese
trove
Weakly supervised medical named entity classification
Stars: ✭ 55 (-99.17%)
Mutual labels:  text-classification, bert
classifier multi label seq2seq attention
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search
Stars: ✭ 26 (-99.61%)
Mutual labels:  text-classification, bert
bert-movie-reviews-sentiment-classifier
Build a Movie Reviews Sentiment Classifier with Google's BERT Language Model
Stars: ✭ 12 (-99.82%)
Mutual labels:  language-model, bert
Text Cnn
嵌入Word2vec词向量的CNN中文文本分类
Stars: ✭ 298 (-95.52%)
Mutual labels:  text-classification, word2vec
Nlu sim
all kinds of baseline models for sentence similarity 句子对语义相似度模型
Stars: ✭ 286 (-95.7%)
Mutual labels:  question-answering, word2vec
Albert zh
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Stars: ✭ 3,500 (-47.42%)
Mutual labels:  bert, chinese-corpus
Cluener2020
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Stars: ✭ 689 (-89.65%)
Mutual labels:  chinese, dataset
SQUAD2.Q-Augmented-Dataset
Augmented version of SQUAD 2.0 for Questions
Stars: ✭ 31 (-99.53%)
Mutual labels:  question-answering, bert
CLUEmotionAnalysis2020
CLUE Emotion Analysis Dataset 细粒度情感分析数据集
Stars: ✭ 3 (-99.95%)
Mutual labels:  corpus, chinese
bert tokenization for java
This is a java version of Chinese tokenization descried in BERT.
Stars: ✭ 39 (-99.41%)
Mutual labels:  chinese-nlp, bert
ganbert-pytorch
Enhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace
Stars: ✭ 60 (-99.1%)
Mutual labels:  text-classification, bert
TorchBlocks
A PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (-98.72%)
Mutual labels:  text-classification, bert
textgo
Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (-99.5%)
Mutual labels:  text-classification, bert
Giveme5W
Extraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-99.76%)
Mutual labels:  news, question-answering
feedIO
A Feed Aggregator that Knows What You Want to Read.
Stars: ✭ 26 (-99.61%)
Mutual labels:  news, text-classification
cdQA-ui
⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-99.71%)
Mutual labels:  question-answering, bert
Zhparser
zhparser is a PostgreSQL extension for full-text search of Chinese language
Stars: ✭ 418 (-93.72%)
Mutual labels:  chinese, chinese-nlp
kwx
BERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-99.5%)
Mutual labels:  text-classification, bert
Species-Names-Corpus
物种名称语料库。植物名,动物名。
Stars: ✭ 23 (-99.65%)
Mutual labels:  corpus, dataset
iamQA
中文wiki百科QA阅读理解问答系统,使用了CCKS2016数据的NER模型和CMRC2018的阅读理解模型,还有W2V词向量搜索,使用torchserve部署
Stars: ✭ 46 (-99.31%)
Mutual labels:  question-answering, bert
Pytorch-NLU
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (-97.73%)
Mutual labels:  text-classification, bert
AskNowNQS
A question answering system for RDF knowledge graphs.
Stars: ✭ 32 (-99.52%)
Mutual labels:  word2vec, question-answering
Medical-Names-Corpus
医疗语料库。医疗机构名语料库。药品本位码。
Stars: ✭ 26 (-99.61%)
Mutual labels:  corpus, dataset
Bertweet
BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Stars: ✭ 282 (-95.76%)
FewCLUE
FewCLUE 小样本学习测评基准,中文版
Stars: ✭ 251 (-96.23%)
Mutual labels:  chinese, bert
Tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (-23.72%)
Mutual labels:  language-model, bert
Cluecorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (-95.82%)
Mutual labels:  chinese, corpus
Giveme5w1h
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Stars: ✭ 316 (-95.25%)
Mutual labels:  news, question-answering
Chinese-Word-Segmentation-in-NLP
State of the art Chinese Word Segmentation with Bi-LSTMs
Stars: ✭ 23 (-99.65%)
Mutual labels:  chinese, language-model
Nlp Projects
word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding
Stars: ✭ 360 (-94.59%)
Mutual labels:  text-classification, word2vec
Bert Pytorch
Google AI 2018 BERT pytorch implementation
Stars: ✭ 4,642 (-30.26%)
Mutual labels:  language-model, bert
Small Chinese Corpus
Some useful Chinese corpus datasets 中文语料小数据
Stars: ✭ 462 (-93.06%)
Mutual labels:  corpus, chinese-nlp
Dynamic Memory Networks Plus Pytorch
Implementation of Dynamic memory networks plus in Pytorch
Stars: ✭ 123 (-98.15%)
Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-98.14%)
ODSQA
ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
Stars: ✭ 43 (-99.35%)
Mutual labels:  question-answering, chinese
CLUE pytorch
CLUE baseline pytorch CLUE的pytorch版本基线
Stars: ✭ 72 (-98.92%)
Mutual labels:  chinese, bert
61-120 of 2239 similar projects