All Projects → fendouai → Chinese Text Classification

fendouai / Chinese Text Classification

Licence: apache-2.0
Chinese-Text-Classification,Tensorflow CNN(卷积神经网络)实现的中文文本分类。QQ群:522785813,微信群二维码:http://www.tensorflownews.com/

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Chinese Text Classification

Cnn Question Classification Keras
Chinese Question Classifier (Keras Implementation) on BQuLD
Stars: ✭ 28 (-90.14%)
Mutual labels:  chinese, cnn, text-classification
Text Classification Cnn Rnn
CNN-RNN中文文本分类,基于TensorFlow
Stars: ✭ 3,613 (+1172.18%)
Mutual labels:  chinese, cnn, text-classification
Cnn Text Classification Tf Chinese
CNN for Chinese Text Classification in Tensorflow
Stars: ✭ 237 (-16.55%)
Mutual labels:  chinese, cnn, text-classification
Jiebar
Chinese text segmentation with R. R语言中文分词 (文档已更新 🎉 :https://qinwenfeng.com/jiebaR/ )
Stars: ✭ 302 (+6.34%)
Mutual labels:  chinese, jieba
Cnn Text Classification Keras
Text Classification by Convolutional Neural Network in Keras
Stars: ✭ 213 (-25%)
Mutual labels:  cnn, text-classification
Hierarchical Attention Networks Pytorch
Hierarchical Attention Networks for document classification
Stars: ✭ 239 (-15.85%)
Mutual labels:  cnn, text-classification
Char Cnn Text Classification Tensorflow
Character-level Convolutional Networks for Text Classification论文仿真实现
Stars: ✭ 72 (-74.65%)
Mutual labels:  cnn, text-classification
Eda nlp for chinese
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Stars: ✭ 660 (+132.39%)
Mutual labels:  chinese, text-classification
Cnn handwritten chinese recognition
CNN在线识别手写中文。
Stars: ✭ 365 (+28.52%)
Mutual labels:  chinese, cnn
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+2243.66%)
Mutual labels:  chinese, text-classification
Nlp xiaojiang
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Stars: ✭ 954 (+235.92%)
Mutual labels:  chinese, text-classification
Gse
Go efficient multilingual NLP and text segmentation; support english, chinese, japanese and other. Go 高性能多语言 NLP 和分词
Stars: ✭ 1,695 (+496.83%)
Mutual labels:  chinese, jieba
Text Classification Demos
Neural models for Text Classification in Tensorflow, such as cnn, dpcnn, fasttext, bert ...
Stars: ✭ 144 (-49.3%)
Mutual labels:  cnn, text-classification
Classifier multi label textcnn
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 116 (-59.15%)
Mutual labels:  cnn, text-classification
Cnn intent classification
CNN for intent classification task in a Chatbot
Stars: ✭ 90 (-68.31%)
Mutual labels:  cnn, text-classification
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+73.59%)
Mutual labels:  chinese, text-classification
Sentiment analysis albert
sentiment analysis、文本分类、ALBERT、TextCNN、classification、tensorflow、BERT、CNN、text classification
Stars: ✭ 61 (-78.52%)
Mutual labels:  cnn, text-classification
Lstm Cnn classification
Stars: ✭ 64 (-77.46%)
Mutual labels:  cnn, text-classification
Lightnlp
基于Pytorch和torchtext的自然语言处理深度学习框架。
Stars: ✭ 739 (+160.21%)
Mutual labels:  chinese, text-classification
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+643.66%)
Mutual labels:  chinese, text-classification

用卷积神经网络基于 Tensorflow 实现的中文文本分类

这个项目是基于以下项目改写: cnn-text-classification-tf

关于 Chinese-Text-Classification 的问题欢迎来这里提问

主要的改动:

  • 兼容 tensorflow 1.2 以上
  • 增加了中文数据集
  • 增加了中文处理流程

特性:

  • 兼容最新 TensorFlow
  • 中文数据集
  • 基于 jieba 的中文处理工具
  • 模型训练,模型保存,模型评估的完整实现

训练结果

模型评估

以下为原项目的 README

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post.

It is slightly simplified implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in Tensorflow.

Requirements

  • Python 3
  • Tensorflow > 1.2
  • Numpy

Training

Print parameters:

./train.py --help
optional arguments:
  -h, --help            show this help message and exit
  --embedding_dim EMBEDDING_DIM
                        Dimensionality of character embedding (default: 128)
  --filter_sizes FILTER_SIZES
                        Comma-separated filter sizes (default: '3,4,5')
  --num_filters NUM_FILTERS
                        Number of filters per filter size (default: 128)
  --l2_reg_lambda L2_REG_LAMBDA
                        L2 regularizaion lambda (default: 0.0)
  --dropout_keep_prob DROPOUT_KEEP_PROB
                        Dropout keep probability (default: 0.5)
  --batch_size BATCH_SIZE
                        Batch Size (default: 64)
  --num_epochs NUM_EPOCHS
                        Number of training epochs (default: 100)
  --evaluate_every EVALUATE_EVERY
                        Evaluate model on dev set after this many steps
                        (default: 100)
  --checkpoint_every CHECKPOINT_EVERY
                        Save model after this many steps (default: 100)
  --allow_soft_placement ALLOW_SOFT_PLACEMENT
                        Allow device soft device placement
  --noallow_soft_placement
  --log_device_placement LOG_DEVICE_PLACEMENT
                        Log placement of ops on devices
  --nolog_device_placement

Train:

./train.py

Evaluating

./eval.py --eval_train --checkpoint_dir="./runs/1459637919/checkpoints/"

Replace the checkpoint dir with the output from the training. To use your own data, change the eval.py script to load your data.

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].