All Projects → bigboNed3 → Chinese_text_cnn

bigboNed3 / Chinese_text_cnn

TextCNN Pytorch实现 中文文本分类 情感分析

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Chinese text cnn

Hdltex
HDLTex: Hierarchical Deep Learning for Text Classification
Stars: ✭ 191 (-18.72%)
Mutual labels:  text-classification
Icdar 2019 Sroie
ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
Stars: ✭ 202 (-14.04%)
Mutual labels:  text-classification
Bert4doc Classification
Code and source for paper ``How to Fine-Tune BERT for Text Classification?``
Stars: ✭ 220 (-6.38%)
Mutual labels:  text-classification
Pyss3
A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (-18.72%)
Mutual labels:  text-classification
Shallowlearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (-16.6%)
Mutual labels:  text-classification
Cnn Text Classification Keras
Text Classification by Convolutional Neural Network in Keras
Stars: ✭ 213 (-9.36%)
Mutual labels:  text-classification
Text Pairs Relation Classification
About Text Pairs (Sentence Level) Classification (Similarity Modeling) Based on Neural Network.
Stars: ✭ 182 (-22.55%)
Mutual labels:  text-classification
Pytorch Transformers Classification
Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.
Stars: ✭ 229 (-2.55%)
Mutual labels:  text-classification
Nlp classification
Implementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 202 (-14.04%)
Mutual labels:  text-classification
Interpret Text
A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard.
Stars: ✭ 220 (-6.38%)
Mutual labels:  text-classification
Marktool
这是一款基于web的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持文本的迭代标注和实体的嵌套标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,可对多人的标注结果进行一致性检验和调整,提高了标注语料的准确率和可靠性。
Stars: ✭ 190 (-19.15%)
Mutual labels:  text-classification
Fake news detection
Fake News Detection in Python
Stars: ✭ 194 (-17.45%)
Mutual labels:  text-classification
Band
BAND:BERT Application aNd Deployment,Simple and efficient BERT model training and deployment, 简单高效的 BERT 模型训练和部署
Stars: ✭ 216 (-8.09%)
Mutual labels:  text-classification
Simpletransformers
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (+1125.96%)
Mutual labels:  text-classification
Paddlenlp
NLP Core Library and Model Zoo based on PaddlePaddle 2.0
Stars: ✭ 212 (-9.79%)
Mutual labels:  text-classification
A Pytorch Tutorial To Text Classification
Hierarchical Attention Networks | a PyTorch Tutorial to Text Classification
Stars: ✭ 184 (-21.7%)
Mutual labels:  text-classification
Chinese ulmfit
中文ULMFiT 情感分析 文本分类
Stars: ✭ 208 (-11.49%)
Mutual labels:  text-classification
Fancy Nlp
NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.
Stars: ✭ 233 (-0.85%)
Mutual labels:  text-classification
Catalyst
Accelerated deep learning R&D
Stars: ✭ 2,804 (+1093.19%)
Mutual labels:  text-classification
Text Classification
Text Classification through CNN, RNN & HAN using Keras
Stars: ✭ 216 (-8.09%)
Mutual labels:  text-classification

TextCNN Pytorch实现 中文文本分类

论文

Convolutional Neural Networks for Sentence Classification

参考

依赖项

  • python3.5
  • pytorch==1.0.0
  • torchtext==0.3.1
  • jieba==0.39

词向量

https://github.com/Embedding/Chinese-Word-Vectors
(这里用的是Zhihu_QA 知乎问答训练出来的word Word2vec)

用法

python3 main.py -h

训练

python3 main.py

准确率

  • [x] CNN-rand 随机初始化Embedding
      python main.py
    
      Batch[1800] - loss: 0.009499  acc: 100.0000%(128/128)
      Evaluation - loss: 0.000026  acc: 94.0000%(6616/7000)
      early stop by 1000 steps, acc: 94.0000%
    
  • [x] CNN-static 使用预训练的静态词向量
      python main.py -static=true
    
      Batch[1900] - loss: 0.011894  acc: 100.0000%(128/128)
      Evaluation - loss: 0.000018  acc: 95.0000%(6679/7000)
      early stop by 1000 steps, acc: 95.0000%
    
  • [x] CNN-non-static 微调预训练的词向量
      python main.py -static=true -non-static=true
    
      Batch[1500] - loss: 0.008823  acc: 99.0000%(127/128))
      Evaluation - loss: 0.000016  acc: 96.0000%(6729/7000)
      early stop by 1000 steps, acc: 96.0000%
    
  • [x] CNN-multichannel 微调加静态
      python main.py -static=true -non-static=true -multichannel=true
    
      Batch[1500] - loss: 0.023020  acc: 98.0000%(126/128))
      Evaluation - loss: 0.000016  acc: 96.0000%(6744/7000)
      early stop by 1000 steps, acc: 96.0000%
    
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].