All Projects → srviest → Char Cnn Text Classification Pytorch

srviest / Char Cnn Text Classification Pytorch

Licence: apache-2.0
Character-level Convolutional Neural Networks for text classification in PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Char Cnn Text Classification Pytorch

Googlelanguager
R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API
Stars: ✭ 145 (-1.36%)
Mutual labels:  natural-language-processing, sentiment-analysis
Absa Pytorch
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Stars: ✭ 1,181 (+703.4%)
Mutual labels:  natural-language-processing, sentiment-analysis
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (-59.18%)
Mutual labels:  natural-language-processing, sentiment-analysis
Absapapers
Worth-reading papers and related awesome resources on aspect-based sentiment analysis (ABSA). 值得一读的方面级情感分析论文与相关资源集合
Stars: ✭ 142 (-3.4%)
Mutual labels:  natural-language-processing, sentiment-analysis
Pytreebank
😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-36.73%)
Mutual labels:  natural-language-processing, sentiment-analysis
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+605.44%)
Mutual labels:  natural-language-processing, sentiment-analysis
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+670.07%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp.js
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
Stars: ✭ 4,670 (+3076.87%)
Mutual labels:  natural-language-processing, sentiment-analysis
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-42.18%)
Mutual labels:  natural-language-processing, sentiment-analysis
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-47.62%)
Mutual labels:  natural-language-processing, sentiment-analysis
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-74.15%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp Papers
Papers and Book to look at when starting NLP 📚
Stars: ✭ 111 (-24.49%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp With Ruby
Curated List: Practical Natural Language Processing done in Ruby
Stars: ✭ 907 (+517.01%)
Mutual labels:  natural-language-processing, sentiment-analysis
Pattern
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Stars: ✭ 8,112 (+5418.37%)
Mutual labels:  natural-language-processing, sentiment-analysis
Conv Emotion
This repo contains implementation of different architectures for emotion recognition in conversations.
Stars: ✭ 646 (+339.46%)
Mutual labels:  natural-language-processing, sentiment-analysis
Repo 2017
Python codes in Machine Learning, NLP, Deep Learning and Reinforcement Learning with Keras and Theano
Stars: ✭ 1,123 (+663.95%)
Mutual labels:  natural-language-processing, sentiment-analysis
Aspect Based Sentiment Analysis
A paper list for aspect based sentiment analysis.
Stars: ✭ 311 (+111.56%)
Mutual labels:  natural-language-processing, sentiment-analysis
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+143.54%)
Mutual labels:  natural-language-processing, sentiment-analysis
Senta
Baidu's open-source Sentiment Analysis System.
Stars: ✭ 1,187 (+707.48%)
Mutual labels:  natural-language-processing, sentiment-analysis
Pynlp
A pythonic wrapper for Stanford CoreNLP.
Stars: ✭ 103 (-29.93%)
Mutual labels:  natural-language-processing, sentiment-analysis

Introduction

This is the implementation of Zhang's Character-level Convolutional Networks for Text Classification paper in PyTorch modified from Shawn1993/cnn-text-classification-pytorch.

Zhang's original implementation in Torch: https://github.com/zhangxiangxiao/Crepe

Requirement

  • python 2, 3
  • pytorch >= 0.5
  • numpy
  • termcolor

Dataset Format

Each sample looks like:

"class idx","sentence or text to be classified"  

Samples are separated by newline.

Example:

"3","Fears for T N pension after talks, Unions representing workers at Turner   Newall say they are 'disappointed' after talks with stricken parent firm Federal Mogul."
"4","The Race is On: Second Private Team Sets Launch Date for Human Spaceflight (SPACE.com)","SPACE.com - TORONTO, Canada -- A second\team of rocketeers competing for the  #36;10 million Ansari X Prize, a contest for\privately funded suborbital space flight, has officially announced the first\launch date for its manned rocket."

Train

python train.py -h

You will get:

Character-level CNN text classifier

optional arguments:
  -h, --help            show this help message and exit
  --train_path DIR      path to training data csv
  --val_path DIR        path to validation data csv

Learning options:
  --lr LR               initial learning rate [default: 0.0001]
  --epochs EPOCHS       number of epochs for train [default: 200]
  --batch_size BATCH_SIZE
                        batch size for training [default: 64]
  --max_norm MAX_NORM   Norm cutoff to prevent explosion of gradients
  --optimizer OPTIMIZER
                        Type of optimizer. SGD|Adam|ASGD are supported
                        [default: Adam]
  --class_weight        Weights should be a 1D Tensor assigning weight to each
                        of the classes.
  --dynamic_lr          Use dynamic learning schedule.
  --milestones MILESTONES [MILESTONES ...]
                        List of epoch indices. Must be increasing.
                        Default:[5,10,15]
  --decay_factor DECAY_FACTOR
                        Decay factor for reducing learning rate [default: 0.5]

Model options:
  --alphabet_path ALPHABET_PATH
                        Contains all characters for prediction
  --l0 L0               maximum length of input sequence to CNNs [default:
                        1014]
  --shuffle             shuffle the data every epoch
  --dropout DROPOUT     the probability for dropout [default: 0.5]
  -kernel_num KERNEL_NUM
                        number of each kind of kernel
  -kernel_sizes KERNEL_SIZES
                        comma-separated kernel size to use for convolution

Device options:
  --num_workers NUM_WORKERS
                        Number of workers used in data-loading
  --cuda                enable the gpu

Experiment options:
  --verbose             Turn on progress tracking per iteration for debugging
  --continue_from CONTINUE_FROM
                        Continue from checkpoint model
  --checkpoint          Enables checkpoint saving of model
  --checkpoint_per_batch CHECKPOINT_PER_BATCH
                        Save checkpoint per batch. 0 means never save
                        [default: 10000]
  --save_folder SAVE_FOLDER
                        Location to save epoch models, training configurations
                        and results.
  --log_config          Store experiment configuration
  --log_result          Store experiment result
  --log_interval LOG_INTERVAL
                        how many steps to wait before logging training status
                        [default: 1]
  --val_interval VAL_INTERVAL
                        how many steps to wait before vaidation [default: 200]
  --save_interval SAVE_INTERVAL
                        how many epochs to wait before saving [default:1]
python train.py

You will get:

Epoch[8] Batch[200] - loss: 0.237892  lr: 0.00050  acc: 93.7500%(120/128))
Evaluation - loss: 0.363364  acc: 89.1155%(6730/7552)
Label:   0      Prec:  93.2% (1636/1755)  Recall:  86.6% (1636/1890)  F-Score:  89.8%
Label:   1      Prec:  94.6% (1802/1905)  Recall:  95.6% (1802/1884)  F-Score:  95.1%
Label:   2      Prec:  85.6% (1587/1854)  Recall:  84.1% (1587/1888)  F-Score:  84.8%
Label:   3      Prec:  83.7% (1705/2038)  Recall:  90.2% (1705/1890)  F-Score:  86.8%

Test

If you has construct you test set, you make testing like:

python test.py --test-path='data/ag_news_csv/test.csv' --model-path='models_CharCNN/CharCNN_best.pth.tar'

The model-path option means where your model load from.

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].