srviest / Char Cnn Text Classification Pytorch
Licence: apache-2.0
Character-level Convolutional Neural Networks for text classification in PyTorch
Stars: ✭ 147
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Char Cnn Text Classification Pytorch
Googlelanguager
R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API
Stars: ✭ 145 (-1.36%)
Mutual labels: natural-language-processing, sentiment-analysis
Absa Pytorch
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Stars: ✭ 1,181 (+703.4%)
Mutual labels: natural-language-processing, sentiment-analysis
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (-59.18%)
Mutual labels: natural-language-processing, sentiment-analysis
Absapapers
Worth-reading papers and related awesome resources on aspect-based sentiment analysis (ABSA). 值得一读的方面级情感分析论文与相关资源集合
Stars: ✭ 142 (-3.4%)
Mutual labels: natural-language-processing, sentiment-analysis
Pytreebank
😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-36.73%)
Mutual labels: natural-language-processing, sentiment-analysis
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+605.44%)
Mutual labels: natural-language-processing, sentiment-analysis
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+670.07%)
Mutual labels: natural-language-processing, sentiment-analysis
Nlp.js
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
Stars: ✭ 4,670 (+3076.87%)
Mutual labels: natural-language-processing, sentiment-analysis
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-42.18%)
Mutual labels: natural-language-processing, sentiment-analysis
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-47.62%)
Mutual labels: natural-language-processing, sentiment-analysis
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-74.15%)
Mutual labels: natural-language-processing, sentiment-analysis
Nlp Papers
Papers and Book to look at when starting NLP 📚
Stars: ✭ 111 (-24.49%)
Mutual labels: natural-language-processing, sentiment-analysis
Nlp With Ruby
Curated List: Practical Natural Language Processing done in Ruby
Stars: ✭ 907 (+517.01%)
Mutual labels: natural-language-processing, sentiment-analysis
Pattern
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Stars: ✭ 8,112 (+5418.37%)
Mutual labels: natural-language-processing, sentiment-analysis
Conv Emotion
This repo contains implementation of different architectures for emotion recognition in conversations.
Stars: ✭ 646 (+339.46%)
Mutual labels: natural-language-processing, sentiment-analysis
Repo 2017
Python codes in Machine Learning, NLP, Deep Learning and Reinforcement Learning with Keras and Theano
Stars: ✭ 1,123 (+663.95%)
Mutual labels: natural-language-processing, sentiment-analysis
Aspect Based Sentiment Analysis
A paper list for aspect based sentiment analysis.
Stars: ✭ 311 (+111.56%)
Mutual labels: natural-language-processing, sentiment-analysis
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+143.54%)
Mutual labels: natural-language-processing, sentiment-analysis
Senta
Baidu's open-source Sentiment Analysis System.
Stars: ✭ 1,187 (+707.48%)
Mutual labels: natural-language-processing, sentiment-analysis
Pynlp
A pythonic wrapper for Stanford CoreNLP.
Stars: ✭ 103 (-29.93%)
Mutual labels: natural-language-processing, sentiment-analysis
Introduction
This is the implementation of Zhang's Character-level Convolutional Networks for Text Classification paper in PyTorch modified from Shawn1993/cnn-text-classification-pytorch.
Zhang's original implementation in Torch: https://github.com/zhangxiangxiao/Crepe
Requirement
- python 2, 3
- pytorch >= 0.5
- numpy
- termcolor
Dataset Format
Each sample looks like:
"class idx","sentence or text to be classified"
Samples are separated by newline.
Example:
"3","Fears for T N pension after talks, Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent firm Federal Mogul."
"4","The Race is On: Second Private Team Sets Launch Date for Human Spaceflight (SPACE.com)","SPACE.com - TORONTO, Canada -- A second\team of rocketeers competing for the #36;10 million Ansari X Prize, a contest for\privately funded suborbital space flight, has officially announced the first\launch date for its manned rocket."
Train
python train.py -h
You will get:
Character-level CNN text classifier
optional arguments:
-h, --help show this help message and exit
--train_path DIR path to training data csv
--val_path DIR path to validation data csv
Learning options:
--lr LR initial learning rate [default: 0.0001]
--epochs EPOCHS number of epochs for train [default: 200]
--batch_size BATCH_SIZE
batch size for training [default: 64]
--max_norm MAX_NORM Norm cutoff to prevent explosion of gradients
--optimizer OPTIMIZER
Type of optimizer. SGD|Adam|ASGD are supported
[default: Adam]
--class_weight Weights should be a 1D Tensor assigning weight to each
of the classes.
--dynamic_lr Use dynamic learning schedule.
--milestones MILESTONES [MILESTONES ...]
List of epoch indices. Must be increasing.
Default:[5,10,15]
--decay_factor DECAY_FACTOR
Decay factor for reducing learning rate [default: 0.5]
Model options:
--alphabet_path ALPHABET_PATH
Contains all characters for prediction
--l0 L0 maximum length of input sequence to CNNs [default:
1014]
--shuffle shuffle the data every epoch
--dropout DROPOUT the probability for dropout [default: 0.5]
-kernel_num KERNEL_NUM
number of each kind of kernel
-kernel_sizes KERNEL_SIZES
comma-separated kernel size to use for convolution
Device options:
--num_workers NUM_WORKERS
Number of workers used in data-loading
--cuda enable the gpu
Experiment options:
--verbose Turn on progress tracking per iteration for debugging
--continue_from CONTINUE_FROM
Continue from checkpoint model
--checkpoint Enables checkpoint saving of model
--checkpoint_per_batch CHECKPOINT_PER_BATCH
Save checkpoint per batch. 0 means never save
[default: 10000]
--save_folder SAVE_FOLDER
Location to save epoch models, training configurations
and results.
--log_config Store experiment configuration
--log_result Store experiment result
--log_interval LOG_INTERVAL
how many steps to wait before logging training status
[default: 1]
--val_interval VAL_INTERVAL
how many steps to wait before vaidation [default: 200]
--save_interval SAVE_INTERVAL
how many epochs to wait before saving [default:1]
python train.py
You will get:
Epoch[8] Batch[200] - loss: 0.237892 lr: 0.00050 acc: 93.7500%(120/128))
Evaluation - loss: 0.363364 acc: 89.1155%(6730/7552)
Label: 0 Prec: 93.2% (1636/1755) Recall: 86.6% (1636/1890) F-Score: 89.8%
Label: 1 Prec: 94.6% (1802/1905) Recall: 95.6% (1802/1884) F-Score: 95.1%
Label: 2 Prec: 85.6% (1587/1854) Recall: 84.1% (1587/1888) F-Score: 84.8%
Label: 3 Prec: 83.7% (1705/2038) Recall: 90.2% (1705/1890) F-Score: 86.8%
Test
If you has construct you test set, you make testing like:
python test.py --test-path='data/ag_news_csv/test.csv' --model-path='models_CharCNN/CharCNN_best.pth.tar'
The model-path option means where your model load from.
Reference
- Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015)
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].