Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (-29.7%)

Mutual labels: text-classification, classification

Text classification

all kinds of text classification models and more with deep learning

Stars: ✭ 7,179 (+3453.96%)

Mutual labels: classification, text-classification

Relation-Classification

Relation Classification - SEMEVAL 2010 task 8 dataset

Stars: ✭ 46 (-77.23%)

Mutual labels: text-classification, classification

Textclassification

All kinds of neural text classifiers implemented by Keras

Stars: ✭ 51 (-74.75%)

Mutual labels: classification, text-classification

Ml Classify Text Js

Machine learning based text classification in JavaScript using n-grams and cosine similarity

Stars: ✭ 38 (-81.19%)

Mutual labels: classification, text-classification

nlp classification

Implementing nlp papers relevant to classification with PyTorch, gluonnlp

Stars: ✭ 224 (+10.89%)

Mutual labels: text-classification, classification

Fastrtext

R wrapper for fastText

Stars: ✭ 103 (-49.01%)

Mutual labels: classification, text-classification

ML4K-AI-Extension

Use machine learning in AppInventor, with easy training using text, images, or numbers through the Machine Learning for Kids website.

Stars: ✭ 18 (-91.09%)

Mutual labels: text-classification, classification

Text Classification Cnn Rnn

CNN-RNN中文文本分类，基于TensorFlow

Stars: ✭ 3,613 (+1688.61%)

Mutual labels: classification, text-classification

COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers

Rank 1 / 216

Stars: ✭ 24 (-88.12%)

Mutual labels: text-classification, classification

Eda nlp

Data augmentation for NLP, presented at EMNLP 2019

Stars: ✭ 902 (+346.53%)

Mutual labels: classification, text-classification

awesome-text-classification

Text classification meets word embeddings.

Stars: ✭ 27 (-86.63%)

Mutual labels: text-classification, classification

Cnn Question Classification Keras

Chinese Question Classifier (Keras Implementation) on BQuLD

Stars: ✭ 28 (-86.14%)

Mutual labels: classification, text-classification

Deep Atrous Cnn Sentiment

Deep-Atrous-CNN-Text-Network: End-to-end word level model for sentiment analysis and other text classifications

Stars: ✭ 64 (-68.32%)

Mutual labels: classification, text-classification

Awesome Text Classification

Awesome-Text-Classification Projects,Papers,Tutorial .

Stars: ✭ 158 (-21.78%)

Mutual labels: classification, text-classification

View All Similar Projects ➔

NLP paper implementation relevant to classification with PyTorch

The papers were implemented in using korean corpus

Prelimnary & Usage

preliminary

pyenv virualenv 3.7.7 nlp
pyenv activate nlp
pip install -r requirements.txt

Usage

python build_dataset.py
python build_vocab.py
python train.py # default training parameter
python evaluate.py # defatul evaluation parameter

Single sentence classification (sentiment classification task)

Using the Naver sentiment movie corpus v1.0 (a.k.a. nsmc)
Configuration
- conf/model/{type}.json (e.g. type = ["sencnn", "charcnn",...])
- conf/dataset/nsmc.json
Structure

# example: Convolutional_Neural_Networks_for_Sentence_Classification
├── build_dataset.py
├── build_vocab.py
├── conf
│   ├── dataset
│   │   └── nsmc.json
│   └── model
│       └── sencnn.json
├── evaluate.py
├── experiments
│   └── sencnn
│       └── epochs_5_batch_size_256_learning_rate_0.001
├── model
│   ├── data.py
│   ├── __init__.py
│   ├── metric.py
│   ├── net.py
│   ├── ops.py
│   ├── split.py
│   └── utils.py
├── nsmc
│   ├── ratings_test.txt
│   ├── ratings_train.txt
│   ├── test.txt
│   ├── train.txt
│   ├── validation.txt
│   └── vocab.pkl
├── train.py
└── utils.py

Model \ Accuracy	Train (120,000)	Validation (30,000)	Test (50,000)	Date
SenCNN	91.95%	86.54%	85.84%	20/05/30
CharCNN	86.29%	81.69%	81.38%	20/05/30
ConvRec	86.23%	82.93%	82.43%	20/05/30
VDCNN	86.59%	84.29%	84.10%	20/05/30
SAN	90.71%	86.70%	86.37%	20/05/30
ETRIBERT	91.12%	89.24%	88.98%	20/05/30
SKTBERT	92.20%	89.08%	88.96%	20/05/30

[x] Convolutional Neural Networks for Sentence Classification (as SenCNN)
- https://arxiv.org/abs/1408.5882
[x] Character-level Convolutional Networks for Text Classification (as CharCNN)
- https://arxiv.org/abs/1509.01626
[x] Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers (as ConvRec)
- https://arxiv.org/abs/1602.00367
[x] Very Deep Convolutional Networks for Text Classification (as VDCNN)
- https://arxiv.org/abs/1606.01781
[x] A Structured Self-attentive Sentence Embedding (as SAN)
- https://arxiv.org/abs/1703.03130
[x] BERT_single_sentence_classification (as ETRIBERT, SKTBERT)
- https://arxiv.org/abs/1810.04805

Pairwise-text-classification (paraphrase detection task)

Creating dataset from https://github.com/songys/Question_pair
Configuration
- conf/model/{type}.json (e.g. type = ["siam", "san",...])
- conf/dataset/qpair.json
Structure

# example: Siamese_recurrent_architectures_for_learning_sentence_similarity
├── build_dataset.py
├── build_vocab.py
├── conf
│   ├── dataset
│   │   └── qpair.json
│   └── model
│       └── siam.json
├── evaluate.py
├── experiments
│   └── siam
│       └── epochs_5_batch_size_64_learning_rate_0.001
├── model
│   ├── data.py
│   ├── __init__.py
│   ├── metric.py
│   ├── net.py
│   ├── ops.py
│   ├── split.py
│   └── utils.py
├── qpair
│   ├── kor_pair_test.csv
│   ├── kor_pair_train.csv
│   ├── test.txt
│   ├── train.txt
│   ├── validation.txt
│   └── vocab.pkl
├── train.py
└── utils.py

Model \ Accuracy	Train (6,136)	Validation (682)	Test (758)	Date
Siam	93.00%	83.13%	83.64%	20/05/30
SAN	89.47%	82.11%	81.53%	20/05/30
Stochastic	89.26%	82.69%	80.07%	20/05/30
ETRIBERT	95.07%	94.42%	94.06%	20/05/30
SKTBERT	95.43%	92.52%	93.93%	20/05/30

[x] A Structured Self-attentive Sentence Embedding (as SAN)
- https://arxiv.org/abs/1703.03130
[x] Siamese recurrent architectures for learning sentence similarity (as Siam)
- https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12195
[x] Stochastic Answer Networks for Natural Language Inference (as Stochastic)
- https://arxiv.org/abs/1804.07888
[x] BERT_pairwise_text_classification (as ETRIBERT, SKTBERT)
- https://arxiv.org/abs/1810.04805

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 202

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗