guokr / Caver

Licence: GPL-3.0 license

Caver: a toolkit for multilabel text classification.

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Caver

extremeText

Library for fast text representation and extreme classification.

Stars: ✭ 141 (+271.05%)

Mutual labels: text-classification, multi-label-classification

classifier multi label

multi-label，classifier，text classification，多标签文本分类，文本分类，BERT，ALBERT，multi-label-classification

Stars: ✭ 127 (+234.21%)

Mutual labels: text-classification, multi-label-classification

Text Classification Pytorch

Text classification using deep learning models in Pytorch

Stars: ✭ 683 (+1697.37%)

Mutual labels: text-classification, attention-model

automatic-personality-prediction

[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings

Stars: ✭ 43 (+13.16%)

Mutual labels: text-classification

kaggle-human-protein-atlas-image-classification

Kaggle 2018 @ Human Protein Atlas Image Classification

Stars: ✭ 34 (-10.53%)

Mutual labels: multi-label-classification

WSDM-Cup-2019

[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.

Stars: ✭ 62 (+63.16%)

Mutual labels: text-classification

Reuters-21578-Classification

Text classification with Reuters-21578 datasets using Gensim Word2Vec and Keras LSTM

Stars: ✭ 44 (+15.79%)

Mutual labels: text-classification

classification

Vietnamese Text Classification

Stars: ✭ 39 (+2.63%)

Mutual labels: text-classification

small-text

Active Learning for Text Classification in Python

Stars: ✭ 241 (+534.21%)

Mutual labels: text-classification

HiGRUs

Implementation of the paper "Hierarchical GRU for Utterance-level Emotion Recognition" in NAACL-2019.

Stars: ✭ 60 (+57.89%)

Mutual labels: text-classification

attention-mechanism-keras

attention mechanism in keras, like Dense and RNN...

Stars: ✭ 19 (-50%)

Mutual labels: attention-model

napkinXC

Extremely simple and fast extreme multi-class and multi-label classifiers.

Stars: ✭ 38 (+0%)

Mutual labels: multi-label-classification

10kGNAD

Ten Thousand German News Articles Dataset for Topic Classification

Stars: ✭ 63 (+65.79%)

Mutual labels: text-classification

Product-Categorization-NLP

Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).

Stars: ✭ 30 (-21.05%)

Mutual labels: text-classification

Naive-Bayes-Text-Classifier-in-Java

Naive Bayes Classification used to classify movie reviews as positive or negative

Stars: ✭ 18 (-52.63%)

Mutual labels: text-classification

MetaLifelongLanguage

Repository containing code for the paper "Meta-Learning with Sparse Experience Replay for Lifelong Language Learning".

Stars: ✭ 21 (-44.74%)

Mutual labels: text-classification

nsmc-zeppelin-notebook

Movie review dataset Word2Vec & sentiment classification Zeppelin notebook

Stars: ✭ 26 (-31.58%)

Mutual labels: text-classification

nlp classification

Implementing nlp papers relevant to classification with PyTorch, gluonnlp

Stars: ✭ 224 (+489.47%)

Mutual labels: text-classification

MetaCat

Minimally Supervised Categorization of Text with Metadata (SIGIR'20)

Stars: ✭ 52 (+36.84%)

Mutual labels: text-classification

textgo

Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!

Stars: ✭ 33 (-13.16%)

Mutual labels: text-classification

View All Similar Projects ➔

Caver

Rising a torch in the cave to see the words on the wall, tag your short text in 3 lines. Caver uses Facebook's PyTorch project to make the implementation easier.

Demo • Requirements • Install • Pre-trained models • Train • Examples • Document

Quick Demo

from caver import CaverModel
model = CaverModel("./checkpoint_path")

sentence = ["看 美 剧 学 英 语 靠 谱 吗",
            "科 比 携 手 姚 明 出 任 2019 篮 球 世 界 杯 全 球 大 使",
            "如 何 在 《 权 力 的 游 戏 》 中 苟 到 最 后",
            "英 雄 联 盟 LPL 夏 季 赛 RNG 能 否 击 败 TOP 战 队"]

model.predict([sentence[0]], top_k=3)
>>> ['美剧', '英语', '英语学习']

model.predict([sentence[1]], top_k=5)
>>> ['篮球', 'NBA', '体育', 'NBA 球员', '运动']

model.predict([sentence[2]], top_k=7)
>>> ['权力的游戏（美剧）', '美剧', '影视评论', '电视剧', '电影', '文学', '小说']

model.predict([sentence[3]], top_k=6)
>>> ['英雄联盟（LoL）', '电子竞技', '英雄联盟职业联赛（LPL）', '游戏', '网络游戏', '多人联机在线竞技游戏 (MOBA)']

Requirements

PyTorch
tqdm
torchtext
numpy
Python3

Install

$ pip install caver --user

Did you guys have some pre-trained models

Yes, we have released two pre-trained models on Zhihu NLPCC2018 opendataset.

If you want to use the pre-trained model for performing text tagging, you can download it (along with other important inference material) from the Caver releases page. Alternatively, you can run the following command to download and unzip the files in your current directory:

$ wget -O - https://github.com/guokr/Caver/releases/download/0.1/checkpoints_char_cnn.tar.gz | tar zxvf -
$ wget -O - https://github.com/guokr/Caver/releases/download/0.1/checkpoints_char_lstm.tar.gz | tar zxvf -

How to train on your own dataset

$ python3 train.py --input_data_dir {path to your origin dataset}
                   --output_data_dir {path to store the preprocessed dataset}
                   --train_filename train.tsv
                   --valid_filename valid.tsv
                   --checkpoint_dir {path to save the checkpoints}
                   --model {fastText/CNN/LSTM}
                   --batch_size {16, you can modify this for you own}
                   --epoch {10}

More Examples

It's updating, but basically you can check examples.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

guokr / Caver

Programming Languages

Labels

Projects that are alternatives of or similar to Caver

Caver

Quick Demo

Requirements

Install

Did you guys have some pre-trained models

How to train on your own dataset

More Examples