All Projects → Alibaba-NLP → Hiagm

Alibaba-NLP / Hiagm

Licence: mit
Hierarchy-Aware Global Model for Hierarchical Text Classification

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Hiagm

Eda nlp
Data augmentation for NLP, presented at EMNLP 2019
Stars: ✭ 902 (+1740.82%)
Mutual labels:  text-classification
Cnn Question Classification Keras
Chinese Question Classifier (Keras Implementation) on BQuLD
Stars: ✭ 28 (-42.86%)
Mutual labels:  text-classification
Nlp Experiments In Pytorch
PyTorch repository for text categorization and NER experiments in Turkish and English.
Stars: ✭ 35 (-28.57%)
Mutual labels:  text-classification
Bert language understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Stars: ✭ 933 (+1804.08%)
Mutual labels:  text-classification
Keras Textclassification
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Stars: ✭ 914 (+1765.31%)
Mutual labels:  text-classification
Nlp xiaojiang
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Stars: ✭ 954 (+1846.94%)
Mutual labels:  text-classification
Text Classification Benchmark
文本分类基准测试
Stars: ✭ 18 (-63.27%)
Mutual labels:  text-classification
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-22.45%)
Mutual labels:  text-classification
Text gcn
Graph Convolutional Networks for Text Classification. AAAI 2019
Stars: ✭ 945 (+1828.57%)
Mutual labels:  text-classification
Tensorflow Sentiment Analysis On Amazon Reviews Data
Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.
Stars: ✭ 34 (-30.61%)
Mutual labels:  text-classification
Nlp tensorflow project
Use tensorflow to achieve some NLP project, eg: classification chatbot ner attention QAetc.
Stars: ✭ 27 (-44.9%)
Mutual labels:  text-classification
Text2gender
Predict the author's gender from their text.
Stars: ✭ 14 (-71.43%)
Mutual labels:  text-classification
Few Shot Text Classification
Few-shot binary text classification with Induction Networks and Word2Vec weights initialization
Stars: ✭ 32 (-34.69%)
Mutual labels:  text-classification
Concise Ipython Notebooks For Deep Learning
Ipython Notebooks for solving problems like classification, segmentation, generation using latest Deep learning algorithms on different publicly available text and image data-sets.
Stars: ✭ 23 (-53.06%)
Mutual labels:  text-classification
Textcnn
TextCNN by TensorFlow 2.0.0 ( tf.keras mainly ).
Stars: ✭ 37 (-24.49%)
Mutual labels:  text-classification
Text Mining
Text Mining in Python
Stars: ✭ 18 (-63.27%)
Mutual labels:  text-classification
Omnicat Bayes
Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
Stars: ✭ 30 (-38.78%)
Mutual labels:  text-classification
Bert Toxicity Classification
bert on Jigsaw Unintended Bias in Toxicity Classification
Stars: ✭ 46 (-6.12%)
Mutual labels:  text-classification
Textclassifier
Text classifier for Hierarchical Attention Networks for Document Classification
Stars: ✭ 985 (+1910.2%)
Mutual labels:  text-classification
Easy Deep Learning With Allennlp
🔮Deep Learning for text made easy with AllenNLP
Stars: ✭ 32 (-34.69%)
Mutual labels:  text-classification

HiAGM: Hierarchy-Aware Global Model for Hierarchical Text Classification

This repository implements the hierarchy-aware structure encoders for mutual interaction between label space and text features. This work has been accepted as the long paper 'Hierarchy-Aware Global Model for Hierarchical Text Classification' in ACL 2020. The dataset splits of NYTimes (New York Times) and WoS (Web of Science) are proposed in this repository.

Hierarchy-Aware Global Model

The hierarchy-aware global model improves the conventional text classification model with prior knowledge of the predefined hierarchical structure. The project folder consists of following parts:

  • config: config files (json format)
  • data: data dir, could be changed in config file (with sample data)
  • data_modules: Dataset / DataLoader / Collator / Vocab
  • helper: Configure / Hierarchy_Statistic / Logger / Utils
  • models: StructureModel / EmbeddingLayer / TextEncoder / TextPropagation (HiAGM-TP) / Multi-Label Attention (HiAGM-LA)
  • train_modules: Criterions / EvaluationMetrics / Trainer

Hierarchy-Aware Structure Encoder

  • Bidirectional TreeLSTM: weighted_tree_lstm.py & tree.py
  • Hierarchy-GCN: graphcnn.py

Setup

  • Python >= 3.6
  • torch >= 0.4.1
  • numpy >= 1.17.4

Preprocess

data_modules.preprocess

  • transform to json format file {'token': List[str], 'label': List[str]}
  • clean stopwords
  • RCV1-V2: The preprocess code could refer to the repository of reuters_loader.
  • NYTimes & WoS: data.preprocess_nyt & data.preprocess_wos. Please download the origin datasets and then use these codes to preprocess for HTC.

Prior Probability

  • helper.hierarchical_statistic
  • Note that first change the Root.child List
  • calculate the prior probability between parent-child pair in train dataset

Train

python train.py config/gcn-rcv1-v2.json
  • optimizer -> train.set_optimizer: default torch.optim.Adam
  • learning rate decay schedule callback -> train_modules.trainer.update_lr
  • earlystop callback -> train.py
  • Hyper-parameters are set in config.train

Citation

Please cite our ACL 2020 paper:

@article{jie2020hierarchy,  
 title={Hierarchy-Aware Global Model for Hierarchical Text Classification},  
 author={Jie Zhou, Chunping Ma, Dingkun Long, Guangwei Xu, Ning Ding, Haoyu Zhang, Pengjun Xie, Gongshen Liu},  
 booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)},
 year={2020}  
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].