Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → morningmoni → HiLAP

morningmoni / HiLAP

Licence: other

Code for paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019

Programming Languages

139335 projects - #7 most used programming language

Labels

reinforcement-learning text-classification hierarchical-classification

Projects that are alternatives of or similar to HiLAP

HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories (ICDM'19)

Stars: ✭ 58 (-50%)

Mutual labels: text-classification, hierarchical-classification

[AAAI 2019] Weakly-Supervised Hierarchical Text Classification

Stars: ✭ 83 (-28.45%)

Mutual labels: text-classification, hierarchical-classification

Text and Audio classification with Bert

Text Classification in Turkish Texts with Bert

Stars: ✭ 34 (-70.69%)

Mutual labels: text-classification

BERT, LDA, and TFIDF based keyword extraction in Python

Stars: ✭ 33 (-71.55%)

Mutual labels: text-classification

Augmenty is an augmentation library based on spaCy for augmenting texts.

Stars: ✭ 101 (-12.93%)

Mutual labels: text-classification

synaptic-simple-trainer

A ready to go text classification trainer based on synaptic (https://github.com/cazala/synaptic)

Stars: ✭ 19 (-83.62%)

Mutual labels: text-classification

fake-news-detection

This repo is a collection of AWESOME things about fake news detection, including papers, code, etc.

Stars: ✭ 34 (-70.69%)

Mutual labels: text-classification

monkeylearn-java

Official Java client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Java apps.

Stars: ✭ 23 (-80.17%)

Mutual labels: text-classification

Kaggle-Twitter-Sentiment-Analysis

Kaggle Twitter Sentiment Analysis Competition

Stars: ✭ 18 (-84.48%)

Mutual labels: text-classification

DaDengAndHisPython

【微信公众号：大邓和他的python】, Python语法快速入门https://www.bilibili.com/video/av44384851 Python网络爬虫快速入门https://www.bilibili.com/video/av72010301, 我的联系邮箱[email protected]

Stars: ✭ 59 (-49.14%)

Mutual labels: text-classification

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (-81.03%)

Mutual labels: text-classification

text-classification-svm

The missing SVM-based text classification module implementing HanLP's interface

Stars: ✭ 46 (-60.34%)

Mutual labels: text-classification

Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT

Stars: ✭ 15 (-87.07%)

Mutual labels: text-classification

Filipino-Text-Benchmarks

Open-source benchmark datasets and pretrained transformer models in the Filipino language.

Stars: ✭ 22 (-81.03%)

Mutual labels: text-classification

DeepClassifier is aimed at building general text classification model library.It's easy and user-friendly to build any text classification task.

Stars: ✭ 25 (-78.45%)

Mutual labels: text-classification

OpenTC is a text classification engine using several algorithms in machine learning

Stars: ✭ 27 (-76.72%)

Mutual labels: text-classification

Kaggle-project-list

Summary of my projects on kaggle

Stars: ✭ 20 (-82.76%)

Mutual labels: text-classification

Binary-Text-Classification-Doc2vec-SVM

A Python implementation of a binary text classifier using Doc2Vec and SVM

Stars: ✭ 16 (-86.21%)

Mutual labels: text-classification

Evidence-based Explanation Dataset (AACL-IJCNLP 2020)

Stars: ✭ 16 (-86.21%)

Mutual labels: text-classification

Nodejs binding for fasttext representation and classification.

Stars: ✭ 39 (-66.38%)

Mutual labels: text-classification

View All Similar Projects ➔

This repo provides the code with paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019.

Abstract

While existing hierarchical text classification (HTC) methods attempt to capture label hierarchies for model training, they either make local decisions regarding each label or completely ignore the hierarchy information during inference. To solve the mismatch between training and inference as well as modeling label dependencies in a more principled way, we formulate HTC as a Markov decision process and propose to learn a Label Assignment Policy via deep reinforcement learning to determine where to place an object and when to stop the assignment process. The proposed method, HiLAP, explores the hierarchy during both training and inference time in a consistent manner and makes inter-dependent decisions. As a general framework, HiLAP can incorporate different neural encoders as base models for end-to-end training. Experiments on five public datasets and four base models show that HiLAP yields an average improvement of 33.4% in Macro-F1 over flat classifiers and outperforms state-of-the-art HTC methods by a large margin.

Model

model.py: The main model of HiLAP.

TextCNN.py: Our implementation of "Convolutional Neural Networks for Sentence Classification" EMNLP 2014.

OHCNN(_fast).py: Our implementation of "Effective Use of Word Order for Text Categorization with Convolutional Neural Networks" NAACL 2015.

HAN.py: Our implementation of "Hierarchical Attention Networks for Document Classification" NAACL 2016.

HMCN.py: Our implementation of "Hierarchical Multi-Label Classification Networks" ICML 2018.

Requirements

Python 3

PyTorch 0.3

Data

Due to copyright issues, we can't directly release the datasets used in our experiments. Instead, we provide the links to the five data sources (the first two may require license):

RCV1 original release, text data (update: download the text data and convert to docs.txt with format "docid content")
NYT
Yelp (update: the latest release is different from what we used, pls send an email if you need the version we used)
FunGO

Please check readData_*.py to see how to use our scripts to process and generate the datasets from the original data.

Run

All the parameters in conf.py have default values. Change parameters mode, base_model, and dataset and then run main.py to train or test on different settings. To test a model, set load_model=model_file & is_Train=False in conf.py and run main.py.

Cite

@inproceedings{mao-etal-2019-hierarchical,
    title = "Hierarchical Text Classification with Reinforced Label Assignment",
    author = "Mao, Yuning  and
      Tian, Jingjing  and
      Han, Jiawei  and
      Ren, Xiang",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1042",
    doi = "10.18653/v1/D19-1042",
    pages = "445--455",
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 116

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗