All Projects → luoyuanlab → text_gcn_tutorial

luoyuanlab / text_gcn_tutorial

Licence: other
A tutorial & minimal example (8min on CPU) for Graph Convolutional Networks for Text Classification. AAAI 2019

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to text gcn tutorial

TextCategorization
⚡ Using deep learning (MLP, CNN, Graph CNN) to classify text in TensorFlow.
Stars: ✭ 30 (+30.43%)
Mutual labels:  text-classification, graph-convolutional-networks
DSTGCN
codes of Deep Spatio-Temporal Graph Convolutional Network for Traffic Accident Prediction
Stars: ✭ 37 (+60.87%)
Mutual labels:  graph-convolutional-networks
Hierarchical Attention Networks Pytorch
Hierarchical Attention Networks for document classification
Stars: ✭ 239 (+939.13%)
Mutual labels:  text-classification
overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification
NLP tutorial
Stars: ✭ 41 (+78.26%)
Mutual labels:  text-classification
Ai law
all kinds of baseline models for long text classificaiton( text categorization)
Stars: ✭ 243 (+956.52%)
Mutual labels:  text-classification
Subject-and-Sentiment-Analysis
汽车行业用户观点主题及情感识别
Stars: ✭ 24 (+4.35%)
Mutual labels:  text-classification
Chinese text cnn
TextCNN Pytorch实现 中文文本分类 情感分析
Stars: ✭ 235 (+921.74%)
Mutual labels:  text-classification
awesome-efficient-gnn
Code and resources on scalable and efficient Graph Neural Networks
Stars: ✭ 498 (+2065.22%)
Mutual labels:  graph-convolutional-networks
L2-GCN
[CVPR 2020] L2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks
Stars: ✭ 26 (+13.04%)
Mutual labels:  graph-convolutional-networks
Sentiment-analysis-amazon-Products-Reviews
NLP with NLTK for Sentiment analysis amazon Products Reviews
Stars: ✭ 37 (+60.87%)
Mutual labels:  text-classification
Indonesian-Twitter-Emotion-Dataset
Indonesian twitter dataset for emotion classification task
Stars: ✭ 49 (+113.04%)
Mutual labels:  text-classification
protonet-bert-text-classification
finetune bert for small dataset text classification in a few-shot learning manner using ProtoNet
Stars: ✭ 28 (+21.74%)
Mutual labels:  text-classification
bns-short-text-similarity
📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.
Stars: ✭ 24 (+4.35%)
Mutual labels:  text-classification
Text Classification
Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Stars: ✭ 239 (+939.13%)
Mutual labels:  text-classification
SelfGNN
A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling" paper, which appeared in The International Workshop on Self-Supervised Learning for the Web (SSL'21) @ the Web Conference 2021 (WWW'21).
Stars: ✭ 24 (+4.35%)
Mutual labels:  graph-convolutional-networks
Cnn Text Classification Tf Chinese
CNN for Chinese Text Classification in Tensorflow
Stars: ✭ 237 (+930.43%)
Mutual labels:  text-classification
AliNet
Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation, AAAI 2020
Stars: ✭ 89 (+286.96%)
Mutual labels:  graph-convolutional-networks
Vaaku2Vec
Language Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (+195.65%)
Mutual labels:  text-classification
ERNIE-text-classification-pytorch
This repo contains a PyTorch implementation of a pretrained ERNIE model for text classification.
Stars: ✭ 49 (+113.04%)
Mutual labels:  text-classification
Text-Classification-LSTMs-PyTorch
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+95.65%)
Mutual labels:  text-classification

Text GCN Tutorial

This tutorial (currently under improvement) is based on the implementation of Text GCN in our paper:

Liang Yao, Chengsheng Mao, Yuan Luo. "Graph Convolutional Networks for Text Classification." In 33rd AAAI Conference on Artificial Intelligence (AAAI-19)

Require

Python 2.7 or 3.6

Tensorflow >= 1.4.0

Example input data

The Ohsumed corpus is from the MEDLINE database, which is a bibliographic database of important medical literature maintained by the National Library of Medicine

In this tutorial, we created a subsample of the 2,762 unique diseases abstracts from 3 categories

  • C04: Neoplasms
  • C10: Nervous System Diseases
  • C14: Cardiovascular Diseases

As we focus on single-label text classification, the documents belonging to multiple categories are excluded

1230 train (use 10% as validation), 1532 test

  1. /data/ohsumed_3.txt indicates document names, training/test split, document labels. Each line is for a document.

  2. /data/corpus/ohsumed_3.txt contains raw text of each document, each line is for the corresponding line in /data/ohsumed_3.txt

Reproduing Results

  1. Run python remove_words.py ohsumed_3

  2. Run python build_graph.py ohsumed_3

  3. Run python train.py ohsumed_3

Example output

2019-04-04 22:58:26.244395: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Epoch: 0001 train_loss= 1.09856 train_acc= 0.41463 val_loss= 1.08209 val_acc= 0.48780 time= 29.13731
Epoch: 0002 train_loss= 1.08044 train_acc= 0.49865 val_loss= 1.05469 val_acc= 0.47967 time= 23.00088
Epoch: 0003 train_loss= 1.05075 train_acc= 0.49865 val_loss= 1.02113 val_acc= 0.47967 time= 21.82401
Epoch: 0004 train_loss= 1.01430 train_acc= 0.49955 val_loss= 0.98582 val_acc= 0.48780 time= 21.42816
Epoch: 0005 train_loss= 0.97174 train_acc= 0.50678 val_loss= 0.95375 val_acc= 0.51220 time= 21.44958
Epoch: 0006 train_loss= 0.93406 train_acc= 0.51220 val_loss= 0.92789 val_acc= 0.55285 time= 24.01502
......
Epoch: 0074 train_loss= 0.01921 train_acc= 0.99819 val_loss= 0.09674 val_acc= 0.96748 time= 24.01229
Epoch: 0075 train_loss= 0.02093 train_acc= 0.99909 val_loss= 0.09715 val_acc= 0.96748 time= 24.08436
Early stopping...
Optimization Finished!
Test set results: cost= 0.24295 accuracy= 0.92167 time= 7.60145
10456
Test Precision, Recall and F1-Score...
             precision    recall  f1-score   support

          0     0.8882    0.8363    0.8614       342
          1     0.9438    0.9517    0.9477       600
          2     0.9174    0.9407    0.9289       590

avg / total     0.9212    0.9217    0.9212      1532

Visualizing Documents

Run python tsne.py

Example Visualization

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].