All Projects → y3nk0 → Graph-Based-TC

y3nk0 / Graph-Based-TC

Licence: other
Graph-based framework for text classification

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Graph-Based-TC

ulm-basenet
Implementation of ULMFit algorithm for text classification via transfer learning
Stars: ✭ 94 (+291.67%)
Mutual labels:  text-classification
Very-deep-cnn-tensorflow
Very deep CNN for text classification
Stars: ✭ 18 (-25%)
Mutual labels:  text-classification
rnn-text-classification-tf
Tensorflow implementation of Attention-based Bidirectional RNN text classification.
Stars: ✭ 26 (+8.33%)
Mutual labels:  text-classification
keras-aquarium
a small collection of models implemented in keras, including matrix factorization(recommendation system), topic modeling, text classification, etc. Runs on tensorflow.
Stars: ✭ 14 (-41.67%)
Mutual labels:  text-classification
video features
Extract video features from raw videos using multiple GPUs. We support RAFT and PWC flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.
Stars: ✭ 225 (+837.5%)
Mutual labels:  feature-extraction
pyAudioProcessing
Audio feature extraction and classification
Stars: ✭ 165 (+587.5%)
Mutual labels:  feature-extraction
TextClassification
基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (+258.33%)
Mutual labels:  text-classification
watson-document-classifier
Augment IBM Watson Natural Language Understanding APIs with a configurable mechanism for text classification, uses Watson Studio.
Stars: ✭ 41 (+70.83%)
Mutual labels:  text-classification
20-newsgroups text-classification
"20 newsgroups" dataset - Text Classification using Multinomial Naive Bayes in Python.
Stars: ✭ 41 (+70.83%)
Mutual labels:  text-classification
Python Computer Vision from Scratch
This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply…
Stars: ✭ 219 (+812.5%)
Mutual labels:  feature-extraction
iros bshot
B-SHOT : A Binary Feature Descriptor for Fast and Efficient Keypoint Matching on 3D Point Clouds
Stars: ✭ 43 (+79.17%)
Mutual labels:  feature-extraction
BERT-chinese-text-classification-pytorch
This repo contains a PyTorch implementation of a pretrained BERT model for text classification.
Stars: ✭ 92 (+283.33%)
Mutual labels:  text-classification
character-level-cnn
Keras implementation of Character-level CNN for Text Classification
Stars: ✭ 56 (+133.33%)
Mutual labels:  text-classification
transfer-learning-text-tf
Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)
Stars: ✭ 82 (+241.67%)
Mutual labels:  text-classification
Ask2Transformers
A Framework for Textual Entailment based Zero Shot text classification
Stars: ✭ 102 (+325%)
Mutual labels:  text-classification
procjam2018
Graph.ical, a procedural texture authoring application developed for PROCJAM 2018.
Stars: ✭ 42 (+75%)
Mutual labels:  graph-based
RE2RNN
Source code for the EMNLP 2020 paper "Cold-Start and Interpretability: Turning Regular Expressions intoTrainable Recurrent Neural Networks"
Stars: ✭ 96 (+300%)
Mutual labels:  text-classification
SPHORB
feature detector and descriptor for spherical panorama
Stars: ✭ 66 (+175%)
Mutual labels:  feature-extraction
X-Transformer
X-Transformer: Taming Pretrained Transformers for eXtreme Multi-label Text Classification
Stars: ✭ 127 (+429.17%)
Mutual labels:  text-classification
Nepali-News-Classifier
Text Classification of Nepali Language Document. This Mini Project was done for the partial fulfillment of NLP Course : COMP 473.
Stars: ✭ 13 (-45.83%)
Mutual labels:  text-classification

Graph-Based-TC

Graph-based framework for text classification

This is the code for the paper "Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification", presented at the TextGraphs workshop, NAACL 2018, New Orleans, USA. Our paper got the Best Paper Award!

Datasets

Our implementation includes 6 datasets: 20newsgroups, IMDB, WebKb, Reuters, Subjectivity and Amazon. We use the fetch_20newsgroups built-in python to get the 20newsgroup dataset. For the IMDB dataset, you can download it here. We add all remaining datasets here, due to GitHub size limits.

Files ending with main.py contain tf and tf-idf, and files ending with gow.py contain the tw, tw-idf, tw-icw and tw-icw-lw methods.

Parameters

Inside each file there are several parameters to set in order to get the result of the desired method.

parameters for main.py files

  • bag_of_words: use our tf-idf or the tf-idf vectorizer(scikit-learn)
  • ngrams_par: the number of ngrams
  • idf_bool: use idf or not

parameters for gow.py files

  • idf_pars: {"no","idf","icw”,”icw-lw”}, "no" for tw method, "idf" for tw-idf, "icw" for tw-icw, “icw-lw” for tw-icw-lw
  • sliding_window: the parameter for creating edges between words. 2 is for connecting only to the next word
  • centrality_par: the centrality metric which we use for term weighting (e.g. weighted_degree_centrality for weighted w2v version)
  • centrality_col_par: the centrality metric which we use for the collection graph

Example

For the WebKb dataset you go in the example/:

  • for tf run: webkb_main.py with parameter idf_bool = False
  • for tf-idf run: python webkb_main.py with parameter idf_bool = True
  • for tw with degree centrality run: python webkb_gow.py with parameter idf_par="no"
  • for tw-idf with degree centrality run: python webkb_gow.py with parameter idf_par="idf"
  • for tw-icw with degree centrality on both tw and icw run: python webkb_gow.py with parameter idf_par="icw"
  • for tf-icw with degree centrality on icw run: python webkb_gow.py with parameter idf_par="tf-icw"
  • for tw-icw-lw with degree centrality on both tw,icw and lw run: python webkb_gow.py with parameter idf_par="icw-lw"

Citation

Please cite using the following BibTeX entry if you use our code (same with Google Scholar):

@inproceedings{skianis2018fusing,
    title={Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification},
    author={Skianis, Konstantinos and Malliaros, Fragkiskos and Vazirgiannis, Michalis},
    booktitle={Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12)},
    pages={49--58},
    year={2018}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].