Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → madrugado → Attention Based Aspect Extraction

madrugado / Attention Based Aspect Extraction

Licence: apache-2.0

Code for unsupervised aspect extraction, using Keras and its Backends

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning keras unsupervised-learning topic-modeling

Projects that are alternatives of or similar to Attention Based Aspect Extraction

NMFADMM

A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).

Stars: ✭ 39 (-48%)

Mutual labels: topic-modeling, unsupervised-learning

kwx

BERT, LDA, and TFIDF based keyword extraction in Python

Stars: ✭ 33 (-56%)

Mutual labels: topic-modeling, unsupervised-learning

Corex topic

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Stars: ✭ 439 (+485.33%)

Mutual labels: unsupervised-learning, topic-modeling

Topicmodels

topics Models extension for Mallet & scikit-learn

Stars: ✭ 50 (-33.33%)

Mutual labels: topic-modeling

Lir For Unsupervised Ir

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Stars: ✭ 53 (-29.33%)

Mutual labels: unsupervised-learning

Neuralhmm

code for unsupervised learning Neural Hidden Markov Models paper

Stars: ✭ 64 (-14.67%)

Mutual labels: unsupervised-learning

Stminsights

A Shiny Application for Inspecting Structural Topic Models

Stars: ✭ 74 (-1.33%)

Mutual labels: topic-modeling

Lightlda

fast sampling algorithm based on CGS

Stars: ✭ 49 (-34.67%)

Mutual labels: topic-modeling

Self Supervised Learning Overview

📜 Self-Supervised Learning from Images: Up-to-date reading list.

Stars: ✭ 73 (-2.67%)

Mutual labels: unsupervised-learning

Weakly Supervised 3d Object Detection

Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020

Stars: ✭ 61 (-18.67%)

Mutual labels: unsupervised-learning

How To Mine Newsfeed Data And Extract Interactive Insights In Python

A practical guide to topic mining and interactive visualizations

Stars: ✭ 61 (-18.67%)

Mutual labels: topic-modeling

Rakun

Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation

Stars: ✭ 54 (-28%)

Mutual labels: unsupervised-learning

Sine

A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Stars: ✭ 67 (-10.67%)

Mutual labels: unsupervised-learning

Voxelmorph

Unsupervised Learning for Image Registration

Stars: ✭ 1,057 (+1309.33%)

Mutual labels: unsupervised-learning

Udacity Natural Language Processing Nanodegree

Tutorials and my solutions to the Udacity NLP Nanodegree

Stars: ✭ 73 (-2.67%)

Mutual labels: topic-modeling

Php Ml

PHP-ML - Machine Learning library for PHP

Stars: ✭ 7,900 (+10433.33%)

Mutual labels: unsupervised-learning

Concrete Autoencoders

Stars: ✭ 68 (-9.33%)

Mutual labels: unsupervised-learning

Labeled Lda Python

Implement of L-LDA Model(Labeled Latent Dirichlet Allocation Model) with python

Stars: ✭ 60 (-20%)

Mutual labels: topic-modeling

Dgi

TensorFlow implementation of Deep Graph Infomax

Stars: ✭ 58 (-22.67%)

Mutual labels: unsupervised-learning

Dmgi

Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)

Stars: ✭ 62 (-17.33%)

Mutual labels: unsupervised-learning

View All Similar Projects ➔

Attention-Based Aspect Extraction

This repository is a fork of paper authors' repository with following code improvements:

python 3 compliant
Keras 2 compliant
Keras backend independent

In addition there is an additional functionality:

seed words
no need to specify embedding dimension with external embedding usage model

Dependencies

keras>=2.0
tensorflow-gpu>=1.4
numpy>=1.13

This code also tested to work with CNTK and MXNet. With MXNet there were some issues with Keras internals, hope they will be improved in future versions.

Data and Preprocessing

You can download the original datasets (of Restaurant and Beer domains) in [Download].

For preprocessing, put the decompressed zip file in the main folder and run:

python preprocess.py
python word2vec.py

respectively in code/. The preprocessed files and trained word embeddings for each domain will be saved in a folder preprocessed_data/.

You can also find the pre-processed datasets and the pre-trained word embeddings in [Download]. The zip file should be decompressed and put in the main folder.

Train

For training, run in code/ folder:

python train.py \
--emb-name ../preprocessed_data/$domain/w2v_embedding \
--domain $domain \
--out-dir ../output

where:

$domain (restaurant or beer) is the corresponding domain,
--emb-name is the path to the pre-trained word embeddings, it could be just a name of a file, then it will be searched in ../preprocessed_data/$domain/, otherwise it will be searched by absolute path;
--out-dir is the path of the output directory.

You can find more arguments/hyper-parameters defined in [code/train.py] with default values used in our experiments.

After training, two output files will be saved in ../output/$domain/:

aspect.log contains extracted aspects with top 100 words for each of them.
model_param contains the saved model weights.

Evaluation

For evaluation, run in code/ folder:

python evaluation.py \
--domain $domain \
--out-dir ../output

Note that you should keep the values of arguments for evaluation the same as those for training (except --emb-name, you don't need to specify it), as we need to first rebuild the network architecture and then load the saved model weights.

This will output a file att_weights that contains the attention weights on all test sentences in ../output/$domain/.

To assign each test sentence a gold aspect label, you need to first manually map each inferred aspect to a gold aspect label according to its top words, and then uncomment the bottom part in evaluation.py (line 136-144) for evaluaton using F scores.

One example of trained model for the restaurant domain has been put in pre_trained_model/restaurant/, and the corresponding aspect mapping has been provided in code/evaluation.py (at the bottom).

Cite

If you use the code, please consider citing original paper:

@InProceedings{he-EtAl:2017:Long2,
  author    = {He, Ruidan  and  Lee, Wee Sun  and  Ng, Hwee Tou  and  Dahlmeier, Daniel},
  title     = {An Unsupervised Neural Attention Model for Aspect Extraction},
  booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  month     = {July},
  year      = {2017},
  address   = {Vancouver, Canada},
  publisher = {Association for Computational Linguistics}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 75

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (13) 🔗