All Projects → galsang → Abcnn

galsang / Abcnn

Implementation of ABCNN(Attention-Based Convolutional Neural Network) on Tensorflow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Abcnn

Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (-52.27%)
Mutual labels:  convolutional-neural-networks, attention
Text Classification Models Pytorch
Implementation of State-of-the-art Text Classification Models in Pytorch
Stars: ✭ 379 (+43.56%)
Mutual labels:  convolutional-neural-networks, attention
AoA-pytorch
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
Stars: ✭ 33 (-87.5%)
Mutual labels:  attention
Attention-Visualization
Visualization for simple attention and Google's multi-head attention.
Stars: ✭ 54 (-79.55%)
Mutual labels:  attention
dhs summit 2019 image captioning
Image captioning using attention models
Stars: ✭ 34 (-87.12%)
Mutual labels:  attention
NTUA-slp-nlp
💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-92.8%)
Mutual labels:  attention
Diverse-Structure-Inpainting
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"
Stars: ✭ 131 (-50.38%)
Mutual labels:  attention
automatic-personality-prediction
[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings
Stars: ✭ 43 (-83.71%)
Mutual labels:  attention
ai challenger 2018 sentiment analysis
Fine-grained Sentiment Analysis of User Reviews --- AI CHALLENGER 2018
Stars: ✭ 16 (-93.94%)
Mutual labels:  attention
attention-target-detection
[CVPR2020] "Detecting Attended Visual Targets in Video"
Stars: ✭ 105 (-60.23%)
Mutual labels:  attention
Attention
一些不同的Attention机制代码
Stars: ✭ 17 (-93.56%)
Mutual labels:  attention
keras cv attention models
Keras/Tensorflow attention models including beit,botnet,CMT,CoaT,CoAtNet,convnext,cotnet,davit,efficientdet,efficientnet,fbnet,gmlp,halonet,lcnet,levit,mlp-mixer,mobilevit,nfnets,regnet,resmlp,resnest,resnext,resnetd,swin,tinynet,uniformer,volo,wavemlp,yolor,yolox
Stars: ✭ 159 (-39.77%)
Mutual labels:  attention
Base-On-Relation-Method-Extract-News-DA-RNN-Model-For-Stock-Prediction--Pytorch
基於關聯式新聞提取方法之雙階段注意力機制模型用於股票預測
Stars: ✭ 33 (-87.5%)
Mutual labels:  attention
Semantic-Aware-Attention-Based-Deep-Object-Co-segmentation
Semantic Aware Attention Based Deep Object Co-segmentation
Stars: ✭ 61 (-76.89%)
Mutual labels:  attention
RNNSearch
An implementation of attention-based neural machine translation using Pytorch
Stars: ✭ 43 (-83.71%)
Mutual labels:  attention
mtad-gat-pytorch
PyTorch implementation of MTAD-GAT (Multivariate Time-Series Anomaly Detection via Graph Attention Networks) by Zhao et. al (2020, https://arxiv.org/abs/2009.02040).
Stars: ✭ 85 (-67.8%)
Mutual labels:  attention
ntua-slp-semeval2018
Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.
Stars: ✭ 79 (-70.08%)
Mutual labels:  attention
CoVA-Web-Object-Detection
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!
Stars: ✭ 18 (-93.18%)
Mutual labels:  attention
SBR
⌛ Introducing Self-Attention to Target Attentive Graph Neural Networks (AISP '22)
Stars: ✭ 22 (-91.67%)
Mutual labels:  attention
Netket
Machine learning algorithms for many-body quantum systems
Stars: ✭ 256 (-3.03%)
Mutual labels:  convolutional-neural-networks

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

[Update]: Someone has reported me that the problem of a loss being 'nan' can be attributed to tf.sqrt function which outpus 'nan' when its input is very small or negative. Therefore, I recommend you modify tf.sqrt functions adequately if you have in the trouble.

[Warning]: Some people have reported that there are some bugs that losses go to NaN in case of ABCNN-2 and 3. (I don't know the exact condition where the bugs appear.) Unfortunately, I have no plan to revise the code around the corner. Please be careful when using the code or please send me the pull requests when your revised version of the code works properly. Thanks.

This is the implementation of ABCNN, which is proposed by Wenpeng Yin et al., on Tensorflow.
It includes all 4 models below:

  • BCNN

    MAP MRR
    BCNN(1 layer) Results 0.6660 0.6813
    Baseline 0.6629 0.6813
    BCNN(2 layer) Results 0.6762 0.6871
    Baseline 0.6593 0.6738
  • ABCNN-1

    MAP MRR
    ABCNN-1(1 layer) Results 0.6652 0.6755
    Baseline 0.6810 0.6979
    ABCNN-1(2 layer) Results 0.6702 0.6838
    Baseline 0.6855 0.7023
  • ABCNN-2

    MAP MRR
    ABCNN-2(1 layer) Results 0.6660 0.6813
    Baseline 0.6885 0.7023
    ABCNN-2(2 layer) Results ------ ------
    Baseline 0.6879 0.7068
  • ABCNN-3

    MAP MRR
    ABCNN-3(1 layer) Results 0.6612 0.6682
    Baseline 0.6914 0.7127
    ABCNN-3(2 layer) Results 0.6571 0.6722
    Baseline 0.6921 0.7105

Note:

  • Implementation is now only focusing on AS task with WikiQA corpus. (I originally tried to deal with PI task with MSRP(Microsoft Research Paraphrase) corpus but it seems that model doesn't work without external features classifier requires.)
  • My code has verified that BCNN works fine as the authors proposed. (watched even better results than the paper's.)
  • In the case of ABCNNs, results are inferior to ones in the paper but somewhat competitive. Careful hyperparameter configuration and detailed re-examination may help to achieve optimized results.
  • I doubt that there are some bugs on ABCNNs(especially ABCNN-2 which has 2 conv layers) and will keep watching codes. Please be careful when using the results.

Specification

  • preprocess.py: preprocess (training, test) data and import word2vec to use.
  • train.py: train a model with configs.
  • test.py: test the trained model.
  • ABCNN.py: Implementation of ABCNN models.
  • show.py: pyplot codes for test results.
  • utils.py: common util functions.
  • MSRP_Corpus: MSRP corpus for PI.
  • WikiQA_Corpus: WikiQA corpus for AS.
  • models: saved models available on Tensorflow.
  • experiments: test results on AS tasks.

Development Environment

  • OS: Windows 10 (64 bit)
  • Language: Python 3.5.3
  • CPU: Intel Xeon CPU E3-1231 v3 3.4 GHz
  • RAM: 16GB
  • GPU support: GTX 970
  • Libraries:
    • tensorflow 1.2.1
    • numpy 1.12.1
    • gensim 1.0.1
    • NLTK 3.2.2
    • scikit-learn 0.18.1
    • matplotlib 2.0.0

Requirements

This model is based on pre-trained Word2vec(GoogleNews-vectors-negative300.bin) by T.Mikolov et al.
You should download this file and place it in the root folder.

Execution

(training): python train.py --lr=0.08 --ws=4 --l2_reg=0.0004 --epoch=20 --batch_size=64 --model_type=BCNN --num_layers=2 --data_type=WikiQA

Paramters
--lr: learning rate
--ws: window_size
--l2_reg: l2_reg modifier
--epoch: epoch
--batch_size: batch size
--model_type: model type
--num_layers: number of convolution layers
--data_type: MSRP or WikiQA data

(test): python test.py --ws=4 --l2_reg=0.0004 --epoch=20 --max_len=40 --model_type=BCNN --num_layers=2 --data_type=WikiQA --classifier=LR

Paramters
--ws: window_size
--l2_reg: l2_reg modifier
--epoch: epoch
--max_len: max sentence length
--model_type: model type
--num_layers: number of convolution layers
--data_type: MSRP or WikiQA data
--classifier: Final layout classifier(model, LR, SVM)

MISC.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].