All Projects → shamanez → Self-Supervised-Embedding-Fusion-Transformer

shamanez / Self-Supervised-Embedding-Fusion-Transformer

Licence: MIT license
The code for our IEEE ACCESS (2020) paper Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature Fusion.

Programming Languages

python
139335 projects - #7 most used programming language
Cuda
1817 projects

Projects that are alternatives of or similar to Self-Supervised-Embedding-Fusion-Transformer

MSAF
Offical implementation of paper "MSAF: Multimodal Split Attention Fusion"
Stars: ✭ 47 (-17.54%)
Mutual labels:  multimodal-sentiment-analysis, multimodal-deep-learning, multimodal-emotion-recognition
hfusion
Multimodal sentiment analysis using hierarchical fusion with context modeling
Stars: ✭ 42 (-26.32%)
Mutual labels:  emotion-recognition, multimodal-sentiment-analysis
SIGIR2021 Conure
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Stars: ✭ 23 (-59.65%)
Mutual labels:  bert, self-supervised-learning
erc
Emotion recognition in conversation
Stars: ✭ 34 (-40.35%)
Mutual labels:  bert, emotion-recognition
BBFN
This repository contains the implementation of the paper -- Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis
Stars: ✭ 42 (-26.32%)
Mutual labels:  multimodal-sentiment-analysis, multimodal-deep-learning
parsbert-ner
🤗 ParsBERT Persian NER Tasks
Stars: ✭ 15 (-73.68%)
Mutual labels:  bert
Tianchi2020ChineseMedicineQuestionGeneration
2020 阿里云天池大数据竞赛-中医药文献问题生成挑战赛
Stars: ✭ 20 (-64.91%)
Mutual labels:  bert
ai web RISKOUT BTS
국방 리스크 관리 플랫폼 (🏅 국방부장관상/Minister of National Defense Award)
Stars: ✭ 18 (-68.42%)
Mutual labels:  bert
bert-movie-reviews-sentiment-classifier
Build a Movie Reviews Sentiment Classifier with Google's BERT Language Model
Stars: ✭ 12 (-78.95%)
Mutual labels:  bert
BYOL
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Stars: ✭ 102 (+78.95%)
Mutual labels:  self-supervised-learning
muse-as-service
REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.
Stars: ✭ 45 (-21.05%)
Mutual labels:  bert
contextualSpellCheck
✔️Contextual word checker for better suggestions
Stars: ✭ 274 (+380.7%)
Mutual labels:  bert
STEP
Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits
Stars: ✭ 39 (-31.58%)
Mutual labels:  emotion-recognition
CVPR21 PASS
PyTorch implementation of our CVPR2021 (oral) paper "Prototype Augmentation and Self-Supervision for Incremental Learning"
Stars: ✭ 55 (-3.51%)
Mutual labels:  self-supervised-learning
DeepNER
An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.
Stars: ✭ 9 (-84.21%)
Mutual labels:  bert
MSF
Official code for "Mean Shift for Self-Supervised Learning"
Stars: ✭ 42 (-26.32%)
Mutual labels:  self-supervised-learning
MRC Competition Dureader
机器阅读理解 冠军/亚军代码及中文预训练MRC模型
Stars: ✭ 552 (+868.42%)
Mutual labels:  bert
SentimentAnalysis
(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset
Stars: ✭ 40 (-29.82%)
Mutual labels:  bert
Xpersona
XPersona: Evaluating Multilingual Personalized Chatbot
Stars: ✭ 54 (-5.26%)
Mutual labels:  bert
PDN
The official PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf '21)
Stars: ✭ 44 (-22.81%)
Mutual labels:  bert

Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature Fusion

Model Overviw

Please replace the Table 6 in the paper

Please replace the Table 6 of the paper with this table.

Basic strucutre of the code

Inspiration from fairseq

  1. This code strcuture is built on top of Faiseq interface
  2. Fairseq is an open source project by FacebookAI team that combined different SOTA architectures for sequencial data processing
  3. This also consist of SOTA optimizing mechanisms such as ealry stopage, warup learnign rates, learning rate shedulers
  4. We are trying to develop our own architecture in compatible with fairseq interface.
  5. For more understanding please read the paper published about Fairseq interaface.

Merging of our own architecture with Fairseq interface

  1. This can be bit tricky in the beggining. First it is important to udnestand that Fairseq has built in a way that all architectures can be access through the terminal commands (args).

  2. Since our architecture has lot of properties in tranformer architecture, we followed the a tutorial that describe to use Roberta for the custom classification task.

  3. We build over archtiecture by inserting new stuff to following directories in Fairseq interfeace.

    • fairseq/data
    • fairseq/models
    • fairseq/modules
    • fairseq/tasks
    • fairseq/criterions

Main scripts of the code

Our main scripts are categorized in to for parts

  1. Custom dataloader for load raw audio, faceframes and text is in the fairseq/data/raw_audio_text_video_dataset.py

  2. The task of the emotion prediction similar to other tasks such as translation is in the fairseq/tasks/emotion_prediction.py

  3. The custom architecture of our model similar to roberta,wav2vec is in the fairseq/models/mulT_emo.py

  4. To obtain Inter-Modal attention we modify the self attentional architecture a bit. They can be found in fairseq/modules/transformer_multi_encoder.py and fairseq/modules/transformer_layer.py

  5. Finally the cutom loss function scripts cab be found it fairseq/criterions/emotion_prediction_cri.py

Prerequest models

Our model uses pretrained SSL methods to extract features. It is important to download those checkpoints prior to the trainig procedure. Please you the following links to downlaod the pretrained SSL models.

  1. For audio fetures - wav2vec
  2. For facial features - Fabnet
  3. For sentence (text) features - Roberta

Training Command

python train.py --data ./T_data-old/mosei_sent --restore-file None --task emotion_prediction --reset-optimizer --reset-dataloader --reset-meters --init-token 0 --separator-token 2 --arch robertEMO_large --criterion emotion_prediction_cri --num-classes 1 --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.1 --optimizer adam --adam-betas "(0.9, 0.98)" --adam-eps 1e-06 --clip-norm 0.0 --lr 1e-03 --max-epoch 32 --best-checkpoint-metric loss --encoder-layers 2 --encoder-attention-heads 4 --max-sample-size 150000 --max-tokens 150000000 --batch-size 4 --encoder-layers-cross 2 --max-positions-t 512 --max-positions-a 936 --max-positions-v 301 --no-epoch-checkpoints --update-freq 2 --find-unused-parameters --ddp-backend=no_c10d --lr-scheduler reduce_lr_on_plateau --regression-target-mos

Validation Command

CUDA_VISIBLE_DEVICES=1 python validate.py --data ./T_data/emocap --path './checkpoints/checkpoint_best.pt' --task emotion_prediction --valid-subset test --batch-size 4

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].