Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → david-yoon → Multimodal Speech Emotion

david-yoon / Multimodal Speech Emotion

Licence: mit

TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18

Labels

jupyter-notebook

Projects that are alternatives of or similar to Multimodal Speech Emotion

Introdatascience

Notes on Data Science. 数理统计、机器学习和数据编程的学习笔记。

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Jupyter notebooks

Collection of jupyter notebooks

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Robust Detection Benchmark

Code, data and benchmark from the paper "Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming" (NeurIPS 2019 ML4AD)

Stars: ✭ 128 (+0%)

Mutual labels: jupyter-notebook

Play time!

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Colour Demosaicing

CFA (Colour Filter Array) Demosaicing Algorithms for Python

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Pynq Computervision

Computer Vision Overlays on Pynq

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Rasa Ptbr Boilerplate

Um template para criar um FAQ chatbot usando Rasa, Rocket.chat, elastic search

Stars: ✭ 128 (+0%)

Mutual labels: jupyter-notebook

daily curated links in DS, DL, NLP, ML

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Focal Loss Pytorch

全中文注释.(The loss function of retinanet based on pytorch).(You can use it on one-stage detection task or classifical task, to solve data imbalance influence).用于one-stage目标检测算法,提升检测效果.你也可以在分类任务中使用该损失函数,解决数据不平衡问题.

Stars: ✭ 126 (-1.56%)

Mutual labels: jupyter-notebook

Computer Vision: From 3D Reconstruction to Recognition

Stars: ✭ 126 (-1.56%)

Mutual labels: jupyter-notebook

Machine learning beginner

机器学习初学者公众号作品

Stars: ✭ 1,770 (+1282.81%)

Mutual labels: jupyter-notebook

Public repo for HF blog posts

Stars: ✭ 126 (-1.56%)

Mutual labels: jupyter-notebook

Criteo 1tb Benchmark

Benchmark of different ML algorithms on Criteo 1TB dataset

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Chinese Chatbot

中文聊天机器人，基于10万组对白训练而成，采用注意力机制，对一般问题都会生成一个有意义的答复。已上传模型，可直接运行，跑不起来直播吃键盘。

Stars: ✭ 124 (-3.12%)

Mutual labels: jupyter-notebook

Algorithms Illuminated

My notes for Tim Roughgarden's awesome course on Algorithms and his 4 part books

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Celegansneuroml

NeuroML based C elegans model, contained in a neuroConstruct project, as well as c302

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

Insightface Just Works

Insightface face detection and recognition model that just works out of the box.

Stars: ✭ 127 (-0.78%)

Mutual labels: jupyter-notebook

A doodle classifier(CNN), trained on all 345 categories from Quickdraw dataset.

Stars: ✭ 128 (+0%)

Mutual labels: jupyter-notebook

A Cytoscape Jupyter widget

Stars: ✭ 128 (+0%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

multimodal-speech-emotion

This repository contains the source code used in the following paper,

Multimodal Speech Emotion Recognition using Audio and Text, IEEE SLT-18, [paper]

[requirements]

tensorflow==1.4 (tested on cuda-8.0, cudnn-6.0)
python==2.7
scikit-learn==0.20.0
nltk==3.3

[download data corpus]

IEMOCAP [link] [paper]
download IEMOCAP data from its original web-page (license agreement is required)

[preprocessed-data schema (our approach)]

Get the preprocessed dataset [application link]

If you want to download the "preprocessed dataset," please ask the license to the IEMOCAP team first.
for the preprocessing, refer to codes in the "./preprocessing"
We cannot publish ASR-processed transcription due to the license issue (commercial API), however, we assume that it is moderately easy to extract ASR-transcripts from the audio signal by oneself. (we used google-cloud-speech-api)
Format of the data for our experiments:

MFCC : MFCC features of the audio signal (ex. train_audio_mfcc.npy)
[#samples, 750, 39] - (#sampels, sequencs(max 7.5s), dims)

MFCC-SEQN : valid lenght of the sequence of the audio signal (ex. train_seqN.npy)
[#samples] - (#sampels)

PROSODY : prosody features of the audio signal (ex. train_audio_prosody.npy)
[#samples, 35] - (#sampels, dims)

TRANS : sequences of trasnciption (indexed) of a data (ex. train_nlp_trans.npy)
[#samples, 128] - (#sampels, sequencs(max))

LABEL : targe label of the audio signal (ex. train_label.npy)
[#samples] - (#sampels)

[source code]

repository contains code for following models

Audio Recurrent Encoder (ARE)
Text Recurrent Encoder (TRE)
Multimodal Dual Recurrent Encoder (MDRE)
Multimodal Dual Recurrent Encoder with Attention (MDREA)

[training]

refer "reference_script.sh"
fianl result will be stored in "./TEST_run_result.txt"

[cite]

Please cite our paper, when you use our code | model | dataset

@inproceedings{yoon2018multimodal,
title={Multimodal Speech Emotion Recognition Using Audio and Text},
author={Yoon, Seunghyun and Byun, Seokhyun and Jung, Kyomin},
booktitle={2018 IEEE Spoken Language Technology Workshop (SLT)},
pages={112--118},
year={2018},
organization={IEEE}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 128

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗