All Projects → mkosaka1 → Speech_Emotion_Recognition

mkosaka1 / Speech_Emotion_Recognition

Licence: other
Using Convolutional Neural Networks in speech emotion recognition on the RAVDESS Audio Dataset.

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Speech Emotion Recognition

SemEval2019Task3
Code for ANA at SemEval-2019 Task 3
Stars: ✭ 41 (-34.92%)
Mutual labels:  emotion
ANMP
multi-channel loopable video game music player for nerds and audiophiles
Stars: ✭ 16 (-74.6%)
Mutual labels:  audio-files
react-emotion-multi-step-form
React multi-step form library with Emotion styling
Stars: ✭ 25 (-60.32%)
Mutual labels:  emotion
VoiceNET.Library
.NET library to easily create Voice Command Control feature.
Stars: ✭ 14 (-77.78%)
Mutual labels:  cnn-model
react-awesome-reveal
React components to add reveal animations using the Intersection Observer API and CSS Animations.
Stars: ✭ 564 (+795.24%)
Mutual labels:  emotion
pytorch Highway Networks
Highway Networks implement in pytorch
Stars: ✭ 63 (+0%)
Mutual labels:  cnn-model
PSCognitiveService
Powershell module to access Microsoft Azure Machine learning RESTful API's or Microsoft cognitive services
Stars: ✭ 46 (-26.98%)
Mutual labels:  emotion
Audio-auto-tagging
Convolutional Neural Network for auto-tagging of audio clips on MagnaTagATune dataset
Stars: ✭ 37 (-41.27%)
Mutual labels:  audio-files
rescript-react-boilerplate
An opinionated app shell for ReScript & React progressive web apps
Stars: ✭ 62 (-1.59%)
Mutual labels:  emotion
leafygreen-ui
LeafyGreen UI – LeafyGreen's React UI Kit
Stars: ✭ 112 (+77.78%)
Mutual labels:  emotion
dissertation
🎓 📜 This repository holds my final year and dissertation project during my time at the University of Lincoln titled 'Deep Learning for Emotion Recognition in Cartoons'.
Stars: ✭ 22 (-65.08%)
Mutual labels:  emotion
hiddenwave
An Audio Steganography Tool, written in C++
Stars: ✭ 46 (-26.98%)
Mutual labels:  audio-files
Deep-Learning-for-Expression-Recognition-in-Image-Sequences
The project uses state of the art deep learning on collected data for automatic analysis of emotions.
Stars: ✭ 26 (-58.73%)
Mutual labels:  emotion
onurl
URL Shortener created w/ Next.js, TypeScript, Mongoose
Stars: ✭ 48 (-23.81%)
Mutual labels:  emotion
satchel
The little bag of CSS-in-JS superpowers
Stars: ✭ 14 (-77.78%)
Mutual labels:  emotion
next-express-emotion
Easy way to setup NextJS 12.0.7, React 17.0.2, Express 4.17.2, and @emotion/react 11.7.1 locally.
Stars: ✭ 31 (-50.79%)
Mutual labels:  emotion
yolo-deepsort-flask
Target detection and multi target tracking platform based on Yolo DeepSort and Flask.
Stars: ✭ 29 (-53.97%)
Mutual labels:  cnn-model
download audioset
📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Stars: ✭ 53 (-15.87%)
Mutual labels:  audio-files
ser-with-w2v2
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Stars: ✭ 40 (-36.51%)
Mutual labels:  speech-emotion-recognition
SER-datasets
A collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (+17.46%)
Mutual labels:  speech-emotion-recognition

Speech Emotion Recognition System

Muriel Kosaka

Project Overview

Through all the available senses humans can actually sense the emotional state of their communication partner. The emotional detection is natural for humans but it is very difficult task for computers; although they can easily understand content based information, accessing the depth behind content is difficult and that’s what speech emotion recognition (SER) sets out to do. It is a system through which various audio speech files are classified into different emotions such as happy, sad, anger and neutral by computer. SER can be used in areas such as the medical field or customer call centers. With this project I hope to look into applying this model into an app that individuals with ASD can use when speaking to others to help guide conversation and create/maintain healthy relationships with others who have deficits in understanding others emotions. Google Slides Presentation

Dataset

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) Dataset from Kaggle contains 1440 audio files from 24 Actors vocalizing two lexically-matched statements. Emotions include angry, happy, sad, fearful, calm, neutral, disgust, and surprised. Click for dataset

Process

  1. See Data_Preprocessing_&_Initial_Model.ipynb: Loaded audio files, created visualizations, conducted feature extraction (log-mel spectrograms) resulting into dataframe (see audio.csv) and built inital 1D CNN Model. Obtained an accuracy score of 38% with the model having difficulty classifying calm, surprised, angry, and digust.
  • EDA

  • Initial Model

  1. See Data_Augmentation.ipynb: Implemented data augmentation methods including adding noise, speed and pitch, and stretch to all audio files and used feature extraction methods to turn audio files into images to feed into 1D CNN Model. Obtained an accuracy score of 80%, but overfitting the data as seen in graph.

  1. See Uploads for all .png and sample audio files

  2. See Transfer_Learning: Currently working in Notebook. Applied VGG16 and Inception pre-trained models for higher accuracy.

Conclusion

Using feature extraction methods by itself did not achieve a high accuracy score within my CNN model, but using data augmentation methods did improve the accuracy score to 53% however it was overfitting the data. This model needs to be improved upon before being applied towards making an app to detect emotion in real time. Fine tuning the VGG-16 architecture with image augmentation improved the overall model accuracy to 81%/

Limitations

Limitations include not using feature selection to reduce the dimensionality of my augmented CNN which may have improved learning performance. Another limitation included using minimal data, the RAVDESS Dataset has only 1,440 files which may be why there was overfitting of the data. Additional datasets could have been utilized.

Next Steps

Next steps for this project include building a front-end for user interaction, then work towards building an app to detect emotion. Afterwards, I would like to be able to build system that can recognize emotion in real time and then calculate degree of affection such as love, truthfulness, and friendship of the person you are talking to.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].