AryaAftab / LIGHT-SERNET

Licence: other

Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

Programming Languages

python

139335 projects - #7 most used programming language

Jupyter Notebook

11667 projects

shell

77523 projects

Projects that are alternatives of or similar to LIGHT-SERNET

Tensorflowtts

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+11810%)

Mutual labels: tflite, tensorflow2

TF2DeepFloorplan

TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.

Stars: ✭ 98 (+390%)

Mutual labels: tflite, tensorflow2

E2E-Object-Detection-in-TFLite

This repository shows how to train a custom detection model with the TFOD API, optimize it with TFLite, and perform inference with the optimized model.

Stars: ✭ 28 (+40%)

Mutual labels: tflite, tensorflow2

kula

Lightweight and highly extensible .NET scripting language.

Stars: ✭ 43 (+115%)

Mutual labels: lightweight

gcnn keras

Graph convolution with tf.keras

Stars: ✭ 47 (+135%)

Mutual labels: tensorflow2

kcs

Scripting in C with JIT(x64)/VM.

Stars: ✭ 25 (+25%)

Mutual labels: lightweight

ttt

A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+

Stars: ✭ 35 (+75%)

Mutual labels: tensorflow2

E2E-tfKeras-TFLite-Android

End to end training MNIST image classifier with tf.Keras, convert to TFLite and deploy to Android

Stars: ✭ 17 (-15%)

Mutual labels: tflite

Spectrum

Spectrum is an AI that uses machine learning to generate Rap song lyrics

Stars: ✭ 37 (+85%)

Mutual labels: tensorflow2

ElegantRL

Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

Stars: ✭ 2,074 (+10270%)

Mutual labels: lightweight

Selfie2Anime-with-TFLite

How to create Selfie2Anime from tflite model to Android.

Stars: ✭ 70 (+250%)

Mutual labels: tflite

beercss

Build material design interfaces in record time... without stress for devs... 🍺💛

Stars: ✭ 223 (+1015%)

Mutual labels: lightweight

semantic segmentation

Semantically segment the road in the given image.

Stars: ✭ 91 (+355%)

Mutual labels: fully-convolutional-networks

deep reinforcement learning gallery

Deep reinforcement learning with tensorflow2

Stars: ✭ 35 (+75%)

Mutual labels: tensorflow2

RxSwiftMVVM

RxSwift MVVM Moya HandyJSON

Stars: ✭ 58 (+190%)

Mutual labels: lightweight

G-SimCLR

This is the code base for paper "G-SimCLR : Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling" by Souradip Chakraborty, Aritra Roy Gosthipaty and Sayak Paul.

Stars: ✭ 69 (+245%)

Mutual labels: tensorflow2

NanoLimbo

The lightweight, high performance Minecraft limbo server

Stars: ✭ 94 (+370%)

Mutual labels: lightweight

Deep-Learning

This repo provides projects on deep-learning mainly using Tensorflow 2.0

Stars: ✭ 22 (+10%)

Mutual labels: tensorflow2

goof

Go Offer File - Easily serve files and directories over a network; a Golang implementation of `woof`.

Stars: ✭ 24 (+20%)

Mutual labels: lightweight

text classifier

Tensorflow2.3的文本分类项目，支持各种分类模型，支持相关tricks。

Stars: ✭ 135 (+575%)

Mutual labels: tensorflow2

View All Similar Projects ➔

Light-SERNet

This is the Tensorflow 2.x implementation of our paper "Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition", accepted in ICASSP 2022.

In this paper, we propose an efficient and lightweight fully convolutional neural network(FCNN) for speech emotion recognition in systems with limited hardware resources. In the proposed FCNN model, various feature maps are extracted via three parallel paths with different filter sizes. This helps deep convolution blocks to extract high-level features, while ensuring sufficient separability. The extracted features are used to classify the emotion of the input speech segment. While our model has a smaller size than that of the state-of-the-art models, it achieves a higher performance on the IEMOCAP and EMO-DB datasets.

Demo

Demo on EMO-DB dataset:

Run

1. Clone Repository

$ git clone https://github.com/AryaAftab/LIGHT-SERNET.git
$ cd LIGHT-SERNET/

2. Requirements

Tensorflow >= 2.3.0
Numpy >= 1.19.2
Tqdm >= 4.50.2
Matplotlib> = 3.3.1
Scikit-learn >= 0.23.2

$ pip install -r requirements.txt

3. Data:

Download EMO-DB and IEMOCAP(requires permission to access) datasets
extract them in data folder

Note: For using IEMOCAP dataset, please follow issue #3.

4. Set hyperparameters and training config :

You only need to change the constants in the hyperparameters.py to set the hyperparameters and the training config.

5. Strat training:

Use the following code to train the model on the desired dataset, cost function, and input length(second).

Note 1: The input is automatically cut or padded to the desired size and stored in the data folder.
Note 2: The best model are saved in the model folder.
Note 3: The results for the confusion matrix are saved in the result folder.

$ python train.py -dn {dataset_name} \
                  -id {input durations} \
                  -at {audio_type} \
                  -ln {cost function name} \
                  -v {verbose for training bar} \
                  -it {type of input(mfcc, spectrogram, mel_spectrogram)}
                  -c {type of cache(disk, ram, None)}
                  -m {fuse mfcc feature extractor in exported tflite model}

Example:

EMO-DB Dataset:

python train.py -dn "EMO-DB" \
                -id 3 \
                -at "all" \
                -ln "focal" \
                -v 1 \
                -it "mfcc"
                -c "disk"
                -m false

IEMOCAP Dataset:

python train.py -dn "IEMOCAP" \
                -id 7 \
                -at "impro" \
                -ln "cross_entropy" \
                -v 1 \
                -it "mfcc"
                -c "disk"
                -m false

Note : For all experiments just run run.sh

sh run.sh

Fusing MFCC Extractor(New Feature)

To run the model independently and without the need for the Tensorflow library, the MFCC feature extractor was added as a single layer to the beginning of the model. Then, The trained model was exported as a single file in the TensorFlow Lite format. The input of this model is raw sound in the form of a vector (1, sample_rate * input_duration). To train with fusing feature:

python train.py -dn "EMO-DB" \
                -id 3 \
                -m True

Note 1: The best model are saved in the model folder.
Note 2: To run tflite model you can just use tflite_runtime library. For using tflite_runtime library in this project, you need to build it with TF OP support(Flex delegate). you can learn how to built Tenorflow Lite from source with this flag here.
Note 3: To run tflite model as a real-time application another repository will be completed soon.

Citation

If you find our code useful for your research, please consider citing:

@inproceedings{aftab2022light,
  title={Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition},
  author={Aftab, Arya and Morsali, Alireza and Ghaemmaghami, Shahrokh and Champagne, Benoit},
  booktitle={ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={6912--6916},
  year={2022},
  organization={IEEE}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

AryaAftab / LIGHT-SERNET

Programming Languages

Labels

Projects that are alternatives of or similar to LIGHT-SERNET

Light-SERNet

Demo

Run

1. Clone Repository

2. Requirements

3. Data:

4. Set hyperparameters and training config :

5. Strat training:

Example:

Fusing MFCC Extractor(New Feature)

Citation