All Projects → Tralfazz → RE-VERB

Tralfazz / RE-VERB

Licence: MIT license
speaker diarization system using an LSTM

Programming Languages

python
139335 projects - #7 most used programming language
Vue
7211 projects
javascript
184084 projects - #8 most used programming language
HTML
75241 projects
Dockerfile
14818 projects
shell
77523 projects

Projects that are alternatives of or similar to RE-VERB

sharpmask
TensorFlow implementation of DeepMask and SharpMask
Stars: ✭ 31 (+40.91%)
Mutual labels:  ml
Roboverb
A VST / VST3 / AU / LV2 Reverb Plugin
Stars: ✭ 48 (+118.18%)
Mutual labels:  reverb
leetspeek
Open and collaborative content from leet hackers!
Stars: ✭ 11 (-50%)
Mutual labels:  ml
deprecated-coalton-prototype
Coalton is (supposed to be) a dialect of ML embedded in Common Lisp.
Stars: ✭ 209 (+850%)
Mutual labels:  ml
gym-rs
OpenAI's Gym written in pure Rust for blazingly fast performance
Stars: ✭ 34 (+54.55%)
Mutual labels:  ml
TrackMania AI
Racing game AI
Stars: ✭ 65 (+195.45%)
Mutual labels:  ml
managed ml systems and iot
Managed Machine Learning Systems and Internet of Things Live Lesson
Stars: ✭ 35 (+59.09%)
Mutual labels:  ml
card-scanner-flutter
A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning
Stars: ✭ 82 (+272.73%)
Mutual labels:  ml
GE2E-Loss
Pytorch implementation of Generalized End-to-End Loss for speaker verification
Stars: ✭ 72 (+227.27%)
Mutual labels:  speaker-diarization
community
README for Rekcurd projects
Stars: ✭ 16 (-27.27%)
Mutual labels:  ml
industrial-ml-datasets
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Stars: ✭ 45 (+104.55%)
Mutual labels:  ml
project-code-py
Leetcode using AI
Stars: ✭ 100 (+354.55%)
Mutual labels:  ml
pypmml
Python PMML scoring library
Stars: ✭ 65 (+195.45%)
Mutual labels:  ml
osdg-tool
OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant content in any text. The tool is available online at www.osdg.ai. API access available for research purposes.
Stars: ✭ 22 (+0%)
Mutual labels:  ml
DevSoc21
Official website for DEVSOC 21, our annual flagship hackathon.
Stars: ✭ 15 (-31.82%)
Mutual labels:  ml
djl
An Engine-Agnostic Deep Learning Framework in Java
Stars: ✭ 3,080 (+13900%)
Mutual labels:  ml
SENet-for-Weakly-Supervised-Relation-Extraction
No description or website provided.
Stars: ✭ 39 (+77.27%)
Mutual labels:  ml
lm-scorer
📃Language Model based sentences scoring library
Stars: ✭ 264 (+1100%)
Mutual labels:  ml
CustomVisionMicrosoftToCoreMLDemoApp
This app recognises 3 hand signs - fist, high five and victory hand [ rock, paper, scissors basically :) ] with live feed camera. It uses a HandSigns.mlmodel which has been trained using Custom Vision from Microsoft.
Stars: ✭ 25 (+13.64%)
Mutual labels:  ml
revisiting rainbow
Revisiting Rainbow
Stars: ✭ 71 (+222.73%)
Mutual labels:  ml

Logo

Logo


About the project

RE: VERB is speaker diarization system, it allows the user to send/record audio of a conversation and receive timestamps of who spoke when

RE:VERB is our final project in Magshimim, and consists of a web client and a server.

  • The client can record audio and show the the timestamp results graphically

  • The server can be used with many other clients with the simple REST API it has.

Built With

client

server

  • Pytorch - library for deep learning with python that has great support for GPUs with CUDA

  • Express.js - Node.js web server framework

Getting Started

The project contains the server and the web client(a CLI client also exists for debug purposes).

the server is located at ./server and the web client is located at ./client/website.


Server

The model alongside the scripts for downloading, training and the weights from our training is located at ./server/speech_diarization/model

we used Docker to create a cross-platform environment to run the server on.

The server is made up of:

  • a container for the web server
  • a container for the diarization process
  • a container for a redis database that will allow the others to communicate

docker compose will run and manage all 3 at once

Docker and docker-compose need to be installed in order to build and run the server, all the rest will be taken care of.

Installing

cd server
docker-compose up

This will run all 3 containers and install dependencies.

If you make a change in the server, use

docker-compose up --build

to rebuild.

usage:

sending a HTTP POST request with an audio file to the server at http://localhost:1337/upload (default port and url) will return a JSON file with the timestamps in milliseconds.

{"0": [[40, 120], [3060, 3460], [3480, 3560]], "1": [[1260, 1660], [1680, 1960]]}

Client

The client needs npm or yarn to be installed, more info about the client can be found here.

to install:

cd client/website
npm install

afterwards you can use

npm run serve

to run a development server


Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

  • The diarization algorithm is an implementation of this research, we also used their implementation of the spectral clustering

  • We took inspiration and some code from Harry volek's implementation of a different but similar problem - Speaker Verification

Future Plans

  • We had problems with training on the AMI corpus so we used the TIMIT corpus for the model provided.

  • We plan to train again on the VoxCeleb 1 and 2 datasets which contain a lot more data and hopefully improve feature extraction

  • We want to add integration with a speech-to-text service and transcribe the created segments

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].