All Projects → cvqluu → simple_diarizer

cvqluu / simple_diarizer

Licence: GPL-3.0 License
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to simple diarizer

kaldi-long-audio-alignment
Long audio alignment using Kaldi
Stars: ✭ 21 (-19.23%)
Mutual labels:  speech-to-text, transcription, asr
leopard
On-device speech-to-text engine powered by deep learning
Stars: ✭ 354 (+1261.54%)
Mutual labels:  speech-to-text, transcription, asr
asr24
24-hour Automatic Speech Recognition
Stars: ✭ 27 (+3.85%)
Mutual labels:  transcription, asr
ASR-Audio-Data-Links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (+588.46%)
Mutual labels:  speech-to-text, asr
speechmatics-python
Python library and CLI for Speechmatics
Stars: ✭ 24 (-7.69%)
Mutual labels:  speech-to-text, transcription
megs
A merged version of multiple open-source German speech datasets.
Stars: ✭ 21 (-19.23%)
Mutual labels:  speech-to-text, asr
wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+688.46%)
Mutual labels:  speech-to-text, asr
speech-recognition-evaluation
Evaluate results from ASR/Speech-to-Text quickly
Stars: ✭ 25 (-3.85%)
Mutual labels:  speech-to-text, asr
Speecht
An opensource speech-to-text software written in tensorflow
Stars: ✭ 152 (+484.62%)
Mutual labels:  speech-to-text, asr
speech-to-text
Python helper for Google and IBM Watson speech-to-text cloud APIs.
Stars: ✭ 14 (-46.15%)
Mutual labels:  speech-to-text, transcription
vosk-asterisk
Speech Recognition in Asterisk with Vosk Server
Stars: ✭ 52 (+100%)
Mutual labels:  speech-to-text, asr
Kerasdeepspeech
A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
Stars: ✭ 245 (+842.31%)
Mutual labels:  speech-to-text, asr
Edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (+688.46%)
Mutual labels:  speech-to-text, asr
Lingvo
Lingvo
Stars: ✭ 2,361 (+8980.77%)
Mutual labels:  speech-to-text, asr
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (+3.85%)
Mutual labels:  speech-to-text, asr
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+103.85%)
Mutual labels:  speech-to-text, asr
Asr audio data links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (+392.31%)
Mutual labels:  speech-to-text, asr
Speech To Text Russian
Проект для распознавания речи на русском языке на основе pykaldi.
Stars: ✭ 151 (+480.77%)
Mutual labels:  speech-to-text, asr
PCPM
Presenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.
Stars: ✭ 21 (-19.23%)
Mutual labels:  speech-to-text, asr
kaldi helpers
🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-50%)
Mutual labels:  speech-to-text, transcription

simple_diarizer

Open In Colab

Simplified diarization pipeline using some pretrained models.

Made to be a simple as possible to go from an input audio file to diarized segments.

import soundfile as sf
import matplotlib.pyplot as plt

from simple_diarizer.diarizer import Diarizer
from simple_diarizer.utils import combined_waveplot

diar = Diarizer(
                  embed_model='xvec', # 'xvec' and 'ecapa' supported
                  cluster_method='sc' # 'ahc' and 'sc' supported
               )

segments = diar.diarize(WAV_FILE, num_speakers=NUM_SPEAKERS)

signal, fs = sf.read(WAV_FILE)
combined_waveplot(signal, fs, segments)
plt.show()

Install

Simplified diarization is available on PyPI:

pip install simple-diarizer

Source Video

"Some Quick Advice from Barack Obama!"

YouTube Thumbnail

Pre-trained Models

The following pretrained models are used:

Demo

Open In Colab

It can be checked out in the above link, where it will try and diarize any input YouTube URL. It will also use YouTube's autogenerated transcriptions to produce a speaker labelled transcription.

Hopefully this can be of use as a free basic tool to produce a diarized transcript of a video/audio of interest.

Other References

Planned Features

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].