speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription

Stars: ✭ 259 (+48.85%)

Mutual labels: speech

Code Switching Papers

A curated list of research papers and resources on code-switching

Stars: ✭ 122 (-29.89%)

Mutual labels: speech

Noise2Noise-audio denoising without clean training data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…

Stars: ✭ 49 (-71.84%)

Mutual labels: speech

Syn Speech

Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework

Stars: ✭ 57 (-67.24%)

Mutual labels: speech

minutes

🔭 Speaker diarization via transfer learning

Stars: ✭ 25 (-85.63%)

Mutual labels: speech

Tts Papers

🐸 collection of TTS papers

Stars: ✭ 160 (-8.05%)

Mutual labels: speech

sova-asr

SOVA ASR (Automatic Speech Recognition)

Stars: ✭ 123 (-29.31%)

Mutual labels: speech

Stl

The ITU-T Software Tool Library (G.191)

Stars: ✭ 44 (-74.71%)

Mutual labels: speech

Speech256

An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.

Stars: ✭ 51 (-70.69%)

Mutual labels: speech

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

Stars: ✭ 117 (-32.76%)

Mutual labels: speech

ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'

Stars: ✭ 40 (-77.01%)

Mutual labels: speech

Dialectid e2e

End to End Dialect Identification using Convolutional Neural Network

Stars: ✭ 40 (-77.01%)

Mutual labels: speech

torch-asg

Auto Segmentation Criterion (ASG) implemented in pytorch

Stars: ✭ 42 (-75.86%)

Mutual labels: speech

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (-22.41%)

Mutual labels: speech

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (-58.05%)

Mutual labels: speech

Wsay

Windows "say"

Stars: ✭ 36 (-79.31%)

Mutual labels: speech

SER-datasets

A collection of datasets for the purpose of emotion recognition/detection in speech.

Stars: ✭ 74 (-57.47%)

Mutual labels: speech

Holobot

HoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.

Stars: ✭ 114 (-34.48%)

Mutual labels: speech

speech recognition ctc

Use ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别

Stars: ✭ 40 (-77.01%)

Mutual labels: speech

Pykaldi

A Python wrapper for Kaldi

Stars: ✭ 756 (+334.48%)

Mutual labels: speech

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (-9.2%)

Mutual labels: speech

Pytorch Kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+1105.17%)

Mutual labels: speech

MelNet-SpeechGeneration

Implementation of MelNet in PyTorch to generate high-fidelity audio samples

Stars: ✭ 19 (-89.08%)

Mutual labels: speech

Praat

Praat: Doing Phonetics By Computer

Stars: ✭ 675 (+287.93%)

Mutual labels: speech

HTK

The Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.

Stars: ✭ 23 (-86.78%)

Mutual labels: speech

Python Speech recognition

A simple example for use speech recognition baidu api with python.

Stars: ✭ 106 (-39.08%)

Mutual labels: speech

opensnips

Open source projects related to Snips https://snips.ai/.

Stars: ✭ 50 (-71.26%)

Mutual labels: speech

Speech Emotion Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Stars: ✭ 633 (+263.79%)

Mutual labels: speech

nlp-class

A Natural Language Processing course taught by Professor Ghassemi

Stars: ✭ 95 (-45.4%)

Mutual labels: speech

Avpi

an open source voice command macro software

Stars: ✭ 130 (-25.29%)

Mutual labels: speech

Voice2Mesh

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Stars: ✭ 67 (-61.49%)

Mutual labels: speech

Nodejs Speech

Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.

Stars: ✭ 545 (+213.22%)

Mutual labels: speech

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (+28.74%)

Mutual labels: speech

Audiomate

Python library for handling audio datasets.

Stars: ✭ 99 (-43.1%)

Mutual labels: speech

gtranscribe

Software for interview transcription

Stars: ✭ 12 (-93.1%)

Mutual labels: speech

Speech Denoising Wavenet

A neural network for end-to-end speech denoising

Stars: ✭ 516 (+196.55%)

Mutual labels: speech

linear16

Converts an audio file to LINEAR16 Google-speech compatible file.

Stars: ✭ 14 (-91.95%)

Mutual labels: speech

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+1006.9%)

Mutual labels: speech

DeepSegmentor

Sequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)

Stars: ✭ 17 (-90.23%)

Mutual labels: speech

Java Speech Api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (+181.61%)

Mutual labels: speech

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Stars: ✭ 13,870 (+7871.26%)

Mutual labels: speech

Gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

Stars: ✭ 1,303 (+648.85%)

Mutual labels: speech

deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture