Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (+346.15%)

Mutual labels: speech

Multimodal-Gesture-Recognition-with-LSTMs-and-CTC

An end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.

Stars: ✭ 25 (+92.31%)

Mutual labels: speech

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+2169.23%)

Mutual labels: speech

frog

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

Stars: ✭ 70 (+438.46%)

Mutual labels: computational-linguistics

speech-recognition-evaluation

Evaluate results from ASR/Speech-to-Text quickly

Stars: ✭ 25 (+92.31%)

Mutual labels: speech-to-text

react-native-speech-bubble

💬 A speech bubble dialog component for React Native.

Stars: ✭ 50 (+284.62%)

Mutual labels: speech

megs

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (+61.54%)

Mutual labels: speech-to-text

audio noise clustering

https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (+84.62%)

Mutual labels: speech

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

Stars: ✭ 278 (+2038.46%)

Mutual labels: speech

idear

🎙️ Handsfree Audio Development Interface

Stars: ✭ 84 (+546.15%)

Mutual labels: speech

datalinguist

Stanford CoreNLP in idiomatic Clojure.

Stars: ✭ 93 (+615.38%)

Mutual labels: computational-linguistics

rustfst

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Stars: ✭ 104 (+700%)

Mutual labels: kaldi

benchmarkstt

Open Source AI Benchmarking toolkit for benchmarking speech to text services

Stars: ✭ 43 (+230.77%)

Mutual labels: speech-to-text

Naver-AI-Hackathon-Speech

2019 Clova AI Hackathon : Speech - Rank 12 / Team Kai.Lib

Stars: ✭ 26 (+100%)

Mutual labels: speech

browser-apis

🦄 Cool & Fun Browser Web APIs 🥳

Stars: ✭ 21 (+61.54%)

Mutual labels: speech

Shifter

Pitch shifter using WSOLA and resampling implemented by Python3

Stars: ✭ 22 (+69.23%)

Mutual labels: speech

lectures-all

Central repository for all lectures on deep learning at UPC ETSETB TelecomBCN.

Stars: ✭ 46 (+253.85%)

Mutual labels: speech

glaemscribe

Glaemscribe, the tolkienian languages/writings transcription engine.

Stars: ✭ 29 (+123.08%)

Mutual labels: transcription

esapp

An unsupervised Chinese word segmentation tool.

Stars: ✭ 13 (+0%)

Mutual labels: computational-linguistics

Voice Gender

Gender recognition by voice and speech analysis

Stars: ✭ 248 (+1807.69%)

Mutual labels: speech

embedding evaluation

Evaluate your word embeddings

Stars: ✭ 32 (+146.15%)

Mutual labels: computational-linguistics

Wavegrad

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+1784.62%)

Mutual labels: speech

Unity live caption

Use Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!

Stars: ✭ 26 (+100%)

Mutual labels: speech-to-text

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Stars: ✭ 2,384 (+18238.46%)

Mutual labels: automatic-speech-recognition

ucto

Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …

Stars: ✭ 58 (+346.15%)

Mutual labels: computational-linguistics

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

Stars: ✭ 65 (+400%)

Mutual labels: speech

Tacotron pytorch

PyTorch implementation of Tacotron speech synthesis model.

Stars: ✭ 242 (+1761.54%)

Mutual labels: speech

Gcc Nmf

Real-time GCC-NMF Blind Speech Separation and Enhancement

Stars: ✭ 231 (+1676.92%)

Mutual labels: speech

wave2vec-recognize-docker

Wave2vec 2.0 Recognize pipeline

Stars: ✭ 30 (+130.77%)

Mutual labels: automatic-speech-recognition

React.ai

It recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux

Stars: ✭ 38 (+192.31%)

Mutual labels: speech-to-text

Source separation

Deep learning based speech source separation using Pytorch

Stars: ✭ 226 (+1638.46%)

Mutual labels: speech

Volute

Raspberry Pi + Nodejs = Speech Robot

Stars: ✭ 224 (+1623.08%)

Mutual labels: speech

octopus

On-device speech-to-index engine powered by deep learning.