All Projects → candlewill → Speech-Corpus-Collection

candlewill / Speech-Corpus-Collection

Licence: MIT license
A Collection of Speech Corpus for ASR and TTS

Projects that are alternatives of or similar to Speech-Corpus-Collection

opensource-voice-tools
A repo listing known open source voice tools, ordered by where they sit in the voice stack
Stars: ✭ 21 (-81.42%)
Mutual labels:  corpus, tts, asr
spokestack-tray-android
A UI component that makes it easy to add voice interaction to your app.
Stars: ✭ 13 (-88.5%)
Mutual labels:  tts, asr
speech course
YSDA course in Speech Processing.
Stars: ✭ 93 (-17.7%)
Mutual labels:  tts, asr
leopard-chat-ui-teneo
Leopard Chat UI - A Teneo Chat Client based on Vue and Vuetify
Stars: ✭ 65 (-42.48%)
Mutual labels:  tts, asr
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (-53.1%)
Mutual labels:  tts, asr
klaam
Arabic speech recognition, classification and text-to-speech.
Stars: ✭ 151 (+33.63%)
Mutual labels:  tts, asr
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (-53.98%)
Mutual labels:  tts, asr
Lingvo
Lingvo
Stars: ✭ 2,361 (+1989.38%)
Mutual labels:  tts, asr
Mrcp Plugin With Freeswitch
使用FreeSWITCH接受用户手机呼叫,通过UniMRCP Server集成讯飞开放平台(xfyun)插件将用户语音进行语音识别(ASR),并根据自定义业务逻辑调用语音合成(TTS),构建简单的端到端语音呼叫中心。
Stars: ✭ 168 (+48.67%)
Mutual labels:  tts, asr
Zerospeech Tts Without T
A Pytorch implementation for the ZeroSpeech 2019 challenge.
Stars: ✭ 100 (-11.5%)
Mutual labels:  tts, asr
Athena
an open-source implementation of sequence-to-sequence based speech processing engine
Stars: ✭ 542 (+379.65%)
Mutual labels:  tts, asr
Wukong Robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,还可能是首个支持脑机交互的开源智能音箱项目。
Stars: ✭ 3,110 (+2652.21%)
Mutual labels:  tts, asr
megs
A merged version of multiple open-source German speech datasets.
Stars: ✭ 21 (-81.42%)
Mutual labels:  corpus, asr
node-red-contrib-sonospollytts
Play speech TTS using Sonos.
Stars: ✭ 11 (-90.27%)
Mutual labels:  tts
german-nouns
A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.
Stars: ✭ 101 (-10.62%)
Mutual labels:  corpus
ppt presenter
Convert ppt to video with audio track, using text to speech synthesis
Stars: ✭ 38 (-66.37%)
Mutual labels:  tts
Fergun
An utility Discord bot written in C# using Discord.Net
Stars: ✭ 26 (-76.99%)
Mutual labels:  tts
DANeS
DANeS is an open-source E-newspaper dataset by collaboration between DATASET JSC (dataset.vn) and AIV Group (aivgroup.vn)
Stars: ✭ 64 (-43.36%)
Mutual labels:  corpus
wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+81.42%)
Mutual labels:  asr
rasr
The RWTH ASR Toolkit.
Stars: ✭ 43 (-61.95%)
Mutual labels:  asr

Speech-Corpus-Collection

This repo is a collection of Speech Corpus for automatic speech recognition (ASR) and text-to-speech (TTS).

ASR Corpus

  1. VCTK
    Around 10.4GB. Alternative Host

  2. LibriSpeech
    Large-scale (1000 hours) corpus of read English speech.

  3. TEDLIUM release 2
    The TED-LIUM corpus was made from audio talks and their transcriptions available on the TED website. The authors have prepared and filtered these data in order to train acoustic models to participate to the International Workshop on Spoken Language Translation 2011 (the LIUM English/French SLT system reached the first rank in the SLT task).

TTS Corpus

  1. CMU ARCTIC Databases
    The databases consist of around 1150 utterances, including US English male (bdl) and female (slt) speakers, as well as other accented speakers.

  2. The World English Bible
    The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its text and audio recordings are freely avaiable here. Unfortunately, however, each of the audio files matches a chapter, not a verse, so is too long in most cases. Kyubyong sliced them by verse manually. You can get them on his dropbox.

  3. Nancy Corpus
    The Nancy corpus from the 2011 Blizzard Challenge. The data is freely availiable for research use on the signing of a license.

General

  1. The NSynth Dataset
    NSynth is an audio dataset containing 305,979 musical notes, each with a unique pitch, timbre, and envelope. For 1,006 instruments from commercial sample libraries, we generated four second, monophonic 16kHz audio snippets, referred to as notes, by ranging over every pitch of a standard MIDI pian o (21-108) as well as five different velocities (25, 50, 75, 100, 127). The note was held for the first three seconds and allowed to decay for the final second.

Contact Me

Yunchao He
Weibo

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].