Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (+282.81%)

Mutual labels: speech-recognition, speech, speech-to-text

Sonus

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

Stars: ✭ 532 (+315.63%)

Mutual labels: speech-recognition, speech, speech-to-text

Neural sp

End-to-end ASR/LM implementation with PyTorch

Stars: ✭ 408 (+218.75%)

Mutual labels: speech-recognition, speech, asr

Eesen

The official repository of the Eesen project

Stars: ✭ 738 (+476.56%)

Mutual labels: speech-recognition, speech-to-text, asr

Awesome Kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Stars: ✭ 393 (+207.03%)

Mutual labels: speech-recognition, speech, speech-to-text

Annyang

💬 Speech recognition for your site

Stars: ✭ 6,216 (+4756.25%)

Mutual labels: speech-recognition, speech, speech-to-text

Mongolian Speech Recognition

Mongolian speech recognition with PyTorch

Stars: ✭ 97 (-24.22%)

Mutual labels: speech-recognition, speech-to-text, asr

Discordspeechbot

A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

Stars: ✭ 35 (-72.66%)

Mutual labels: speech-recognition, speech, speech-to-text

View All Similar Projects ➔

Audio Data Links

A list of common publically (and privately) available audio data that you can download for ASR or other speech activities. All your WERs are belong to us. Inspired by wer are we who stole someone elses joke.

1. FREE

Source	Name & Direct Link	Type	Size(Hours)
OpenSLR	LibriSpeech - Train:100 360 500 Test:Clean Other Dev:Clean Other	Read	960
OpenSLR	TED-LIUM Release 2	Read	118
OpenSLR	TED-LIUM Release 3	Read	452
Voxforge	Voxforge English	Read	130
Mozilla	Common Voice v1	Read	500
Mozilla	Common Voice en_1087h_2019-06-12	Read	1087
Tatoeba	Tatoeba Audio Eng	Read	~200
Valentini	Noisy Speech Database All Files, DOI	Read	TBC

2. PAID

Source	Name	Type	Size(Hours)	Code
LDC	Fisher	Conversational	2000	Speech LDC2004S13 LDC2005S13 Transcripts LDC2004T19 LDC2005T19
LDC	Switchboard Hub 500	Conversational	240	LDC2002S09
LDC	Switchboard Release 2	Conversational	300	LDC97S62
LDC	TIMIT	Read	5	LDC93S1
LDC	Wall Street Journal (WSJ)	Read	80	LDC93S6A or LDC93S6B

TTS

1. FREE

Source	Name & Direct Link	Type	Size(Hours)
Edinburgh CSTR	CSTR VCTK Corpus	Read	44
LJ Speech	LJ Speech	Read	24

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 128

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗