SpeechrecognizerbuttonUIButton subclass with push to talk recording, speech recognition and Siri-style waveform view.
Stars: ✭ 144 (+27.43%)
Annyang💬 Speech recognition for your site
Stars: ✭ 6,216 (+5400.88%)
AdaptAdapt Intent Parser
Stars: ✭ 690 (+510.62%)
Recording-BotA bot built to record and transcribe audio fragments from Discord.
Stars: ✭ 22 (-80.53%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: ✭ 135 (+19.47%)
download audioset📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Stars: ✭ 53 (-53.1%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (+30.09%)
leon🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+7475.22%)
sepia-stt-serverSEPIA server to support open-source speech recognition via WebSocket connection.
Stars: ✭ 45 (-60.18%)
picovoiceThe end-to-end platform for building voice products at scale
Stars: ✭ 316 (+179.65%)
speech-to-textmixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
Stars: ✭ 61 (-46.02%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (+13.27%)
scim[wip]Speech recognition tool-box written by Nim. Based on Arraymancer.
Stars: ✭ 17 (-84.96%)
kosrKorean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)
Stars: ✭ 25 (-77.88%)
FAST-RIRThis is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Stars: ✭ 90 (-20.35%)
CCAligner🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.
Stars: ✭ 131 (+15.93%)
Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+9768.14%)
wav2letterFacebook AI Research's Automatic Speech Recognition Toolkit
Stars: ✭ 6,026 (+5232.74%)
Unity live captionUse Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!
Stars: ✭ 26 (-76.99%)
voce-browserVoice Controlled Chromium Web Browser
Stars: ✭ 34 (-69.91%)
Keras KaldiKeras Interface for Kaldi ASR
Stars: ✭ 124 (+9.73%)
pyjsgfJSpeech Grammar Format (JSGF) compiler, matcher and parser package for Python.
Stars: ✭ 40 (-64.6%)
kaldi ag trainingDocker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-87.61%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (-69.03%)
Project aliasAlias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Stars: ✭ 1,577 (+1295.58%)
vosk-asteriskSpeech Recognition in Asterisk with Vosk Server
Stars: ✭ 52 (-53.98%)
cepCEP is a software platform designed for users that want to learn or rapidly prototype using standard A.I. components.
Stars: ✭ 140 (+23.89%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+98.23%)
NonautoreggenprogressTracking the progress in non-autoregressive generation (translation, transcription, etc.)
Stars: ✭ 118 (+4.42%)
speech-to-text-code-patternReact app using the Watson Speech to Text service to transform voice audio into written text.
Stars: ✭ 37 (-67.26%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-27.43%)
Rnn TransducerMXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Stars: ✭ 114 (+0.88%)
IR-GANAugmenting Room Impulse Response
Stars: ✭ 21 (-81.42%)
specAugmentTensor2tensor experiment with SpecAugment
Stars: ✭ 46 (-59.29%)
Modality-Transferable-MERModality-Transferable-MER, multimodal emotion recognition model with zero-shot and few-shot abilities.
Stars: ✭ 36 (-68.14%)
Ml RoadMachine Learning Resources, Practice and Research
Stars: ✭ 1,776 (+1471.68%)
OpenVINO-EmotionRecognitionOpenVINO+NCS2/NCS+MutiModel(FaceDetection, EmotionRecognition)+MultiStick+MultiProcess+MultiThread+USB Camera/PiCamera. RaspberryPi 3 compatible. Async.
Stars: ✭ 51 (-54.87%)
InimesedAn Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
Stars: ✭ 65 (-42.48%)
DeepspeechrecognitionA Chinese Deep Speech Recognition System 包括基于深度学习的声学模型和基于深度学习的语言模型
Stars: ✭ 1,421 (+1157.52%)
SpeechEmoRecSpeech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
Stars: ✭ 44 (-61.06%)
hfusionMultimodal sentiment analysis using hierarchical fusion with context modeling
Stars: ✭ 42 (-62.83%)
cobraOn-device voice activity detection (VAD) powered by deep learning.
Stars: ✭ 76 (-32.74%)
VoiceDictation迅飞 语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息,让机器能够“听懂”人类语言,相当于给机器安装上“耳朵”,使其具备“能听”的功能。
Stars: ✭ 36 (-68.14%)
Wav2letterFacebook AI Research's Automatic Speech Recognition Toolkit
Stars: ✭ 5,907 (+5127.43%)
Awesome DiarizationA curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Stars: ✭ 673 (+495.58%)