A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

Stars: ✭ 195 (-36.07%)

Mutual labels: speech-recognition, speech-to-text

Ctcdecode

PyTorch CTC Decoder bindings

Stars: ✭ 442 (+44.92%)

Mutual labels: beam-search, ctc

wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Stars: ✭ 6,026 (+1875.74%)

Mutual labels: end-to-end, speech-recognition

Automatic Speech Recognition

🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)

Stars: ✭ 192 (-37.05%)

Mutual labels: speech-recognition, speech-to-text

Image Caption Generator

A neural network to generate captions for an image using CNN and RNN with BEAM Search.

Stars: ✭ 126 (-58.69%)

Mutual labels: beam-search, attention-mechanism

Seq2seq chatbot new

基于seq2seq模型的简单对话系统的tf实现，具有embedding、attention、beam_search等功能，数据集是Cornell Movie Dialogs

Stars: ✭ 144 (-52.79%)

Mutual labels: beam-search, attention-mechanism

Speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Stars: ✭ 242 (-20.66%)

Mutual labels: speech-recognition, speech-to-text

Image-Caption

Using LSTM or Transformer to solve Image Captioning in Pytorch

Stars: ✭ 36 (-88.2%)

Mutual labels: beam-search, attention-mechanism

Seq2seq chatbot

基于seq2seq模型的简单对话系统的tf实现，具有embedding、attention、beam_search等功能，数据集是Cornell Movie Dialogs

Stars: ✭ 308 (+0.98%)

Mutual labels: beam-search, attention-mechanism

Automatic speech recognition

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Stars: ✭ 2,751 (+801.97%)

Mutual labels: speech-recognition, end-to-end

Speech recognition with tensorflow

Implementation of a seq2seq model for Speech Recognition using the latest version of TensorFlow. Architecture similar to Listen, Attend and Spell.

Stars: ✭ 253 (-17.05%)

Mutual labels: speech-recognition, speech-to-text

Asr Evaluation

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Stars: ✭ 190 (-37.7%)

Mutual labels: speech-recognition, asr

Cn2an

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Stars: ✭ 249 (-18.36%)

Mutual labels: speech-recognition, asr

Poetry Seq2seq

Chinese Poetry Generation

Stars: ✭ 159 (-47.87%)

Mutual labels: beam-search, attention-mechanism

revai-python-sdk

Rev AI Python SDK

Stars: ✭ 35 (-88.52%)

Mutual labels: speech-recognition, speech-to-text

voce-browser

Voice Controlled Chromium Web Browser

Stars: ✭ 34 (-88.85%)

Mutual labels: speech-recognition, speech-to-text

anycontrol

Voice control for your websites and applications

Stars: ✭ 53 (-82.62%)

Mutual labels: speech-recognition, speech-to-text

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

Stars: ✭ 21 (-93.11%)

Mutual labels: speech-recognition, asr

web-voice-processor

A library for real-time voice processing in web browsers

Stars: ✭ 69 (-77.38%)

Mutual labels: speech-recognition, speech-to-text

octopus

On-device speech-to-index engine powered by deep learning.

Stars: ✭ 30 (-90.16%)

Mutual labels: speech-recognition, speech-to-text

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Stars: ✭ 21 (-93.11%)

Mutual labels: speech-recognition, speech-to-text

revai-java-sdk

Rev.ai Java SDK

Stars: ✭ 16 (-94.75%)

Mutual labels: speech-recognition, speech-to-text

React.ai

It recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux

Stars: ✭ 38 (-87.54%)

Mutual labels: speech-recognition, speech-to-text

deepspeech

A PyTorch implementation of DeepSpeech and DeepSpeech2.

Stars: ✭ 45 (-85.25%)

Mutual labels: speech-recognition, speech-to-text

speechmatics-python

Python library and CLI for Speechmatics

Stars: ✭ 24 (-92.13%)

Mutual labels: speech-recognition, speech-to-text

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (-70.82%)

Mutual labels: speech-recognition, speech-to-text

TS3000 TheChatBOT

Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic just like its creator 😝 BDW it uses Pytorch framework and Python3.

Stars: ✭ 20 (-93.44%)

Mutual labels: beam-search, attention-mechanism

web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.

Stars: ✭ 35 (-88.52%)

Mutual labels: speech-recognition, speech-to-text

AmazonSpeechTranslator

End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

Stars: ✭ 50 (-83.61%)

Mutual labels: speech-recognition, speech-to-text

musicologist

Music advice from a conversational interface powered by Algolia

Stars: ✭ 19 (-93.77%)

Mutual labels: speech-recognition, speech-to-text

DeepSpeech-API

The code enables users to use Mozilla's Deep Speech model over the Web Browser.

Stars: ✭ 31 (-89.84%)

Mutual labels: speech-recognition, speech-to-text

Inimesed

An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.

Stars: ✭ 65 (-78.69%)

Mutual labels: speech-recognition, speech-to-text

rustfst

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Stars: ✭ 104 (-65.9%)

Mutual labels: speech-recognition, asr

kaldi ag training

Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.

Stars: ✭ 14 (-95.41%)

Mutual labels: speech-recognition, speech-to-text

Transformer-Transducer

PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)

Stars: ✭ 61 (-80%)

Mutual labels: end-to-end, speech-recognition

speechrec

a simple speech recognition app using the Web Speech API Interfaces