All Projects → m-nathani → speech_to_text

m-nathani / speech_to_text

Licence: AGPL-3.0 license
how to use the Google Cloud Speech API to transcribe audio/video files.

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to speech to text

anycontrol
Voice control for your websites and applications
Stars: ✭ 53 (+51.43%)
Mutual labels:  speech, speech-recognition, speech-to-text, speech-api
Edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (+485.71%)
Mutual labels:  speech, speech-recognition, speech-to-text
Lingvo
Lingvo
Stars: ✭ 2,361 (+6645.71%)
Mutual labels:  speech, speech-recognition, speech-to-text
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-22.86%)
Mutual labels:  speech-recognition, speech-to-text, speech-api
Kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+31760%)
Mutual labels:  speech, speech-recognition, speech-to-text
Asr audio data links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (+265.71%)
Mutual labels:  speech, speech-recognition, speech-to-text
Speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: ✭ 242 (+591.43%)
Mutual labels:  speech, speech-recognition, speech-to-text
Syn Speech
Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (+62.86%)
Mutual labels:  speech, speech-recognition, speech-to-text
ASR-Audio-Data-Links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (+411.43%)
Mutual labels:  speech, speech-recognition, speech-to-text
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+51.43%)
Mutual labels:  speech-recognition, speech-to-text, speech-api
deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (+134.29%)
Mutual labels:  speech, speech-recognition, speech-to-text
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-60%)
Mutual labels:  speech, speech-recognition, speech-to-text
Deepspeech
A PaddlePaddle implementation of ASR.
Stars: ✭ 1,219 (+3382.86%)
Mutual labels:  speech, speech-recognition, speech-to-text
Tacotron asr
Speech Recognition Using Tacotron
Stars: ✭ 165 (+371.43%)
Mutual labels:  speech, speech-recognition, speech-to-text
Openasr
A pytorch based end2end speech recognition system.
Stars: ✭ 69 (+97.14%)
Mutual labels:  speech, speech-recognition, speech-to-text
simple-obs-stt
Speech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (+154.29%)
Mutual labels:  speech, speech-recognition, speech-to-text
Annyang
💬 Speech recognition for your site
Stars: ✭ 6,216 (+17660%)
Mutual labels:  speech, speech-recognition, speech-to-text
Discordspeechbot
A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: ✭ 35 (+0%)
Mutual labels:  speech, speech-recognition, speech-to-text
wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+485.71%)
Mutual labels:  speech, speech-recognition, speech-to-text
KeenASR-Android-PoC
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-40%)
Mutual labels:  speech, speech-recognition, speech-to-text

Google Cloud Speech

These samples show how to use the Google Cloud Speech API to transcribe audio files.

  1. It takes mp4 files as argument
  2. Converts it to audio in FLAC encoding (lossless encoding ) and Breaks the audio file into 10 secs clips
  3. transcribes each 10 sec audio file and prints the speech to text result on console.

Pre-requisite

  1. Setup Google Cloud Speech Project
  2. Install ffmpeg on your machine (linux)

Installation

Install the dependencies for this library via composer

$ cd /path/to/speech_to_text
$ composer install

Configure your project using [Application Default Credentials]

$ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

Usage

To run the Speech Samples:

$ php speech.php

Cloud Speech

Usage:
  command [options] [arguments]

Options:
  -h, --help            Display this help message
  -q, --quiet           Do not output any message
  -V, --version         Display this application version
      --ansi            Force ANSI output
      --no-ansi         Disable ANSI output
  -n, --no-interaction  Do not ask any interactive question
  -v|vv|vvv, --verbose  Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

Available commands:
  help                    Displays help for a command
  list                    Lists commands
  transcribe              Transcribe an video file using Google Cloud Speech API

Help

  $ php speech.php transcribe --help

Usage:
    transcribe [options] [--] <video-file>

Arguments:
    video-file                   The video file to transcribe

Options:
    -l, --language=LANGUAGE      The language to transcribe [default: "en-US"]
    -e, --encoding=ENCODING      The encoding of the audio file. This is required if the encoding is unable to be determined. [default: 2]
    -b, --brand-file=BRAND-FILE  The brand names for speech context to transcribe [default: "brands"]
    -r, --rate-hertz=RATE-HERTZ  The sample rate (in Hertz) of the supplied video [default: 48000]
    -h, --help                   Display this help message
    -q, --quiet                  Do not output any message
    -V, --version                Display this application version
        --ansi                   Force ANSI output
        --no-ansi                Disable ANSI output
    -n, --no-interaction         Do not ask any interactive question
    -v|vv|vvv, --verbose         Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

Help:
    Transcribe an video file using Google Cloud Speech API
    The transcribe command transcribes video from a file using the
    Google Cloud Speech API.
    
    php speech.php transcribe video_file.mp4

Just send the speech sample, send it through the speech API using the transcribe command:

php speech.php transcribe [path to audio/video file]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].