All Projects → IBM → speech-to-text-code-pattern

IBM / speech-to-text-code-pattern

Licence: Apache-2.0 license
React app using the Watson Speech to Text service to transform voice audio into written text.

Programming Languages

javascript
184084 projects - #8 most used programming language
SCSS
7915 projects
shell
77523 projects
HTML
75241 projects
python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to speech-to-text-code-pattern

watson-speech-translator
Use Watson Speech to Text, Language Translator, and Text to Speech in a web app with React components
Stars: ✭ 66 (+78.38%)
Mutual labels:  watson-speech, ibm-cloud, watson-speech-to-text, ibm-cloud-pak
ruby-sdk
♦️ Ruby SDK to use the IBM Watson services.
Stars: ✭ 45 (+21.62%)
Mutual labels:  ibm-watson-services, watson, ibm-watson
speech-to-text
Python helper for Google and IBM Watson speech-to-text cloud APIs.
Stars: ✭ 14 (-62.16%)
Mutual labels:  ibm-watson-speech, speech-to-text, watson-speech
deep avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Stars: ✭ 104 (+181.08%)
Mutual labels:  speech-recognition, speech-to-text
revai-node-sdk
Node.js SDK for the Rev AI API
Stars: ✭ 21 (-43.24%)
Mutual labels:  speech-recognition, speech-to-text
PCPM
Presenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.
Stars: ✭ 21 (-43.24%)
Mutual labels:  speech-recognition, speech-to-text
extract-textual-insights-from-video
Extract Textual insights from Video
Stars: ✭ 23 (-37.84%)
Mutual labels:  ibm-cloud, watson-speech-to-text
youtube-video-maker
📹 A tool for automatic video creation and uploading on YouTube
Stars: ✭ 134 (+262.16%)
Mutual labels:  watson, ibm-watson
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-62.16%)
Mutual labels:  speech-recognition, speech-to-text
cloudco-insurance
A modern insurance company. The application showcases cognitive and cloud computing ideas in the context of insurance.
Stars: ✭ 43 (+16.22%)
Mutual labels:  ibm-watson, ibm-cloud
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+2172.97%)
Mutual labels:  speech-recognition, speech-to-text
Deep-learning-And-Paper
【仅作为交流学习使用】机器智能--相关书目及经典论文包括AutoML、情感分类、语音识别、声纹识别、语音合成实验代码等
Stars: ✭ 62 (+67.57%)
Mutual labels:  speech-recognition, speech-to-text
Inimesed
An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
Stars: ✭ 65 (+75.68%)
Mutual labels:  speech-recognition, speech-to-text
deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (+121.62%)
Mutual labels:  speech-recognition, speech-to-text
DeepSpeech-API
The code enables users to use Mozilla's Deep Speech model over the Web Browser.
Stars: ✭ 31 (-16.22%)
Mutual labels:  speech-recognition, speech-to-text
Chinese-automatic-speech-recognition
Chinese speech recognition
Stars: ✭ 147 (+297.3%)
Mutual labels:  speech-recognition, speech-to-text
Unity live caption
Use Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!
Stars: ✭ 26 (-29.73%)
Mutual labels:  speech-recognition, speech-to-text
rnnt decoder cuda
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
Stars: ✭ 60 (+62.16%)
Mutual labels:  speech-recognition, speech-to-text
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (+35.14%)
Mutual labels:  speech-recognition, speech-to-text
scripty
Speech to text bot for Discord using Mozilla's DeepSpeech
Stars: ✭ 14 (-62.16%)
Mutual labels:  speech-recognition, speech-to-text

Build Status

Speech to Text Code Pattern

Sample React app for playing around with the Watson Speech to Text service.

Demo: https://speech-to-text-code-pattern.ng.bluemix.net/

architecture

Flow

  1. User supplies an audio input to the application (running locally, in the IBM Cloud or in IBM Cloud Pak for Data).
  2. The application sends the audio data to the Watson Speech to Text service through a WebSocket connection.
  3. As the data is processed, the Speech to Text service returns information about extracted text and other metadata to the application to display.

Steps

  1. Provision Watson Speech to Text
  2. Deploy the server
  3. Use the web app

1. Provision Watson Speech to Text

The instructions will depend on whether you are provisioning services using IBM Cloud Pak for Data or on IBM Cloud.

Click to expand one:

IBM Cloud Pak for Data

Install and provision

The service is not available by default. An administrator must install it on the IBM Cloud Pak for Data platform, and you must be given access to the service. To determine whether the service is installed, click the Services icon (services_icon) and check whether the service is enabled.

Gather credentials

  1. For production use, create a user to use for authentication. From the main navigation menu (☰), select Administer > Manage users and then + New user.
  2. From the main navigation menu (☰), select My instances.
  3. On the Provisioned instances tab, find your service instance, and then hover over the last column to find and click the ellipses icon. Choose View details.
  4. Copy the URL to use as the SPEECH_TO_TEXT_URL when you configure credentials.
  5. Optionally, copy the Bearer token to use in development testing only. It is not recommended to use the bearer token except during testing and development because that token does not expire.
  6. Use the Menu and select Users and + Add user to grant your user access to this service instance. This is the SPEECH_TO_TEXT_USERNAME (and SPEECH_TO_TEXT_PASSWORD) you will use when you configure credentials to allow the Node.js server to authenticate.
IBM Cloud

Create the service instance

  • If you do not have an IBM Cloud account, register for a free trial account here.
  • Click here to create a Speech to Text instance.
    • Select a region.
    • Select a pricing plan (Lite is free).
    • Set your Service name or use the generated one.
    • Click Create.
  • Gather credentials

If you need to find the service later, use the main navigation menu (☰) and select Resource list to find the service under Services. Click on the service name to get back to the Manage view (where you can collect the API Key and URL).

2. Deploy the server

Click on one of the options below for instructions on deploying the Node.js server.

local openshift

3. Use the web app

  • Select an input Language model (defaults to English).

  • Press the Play audio sample button to hear our example audio and watch as it is transcribed.

  • Press the Record your own button to transcribe audio from your microphone. Press the button again to stop (the button label becomes Stop recording).

  • Use the Upload file button to transcribe audio from a file.

architecture

Developing and testing

See DEVELOPING.md and TESTING.md for more details about developing and testing this app.

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].