All Projects → IBM → watson-speech-translator

IBM / watson-speech-translator

Licence: Apache-2.0 License
Use Watson Speech to Text, Language Translator, and Text to Speech in a web app with React components

Programming Languages

javascript
184084 projects - #8 most used programming language
CSS
56736 projects

Projects that are alternatives of or similar to watson-speech-translator

speech-to-text-code-pattern
React app using the Watson Speech to Text service to transform voice audio into written text.
Stars: ✭ 37 (-43.94%)
Mutual labels:  watson-speech, ibm-cloud, watson-speech-to-text, ibm-cloud-pak
Watson-Unity-ARKit
# WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Stars: ✭ 24 (-63.64%)
Mutual labels:  watson-speech, ibm-cloud
watson-multimedia-analyzer
WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode. A Node app that use Watson Visual Recognition, Speech to Text, Natural Language Understanding, and Tone Analyzer to enrich media files.
Stars: ✭ 23 (-65.15%)
Mutual labels:  watson-services, watson-speech
watson-discovery-sdu-with-assistant
Build a Node.js chatbot that uses Watson services and webhooks to query an owner's manual
Stars: ✭ 20 (-69.7%)
Mutual labels:  watson-services, ibm-cloud
extract-textual-insights-from-video
Extract Textual insights from Video
Stars: ✭ 23 (-65.15%)
Mutual labels:  ibm-cloud, watson-speech-to-text
watson-discovery-ui
Develop a fully featured Node.js web app built on the Watson Discovery Service
Stars: ✭ 63 (-4.55%)
Mutual labels:  watson-services, ibm-cloud
smores-react
🍭 Marshmallow React components
Stars: ✭ 34 (-48.48%)
Mutual labels:  react-components
react-sample-projects
The goal of this project is to provide a set of simple samples, providing and step by step guide to start working with React.
Stars: ✭ 30 (-54.55%)
Mutual labels:  react-components
Fable.SemanticUI
React.SeamanticUI to Fable bindings
Stars: ✭ 15 (-77.27%)
Mutual labels:  react-components
material-react-components
React components implementing the Material Design specification
Stars: ✭ 21 (-68.18%)
Mutual labels:  react-components
LMPHP
Multi-language management and support on the site.
Stars: ✭ 19 (-71.21%)
Mutual labels:  language-translation
netflix landing-page
The Netflix.com landing page built via React 16 and Styled-Components. Build deployed via Surge.sh for preview.
Stars: ✭ 39 (-40.91%)
Mutual labels:  react-components
safe-dot-android
An app that 🔔 alerts you when a third-party 🕵🏻‍♀️ application uses your device camera or microphone. Privacy Indicators for Android
Stars: ✭ 63 (-4.55%)
Mutual labels:  microphone
react-bmapgl
基于百度地图JavaScript GL版API封装的React组件库
Stars: ✭ 68 (+3.03%)
Mutual labels:  react-components
Deep-Viz-Website
The Deep-Viz Components' display website ( Base on React + Dva + Ant-Design) 组件库Deep-Viz的展示网站
Stars: ✭ 12 (-81.82%)
Mutual labels:  react-components
react-carousel-3d
3D carousal component in react
Stars: ✭ 129 (+95.45%)
Mutual labels:  react-components
Command-line-translator
Command-line access to google translate and some other features
Stars: ✭ 26 (-60.61%)
Mutual labels:  language-translation
react-components
React components library.
Stars: ✭ 27 (-59.09%)
Mutual labels:  react-components
dash-flexbox-grid
Wrapper around react-flexbox-grid for Plotly Dash
Stars: ✭ 19 (-71.21%)
Mutual labels:  react-components
ios-permissions-service
An easy way to do permissions requests & handling automatically.
Stars: ✭ 25 (-62.12%)
Mutual labels:  microphone

Create a language translator app with voice input and output

In this code pattern, we will create a language translator web app. Built with React components and a Node.js server, the app will capture audio input and stream it to a Watson Speech to Text service. As the input speech is transcribed, it will also be sent to a Watson Language Translator service to be translated into the language you select. Both the transcribed and translated text will be displayed by the app in real time. Each completed phrase will be sent to Watson Text to Speech to be spoken in your choice of locale-specific voices.

The best way to understand what is real-time transcription/translation vs. "completed phrase" vocalization is to try it out. You'll notice that the text is updated as words and phrases are completed and become better understood in context. To avoid backtracking or overlapping audio, only completed phrases are vocalized. These are typically short sentences or utterances where a pause indicates a break.

For the best live experience, wear headphones to listen to the translated version of what your microphone is listening to. Alternatively, you can use the toggle buttons to record and transcribe first without translating. When ready, select a language and voice and then enable translation (and speech).

When you have completed this code pattern, you will understand how to:

  • Stream audio to Speech to Text using a WebSocket
  • Use Language Translator with a REST API
  • Retrieve and play audio from Speech to Text using a REST API
  • Integrate Speech to Text, Language Translator, and Text to Speech in a web app
  • Use React components and a Node.js server

NOTE: This code pattern includes instructions for running Watson services on IBM Cloud or with the Watson API Kit on IBM Cloud Pak for Data. Click here for more information about IBM Cloud Pak for Data.

architecture

Flow

  1. User presses the microphone button and captures the input audio.
  2. The audio is streamed to Speech to Text using a WebSocket.
  3. The transcribed text from Speech to Text is displayed and updated.
  4. The transcribed text is sent to Language Translator and the translated text is displayed and updated.
  5. Completed phrases are sent to Text to Speech and the result audio is automatically played.

Steps

  1. Create the Watson services
  2. Deploy the server
  3. Use the web app

Create the Watson services

Note: You can skip this step if you will be using the Deploy to Cloud Foundry on IBM Cloud button below. That option automatically creates the services and binds them (providing their credentials) to the application.

Provision the following services:

  • Speech to Text
  • Language Translator
  • Text to Speech

The instructions will depend on whether you are provisioning services using IBM Cloud Pak for Data or on IBM Cloud.

Click to expand one:

IBM Cloud Pak for Data

Use the following instructions for each of the three services.

Install and provision service instances

The services are not available by default. An administrator must install them on the IBM Cloud Pak for Data platform, and you must be given access to the service. To determine whether the service is installed, Click the Services icon (services_icon) and check whether the service is enabled.

Gather credentials

  1. For production use, create a user to use for authentication. From the main navigation menu (☰), select Administer > Manage users and then + New user.
  2. From the main navigation menu (☰), select My instances.
  3. On the Provisioned instances tab, find your service instance, and then hover over the last column to find and click the ellipses icon. Choose View details.
  4. Copy the URL to use as the {SERVICE_NAME}_URL when you configure credentials.
  5. Optionally, copy the Bearer token to use in development testing only. It is not recommended to use the bearer token except during testing and development because that token does not expire.
  6. Use the Menu and select Users and + Add user to grant your user access to this service instance. This is the user name (and password) you will use when you configure credentials to allow the Node.js server to authenticate.
IBM Cloud

Create the service instances
  • If you do not have an IBM Cloud account, register for a free trial account here.
  • Click here to create a Speech to Text instance.
  • Click here to create a Language Translator instance.
  • Click here to create a Text to Speech instance.
Gather credentials
  1. From the main navigation menu (☰), select Resource list to find your services under Services.
  2. Click on each service to find the Manage view where you can collect the API Key and URL to use for each service when you configure credentials.

Deploy the server

Click on one of the options below for instructions on deploying the Node.js server.

local openshift cf

Use the web app

NOTE: The app was developed using Chrome on macOS. Browser compatibility issues are still being worked out.

watson-speech-translator.gif

  1. Browse to your app URL

    • Use the URL provided at the end of your selected deployment option.
  2. Select a speech recognition model

    • The drop-down will be populated with models supported by your Speech to Text service.
  3. Select an output language and voice

    • The drop-down will only include voices that are supported by your Text to Speech service. The list is also filtered to only show languages that can be translated from the source language using Language Translator.
  4. Use the Speech to Text toggle

    • Use the Speech to Text button (which becomes Stop Listening) to begin recording audio and streaming it to Speech to Text. Press the button again to stop listening/streaming.
  5. Use the Language Translation toggle

    • The Language Translation button (which becomes Stop Translating) is also a toggle. You can leave it enabled to translate while transcribing, or use it after you see the transcribed text that you'd like to translate and say.
  6. Disable Text to Speech

    • By default, the app automatically uses Text to Speech to read the translated output. The checkbox allows you to disable Text to Speech.
  7. Changing the language and voice

    • If you change the voice while language translation is enabled, any current transcribed text will be re-translated (and spoken if enabled).
  8. Resetting the transcribed text

    • The transcribed text will be cleared when you do any of the following:

      • Press Speech to Text to restart listening
      • Refresh the page
      • Change the speech recognition model

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].