All Projects → mobilequickie → AmazonSpeechTranslator

mobilequickie / AmazonSpeechTranslator

Licence: Apache-2.0 license
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

Programming Languages

swift
15916 projects
ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to AmazonSpeechTranslator

open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+1582%)
Mutual labels:  text-to-speech, speech-synthesis, voice-recognition, speech-recognition, speech-to-text
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+6%)
Mutual labels:  text-to-speech, speech-synthesis, voice-recognition, speech-recognition, speech-to-text
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-46%)
Mutual labels:  text-to-speech, speech-synthesis, voice-recognition, speech-recognition, speech-to-text
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (+106%)
Mutual labels:  text-to-speech, speech-synthesis, voice-recognition, speech-recognition, speech-to-text
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+17020%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+4%)
Mutual labels:  text-to-speech, speech-synthesis, voice-recognition, speech-recognition
Nemo
NeMo: a toolkit for conversational AI
Stars: ✭ 3,685 (+7270%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-30%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (+166%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Openseq2seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+2656%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (+242%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
KeenASR-Android-PoC
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-58%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-62%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
React.ai
It recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux
Stars: ✭ 38 (-24%)
Mutual labels:  speech-recognition, speech-to-text, speech-recognizer
speechrec
a simple speech recognition app using the Web Speech API Interfaces
Stars: ✭ 18 (-64%)
Mutual labels:  speech-synthesis, speech-recognition, speech-to-text
Lingvo
Lingvo
Stars: ✭ 2,361 (+4622%)
Mutual labels:  speech-synthesis, speech-recognition, speech-to-text
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+2140%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
octopus
On-device speech-to-index engine powered by deep learning.
Stars: ✭ 30 (-40%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Kalliope
Kalliope is a framework that will help you to create your own personal assistant.
Stars: ✭ 1,509 (+2918%)
Mutual labels:  speech-synthesis, speech-recognition, speech-to-text
Speech And Text
Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)
Stars: ✭ 102 (+104%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text

Speech Recognition, Translation, and Text-to-Speech on iOS

License License

This is a powerful, yet simple solution for demonstrating the power of machine learning on mobile using managed cloud services. The app provides speech recognition via Apple Speech API, text translation via Amazon Translate, and showcases speech synthesis using Amazon Polly to read back our translated text!


Watch the entire app unfold
YouTube - Mobile Quickie
Speech Recognition Translation
Recognition Translation

Getting Started

Of all the AWS services, Amazon Translate is by far the easiest to implement into your app. Amazon Polly is a close 2nd. So, if you have never used AWS before and want to try adding some machine learning to your mobile app, now is the time! And, it only takes less than 5 minutess for both backend and client configuration.

Architectural Diagram

There are two easy steps to building this solution: Part 1. Configure backend by creating an Amazon Cognito Identity Pool, IAM Role(s), and adding permission to those roles for accessing Amazon Translate and Polly directly from a mobile app. Part 2. Create a mobile app to showcase natural language processing by cloning my sample app from GitHub and configuring it to use the values created in step #1.

PART 1: Configure Backend (1 minute)

I created a CloudFormation template to automate the creation of a Cognito Identity Pool, IAM Role(s), and permissions. The other services (Translate & Polly) do not require any backend configuration and will be called directly from our mobile app. Note: Creating a CloudFormation Stack to provision the above AWS resources is FREE.

  1. Click on the Launch Stack button

    Launch Stack

    This will launch the AWS CloudFormation Console, passing in the template, create a new stack, and automate the creation of a Cognitio Identity Pool, associated authenticate & authenticated IAM Roles along with policies for accessing Amazon Translate and Amazon Polly directly from a mobile app.

  2. Click Next on the Select Template page

  3. Click Next

  4. On the Options page, leave all the defaults and click Next

  5. On the Review page, check the box to acknowledge that CloudFormation will create IAM resources and click Create.

  6. Wait for the speechtranslator-stack stack to reach a status of CREATE_COMPLETE

  7. With the speechtranslator-stack selected, click on the Outputs tab and you should see three rows. We only need the identityPoolId for now. Stack Output

  8. Copy the Value for just the identityPoolId as we’ll be pasting this value into the AWSConfiguration.json file in our Xcode project.

PART 2: Mobile Client Setup (3 1/2 minutes)

In this part, we'll clone the repo, update Cocoapods, and update the AppDelagate.swift file with your own backend Identity pool Id and IAM Roles generated in PART 1.

  1. Download or clone this project

    $ git clone https://github.com/mobilequickie/AmazonSpeechTranslator.git
    
    $ cd AmazonSpeechTranslator
    
  2. Install Cocoapods

    $ sudo gem install cocoapods
    
    $ pod install --repo-update
    
  3. Launch project in Xcode

    $ open SpeechRec.xcworkspace
    
  4. Update the AWSConfiguration.json by pasting in your own identityPoolId from the output tab of CloudFormation Stack that you created in Part 1, step #7. Stack Output

  5. Build and run the app

Requirements

  • Cocoapods 1.5.0 +
  • iOS 10.2+ / Mac OS X 10.13+
  • Xcode 9.0+

Built Using

  • See THIRD-PARTY-LICENSES.txt
  • Pulsator - Used for animating the live listener
  • DropDown - Used as a dropdown for selecting the different languages

Author

Dennis Hills (Mobile Quickie) - Initial work

YouTube | Blog | Twitter

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].