All Projects → sekwiatkowski → Awesome Ai Services

sekwiatkowski / Awesome Ai Services

An overview of the AI-as-a-service landscape

Programming Languages

javascript
184084 projects - #8 most used programming language
java
68154 projects - #9 most used programming language
kotlin
9241 projects

Projects that are alternatives of or similar to Awesome Ai Services

Nemo
NeMo: a toolkit for conversational AI
Stars: ✭ 3,685 (+2670.68%)
Mutual labels:  speech-recognition, text-to-speech, machine-translation, speech-synthesis, speech-to-text
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-22.56%)
Mutual labels:  natural-language-processing, speech-recognition, speech-to-text, speech-synthesis, text-to-speech
Nonautoreggenprogress
Tracking the progress in non-autoregressive generation (translation, transcription, etc.)
Stars: ✭ 118 (-11.28%)
Mutual labels:  artificial-intelligence, natural-language-processing, speech-recognition, machine-translation
Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (+28.57%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis, text-to-speech
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (-60.15%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+6336.09%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Lingvo
Lingvo
Stars: ✭ 2,361 (+1675.19%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis, machine-translation
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-73.68%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-62.41%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+742.11%)
Mutual labels:  artificial-intelligence, speech-recognition, speech-to-text, text-to-speech
Openseq2seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+936.09%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis, text-to-speech
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+532.33%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-79.7%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-85.71%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
Libfaceid
libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.
Stars: ✭ 354 (+166.17%)
Mutual labels:  speech-recognition, face-recognition, speech-synthesis
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (-60.9%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition
Espnet
End-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+3308.27%)
Mutual labels:  speech-recognition, speech-synthesis, machine-translation
Kalliope
Kalliope is a framework that will help you to create your own personal assistant.
Stars: ✭ 1,509 (+1034.59%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-71.43%)
Mutual labels:  artificial-intelligence, natural-language-processing, sentiment-analysis
Artyom.js
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Stars: ✭ 1,011 (+660.15%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis

awesome-ai-services Awesome

An overview of the AI-as-a-service landscape

Sharing

Table of Contents

Natural Language

Speech

Vision

Natural Language

Entity Recognition

Sample input

Amazon Comprehend

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • What are the entities mentioned in the document?
  • What are their types?
  • How often is each of these entities mentioned?

Supported entity types: commercial items, dates, events, locations, organizations, persons, quantities, other types, titles

Google Cloud Natural Language

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • What are the entities mentioned in the document?
  • What are their types?
  • How salient is each of these entities in the document?
  • Where in the text are these entities mentioned?
  • What are the URLs to the corresponding Wikipedia entries?

Supported entity types: consumer good, event, location, organization, person, work of art, other types

IBM Watson Natural Understanding

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • What are the entities mentioned in the document?
  • What are their types and subtypes?

Supported entity types:

  • Types: anatomy, award, broadcaster, company, crime, drug, email address, facility, geographic feature, health condition, hashtag, ip address, job title, location, movie, music group, natural event, organization, person, print media, quantity, sport, sporting event, television show, twitter handle, vehicle
  • Subtypes

Microsoft Cognitive Services Text Analytics (Preview)

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

  • What are the entities mentioned in the document?
  • Where in the document are they mentioned?
  • What are the URLs to the corresponding Wikipedia entries?
  • What are their Wikipedia and Bing IDs?

Keyphrase Extraction

Sample input

Amazon Comprehend

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • Which keywords can be extracted for the given document?
  • How often do each of these keywords occur?

Google Cloud Natural Language

Not supported

IBM Watson Natural Understanding

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • Which keywords can be extracted for the given document?

Microsoft Cognitive Services Text Analytics

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

  • Which keywords can be extracted for the given document?

Machine Translation

Sample input

Amazon Translate

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for seven languages

Google Cloud Translation API

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 98 language pairs in neural machine translation model

IBM Watson Language Translator

General: Overview | Sample output | Demo | Price

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 33 language pairs

Microsoft Cognitive Services Translator Text

General: Overview | Sample output | Pricing

JavaScript: Node

JVM: Java | Kotlin

Support for 39 language pairs

Sentiment Analysis

Overview | Sample input

Amazon Comprehend

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • To what extent does the document express an overal positive, negative, neutral or mixed sentiment?

Google Cloud Natural Language

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • To what extent does the document express an overal positive, negative, neutral or mixed sentiment?

IBM Watson Natural Understanding

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • To what extent does the document express an overal positive, negative or neutral sentiment?

Microsoft Cognitive Services Text Analytics

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

  • To what extent does the document express an overal positive, negative or neutral sentiment?

Speech

Speech to Text / Speech Recognition

Sample input

Amazon Transcribe

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for US English and Spanish

Google Cloud Speech-to-Text

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 119 languages/locales

IBM Speech to Text

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 9 languages

Microsoft Cognitive Services Speech to Text (Preview)

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

The REST API is limited to utterances of up to 14 seconds.

Support for 8 languages

Text to Speech / Speech Synthesis

Overview | Sample input

Amazon Polly

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

34 voices in 25 languages

SSML extensions:

  • Breathing
  • Dynamic Range Compression
  • Speaking softly
  • Timbre
  • Whispering

Google Cloud Text-to-Speech (Beta)

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

28 voices in 14 languages

IBM Watson Text to Speech

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

13 voices in 7 languages

SSML extensions:

  • Good news
  • Apology
  • Uncertainty

Customization:

  • Pitch
  • Glottal tension
  • Breathiness
  • Timbre

Microsoft Cognitive Services Text to Speech (Preview)

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

80 voices in 32 languages

Customization in private preview

Vision

Face Detection

Sample input

Amazon Rekognition

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • Where are the faces and face parts located in the image?
  • What are the age ranges of the persons shown?
  • Are they smiling?
  • Do they wear eyeglasses or sunglasses?
  • What are their genders?
  • Do they have a beard or mustache?
  • Are their eyes or mouth open?
  • Do they express emotions of happiness, sadness, anger, confusion, disgust, surprise or calmness?
  • Given a face image, what other image shows the most similar face?
  • Are the faces in two images of the same person?

Google Cloud Vision

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • Where are the faces and face parts located in the image?
  • What is the pose of the faces?
  • Does the faces express emotions states of joy, sorrow, anger or surprise?
  • Is the person wearing headwear?
  • Is the photo underexposed or blurred?

IBM Watson Visual Recognition

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

  • Where are the faces located in the image?
  • What are the age ranges of the persons shown?
  • What are their genders?

Microsoft Cognitive Services Face

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

  • What are the faces and face parts located in the image?
  • Are parts of the faces occluded?
  • What is the pose of the heads?
  • How old are they?
  • What are their genders?
  • Does the face express the emotional states of anger, contempt, disgust, fear, happiness sadness, surprise or a neutral state?
  • Arey they smiling?
  • Is the hair visible? What is the hair color? Or is the person bald?
  • Do they have a moustache, a beard or sideburns?
  • Are they wearing make-up?
  • What kind of acessories is the person wearing, if any?
  • What kind of glasses is the person wearing, if any?
  • Is the photo blurred? What is the exposure level? What is the noise level?
  • Are the faces in two images of the same person?

Text Recognition

Sample input

Amazon Rekognition

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java

  • Where in the image file is text located?
  • What is the text content?
  • Which boxes do individual words belong to?

Google Cloud Vision

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java

  • Where in the image file is text located?
  • What is the text content?
  • What is the language of the text content?

IBM Watson Visual Recognition

This feature is currently in private beta.

Microsoft Cognitive Services Computer Vision

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java

  • Where in the image file is text located?
  • What is the text content?

License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].