All Projects → Renovamen → Speech And Text

Renovamen / Speech And Text

Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Speech And Text

Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+998.04%)
Mutual labels:  speech-recognition, speech-to-text, text-to-speech
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (-48.04%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (+30.39%)
Mutual labels:  speech-recognition, speech-to-text, text-to-speech
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (+0.98%)
Mutual labels:  speech-recognition, speech-to-text, text-to-speech
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-73.53%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (+67.65%)
Mutual labels:  speech-recognition, speech-to-text, text-to-speech
Nemo
NeMo: a toolkit for conversational AI
Stars: ✭ 3,685 (+3512.75%)
Mutual labels:  speech-recognition, text-to-speech, speech-to-text
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-65.69%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+724.51%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-50.98%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-81.37%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+8292.16%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text
Openseq2seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+1250.98%)
Mutual labels:  speech-recognition, speech-to-text, text-to-speech
Textnormalizationcoveringgrammars
Covering grammars for English and Russian text normalization
Stars: ✭ 46 (-54.9%)
Mutual labels:  speech-recognition, text-to-speech
Artyom.js
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Stars: ✭ 1,011 (+891.18%)
Mutual labels:  speech-recognition, speech-to-text
Syn Speech
Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (-44.12%)
Mutual labels:  speech-recognition, speech-to-text
Discordspeechbot
A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: ✭ 35 (-65.69%)
Mutual labels:  speech-recognition, speech-to-text
Audio Pretrained Model
A collection of Audio and Speech pre-trained models.
Stars: ✭ 61 (-40.2%)
Mutual labels:  speech-recognition, speech-to-text
Angle
⦠ Angle: new speakable syntax for python 💡
Stars: ✭ 61 (-40.2%)
Mutual labels:  speech-recognition, speech-to-text
Openasr
A pytorch based end2end speech recognition system.
Stars: ✭ 69 (-32.35%)
Mutual labels:  speech-recognition, speech-to-text

Speech-and-Text

语音转文字(支持实时麦克风输入和从音频文件读入):

  • 百度 API
  • 科大讯飞 API
  • SpeechRecognition (CMU PocketSphinx)

文字转语音:

  • pyttsx3

 

Environment

  • Python 3.6.7
  • MacOS(以下环境配置方式均基于Mac系统,其他系统的配置方式可能会有一些不同)

 

Speech to Text

百度

https://cloud.baidu.com/product/speech 申请API。

文档:http://ai.baidu.com/docs#/ASR-API

Configuration

安装:

pip install baidu-aip

speech_to_text_baidu() 中填入APPID、API_KEY、SECRET_KEY:

APP_ID = ""
API_KEY = ""
SECRET_KEY = ""

(也可以直接使用REST API:Demo

Usage

from Speech_and_Text import speech_to_text_baidu
# 从文件读入
speech_to_text_baidu(audio_path = "path_of_audio", if_microphone = False)
# 从麦克风读入
speech_to_text_baidu(if_microphone = True)

 

科大讯飞

https://www.xfyun.cn/services/voicedictation 申请API。

文档:https://doc.xfyun.cn/rest_api/index.html

Configuration

speech_to_text_ifly() 填入 APPID、API_KEY:

URL = "http://api.xfyun.cn/v1/service/v1/iat"
APPID = ""
API_KEY = ""

要在讯飞管理面板中添加调用方api,否则会报错。

Usage

from Speech_and_Text import speech_to_text_ifly
# 从文件读入
speech_to_text_ifly(audio_path = "path_of_audio", if_microphone = False)
# 从麦克风读入
speech_to_text_ifly(if_microphone = True)

 

SpeechRecognition

使用了Python的语音识别库 SpeechRecognition

源码:https://github.com/Uberi/speech_recognition

 

Configuration

SpeechRecognition

安装:

pip install SpeechRecognition
PyAudio

使用麦克风进行输入

主页:http://people.csail.mit.edu/hubert/pyaudio/

# Mac上的安装方式

xcode-select --install	# 安装xcode, 已经装好的的话,执行的时候会提示

# 先用homebrew安装portaudio(pyaudio需要的库),否则会提示:'portaudio.h' file not found
brew remove portaudio	# 先用homebrew卸载
brew install portaudio	# 重新安装

sudo pip install pyaudio	# 安装pyaudio

Reference: https://stackoverflow.com/questions/33851379/pyaudio-installation-on-mac-python-3

PocketSphinx

CMU Sphinx 是卡内基梅隆大学开发的开源语音识别引擎,可以离线工作,支持多种语言(包括中文)。

源码:https://github.com/cmusphinx

PocketSphinx 是 CMU Sphinx 的 Python 封装接口。

源码:https://github.com/cmusphinx/pocketsphinx-python

安装:

pip install PocketSphinx

添加中文语言包:

查看 SpeechRecognition 包的安装路径('/path'):

python -c "import speech_recognition as sr, os.path as p; print(p.dirname(sr.__file__))"

然后下载并解压 Mandarin Chinese 语言包,把 zh-CN 文件夹放入 '/path/pocketsphinx-data'

 

Usage

from Speech_and_Text import speech_to_text_cmu
# 从文件读入
speech_to_text_cmu(audio_path = "path_of_audio", if_microphone = False)
# 从麦克风读入
speech_to_text_cmu(if_microphone = True)

 

Text to Speech

使用了Python的文字转语音库 pyttsx3

源码:https://github.com/nateshmbhat/pyttsx3

文档:https://pyttsx3.readthedocs.io

Configuration

pip install pyttsx3
pip install pyobjc # 依赖模块

Usage

from Speech_and_Text import text_to_speech
# Example
text_to_speech(sentence = "人类的本质是复读机")
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].