All Projects → hadware → voxpopuli

hadware / voxpopuli

Licence: MIT license
Python wrapper for Espeak and Mbrola, for simple local TTS

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to voxpopuli

JSpeak
A Text to Speech Reader Front-end that Reads from the Clipboard and with Exceptionable Features
Stars: ✭ 16 (-23.81%)
Mutual labels:  voice, espeak, mbrola
vasisualy
Vasisualy it's a simple Russian voice assistant written on Python for GNU/Linux, Windows and Android.
Stars: ✭ 33 (+57.14%)
Mutual labels:  voice, tts-engines
py-espeak-ng
Some simple wrappers around eSpeak NG intended to make using this excellent TTS for waveform and IPA generation as convenient as possible.
Stars: ✭ 27 (+28.57%)
Mutual labels:  espeak, tts-engines
Mad Twinnet
The code for the MaD TwinNet. Demo page:
Stars: ✭ 99 (+371.43%)
Mutual labels:  voice, wav
Francium Voice
Record user voice and encode it as MP3 or WAV
Stars: ✭ 35 (+66.67%)
Mutual labels:  voice, wav
multilingual-g2p
Multilingual Grapheme to Phoneme
Stars: ✭ 40 (+90.48%)
Mutual labels:  espeak, phonemes
liqui
liqui.io api wrapper
Stars: ✭ 22 (+4.76%)
Mutual labels:  wrapper
ftx-api-wrapper-python3
FTX Exchange API wrapper in python3
Stars: ✭ 31 (+47.62%)
Mutual labels:  wrapper
Invoke-Terraform
A cross-platform PowerShell module for downloading and invoking terraform binaries.
Stars: ✭ 14 (-33.33%)
Mutual labels:  wrapper
google-workspace
A unofficial high level Python API wrapper for some of the productivity based Google APIs, that is focused on simplicity.
Stars: ✭ 74 (+252.38%)
Mutual labels:  wrapper
alpine-shellcheck
Docker image for Alpine Linux with latest ShellCheck, a static analysis tool for shell scripts.
Stars: ✭ 12 (-42.86%)
Mutual labels:  wrapper
Pyblox
An API wrapper for Roblox written in Python. (Receives Updates)
Stars: ✭ 30 (+42.86%)
Mutual labels:  wrapper
RxCamera2
Rx Java 2 wrapper for Camera2 google API
Stars: ✭ 27 (+28.57%)
Mutual labels:  wrapper
upx
Node.js cross-platform wrapper for UPX - the ultimate packer for eXecutables.
Stars: ✭ 27 (+28.57%)
Mutual labels:  wrapper
python-sms-activate-ru
Wrapper for automatic SMS receiving by sms-activate.ru
Stars: ✭ 35 (+66.67%)
Mutual labels:  wrapper
uplot-wrappers
React and Vue.js wrappers for uPlot that allow you to work with charts declaratively inside your favorite framework
Stars: ✭ 37 (+76.19%)
Mutual labels:  wrapper
material-yew
Yew wrapper for Material Web Components
Stars: ✭ 116 (+452.38%)
Mutual labels:  wrapper
ok-file-formats
Decoders for PNG, JPEG, WAV, and a few other file formats
Stars: ✭ 72 (+242.86%)
Mutual labels:  wav
discord.bat
🗑️ the BEST discord lib
Stars: ✭ 38 (+80.95%)
Mutual labels:  wrapper
pbwrap
Pastebin API wrapper for Python
Stars: ✭ 19 (-9.52%)
Mutual labels:  wrapper

Voxpopuli

PyPI PyPI Build Status Documentation Status license

A wrapper around Espeak and Mbrola.

This is a lightweight Python wrapper for Espeak and Mbrola, two co-dependent TTS tools. It enables you to render sound by simply feeding it text and voice parameters. Phonemes (the data transmitted by Espeak to mbrola) can also be manipulated using a mimalistic API.

This is a short introduction, but you might want to look at the readthedoc documentation.

Install

These instructions should work on any Debian/Ubuntu-derivative

Install with pip as:

pip install voxpopuli

You have to have espeak and mbrola installed beforehand:

sudo apt install mbrola espeak

You'll also need some mbrola voices installed, which you can either get on their project page, and then uppack in /usr/share/mbrola/<lang><voiceid>/ or more simply by installing them from the ubuntu repo's. All the voices' packages are of the form mbrola-<lang><voiceid>. You can even more simply install all the voices available by running:

sudo apt install mbrola-*

In case the voices you need aren't all in the ubuntu repo's, you can use this convenient little script that install voices directly from Mbrola's voice repo:

# this installs all british english and french voices for instance
sudo python3 -m voxpopuli.voice_install en fr

Usage

Picking a voice and making it say things

The most simple usage of this lib is just bare TTS, using a voice and a text. The rendered audio is returned in a .wav bytes object:

from voxpopuli import Voice
voice = Voice(lang="fr")
wav = voice.to_audio("salut c'est cool")

Evaluating type(wav) whould return bytes. You can then save the wav using the wb file option

with open("salut.wav", "wb") as wavfile:
    wavfile.write(wav)

If you wish to hear how it sounds right away, you'll have to make sure you installed pyaudio via pip, and then do:

voice.say("Salut c'est cool")

Ou can also, say, use scipy to get the pcm audio as a ndarray:

import scipy.io.wavfile import read, write
from io import BytesIO

rate, wave_array = read(BytesIO(wav))
reversed = wave_array[::-1] # reversing the sound file
write("tulas.wav", rate, reversed)

Getting different voices

You can set some parameters you can set on the voice, such as language or pitch

from voxpopuli import Voice
# really slow fice with high pitch
voice = Voice(lang="us", pitch=99, speed=40, voice_id=2)
voice.say("I'm high on helium")

The exhaustive list of parameters is:

  • lang, a language code among those available (us, fr, en, es, ...) You can list them using the listvoices method from a Voice instance.
  • voice_id, an integer, used to select the voice id for a language. If not specified, the first voice id found for a given language is used.
  • pitch, an integer between 0 and 99 (included)
  • speed, an integer, in the words per minute. Default and regular speed is 160 wpm.
  • volume, float ratio applied to the output sample. Some languages have presets that our best specialists tested. Otherwise, defaults to 1.

Handling the phonemic form

To render a string of text to audio, the Voice object actually chains espeak's output to mbrola, who then renders it to audio. Espeak only renders the text to a list of phonemes (such as the one in the IPA), who then are to be processed by mbrola. For those who like pictures, here is a diagram of what happens when you run voice.to_audio("Hello world")

phonemes

phonemes are represented sequentially by a code, a duration in milliseconds, and a list of pitch modifiers. The pitch modifiers are a list of couples, each couple representing the percentage of the sample at which to apply the pitch modification and the pitch.

Funny thing is, with voxpopuli, you can "intercept" that phoneme list as a simple object, modify it, and then pass it back to the voice to render it to audio. For instance, let's make a simple alteration that'll double the duration for each vowels in an english text.

from voxpopuli import Voice, BritishEnglishPhonemes

voice = Voice(lang="en")
# here's how you get the phonemes list
phoneme_list = voice.to_phonemes("Now go away or I will taunt you a second time.") 
for phoneme in phoneme_list: #phoneme list object inherits from the list object
    if phoneme.name in BritishEnglishPhonemes.VOWELS:
        phoneme.duration *= 3
        
# rendering and saving the sound, then saying it out loud:
voice.to_audio(phoneme_list, "modified.wav")
voice.say(phoneme_list)

Notes:

  • For French, Spanish, German and Italian, the phoneme codes used by espeak and mbrola are available as class attributes similar to the BritishEnglishPhonemes class as above.
  • More info on the phonemes can be found here: SAMPA page

What's left to do

  • Moar unit tests
  • Maybe some examples
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].