All Projects → danthelion → Doc2audiobook

danthelion / Doc2audiobook

Licence: mit
Convert text documents to high fidelity audio(books).

Programming Languages

python
139335 projects - #7 most used programming language
python3
1442 projects

Projects that are alternatives of or similar to Doc2audiobook

Tts
Text-to-Speech for Arduino
Stars: ✭ 118 (-32.57%)
Mutual labels:  text-to-speech
Vonage Python Sdk
Vonage Server SDK for Python. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.
Stars: ✭ 134 (-23.43%)
Mutual labels:  text-to-speech
Nonparaseq2seqvc code
Implementation code of non-parallel sequence-to-sequence VC
Stars: ✭ 154 (-12%)
Mutual labels:  text-to-speech
Pytorch Dc Tts
Text to Speech with PyTorch (English and Mongolian)
Stars: ✭ 122 (-30.29%)
Mutual labels:  text-to-speech
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (-24%)
Mutual labels:  text-to-speech
Wavegrad
A fast, high-quality neural vocoder.
Stars: ✭ 138 (-21.14%)
Mutual labels:  text-to-speech
Durian
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-36.57%)
Mutual labels:  text-to-speech
Interspeech2019 Tutorial
INTERSPEECH 2019 Tutorial Materials
Stars: ✭ 160 (-8.57%)
Mutual labels:  text-to-speech
Androidmarytts
Android MARY TTS - an open-source, offline HMM-Based text-to-speech synthesis system based on MaryTTS
Stars: ✭ 134 (-23.43%)
Mutual labels:  text-to-speech
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+1009.71%)
Mutual labels:  text-to-speech
Marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Stars: ✭ 1,699 (+870.86%)
Mutual labels:  text-to-speech
Talkify
Javascript Text to speech library
Stars: ✭ 132 (-24.57%)
Mutual labels:  text-to-speech
Amazon Polly Sample
Sample application for Amazon Polly. Allows to convert any blog into an audio podcast.
Stars: ✭ 139 (-20.57%)
Mutual labels:  text-to-speech
Nlp Pretrained Model
A collection of Natural language processing pre-trained models.
Stars: ✭ 122 (-30.29%)
Mutual labels:  text-to-speech
Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+1024.57%)
Mutual labels:  text-to-speech
Articulate.js
A jQuery plugin that lets the browser speak to you.
Stars: ✭ 116 (-33.71%)
Mutual labels:  text-to-speech
Diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Stars: ✭ 139 (-20.57%)
Mutual labels:  text-to-speech
Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (-2.29%)
Mutual labels:  text-to-speech
Vocgan
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Stars: ✭ 158 (-9.71%)
Mutual labels:  text-to-speech
Tensorflowtts
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Stars: ✭ 2,382 (+1261.14%)
Mutual labels:  text-to-speech

doc2audiobook.py

Extract text from a document (textract) and convert it into a natural sounding synthesised speech (Cloud Text-To-Speech), which is able to leverage Deepminds Wavenet models.

Example

Input Output

Available source formats (from textract)

  • .csv
  • .doc
  • .docx
  • .eml
  • .epub
  • .gif
  • .jpg and .jpeg
  • .json
  • .html and .htm
  • .mp3
  • .msg
  • .odt
  • .ogg
  • .pdf
  • .png
  • .pptx
  • .ps
  • .rtf
  • .tiff
  • .txt
  • .wav
  • .xlsx
  • .xls

Prerequisites

GCP

  1. Select or create a Google Cloud Platform project.
  2. Enable billing for your project.
  3. Enable the Cloud Text-to-Speech API.
  4. Setup Authentication using a Service Account.

Host Machine

  1. Docker
  2. /doc2audiobook/data/input: directory to hold all input files.
  3. /doc2audiobook/data/output: directory to store all output files.
  4. /doc2audiobook/.secrets/client_secret.json: GCP authentication token.

Build

git clone [email protected]:danthelion/doc2audiobook.git
cd doc2audiobook
docker build -t doc2audiobook .

Run

Make sure to put your documents in the folder that is mapped to /data before running!

List available voices

docker run \
-v /doc2audiobook/data:/data:rw \
-v /doc2audiobook/.secrets/client_secret.json:/.secrets/client_secret.json:ro \
doc2audiobook -list-voices

Convert all documents in the mapped input folder to audiobooks using the en-GB-Standard-C voice.

docker run \
-v /doc2audiobook/data:/data:rw \
-v /doc2audiobook/.secrets/client_secret.json:/.secrets/client_secret.json:ro \
doc2audiobook --voice en-GB-Standard-C

Convert a single document in the mapped input folder to an audiobook using the en-GB-Standard-C voice.

docker run \
-v /doc2audiobook/data:/data:rw \
-v /doc2audiobook/.secrets/client_secret.json:/.secrets/client_secret.json:ro \
doc2audiobook --voice en-GB-Standard-C --input test_input.txt
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].