audioslides / audioslides.io

Licence: MIT license

Use Amazon Polly, Google Slides and FFMpeg to create videos that can be updated at anytime by anyone. This project is written in Elixir.

Programming Languages

elixir

2628 projects

HTML

75241 projects

CSS

56736 projects

javascript

184084 projects - #8 most used programming language

shell

77523 projects

Dockerfile

14818 projects

Projects that are alternatives of or similar to audioslides.io

AmazonSpeechTranslator

End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

Stars: ✭ 50 (+163.16%)

Mutual labels: speech-synthesis, amazon-polly

page object

Page Objects for Hound / Elixir

Stars: ✭ 19 (+0%)

Mutual labels: elixir-phoenix

ttsflow

tensorflow speech synthesis c++ inference for voicenet

Stars: ✭ 17 (-10.53%)

Mutual labels: speech-synthesis

sova-tts-engine

Tacotron2 based engine for the SOVA-TTS project

Stars: ✭ 63 (+231.58%)

Mutual labels: speech-synthesis

versionary

Plug for API versioning

Stars: ✭ 40 (+110.53%)

Mutual labels: elixir-phoenix

NanoFlow

PyTorch implementation of the paper "NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity." (NeurIPS 2020)

Stars: ✭ 63 (+231.58%)

Mutual labels: speech-synthesis

Wavegrad

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+1189.47%)

Mutual labels: speech-synthesis

leafblower

Play Cards Against Humanity online with friends!

Stars: ✭ 29 (+52.63%)

Mutual labels: elixir-phoenix

obplayer

📻 OBPlayer Streaming Automation Playout with CAP EAS Alerting

Stars: ✭ 93 (+389.47%)

Mutual labels: polly-voice

GlottDNN

GlottDNN vocoder and tools for training DNN excitation models

Stars: ✭ 30 (+57.89%)

Mutual labels: speech-synthesis

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+1452.63%)

Mutual labels: speech-synthesis

idear

🎙️ Handsfree Audio Development Interface

Stars: ✭ 84 (+342.11%)

Mutual labels: speech-synthesis

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (+178.95%)

Mutual labels: speech-synthesis

voder

An emulation of the Voder Speech Synthesizer.

Stars: ✭ 19 (+0%)

Mutual labels: speech-synthesis

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

Stars: ✭ 65 (+242.11%)

Mutual labels: speech-synthesis

tacotron2

Pytorch implementation of "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions", ICASSP, 2018.

Stars: ✭ 17 (-10.53%)

Mutual labels: speech-synthesis

sam

Software Automatic Mouth - Tiny Speech Synthesizer

Stars: ✭ 316 (+1563.16%)

Mutual labels: speech-synthesis

wiki2ssml

Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.

Stars: ✭ 31 (+63.16%)

Mutual labels: speech-synthesis

Catch-A-Waveform

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

Stars: ✭ 117 (+515.79%)

Mutual labels: speech-synthesis

suomi.dev

Like Hacker News, but for Finns!

Stars: ✭ 27 (+42.11%)

Mutual labels: elixir-phoenix

View All Similar Projects ➔

AudioSlides.IO

Articles

Produce Easy-to-Update Video Courses with Speech Synth by Robin Böhm

tl;dr

Generate small videos with spoken text from Google Slides.

Using Amazon Polly, Google Slides and FFMpeg to create videos that can be updated at anytime by anyone. This project is written in Elixir.

The Prototype

For our prototype we decided to give Amazon Polly a try. It has a good and simple HTTP-API that allows you to convert text to speech really easily.

For the visual layer we just used Google Slides because they also provide a really good REST-API that allows you to easily export PNG of a slide. It’s also possible to get the speaker notes via the same API that could be the input for the Amazon Polly transformation.

The last step is to combine the generated voice output with the exported png image and produce a small video sequence. For this we just used a handy command line interface called FFMPEG. So the basic processing would look something like this:

Example Input & Output

As shown before we need a Google Presentation to start from. My input will be a short slide deck about the new release of Angular version 5.

Google Slides as Input

Generated Video as Output

How to start the project

To start your Phoenix server:

Install dependencies with mix deps.get
Create and migrate your database with mix ecto.create && mix ecto.migrate
Install Node.js dependencies with cd assets && npm install
Start Phoenix endpoint with mix s

Now you can visit localhost:4000 from your browser.

Use with docker

Build the container

docker build -t audioslides .

Run via docker compose

Init the database

docker-compose run web mix ecto.setup

Run database + project

docker compose up

How to test

Run all tests

mix t

Run all test with integration test(ffmpeg, write files)

mix test.integration

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

audioslides / audioslides.io

Programming Languages

Labels

Projects that are alternatives of or similar to audioslides.io

AudioSlides.IO

Articles

tl;dr

The Prototype

Example Input & Output

Google Slides as Input

Generated Video as Output

How to start the project

Use with docker

Build the container

Run via docker compose

How to test