All Projects → audioslides → audioslides.io

audioslides / audioslides.io

Licence: MIT license
Use Amazon Polly, Google Slides and FFMpeg to create videos that can be updated at anytime by anyone. This project is written in Elixir.

Programming Languages

elixir
2628 projects
HTML
75241 projects
CSS
56736 projects
javascript
184084 projects - #8 most used programming language
shell
77523 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to audioslides.io

AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (+163.16%)
Mutual labels:  speech-synthesis, amazon-polly
page object
Page Objects for Hound / Elixir
Stars: ✭ 19 (+0%)
Mutual labels:  elixir-phoenix
ttsflow
tensorflow speech synthesis c++ inference for voicenet
Stars: ✭ 17 (-10.53%)
Mutual labels:  speech-synthesis
sova-tts-engine
Tacotron2 based engine for the SOVA-TTS project
Stars: ✭ 63 (+231.58%)
Mutual labels:  speech-synthesis
versionary
Plug for API versioning
Stars: ✭ 40 (+110.53%)
Mutual labels:  elixir-phoenix
NanoFlow
PyTorch implementation of the paper "NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity." (NeurIPS 2020)
Stars: ✭ 63 (+231.58%)
Mutual labels:  speech-synthesis
Wavegrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+1189.47%)
Mutual labels:  speech-synthesis
leafblower
Play Cards Against Humanity online with friends!
Stars: ✭ 29 (+52.63%)
Mutual labels:  elixir-phoenix
obplayer
📻 OBPlayer Streaming Automation Playout with CAP EAS Alerting
Stars: ✭ 93 (+389.47%)
Mutual labels:  polly-voice
GlottDNN
GlottDNN vocoder and tools for training DNN excitation models
Stars: ✭ 30 (+57.89%)
Mutual labels:  speech-synthesis
IMS-Toucan
Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Stars: ✭ 295 (+1452.63%)
Mutual labels:  speech-synthesis
idear
🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (+342.11%)
Mutual labels:  speech-synthesis
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+178.95%)
Mutual labels:  speech-synthesis
voder
An emulation of the Voder Speech Synthesizer.
Stars: ✭ 19 (+0%)
Mutual labels:  speech-synthesis
TFGAN
TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Stars: ✭ 65 (+242.11%)
Mutual labels:  speech-synthesis
tacotron2
Pytorch implementation of "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions", ICASSP, 2018.
Stars: ✭ 17 (-10.53%)
Mutual labels:  speech-synthesis
sam
Software Automatic Mouth - Tiny Speech Synthesizer
Stars: ✭ 316 (+1563.16%)
Mutual labels:  speech-synthesis
wiki2ssml
Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.
Stars: ✭ 31 (+63.16%)
Mutual labels:  speech-synthesis
Catch-A-Waveform
Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
Stars: ✭ 117 (+515.79%)
Mutual labels:  speech-synthesis
suomi.dev
Like Hacker News, but for Finns!
Stars: ✭ 27 (+42.11%)
Mutual labels:  elixir-phoenix

AudioSlides.IO

Coverage Status Build Status

Articles

tl;dr

Generate small videos with spoken text from Google Slides.

Using Amazon Polly, Google Slides and FFMpeg to create videos that can be updated at anytime by anyone. This project is written in Elixir.

The Prototype

For our prototype we decided to give Amazon Polly a try. It has a good and simple HTTP-API that allows you to convert text to speech really easily.

For the visual layer we just used Google Slides because they also provide a really good REST-API that allows you to easily export PNG of a slide. It’s also possible to get the speaker notes via the same API that could be the input for the Amazon Polly transformation.

The last step is to combine the generated voice output with the exported png image and produce a small video sequence. For this we just used a handy command line interface called FFMPEG. So the basic processing would look something like this:

Video Generation Process

Example Input & Output

As shown before we need a Google Presentation to start from. My input will be a short slide deck about the new release of Angular version 5.

Google Slides as Input

Angular 5 explained by AudioSlides

Generated Video as Output

Angular 5 explained by AudioSlides

How to start the project

To start your Phoenix server:

  • Install dependencies with mix deps.get
  • Create and migrate your database with mix ecto.create && mix ecto.migrate
  • Install Node.js dependencies with cd assets && npm install
  • Start Phoenix endpoint with mix s

Now you can visit localhost:4000 from your browser.

Use with docker

Build the container

docker build -t audioslides .

Run via docker compose

Init the database

docker-compose run web mix ecto.setup

Run database + project

docker compose up

How to test

Run all tests

mix t

Run all test with integration test(ffmpeg, write files)

mix test.integration
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].