All Projects → NVIDIA → Openseq2seq

NVIDIA / Openseq2seq

Licence: apache-2.0
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Openseq2seq

Lingvo
Lingvo
Stars: ✭ 2,361 (+71.34%)
Mutual labels:  speech-recognition, seq2seq, language-model, speech-to-text, speech-synthesis
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-92.53%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis, text-to-speech
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-98.04%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (-87.59%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis, text-to-speech
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (-38.97%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Awesome Speech Recognition Speech Synthesis Papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Stars: ✭ 2,085 (+51.31%)
Mutual labels:  seq2seq, language-model, speech-synthesis, speech-recognition
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+521.19%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (-90.35%)
Mutual labels:  speech-recognition, speech-to-text, speech-synthesis, text-to-speech
Nemo
NeMo: a toolkit for conversational AI
Stars: ✭ 3,685 (+167.42%)
Mutual labels:  speech-recognition, text-to-speech, speech-synthesis, speech-to-text
Neural sp
End-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (-70.39%)
Mutual labels:  speech-recognition, seq2seq, language-model, sequence-to-sequence
Speech recognition with tensorflow
Implementation of a seq2seq model for Speech Recognition using the latest version of TensorFlow. Architecture similar to Listen, Attend and Spell.
Stars: ✭ 253 (-81.64%)
Mutual labels:  speech-recognition, seq2seq, speech-to-text, sequence-to-sequence
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-97.46%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (-96.15%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-96.37%)
Mutual labels:  text-to-speech, speech-synthesis, speech-recognition, speech-to-text
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (-18.72%)
Mutual labels:  speech-recognition, speech-to-text, text-to-speech
mongolian-nlp
Useful resources for Mongolian NLP
Stars: ✭ 119 (-91.36%)
Mutual labels:  text-to-speech, speech-recognition, language-model
dynmt-py
Neural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-98.77%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence
RNNSearch
An implementation of attention-based neural machine translation using Pytorch
Stars: ✭ 43 (-96.88%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (-88.53%)
Mutual labels:  text-to-speech, speech-synthesis, seq2seq
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-98.62%)
Mutual labels:  text-to-speech, speech-recognition, speech-to-text

License Documentation

OpenSeq2Seq

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Features

  1. Models for:
    1. Neural Machine Translation
    2. Automatic Speech Recognition
    3. Speech Synthesis
    4. Language Modeling
    5. NLP tasks (sentiment analysis)
  2. Data-parallel distributed training
    1. Multi-GPU
    2. Multi-node
  3. Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

  1. Python >= 3.5
  2. TensorFlow >= 1.10
  3. CUDA >= 9.0, cuDNN >= 7.0
  4. Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)

Acknowledgments

Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.

Disclaimer

This is a research project, not an official NVIDIA product.

Related resources

Paper

If you use OpenSeq2Seq, please cite this paper

@misc{openseq2seq,
    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
    year={2018},
    eprint={1805.10387},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].