All Projects → Kyubyong → Cross_vc

Kyubyong / Cross_vc

Licence: apache-2.0
Cross-lingual Voice Conversion

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Cross vc

spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-70.33%)
Mutual labels:  speech-synthesis, speech-recognition
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+9306.59%)
Mutual labels:  speech-synthesis, speech-recognition
TinyCog
Small Robot, Toy Robot platform
Stars: ✭ 29 (-68.13%)
Mutual labels:  speech-synthesis, speech-recognition
ml-with-audio
HF's ML for Audio study group
Stars: ✭ 104 (+14.29%)
Mutual labels:  speech-synthesis, speech-recognition
Java Speech Api
The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+438.46%)
Mutual labels:  speech-recognition, speech-synthesis
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+824.18%)
Mutual labels:  speech-synthesis, speech-recognition
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (-42.86%)
Mutual labels:  speech-synthesis, speech-recognition
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-61.54%)
Mutual labels:  speech-synthesis, speech-recognition
Espnet
End-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+4881.32%)
Mutual labels:  speech-recognition, speech-synthesis
Libfaceid
libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.
Stars: ✭ 354 (+289.01%)
Mutual labels:  speech-recognition, speech-synthesis
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-45.05%)
Mutual labels:  speech-synthesis, speech-recognition
Artyom.js
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Stars: ✭ 1,011 (+1010.99%)
Mutual labels:  speech-recognition, speech-synthesis
speechrec
a simple speech recognition app using the Web Speech API Interfaces
Stars: ✭ 18 (-80.22%)
Mutual labels:  speech-synthesis, speech-recognition
Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Stars: ✭ 205 (+125.27%)
Mutual labels:  speech-synthesis, speech-recognition
Khronos
The open source intelligent personal assistant
Stars: ✭ 25 (-72.53%)
Mutual labels:  speech-synthesis, speech-recognition
porfir
Голосовой ассистент Порфирьевич
Stars: ✭ 23 (-74.73%)
Mutual labels:  speech-synthesis, speech-recognition
idear
🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (-7.69%)
Mutual labels:  speech-synthesis, speech-recognition
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (-41.76%)
Mutual labels:  speech-synthesis, speech-recognition
voicekit-examples
Examples on how to use Tinkoff Voicekit
Stars: ✭ 35 (-61.54%)
Mutual labels:  speech-synthesis, speech-recognition
Athena
an open-source implementation of sequence-to-sequence based speech processing engine
Stars: ✭ 542 (+495.6%)
Mutual labels:  speech-recognition, speech-synthesis

Cross-lingual Voice Conversion

I wish I could speak many languages. Wait. Actually I do. But only 4 or 5 languages with limited proficiency. Instead, can I create a voice model that can copy any voice in any language? Possibly! A while ago, me and my colleage Dabi opened a simple voice conversion project. Based on it, I expanded the idea to cross-languages. I found it's very challenging with my limited knowledge. Unfortunately, the results I have for now are not good, but hopefully it will be helpful for some people.

February 2018
Author: Kyubyong Park ([email protected])
Version: 1.0

Requirements

  • NumPy >= 1.11.1
  • TensorFlow >= 1.3
  • librosa
  • tqdm
  • scipy

Data

Architecture

  • Train 1: MFCCs of TIMIT speakers -> Triphone PPGs
  • Train 2: MFCCs of ARTCTIC speaker -> Triphone PPGs -> linear spectrogram
  • Convert: MFCCs of Any speakers -> Triphone PPGs -> linear spectrogram -> (Griffin-Lim) -> wav file

(To see what PPGs are, consult this)

Training

  • STEP 0. Prepare datasets
  • STEP 1. Run python train1.py for phoneme recognition model.
  • STEP 2. Run python train2.py for speech synthesis model.

Training Curves

  • Training 1
  • Training 2

Sample Synthesis

  • Run python convert.py and check the generated samples in 50lang-output folder.

Generated Samples

  • Check here and compare original speech samples in 16 languages and their converted counterparts.
  • Don't expect too much!

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].