All Projects → zabir-nabil → bangla-tts

zabir-nabil / bangla-tts

Licence: other
Bangla text to speech, Multilingual (Bangla, English) real-time ([almost] in a GPU) speech synthesis library

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to bangla-tts

Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Stars: ✭ 139 (+127.87%)
Mutual labels:  text-to-speech
Zero-Shot-TTS
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-45.9%)
Mutual labels:  text-to-speech
ukrainian-tts
Ukrainian TTS (text-to-speech) using Coqui TTS
Stars: ✭ 74 (+21.31%)
Mutual labels:  text-to-speech
java-google-speech-api
🙊 Speech Recognition , Text To Speech , Google Translate
Stars: ✭ 67 (+9.84%)
Mutual labels:  text-to-speech
TensorVox
Desktop application for neural speech synthesis written in C++
Stars: ✭ 140 (+129.51%)
Mutual labels:  text-to-speech
FastSpeech2
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech ✊
Stars: ✭ 64 (+4.92%)
Mutual labels:  text-to-speech
number-to-bengali-word
An amazing package to convert your number to bengali word representation.
Stars: ✭ 22 (-63.93%)
Mutual labels:  bangla
gradle-marytts-voicebuilding-plugin
A replacement for the legacy VoiceImportTools in MaryTTS
Stars: ✭ 14 (-77.05%)
Mutual labels:  text-to-speech
telltime
iOS application to tell the time in the British way 🇬🇧⏰
Stars: ✭ 49 (-19.67%)
Mutual labels:  text-to-speech
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-18.03%)
Mutual labels:  text-to-speech
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-42.62%)
Mutual labels:  text-to-speech
vietTTS
Vietnamese Text to Speech library
Stars: ✭ 78 (+27.87%)
Mutual labels:  text-to-speech
lego-mindstorms-51515-jetson-nano
Combines the LEGO Mindstorms 51515 with the NVIDIA Jetson Nano
Stars: ✭ 31 (-49.18%)
Mutual labels:  text-to-speech
FFTNet
FFTNet: a Real-Time Speaker-Dependent Neural Vocoder
Stars: ✭ 63 (+3.28%)
Mutual labels:  text-to-speech
JSpeak
A Text to Speech Reader Front-end that Reads from the Clipboard and with Exceptionable Features
Stars: ✭ 16 (-73.77%)
Mutual labels:  text-to-speech
soundpad-text-to-speech
Text-To-Speech for Soundpad
Stars: ✭ 29 (-52.46%)
Mutual labels:  text-to-speech
Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Stars: ✭ 107 (+75.41%)
Mutual labels:  text-to-speech
VAENAR-TTS
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
Stars: ✭ 66 (+8.2%)
Mutual labels:  text-to-speech
EMPHASIS-pytorch
EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System
Stars: ✭ 15 (-75.41%)
Mutual labels:  text-to-speech
flutter chatbot inventory
Chatbot Flutter App used to track inventory of product and description using Dialogflow
Stars: ✭ 17 (-72.13%)
Mutual labels:  text-to-speech

Byakto TTS [bangla text to speech]

Multilingual (Bangla, English) real-time ([almost] in a GPU) speech synthesis library

Installation

  • Install Anaconda
  • conda create -n new_virtual_env python==3.6.8
  • conda activate new_virtual_env
  • pip install -r requirements.txt
  • While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (>500 MB)
  • For fast inference, you must install tensorflow-gpu and have a Nvidia GPU (CUDA).

Usage

'''
function: generate(text_arr = [""], save_path = None)
arguments: 
text_arr (array) : an array of strings
save_path (string, optional) : location where generated wav files will be stored if save_path is not None, if the path is not valid, the wav files will be saved in current directory
returns:
if save_path is None, instead of saving an array of tuples containing geenrated speech signals and the sampling rate will be returned
if save_path is not None, then a list containing the file paths (relative) will be returned
'''

from bangla_tts import generate

# usage 1 (saving to path)

file_names = generate(["আমার সোনার বাংলা আমি তোমাকে ভালোবাসি"], save_path = "static") # will be saved to static folder
print(file_names)

# usage 2 (getting numpy arrays for the signals)

gen_wavs = generate(["আমার সোনার বাংলা আমি তোমাকে ভালোবাসি"]) # will return an array containing the speech and sampling rate
print(gen_wavs[0])
print(f"signal length: {gen_wavs[0][0].shape}")
print(f"samplign rate: {gen_wavs[0][1]}")

Generated unseen speech samples

Sample 1 (আমার সোনার বাংলা আমি তোমাকে ভালোবাসি)

Sample 2 (আমার নাম জাবির আল নাজি নাবিল)

Sample 3 (I am still not a great speaker)

Sample 4 (This is just a test)

Update (18th April)

  • Synthesize longer sentences
  • Phonetic representation for English, Bangla numeric segments
  • Added a simple parser which will translate numeric keys to corresponding phonetic representation.

Example: ১৯৯৭ সালের ২১ জানুয়ারী তে আমার জন্ম হয় will be converted to ['ঊনিশশ সাতানব্বই সালের একুশ জানুয়ারী তে আমার জন্ম হয় '] by the parser.

  • Added a simple batch mechanism for translating longer sentences. As the attention window was fixed during training, the model previously failed to generate long sentences (n_characters > 200). So, added a simple segmenting scheme to break the sentences into multiple parts, synthesize in batch, and finally merge them into a single audio file.

New examples:

১৯৯৭ সালের ২১ জানুয়ারী তে আমার জন্ম হয়

আমার ফোন নাম্বার ০১৭১৩৩৫৩৪৩, তবে আমাকে সকাল ১০ টার আগে পাবেন না

বাংলাদেশে গত ২৪ ঘণ্টায় ৩০৬ জন কোভিড-১৯ আক্রান্ত হয়েছেন। এই সময়ের মধ্যে মৃত্যু হয়েছে ৯ জনের। এ নিয়ে দেশটিতে মোট আক্রান্ত হলেন ২১৪৪। আর করোনা ভাইরাসে আক্রান্ত হয়ে মৃত্যু হয়েছে ৮৪ জনের। নতুন করে ৮ জনের পরীক্ষা করার পর করোনা ভাইরাসের উপস্থিতি পাওয়া যায়নি। এনিয়ে মোট ৬৬ জন সুস্থ হলেন। - BBC Bangla

Test Dataset

I have designed a test dataset to compare new bangla TTS models with the benchmark model (bangla-tts).

Bakta dataset

Bakta dataset (multi-lingual, bangla + english)

Wiki Pages

To-dos

  • PyPI
  • More training
  • Light model
  • Publish the restful API
  • Publish the flask app

Other TTS projects

If this repository helps you in anyway, show your love ❤️ by putting a on this project ✌️

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].