Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → r9y9 → Jsut Lab

r9y9 / Jsut Lab

Licence: mit

HTS-style full-context labels for JSUT v1.1

Labels

dataset text-to-speech tts speech-synthesis

Projects that are alternatives of or similar to Jsut Lab

Multilingual text to speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Stars: ✭ 324 (+1057.14%)

Mutual labels: speech-synthesis, text-to-speech, tts

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (+1192.86%)

Mutual labels: speech-synthesis, text-to-speech, tts

LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Stars: ✭ 67 (+139.29%)

Mutual labels: text-to-speech, tts, speech-synthesis

Hifi Gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Stars: ✭ 325 (+1060.71%)

Mutual labels: speech-synthesis, text-to-speech, tts

Parallelwavegan

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

Stars: ✭ 682 (+2335.71%)

Mutual labels: speech-synthesis, text-to-speech, tts

Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Stars: ✭ 149 (+432.14%)

Mutual labels: text-to-speech, tts, speech-synthesis

Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting WaveFlow, WaveNet, Transformer TTS and Tacotron2)

Stars: ✭ 279 (+896.43%)

Mutual labels: speech-synthesis, text-to-speech, tts

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Stars: ✭ 841 (+2903.57%)

Mutual labels: text-to-speech, tts, speech-synthesis

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (+160.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (+85.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

Cognitive Speech Tts

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

Stars: ✭ 312 (+1014.29%)

Mutual labels: speech-synthesis, text-to-speech, tts

esp32-flite

Speech synthesis running on ESP32 based on Flite engine.

Stars: ✭ 28 (+0%)

Mutual labels: text-to-speech, tts, speech-synthesis

WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Stars: ✭ 55 (+96.43%)

Mutual labels: text-to-speech, tts, speech-synthesis

Glow Tts

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Stars: ✭ 284 (+914.29%)

Mutual labels: speech-synthesis, text-to-speech, tts

Daft-Exprt

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Stars: ✭ 41 (+46.43%)

Mutual labels: text-to-speech, tts, speech-synthesis

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (+464.29%)

Mutual labels: text-to-speech, tts, speech-synthesis

VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Stars: ✭ 66 (+135.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

Stars: ✭ 108 (+285.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

talkie

Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.

Stars: ✭ 43 (+53.57%)

Mutual labels: text-to-speech, tts, speech-synthesis

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

Stars: ✭ 74 (+164.29%)

Mutual labels: text-to-speech, tts, speech-synthesis

View All Similar Projects ➔

jsut-lab

The repository provides HTK/HTS-style alignment files with additional full-context labels for JSUT (Japanese speech corpus of Saruwatari-lab., University of Tokyo) corpus (v1.1). All alignment files (.lab) were extracted by forced-alignment using Julius and full-contexts are generated by OpenJTalk.

The label files are expected to be used for speech reseach; e.g., text-to-speech and voice conversion.

Directory structure is exactly same as the JSUT. You can put the label files to the JSUT data directory if you want:

tree ~/data/jsut_ver1.1/ -d -L 2
/home/ryuichi/data/jsut_ver1.1/
├── basic5000
│   ├── lab
│   └── wav
├── countersuffix26
│   ├── lab
│   └── wav
├── loanword128
│   ├── lab
│   └── wav
├── onomatopee300
│   ├── lab
│   └── wav
├── precedent130
│   ├── lab
│   └── wav
├── repeat500
│   ├── lab
│   └── wav
├── travel1000
│   ├── lab
│   └── wav
├── utparaphrase512
│   ├── lab
│   └── wav
└── voiceactress100
    ├── lab
    └── wav

Label format

Fields: <begin_time> <end_time> <full-context-label>. Time are in 100ns units as same as HTK labels.

$ cat basic5000/lab/BASIC5000_0773.lab | head
 
0 2525000 xx^xx-sil+s=a/A:xx+xx+xx/B:xx-xx_xx/C:xx_xx+xx/D:18+xx_xx/E:xx_xx!xx_xx-xx/F:xx_xx#[email protected]_xx|xx_xx/G:6_3%0_xx_xx/H:xx_xx/I:[email protected]+xx&xx-xx|xx+xx/J:1_6/K:3+6-32
2525000 3825000 xx^sil-s+a=N/A:-2+1+6/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
3825000 4825000 sil^s-a+N=g/A:-2+1+6/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
4825000 5825000 s^a-N+g=i/A:-1+2+5/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
5825000 6125000 a^N-g+i=i/A:0+3+4/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
6125000 7524999 N^g-i+i=N/A:0+3+4/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
7524999 8125000 g^i-i+N=w/A:1+4+3/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
8125000 8425000 i^i-N+w=a/A:2+5+2/B:xx-xx_xx/C:18_xx+xx/D:24+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
8425000 10125000 i^N-w+a=pau/A:3+6+1/B:18-xx_xx/C:24_xx+xx/D:07+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32
10125000 11325000 N^w-a+pau=d/A:3+6+1/B:18-xx_xx/C:24_xx+xx/D:07+xx_xx/E:xx_xx!xx_xx-xx/F:6_3#[email protected]_1|1_6/G:3_1%0_xx_0/H:xx_xx/I:[email protected]+3&1-6|1+32/J:2_10/K:3+6-32

For details, please refer to HTS documents: http://hts.sp.nitech.ac.jp

What can I do with this?

If you want to make traditional DNN-based TTS systems, please check out the tutorials at https://r9y9.github.io/nnmnkwii/latest/. You can use alignment and full-context labels to generate linguistic features.

If you are intersted in end-to-end approaches, please have a look at https://github.com/espnet/espnet. The labels are used at the preprocessing stage for the JSUT recipe (see also https://r9y9.github.io/blog/2017/11/12/jsut_ver1/ to know why we need alignments for end-to-end TTS).

Happy speech hacking!

Source code to generate labels

https://github.com/r9y9/segmentation-kit/tree/jsut2

Notice

Alignments are likely to have mistakes because they were automatically generated by Julius. Note that they are not hand-annotated labels.

References

JSUT (Japanese speech corpus of Saruwatari-lab., University of Tokyo)
HTS
Julius
OpenJTalk
日本語 End-to-end 音声合成に使えるコーパス JSUT の前処理 [arXiv:1711.00354]
pyopenjtalk
nnmnkwii
sarulab-speech/jsut-label Hand-annotated phonetic and prosodic information from Saruwatari-lab.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 28

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗