All Projects → as-ideas → DeepPhonemizer

as-ideas / DeepPhonemizer

Licence: MIT license
Grapheme to phoneme conversion with deep learning.

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
shell
77523 projects

Projects that are alternatives of or similar to DeepPhonemizer

multilingual-g2p
Multilingual Grapheme to Phoneme
Stars: ✭ 40 (-73.68%)
Mutual labels:  phonemes, g2p
laravel-scene
Laravel Transformer
Stars: ✭ 27 (-82.24%)
Mutual labels:  transformer
graph-transformer-pytorch
Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2
Stars: ✭ 81 (-46.71%)
Mutual labels:  transformer
transform-graphql
⚙️ Transformer function to transform GraphQL Directives. Create model CRUD directive for example
Stars: ✭ 23 (-84.87%)
Mutual labels:  transformer
COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers
Rank 1 / 216
Stars: ✭ 24 (-84.21%)
Mutual labels:  transformer
image-classification
A collection of SOTA Image Classification Models in PyTorch
Stars: ✭ 70 (-53.95%)
Mutual labels:  transformer
cometa
Corpus of Online Medical EnTities: the cometA corpus
Stars: ✭ 31 (-79.61%)
Mutual labels:  transformer
graphtrans
Representing Long-Range Context for Graph Neural Networks with Global Attention
Stars: ✭ 45 (-70.39%)
Mutual labels:  transformer
PDN
The official PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf '21)
Stars: ✭ 44 (-71.05%)
Mutual labels:  transformer
YOLOv5-Lite
🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 930+kb (int8) and 1.7M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~
Stars: ✭ 1,230 (+709.21%)
Mutual labels:  transformer
TransPose
PyTorch Implementation for "TransPose: Keypoint localization via Transformer", ICCV 2021.
Stars: ✭ 250 (+64.47%)
Mutual labels:  transformer
BossNAS
(ICCV 2021) BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
Stars: ✭ 125 (-17.76%)
Mutual labels:  transformer
Transformer-ocr
Handwritten text recognition using transformers.
Stars: ✭ 92 (-39.47%)
Mutual labels:  transformer
OpenPrompt
An Open-Source Framework for Prompt-Learning.
Stars: ✭ 1,769 (+1063.82%)
Mutual labels:  transformer
Xpersona
XPersona: Evaluating Multilingual Personalized Chatbot
Stars: ✭ 54 (-64.47%)
Mutual labels:  transformer
GTSRB Keras STN
German Traffic Sign Recognition Benchmark, Keras implementation with Spatial Transformer Networks
Stars: ✭ 48 (-68.42%)
Mutual labels:  transformer
transformer
A PyTorch Implementation of "Attention Is All You Need"
Stars: ✭ 28 (-81.58%)
Mutual labels:  transformer
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+440.13%)
Mutual labels:  transformer
FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
Stars: ✭ 204 (+34.21%)
Mutual labels:  transformer
speech-transformer
Transformer implementation speciaized in speech recognition tasks using Pytorch.
Stars: ✭ 40 (-73.68%)
Mutual labels:  transformer



A G2P library in PyTorch

Build Status codecov PyPI Version License

DeepPhonemizer is a library for grapheme to phoneme conversion based on Transformer models. It is intended to be used in text-to-speech production systems with high accuracy and efficiency. You can choose between a forward Transformer model (trained with CTC) and its autoregressive counterpart. The former is faster and more stable while the latter is slightly more accurate.

The main advantages of this repo are:

  • Easy-to-use API for training and inference.
  • Multilingual: You can train a single model on several languages.
  • Accuracy: Phoneme and word error rates are comparable to state-of-art.
  • Speed: The repo is highly optimized for fast inference by using dictionaries and batching.

Check out the inference and training tutorials on Colab!

Read the documentation at: https://as-ideas.github.io/DeepPhonemizer/

Installation

pip install deep-phonemizer

Quickstart

Download the pretrained model: en_us_cmudict_ipa_forward

from dp.phonemizer import Phonemizer

phonemizer = Phonemizer.from_checkpoint('en_us_cmudict_ipa.pt')
phonemizer('Phonemizing an English text is imposimpable!', lang='en_us')

'foʊnɪmaɪzɪŋ æn ɪŋglɪʃ tɛkst ɪz ɪmpəzɪmpəbəl!'

Training

You can easily train your own autoregressive or forward transformer model. All necessary parameters are set in a config.yaml, which you can find under:

dp/configs/forward_config.yaml
dp/configs/autoreg_config.yaml

for the forward and autoregressive transformer model, respectively.

Pepare data in a tuple-format and use the preprocess and train API:

from dp.preprocess import preprocess
from dp.train import train

train_data = [('en_us', 'young', 'jʌŋ'),
                ('de', 'benützten', 'bənʏt͡stn̩'),
                ('de', 'gewürz', 'ɡəvʏʁt͡s')] * 1000

val_data = [('en_us', 'young', 'jʌŋ'),
            ('de', 'benützten', 'bənʏt͡stn̩')] * 100

preprocess(config_file='config.yaml', train_data=train_data, 
           deduplicate_train_data=False)
train(config_file='config.yaml')

Model checkpoints will be stored in the checkpoints path that is provided by the config.yaml.

Inference

Load the phonemizer from a checkpoint and run a prediction. By default, the phonemizer stores a dictionary of word-phoneme mappings that is applied first, and it uses the Transformer model only to predict out-of-dictionary words.

from dp.phonemizer import Phonemizer

phonemizer = Phonemizer.from_checkpoint('checkpoints/best_model.pt')
phonemes = phonemizer('Phonemizing an English text is imposimpable!', lang='en_us')

If you need more inference information, you can use following API:

from dp.phonemizer import Phonemizer

result = phonemizer.phonemise_list(['Phonemizing an English text is imposimpable!'], lang='en_us')

for word, pred in result.predictions.items():
  print(f'{word} {pred.phonemes} {pred.confidence}')

Pretrained Models

Model Language Dataset Repo Version
en_us_cmudict_ipa_forward en_us cmudict-ipa 0.0.10
en_us_cmudict_forward en_us cmudict 0.0.10
latin_ipa_forward en_uk, en_us, de, fr, es wikipron 0.0.10

Torchscript Export

You can easily export the underlying transformer models with TorchScript:

import torch
from dp.phonemizer import Phonemizer

phonemizer = Phonemizer.from_checkpoint('checkpoints/best_model.pt')
model = phonemizer.predictor.model
phonemizer.predictor.model = torch.jit.script(model)
phonemizer('Running the torchscript model!')

Maintainers

References

Transformer based Grapheme-to-Phoneme Conversion

GRAPHEME-TO-PHONEME CONVERSION USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].