upskyy / Transformer-Transducer

Licence: Apache-2.0 license

PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Transformer-Transducer

kosr

Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)

Stars: ✭ 25 (-59.02%)

Mutual labels: end-to-end, transformer, speech-recognition, transformer-transducer

Speech Transformer Tf2.0

transformer for ASR-systerm (via tensorflow2.0)

Stars: ✭ 90 (+47.54%)

Mutual labels: end-to-end, transformer, speech-recognition

Neural sp

End-to-end ASR/LM implementation with PyTorch

Stars: ✭ 408 (+568.85%)

Mutual labels: transformer, speech-recognition, sequence-to-sequence

End2end Asr Pytorch

End-to-End Automatic Speech Recognition on PyTorch

Stars: ✭ 175 (+186.89%)

Mutual labels: end-to-end, transformer, speech-recognition

Kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.

Stars: ✭ 190 (+211.48%)

Mutual labels: end-to-end, transformer, speech-recognition

Athena

an open-source implementation of sequence-to-sequence based speech processing engine

Stars: ✭ 542 (+788.52%)

Mutual labels: transformer, speech-recognition, sequence-to-sequence

kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Stars: ✭ 456 (+647.54%)

Mutual labels: end-to-end, transformer, speech-recognition

Tensorflow end2end speech recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Stars: ✭ 305 (+400%)

Mutual labels: end-to-end, speech-recognition

Espnet

End-to-End Speech Processing Toolkit

Stars: ✭ 4,533 (+7331.15%)

Mutual labels: end-to-end, speech-recognition

Wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Stars: ✭ 5,907 (+9583.61%)

Mutual labels: end-to-end, speech-recognition

E2e Asr

PyTorch Implementations for End-to-End Automatic Speech Recognition

Stars: ✭ 106 (+73.77%)

Mutual labels: end-to-end, speech-recognition

Rus-SpeechRecognition-LSTM-CTC-VoxForge

Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge

Stars: ✭ 50 (-18.03%)

Mutual labels: end-to-end, speech-recognition

Espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Stars: ✭ 808 (+1224.59%)

Mutual labels: end-to-end, speech-recognition

Rnn Transducer

MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks

Stars: ✭ 114 (+86.89%)

Mutual labels: end-to-end, speech-recognition

SOLQ

"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

Stars: ✭ 159 (+160.66%)

Mutual labels: end-to-end, transformer

Speech Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Stars: ✭ 565 (+826.23%)

Mutual labels: end-to-end, transformer

wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Stars: ✭ 6,026 (+9778.69%)

Mutual labels: end-to-end, speech-recognition

Automatic speech recognition

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Stars: ✭ 2,751 (+4409.84%)

Mutual labels: end-to-end, speech-recognition

seq2seq-pytorch

Sequence to Sequence Models in PyTorch

Stars: ✭ 41 (-32.79%)

Mutual labels: transformer, sequence-to-sequence

speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Stars: ✭ 40 (-34.43%)

Mutual labels: end-to-end, transformer

View All Similar Projects ➔

Transformer-Transducer

Transformer-Transducer is that every layer is identical for both audio and label encoders. Unlike the basic transformer structure, the audio encoder and label encoder are separate. So, the alignment is handled by a separate forward-backward process within the RNN-T architecture. And this replace the LSTM encoders with Transformer encoders in RNN-T architecture.

This repository contains only model code, but you can train with transformer transducer at openspeech.

Installation

pip install -e .

Usage

from transformer_transducer.model_builder import build_transformer_transducer
import torch

BATCH_SIZE, SEQ_LENGTH, INPUT_SIZE, NUM_VOCABS = 3, 500, 80, 10

cuda = torch.cuda.is_available()
device = torch.device('cuda' if cuda else 'cpu')

model = build_transformer_transducer(
        device,
        num_vocabs=NUM_VOCABS,
        input_size=INPUT_SIZE,
)

inputs = torch.FloatTensor(BATCH_SIZE, INPUT_SIZE, SEQ_LENGTH).to(device)
input_lengths = torch.IntTensor([500, 450, 350])
targets = torch.LongTensor([[1, 3, 3, 3, 3, 3, 4, 5, 6, 2],
                            [1, 3, 3, 3, 3, 3, 4, 5, 2, 0],
                            [1, 3, 3, 3, 3, 3, 4, 2, 0, 0]]).to(device)
target_lengths = torch.LongTensor([9, 8, 7])

# Forward propagate
outputs = model(inputs, input_lengths, targets, target_lengths)

# Recognize input speech
outputs = model.recognize(inputs, input_lengths)

Reference

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

License

Copyright 2021 Sangchun Ha.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

upskyy / Transformer-Transducer

Programming Languages

Labels

Projects that are alternatives of or similar to Transformer-Transducer

Transformer-Transducer

Installation

Usage

Reference

License