All Projects → Edresson → Wav2Vec-Wrapper

Edresson / Wav2Vec-Wrapper

Licence: Apache-2.0 license
An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
Jupyter Notebook
11667 projects
Dockerfile
14818 projects

Wav2Vec-Wrapper

An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.

Pretrained Models and Reproducibility

Paper Description Instructions
CORAA Checkpoints for the paper: "CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese". More details here link
SE&R-Challenge Fine-tuning instructions for the ASR for Spontaneous and Prepared Speech, and Speech Emotion Recognition Shared task. More details here link
YourTTS2ASR Checkpoints for the paper: "A single speaker is almost all you need for automatic speech recognition". More details here link

Installation

Clone the repository.

git clone https://github.com/Edresson/Wav2Vec-Wrapper
pip3 install -r requeriments.txt

Install Flashlight dependencies to use KenLM

Use Docker:

In the Wav2Vec-Wrapper repository execute:

nvidia-docker build ./ -t huggingface_flashlight

Now see the id of the docker image you just created:

docker images

Using the IMAGE_ID run the command:

nvidia-docker run  --runtime=nvidia -v ~/:/mnt/ --rm  --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --name wav2vec-wrapper -it IMAGE_ID bash

Manually Instalation:

Please see the Flashlight documentation here

Inference

You can easily run inference on a folder of wav files by runing:

python3 test.py --config_path ./example/config_eval.json --checkpoint_path_or_name facebook/wav2vec2-large-xlsr-53-french --audio_path ../wavs/ --no_kenlm

To run inference with a KenLM language model, you need to specify the apppropriate paths in the config file and remove the --no_kenlm flag.

To generate the lexicon.lst file, you can use the ./utils/generate-vocab.ipynb notebook on your corpus.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].