rupakvignesh / Lyrics-to-Audio-Alignment

Licence: other

Aligns text (lyrics) with monophonic singing voice (audio). The algorithm uses structural segmentation to segment the audio into structures and then uses hidden markov models to obtain alignment within segments. The final alignment is concatenation of time stamps of lyrics within the segments for each song.

Programming Languages

python

139335 projects - #7 most used programming language

perl

6916 projects

shell

77523 projects

Projects that are alternatives of or similar to Lyrics-to-Audio-Alignment

Alignmentduration

Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.

Stars: ✭ 36 (-36.84%)

Mutual labels: lyrics, alignment

mrivis

medical image visualization library and development toolkit

Stars: ✭ 19 (-66.67%)

Mutual labels: alignment

mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

Stars: ✭ 644 (+1029.82%)

Mutual labels: segmentation

BuddySuite

Bioinformatics toolkits for manipulating sequence, alignment, and phylogenetic tree files

Stars: ✭ 106 (+85.96%)

Mutual labels: alignment

DeepPhonemizer

Grapheme to phoneme conversion with deep learning.

Stars: ✭ 152 (+166.67%)

Mutual labels: phonemes

colorify

Colorify - C# .Net Console Library with Text Format: colors, alignment and lot more [ Win+Mac+Linux ]

Stars: ✭ 49 (-14.04%)

Mutual labels: alignment

Point2Sequence

Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network

Stars: ✭ 34 (-40.35%)

Mutual labels: segmentation

cath-tools

Protein structure comparison tools such as SSAP and SNAP

Stars: ✭ 40 (-29.82%)

Mutual labels: alignment

pcan

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Stars: ✭ 294 (+415.79%)

Mutual labels: segmentation

MiVOS

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion. Semi-supervised VOS as well!

Stars: ✭ 302 (+429.82%)

Mutual labels: segmentation

hair-dye

Neural Network for Dying Hair💈

Stars: ✭ 45 (-21.05%)

Mutual labels: segmentation

lyricsmaster

LyricsMaster is a library for downloading lyrics from multiple lyrics providers.

Stars: ✭ 18 (-68.42%)

Mutual labels: lyrics

color-pop

🌈 Automatic Color Pop effect on any image inspired by Google Photos

Stars: ✭ 21 (-63.16%)

Mutual labels: segmentation

FluentDNA

FluentDNA allows you to browse sequence data of any size using a zooming visualization similar to Google Maps. You can use FluentDNA as a standalone program or as a python module for your own bioinformatics projects.

Stars: ✭ 52 (-8.77%)

Mutual labels: alignment

dcsp segmentation

No description or website provided.

Stars: ✭ 34 (-40.35%)

Mutual labels: segmentation

superpixelRefinement

Superpixel-based Refinement for Object Proposal Generation (ICPR 2020)

Stars: ✭ 24 (-57.89%)

Mutual labels: segmentation

reveal

Graph based multi genome aligner

Stars: ✭ 39 (-31.58%)

Mutual labels: alignment

payment alipay

odoo alipay module

Stars: ✭ 27 (-52.63%)

Mutual labels: alignment

CarND-Detect-Lane-Lines-And-Vehicles

Use segmentation networks to recognize lane lines and vehicles. Infer position and curvature of lane lines relative to self.

Stars: ✭ 66 (+15.79%)

Mutual labels: segmentation

uoais

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling", ICRA 2022

Stars: ✭ 77 (+35.09%)

Mutual labels: segmentation

View All Similar Projects ➔

Lyrics-to-Audio-Alignment

This project aims at creating an automatic alignment between the textual lyrics and monophonic singing vocals (audio). This system shall be very useful in a setting where a karoake performer would want to keep in sync with the background track. Traditional Hidden Markov Models are used for phoneme modelling and an interesting structural segmentation approach has been explored to break the audio (usually of length 4-5 minutes) to smaller chunks that are structurallly meaningful (Intro, Verse, Chorus, etc) without any implicit assumptions.

Watch the Demo

Pre-requisites

[HTK tool-kit] (http://htk.eng.cam.ac.uk/download.shtml)
[sph2pipe] (https://www.ldc.upenn.edu/language-resources/tools/sphere-conversion-tools)
[Flite] (http://www.speech.cs.cmu.edu/flite/download.html)
[MSAF] (https://github.com/urinieto/msaf/releases)

Training Steps

Training Acoustic models

TIMIT

Create initial hmm models (isolated phoneme training)

tcsh scripts/model_gen.sh <phonelist> <proto_file>

Create connected HMM models (embedded re-estimation)

tcsh script/embedded_reestimation.sh <iterations>

Damp

Align Damp dataset with the generated HMM Models using forced Viterbi alignment
Perform embedded reestimation using the Damp Dataset to refine the phoneme models.

Structural Segmentation

Use MSAF library to segment Damp training data into structural segments

python scripts/msaf_segmentation.py <wav_in_dir> <wav_out_dir>

Create MLF files corresponding to the segmented audio

python scripts/msaf_to_mlf.py <labfile_list>

Perform embedded reestimation within these segments to get the final phoneme models

Testing

To test any model do the forced Viterbi alignment initially

sh scripts/force_align.sh

Set the parameters such as model, features, mlf, dictionary, etc inside the file.

To evaluate the performance of the model, use the manually annotated groundtruth and compute overlap.

python scripts/lab_to_lrc.py <lyrics_list>

Set the groundtruth and output folder inside the script.

Authors

Phoneme Acoustic Modelling - Rupak Vignesh
Structural Segmentation with MSAF - Benjamin Genchel

Acknowledgments

Thanks to Alex Lerch for his guidance
S Aswin Shanmugham's hybrid segmentation framework
Stanford's DAMP dataset.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

rupakvignesh / Lyrics-to-Audio-Alignment

Programming Languages

Labels

Projects that are alternatives of or similar to Lyrics-to-Audio-Alignment

Lyrics-to-Audio-Alignment

Watch the Demo

Pre-requisites

Training Steps

Training Acoustic models

TIMIT

Damp

Structural Segmentation

Testing

Authors

Acknowledgments