All Projects → FlorianKrey → Dnc

FlorianKrey / Dnc

Licence: apache-2.0
Discriminative Neural Clustering for Speaker Diarisation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Dnc

LIUM
Scripts for LIUM SpkDiarization tools
Stars: ✭ 28 (-53.33%)
Mutual labels:  clustering, speech-processing
Scrattch.hicat
Hierarchical, iterative clustering for analysis of transcriptomics data in R
Stars: ✭ 47 (-21.67%)
Mutual labels:  clustering
K Means Constrained
K-Means clustering - constrained with minimum and maximum cluster size
Stars: ✭ 33 (-45%)
Mutual labels:  clustering
Swarm
Easy clustering, registration, and distribution of worker processes for Erlang/Elixir
Stars: ✭ 1,004 (+1573.33%)
Mutual labels:  clustering
Pyannote Audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Stars: ✭ 978 (+1530%)
Mutual labels:  speech-processing
Formant Analyzer
iOS application for finding formants in spoken sounds
Stars: ✭ 43 (-28.33%)
Mutual labels:  speech-processing
Cytometry Clustering Comparison
R scripts to reproduce analyses in our paper comparing clustering methods for high-dimensional cytometry data
Stars: ✭ 30 (-50%)
Mutual labels:  clustering
Protoactor Dotnet
Proto Actor - Ultra fast distributed actors for Go, C# and Java/Kotlin
Stars: ✭ 1,070 (+1683.33%)
Mutual labels:  clustering
Keras Sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Stars: ✭ 47 (-21.67%)
Mutual labels:  speech-processing
Pncc
A implementation of Power Normalized Cepstral Coefficients: PNCC
Stars: ✭ 40 (-33.33%)
Mutual labels:  speech-processing
Fuzzy C Means
A simple python implementation of Fuzzy C-means algorithm.
Stars: ✭ 40 (-33.33%)
Mutual labels:  clustering
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+1536.67%)
Mutual labels:  clustering
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+1608.33%)
Mutual labels:  clustering
Satellite imagery analysis
Implementation of different techniques to find insights from the satellite data using Python.
Stars: ✭ 31 (-48.33%)
Mutual labels:  clustering
Timeseriesclustering.jl
Julia implementation of unsupervised learning methods for time series datasets. It provides functionality for clustering and aggregating, detecting motifs, and quantifying similarity between time series datasets.
Stars: ✭ 49 (-18.33%)
Mutual labels:  clustering
Mob Suite
MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
Stars: ✭ 32 (-46.67%)
Mutual labels:  clustering
Clusteredbigcache
golang bigcache with clustering as a library.
Stars: ✭ 37 (-38.33%)
Mutual labels:  clustering
Compress
Compressing Representations for Self-Supervised Learning
Stars: ✭ 43 (-28.33%)
Mutual labels:  clustering
Spyking Circus
Fast and scalable spike sorting in python
Stars: ✭ 55 (-8.33%)
Mutual labels:  clustering
Fullsubnet
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Stars: ✭ 51 (-15%)
Mutual labels:  speech-processing

Discriminative Neural Clustering (DNC) for Speaker Diarisation

This repository is the code used in our paper:

Discriminative Neural Clustering for Speaker Diarisation

Qiujia Li*, Florian Kreyssig*, Chao Zhang, Phil Woodland (* indicates equal contribution)

Overview

We propose to use encoder-decoder models for supervised clustering. This repository contains:

  • a submodule for spectral clustering, a modified version of this repository by Google
  • a submodule for DNC using Transformers, implemented in ESPnet
  • data processing procedures for data augmentation & curriculum learning in our paper

Dependencies

First, as this repository contains two submodules, after cloning this repository, please run

git submodule update --init --recursive

Then execute the following command to install MiniConda for virtualenv with related packages:

cd DNC
./install.sh

Note that you may want to change the CUDA version for PyTorch in install.sh according to your own driver.

Data generation

First activate the virtual environment:

source venv/bin/activate

To generate training and validation data with sub-meeting length 50 and 1000 random shifts:

python3 datapreperation/gen_augment_data.py --input-scps data/train.scp --input-mlfs data/train.mlf --filtEncomp --maxlen 50 --augment 1000 --varnormalise /path/to/datadir/m50.real.augment

python3 datapreperation/gen_augment_data.py --input-scps data/dev.scp --input-mlfs data/dev.mlf --filtEncomp --maxlen 50 --augment 1000 --varnormalise /path/to/datadir/m50.real.augment

To generate training data with sub-meeting length 50 and 1000 random shifts using the meeting randomisation:

python3 datapreperation/gen_dvecdict.py --input-scps data/train.scp --input-mlfs data/train.mlf --filtEncomp --segLenConstraint 100 --meetingLevelDict /path/to/datadir/dvecdict.meeting.split100

python3 datapreperation/gen_augment_data.py --input-scps data/train.scp --input-mlfs data/train.mlf --filtEncomp --maxlen 50 --augment 100 --varnormalise --randomspeaker  --dvectordict /path/to/datadir/dvecdict.meeting.split100/train.npz /path/to/datadir/m50.meeting.augment/

To generate evaluation data:

python3 datapreperation/gen_augment_data.py --input-scps data/eval.scp --input-mlfs data/eval.mlf --filtEncomp --maxlen 50 --varnormalise /path/to/datadir/m50.real

Training and decoding of DNC models

Train a DNC Transformer

The example setup for AMI is in

cd espnet/egs/ami/dnc1

There are multiple configuration files you may want to change:

  • model training config: config/tuning/train_transformer.yaml
  • model decoding config: config/decode.yaml
  • submission config: cmd_backend variable should be set in cmd.sh to use your preferred setup. You may also want to modify the corresponding submission settings for the queuing system, e.g. config/queue.conf for SGE or conf/slurm.conf for SLURM.

To start training, run

./run.sh --stage 4 --stop_stage 4 --train_json path/to/train.json --dev_json path/to/dev.json --tag tag.for.model --init_model path/to/model/for/initialisation

If the model trains from scratch, the --init_model option should be omitted. For more options, please look into run.sh and config/tuning/train_transformer.yaml.

To track the progress of the training, run

tail -f exp/mdm_train_pytorch_tag.for.model/train.log

Decode a DNC Tranformer

Similar to the command used for training, run

./run.sh --stage 5 --decode_json path/to/eval.json --tag tag.for.model

For more options, please look into run.sh and config/decode.yaml.

The decoding results are, by default, stored in multiple json files in exp/mdm_train_pytorch_tag.for.model/decode_dev_xxxxx/data.JOB.json

Running spectral clustering

To run spectral clustering on previously generated evalutation data, for example for sub-meeting lengths 50:

python3 scoring/run_spectralclustering.py --p-percentile 0.95 --custom-dist cosine --json-out /path/to/scoringdir/eval95k24.1.json  /path/to/datadir/m50.real/eval.json

Evaluation of clustering results

First the DNC or SC output has to be converted into the RTTM format: For SC:

python3 scoring/gen_rttm.py --input-scp data/eval.scp --js-dir /path/to/scoringdir --js-num 1 --js-name eval95k24 --rttm-name eval95k24

For DNC:

python3 scoring/gen_rttm.py --input-scp data/eval.scp --js-dir espnet/egs/ami/dnc1/exp/mdm_train_pytorch_tag.for.model/decode_dev_xxxxx/ --js-num 16 --js-name data --rttm-name evaldnc

To score the result the reference rttm has to first be split into the appropriate sub-meeting lengths:

python3 scoring/split_rttm.py --submeeting-rttm /path/to/scoringdir/eval95k24.rttm --input-rttm scoring/refoutputeval.rttm --output-rttm /path/to/scoringdir/reference.rttm

Finally, the speaker error rate has to be calculated using:

python3 scoring/score_rttm.py --score-rttm /path/to/scoringdir/eval95k24.rttm --ref-rttm /path/to/scoringdir/reference.rttm --output-scoredir /path/to/scoringdir/eval95k24

Reference

@misc{LiKreyssig2019DNC,
  title={Discriminative Neural Clustering for Speaker Diarisation},
  author={Li, Qiujia and Kreyssig, Florian L. and Zhang, Chao and Woodland, Philip C.},
  journal={ArXiv.org},
  eprint={1910.09703}
  year={2019},
  url={https://arxiv.org/abs/1910.09703}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].