All Projects → wq2012 → Spectralcluster

wq2012 / Spectralcluster

Licence: apache-2.0
Python re-implementation of the spectral clustering algorithm in the paper "Speaker Diarization with LSTM"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Spectralcluster

Gemsec
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
Stars: ✭ 210 (-4.55%)
Mutual labels:  unsupervised-learning, clustering
Unsupervised Classification
SCAN: Learning to Classify Images without Labels (ECCV 2020), incl. SimCLR.
Stars: ✭ 605 (+175%)
Mutual labels:  unsupervised-learning, clustering
ML2017FALL
Machine Learning (EE 5184) in NTU
Stars: ✭ 66 (-70%)
Mutual labels:  clustering, unsupervised-learning
treecut
Find nodes in hierarchical clustering that are statistically significant
Stars: ✭ 26 (-88.18%)
Mutual labels:  clustering, unsupervised-learning
Keras deep clustering
How to do Unsupervised Clustering with Keras
Stars: ✭ 202 (-8.18%)
Mutual labels:  unsupervised-learning, clustering
Unsupervised-Learning-in-R
Workshop (6 hours): Clustering (Hdbscan, LCA, Hopach), dimension reduction (UMAP, GLRM), and anomaly detection (isolation forests).
Stars: ✭ 34 (-84.55%)
Mutual labels:  clustering, unsupervised-learning
L2c
Learning to Cluster. A deep clustering strategy.
Stars: ✭ 262 (+19.09%)
Mutual labels:  unsupervised-learning, clustering
T-CorEx
Implementation of linear CorEx and temporal CorEx.
Stars: ✭ 31 (-85.91%)
Mutual labels:  clustering, unsupervised-learning
Self Supervised Learning Overview
📜 Self-Supervised Learning from Images: Up-to-date reading list.
Stars: ✭ 73 (-66.82%)
Mutual labels:  unsupervised-learning, clustering
Bagofconcepts
Python implementation of bag-of-concepts
Stars: ✭ 18 (-91.82%)
Mutual labels:  unsupervised-learning, clustering
machine-learning-course
Machine Learning Course @ Santa Clara University
Stars: ✭ 17 (-92.27%)
Mutual labels:  clustering, unsupervised-learning
Awesome Community Detection
A curated list of community detection research papers with implementations.
Stars: ✭ 1,874 (+751.82%)
Mutual labels:  unsupervised-learning, clustering
LinearCorex
Fast, linear version of CorEx for covariance estimation, dimensionality reduction, and subspace clustering with very under-sampled, high-dimensional data
Stars: ✭ 39 (-82.27%)
Mutual labels:  clustering, unsupervised-learning
MVGL
TCyb 2018: Graph learning for multiview clustering
Stars: ✭ 26 (-88.18%)
Mutual labels:  clustering, unsupervised-learning
kmeans
A simple implementation of K-means (and Bisecting K-means) clustering algorithm in Python
Stars: ✭ 18 (-91.82%)
Mutual labels:  clustering, unsupervised-learning
dti-clustering
(NeurIPS 2020 oral) Code for "Deep Transformation-Invariant Clustering" paper
Stars: ✭ 60 (-72.73%)
Mutual labels:  clustering, unsupervised-learning
acoustic-keylogger
Pipeline of a keylogging attack using just an audio signal and unsupervised learning.
Stars: ✭ 80 (-63.64%)
Mutual labels:  clustering, unsupervised-learning
dbscan
DBSCAN Clustering Algorithm C# Implementation
Stars: ✭ 38 (-82.73%)
Mutual labels:  clustering, unsupervised-learning
Minisom
🔴 MiniSom is a minimalistic implementation of the Self Organizing Maps
Stars: ✭ 801 (+264.09%)
Mutual labels:  unsupervised-learning, clustering
Text Summarizer
Python Framework for Extractive Text Summarization
Stars: ✭ 96 (-56.36%)
Mutual labels:  unsupervised-learning, clustering

Spectral Clustering

Build Status Python application PyPI Version Python Versions Downloads codecov Documentation

Overview

This is a Python re-implementation of the spectral clustering algorithm in the paper Speaker Diarization with LSTM.

refinement

Disclaimer

This is not the original implementation used by the paper.

Specifically, in this implementation, we use the K-Means from scikit-learn, which does NOT support customized distance measure like cosine distance.

Dependencies

  • numpy
  • scipy
  • scikit-learn

Installation

Install the package by:

pip3 install spectralcluster

or

python3 -m pip install spectralcluster

Tutorial

Simply use the predict() method of class SpectralClusterer to perform spectral clustering:

from spectralcluster import SpectralClusterer

clusterer = SpectralClusterer(
    min_clusters=2,
    max_clusters=100,
    p_percentile=0.95,
    gaussian_blur_sigma=1)

labels = clusterer.predict(X)

The input X is a numpy array of shape (n_samples, n_features), and the returned labels is a numpy array of shape (n_samples,).

For the complete list of parameters of the clusterer, see spectralcluster/spectral_clusterer.py.

youtube_screenshot

Citations

Our paper is cited as:

@inproceedings{wang2018speaker,
  title={Speaker diarization with lstm},
  author={Wang, Quan and Downey, Carlton and Wan, Li and Mansfield, Philip Andrew and Moreno, Ignacio Lopz},
  booktitle={2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={5239--5243},
  year={2018},
  organization={IEEE}
}

FAQs

Laplacian matrix

Question: Why are you performing eigen-decomposition directly on the similarity matrix instead of its Laplacian matrix? (source)

Answer: No, we are not performing eigen-decomposition directly on the similarity matrix. In the sequence of refinement operations, the first operation is CropDiagonal, which replaces each diagonal element of the similarity matrix by the max non-diagonal value of the row. After this operation, the matrix has similar properties to a standard Laplacian matrix.

Question: Why don't you just use the standard Laplacian matrix?

Answer: Our Laplacian matrix is less sensitive (thus more robust) to the Gaussian blur operation.

Cosine vs. Euclidean distance

Question: Your paper says the K-Means should be based on Cosine distance, but this repository is using Euclidean distance. Do you have a Cosine distance version?

Answer: You can find a variant of this repository using Cosine distance for K-means instead of Euclidean distance here: FlorianKrey/DNC

Misc

Our new speaker diarization systems are now fully supervised, powered by uis-rnn. Check this Google AI Blog.

To learn more about speaker diarization, here is a curated list of resources: awesome-diarization.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].