All Projects → matteodellamico → Flexible Clustering

matteodellamico / Flexible Clustering

Licence: bsd-3-clause
Clustering for arbitrary data and dissimilarity function

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Flexible Clustering

Watset
Watset: Automatic Induction of Synsets from a Graph of Synonyms
Stars: ✭ 16 (-56.76%)
Mutual labels:  clustering
Tribuo
Tribuo - A Java machine learning library
Stars: ✭ 882 (+2283.78%)
Mutual labels:  clustering
Mob Suite
MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
Stars: ✭ 32 (-13.51%)
Mutual labels:  clustering
Bagofconcepts
Python implementation of bag-of-concepts
Stars: ✭ 18 (-51.35%)
Mutual labels:  clustering
Data mining
The Ruby DataMining Gem, is a little collection of several Data-Mining-Algorithms
Stars: ✭ 10 (-72.97%)
Mutual labels:  clustering
Events
Repository for *SEM Paper on Event Coreference Resolution in ECB+
Stars: ✭ 20 (-45.95%)
Mutual labels:  clustering
Minisom
🔴 MiniSom is a minimalistic implementation of the Self Organizing Maps
Stars: ✭ 801 (+2064.86%)
Mutual labels:  clustering
Saber
Window-Based Hybrid CPU/GPU Stream Processing Engine
Stars: ✭ 35 (-5.41%)
Mutual labels:  streaming-data
Clustering
fast clustering algorithms
Stars: ✭ 14 (-62.16%)
Mutual labels:  clustering
Cytometry Clustering Comparison
R scripts to reproduce analyses in our paper comparing clustering methods for high-dimensional cytometry data
Stars: ✭ 30 (-18.92%)
Mutual labels:  clustering
Stream
A framework for data stream modeling and associated data mining tasks such as clustering and classification. - R Package
Stars: ✭ 23 (-37.84%)
Mutual labels:  clustering
Adapt
Advanced Developer Async Programming Toolkit
Stars: ✭ 26 (-29.73%)
Mutual labels:  clustering
Kitti Track Collection
Data and devtools for the "Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video" paper.
Stars: ✭ 20 (-45.95%)
Mutual labels:  clustering
Nanny
A tidyverse suite for (pre-) machine-learning: cluster, PCA, permute, impute, rotate, redundancy, triangular, smart-subset, abundant and variable features.
Stars: ✭ 17 (-54.05%)
Mutual labels:  clustering
K Means Constrained
K-Means clustering - constrained with minimum and maximum cluster size
Stars: ✭ 33 (-10.81%)
Mutual labels:  clustering
Pyclustering
pyclustring is a Python, C++ data mining library.
Stars: ✭ 806 (+2078.38%)
Mutual labels:  clustering
Go Mesh
Realtime data exchange platform for Smart Cities
Stars: ✭ 20 (-45.95%)
Mutual labels:  streaming-data
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+2554.05%)
Mutual labels:  clustering
Satellite imagery analysis
Implementation of different techniques to find insights from the satellite data using Python.
Stars: ✭ 31 (-16.22%)
Mutual labels:  clustering
Densitycluster
Machine learning. Clustering by fast search and find of density peaks.
Stars: ✭ 27 (-27.03%)
Mutual labels:  clustering

Flexible clustering

A project for scalable hierachical clustering, thanks to a Flexible, Incremental, Scalable, Hierarchical Density-Based Clustering algorithms (FISHDBC, for the friends).

Please see the paper at https://arxiv.org/abs/1910.07283

Dependencies

Installation

python3 setup.py install

A projects allowing scalable hierarchical clustering, thanks to an approximated version of OPTICS, on arbitrary data and distance measures.

Quickstart

Look at the HDBSCAN documentation for the meaning of the return values of the cluster method. There are plenty of configuration options, inherited by HNSWs and HDBSCAN, but the only compulsory argument is a dissimilarity function between arbitrary data elements::

import flexible_clustering

clusterer = flexible_clustering.FISHDBC(my_dissimilarity)
for elem in my_data:
    clusterer.add(elem)
labels, probs, stabilities, condensed_tree, slt, mst = clusterer.cluster()

for elem in some_new_data: # support cheap incremental clustering
    clusterer.add(elem)
# new clustering according to the newly available data
labels, probs, stabilities, condensed_tree, slt, mst = clusterer.cluster()

Make sure to run everything from outside the source directory, to avoid confusing Python path.

Demo/Example

Look at the fishdbc_example.py file for something more (it requires matplotlib to be run).

Want More Info?

Send me an email at [email protected]. I'll improve the docs as and if people use this.

Author

Matteo Dell'Amico

Copyright

BSD 3-clause; see the LICENSE file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].