All Projects → joshlk → K Means Constrained

joshlk / K Means Constrained

Licence: bsd-3-clause
K-Means clustering - constrained with minimum and maximum cluster size

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to K Means Constrained

Model Optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Stars: ✭ 992 (+2906.06%)
Mutual labels:  ml, optimization
Tribuo
Tribuo - A Java machine learning library
Stars: ✭ 882 (+2572.73%)
Mutual labels:  ml, clustering
Timeseriesclustering.jl
Julia implementation of unsupervised learning methods for time series datasets. It provides functionality for clustering and aggregating, detecting motifs, and quantifying similarity between time series datasets.
Stars: ✭ 49 (+48.48%)
Mutual labels:  clustering, optimization
Advisor
Open-source implementation of Google Vizier for hyper parameters tuning
Stars: ✭ 1,359 (+4018.18%)
Mutual labels:  ml, optimization
Ml Dl Scripts
The repository provides usefull python scripts for ML and data analysis
Stars: ✭ 119 (+260.61%)
Mutual labels:  ml, clustering
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+1863.64%)
Mutual labels:  ml, optimization
Pycaret
An open-source, low-code machine learning library in Python
Stars: ✭ 4,594 (+13821.21%)
Mutual labels:  ml, clustering
Wheels
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+2600%)
Mutual labels:  ml, optimization
Data mining
The Ruby DataMining Gem, is a little collection of several Data-Mining-Algorithms
Stars: ✭ 10 (-69.7%)
Mutual labels:  clustering
Kitti Track Collection
Data and devtools for the "Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video" paper.
Stars: ✭ 20 (-39.39%)
Mutual labels:  clustering
Desdeo
An open source framework for interactive multiobjective optimization methods
Stars: ✭ 8 (-75.76%)
Mutual labels:  optimization
Cutest.jl
Julia's CUTEst Interface
Stars: ✭ 10 (-69.7%)
Mutual labels:  optimization
Okalgo
Idiomatic Kotlin extensions for ojAlgo
Stars: ✭ 20 (-39.39%)
Mutual labels:  optimization
Attention Ocr
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.
Stars: ✭ 844 (+2457.58%)
Mutual labels:  ml
Tsp solver
Solving tsp (travel sales problem) using ruin & recreate method.
Stars: ✭ 29 (-12.12%)
Mutual labels:  optimization
Bfgs Neldermead Trustregion
Python implementation of some numerical (optimization) methods
Stars: ✭ 8 (-75.76%)
Mutual labels:  optimization
Awesome Seo
Google SEO研究及流量变现
Stars: ✭ 942 (+2754.55%)
Mutual labels:  optimization
Events
Repository for *SEM Paper on Event Coreference Resolution in ECB+
Stars: ✭ 20 (-39.39%)
Mutual labels:  clustering
Rl Baselines Zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Stars: ✭ 839 (+2442.42%)
Mutual labels:  optimization
Clustering
fast clustering algorithms
Stars: ✭ 14 (-57.58%)
Mutual labels:  clustering

PyPI Python Build Status Documentation

k-means-constrained

K-means clustering implementation whereby a minimum and/or maximum size for each cluster can be specified.

This K-means implementation modifies the cluster assignment step (E in EM) by formulating it as a Minimum Cost Flow (MCF) linear network optimisation problem. This is then solved using a cost-scaling push-relabel algorithm and uses Google's Operations Research tools's SimpleMinCostFlow which is a fast C++ implementation.

This package is inspired by Bradley et al.. The original Minimum Cost Flow (MCF) network proposed by Bradley et al. has been modified so maximum cluster sizes can also be specified along with minimum cluster size.

The code is based on scikit-lean's KMeans and implements the same API with modifications.

Ref:

  1. Bradley, P. S., K. P. Bennett, and Ayhan Demiriz. "Constrained k-means clustering." Microsoft Research, Redmond (2000): 1-8.
  2. Google's SimpleMinCostFlow C++ implementation

Installation

You can install the k-means-constrained from PyPI:

pip install k-means-constrained

It is supported on Python 3.6 and above.

Example

More details can be found in the API documentation.

>>> from k_means_constrained import KMeansConstrained
>>> import numpy as np
>>> X = np.array([[1, 2], [1, 4], [1, 0],
...                [4, 2], [4, 4], [4, 0]])
>>> clf = KMeansConstrained(
...     n_clusters=2,
...     size_min=2,
...     size_max=5,
...     random_state=0
... )
>>> clf.fit_predict(X)
array([0, 0, 0, 1, 1, 1], dtype=int32)
>>> clf.cluster_centers_
array([[ 1.,  2.],
       [ 4.,  2.]])
>>> clf.labels_
array([0, 0, 0, 1, 1, 1], dtype=int32)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].