cmtt / kmpp

Licence: other

k-means clustering algorithm with k-means++ initialization.

Programming Languages

184084 projects - #8 most used programming language

Projects that are alternatives of or similar to kmpp

Clustering methods in Machine Learning includes both theory and python code of each algorithm. Algorithms include K Mean, K Mode, Hierarchical, DB Scan and Gaussian Mixture Model GMM. Interview questions on clustering are also added in the end.

Stars: ✭ 27 (-3.57%)

Mutual labels: clustering-algorithm, kmeans-algorithm

skmeans

Super fast simple k-means implementation for unidimiensional and multidimensional data.

Stars: ✭ 59 (+110.71%)

Mutual labels: kmeans-algorithm

genieclust

Genie++ Fast and Robust Hierarchical Clustering with Noise Point Detection - for Python and R

Stars: ✭ 34 (+21.43%)

Mutual labels: clustering-algorithm

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Stars: ✭ 126 (+350%)

Mutual labels: clustering-algorithm

cyoptics-clustering

Fast OPTICS clustering in Cython + gradient cluster extraction

Stars: ✭ 23 (-17.86%)

Mutual labels: clustering-algorithm

Statistical-Learning-using-R

This is a Statistical Learning application which will consist of various Machine Learning algorithms and their implementation in R done by me and their in depth interpretation.Documents and reports related to the below mentioned techniques can be found on my Rpubs profile.

Stars: ✭ 27 (-3.57%)

Mutual labels: clustering-algorithm

clope

Elixir implementation of CLOPE: A Fast and Effective Clustering Algorithm for Transactional Data

Stars: ✭ 18 (-35.71%)

Mutual labels: clustering-algorithm

Genetic-Algorithm-on-K-Means-Clustering

Implementing Genetic Algorithm on K-Means and compare with K-Means++

Stars: ✭ 37 (+32.14%)

Mutual labels: clustering-algorithm

Project17-C-Map

Map SDK를 활용한 POI Clustering Interaction Dev.

Stars: ✭ 41 (+46.43%)

Mutual labels: clustering-algorithm

tsp-essay

A fun study of some heuristics for the Travelling Salesman Problem.

Stars: ✭ 15 (-46.43%)

Mutual labels: kmeans-algorithm

clueminer

interactive clustering platform

Stars: ✭ 13 (-53.57%)

Mutual labels: clustering-algorithm

Hdbscan

A high performance implementation of HDBSCAN clustering.

Stars: ✭ 2,032 (+7157.14%)

Mutual labels: clustering-algorithm

Project17-B-Map

Map SDK를 활용한 POI Clustering Interaction Dev

Stars: ✭ 78 (+178.57%)

Mutual labels: clustering-algorithm

Clustering

Implements "Clustering a Million Faces by Identity"

Stars: ✭ 128 (+357.14%)

Mutual labels: clustering-algorithm

Study-of-David-Mackay-s-book-

David Mackay's book review and problem solvings and own python codes, mathematica files

Stars: ✭ 46 (+64.29%)

Mutual labels: clustering-algorithm

ST-DBSCAN

Implementation of ST-DBSCAN algorithm based on Birant 2007

Stars: ✭ 25 (-10.71%)

Mutual labels: clustering-algorithm

clustering-python

Different clustering approaches applied on different problemsets

Stars: ✭ 36 (+28.57%)

Mutual labels: clustering-algorithm

kmeans-dbscan-tutorial

A clustering tutorial with scikit-learn for beginners.

Stars: ✭ 20 (-28.57%)

Mutual labels: clustering-algorithm

neural clustering process

Implementation of the Neural Clustering Process algorithm in Pytorch

Stars: ✭ 24 (-14.29%)

Mutual labels: clustering-algorithm

kelp-core

www.kelp-ml.org

Stars: ✭ 19 (-32.14%)

Mutual labels: clustering-algorithm

View All Similar Projects ➔

kmpp

When dealing with lots of data points, clustering algorithms may be used to group them. The k-means algorithm partitions n data points into k clusters and finds the centroids of these clusters incrementally.

The algorithm assigns data points to the closest cluster, and the centroids of each cluster are re-calculated. These steps are repeated until the centroids do not changing anymore.

The basic k-means algorithm is initialized with k centroids at random positions. This implementation addresses some disadvantages of the arbitrary initialization method with the k-means++ algorithm (see "Further reading" at the end).

Installation

Installing via npm

Install kmpp as Node.js module via NPM:

$ npm install kmpp

Example

var kmpp = require('kmpp');

kmpp([
  [x1, y1, ...],
  [x2, y2, ...],
  [x3, y3, ...],
  ...
], {
  k: 4
});

// =>
// { converged: true,
//   centroids: [[xm1, ym1, ...], [xm2, ym2, ...], [xm3, ym3, ...]],
//   counts: [ 7, 6, 7 ],
//   assignments: [ 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1 ]
// }

API

`kmpp(points[, opts)`

Exectes the k-means++ algorithm on points.

Arguments:

points (Array): An array-of-arrays containing the points in format [[x1, y1, ...], [x2, y2, ...], [x3, y3, ...], ...]
opts: object containing configuration parameters. Parameters are
- distance (function): Optional function that takes two points and returns the distance between them.
- initialize (Boolean): Perform initialization. If false, uses the initial state provided in centroids and assignments. Otherwise discards any initial state and performs initialization.
- k (Number): number of centroids. If not provided, sqrt(n / 2) is used, where n is the number of points.
- kmpp (Boolean, default: true): If true, uses k-means++ initialization. Otherwise uses naive random assignment.
- maxIterations (Number, default: 100): Maximum allowed number of iterations.
- norm (Number, default: 2): L-norm used for distance computation. 1 is Manhattan norm, 2 is Euclidean norm. Ignored if distance function is provided.
- centroids (Array): An array of centroids. If initialize is false, used as initialization for the algorithm, otherwise overwritten in-place if of the correct size.
- assignments (Array): An array of assignments. Used for initialization, otherwise overwritten.
- counts (Array): An output array used to avoid extra allocation. Values are discarded and overwritten.

Returns an object containing information about the centroids and point assignments. Values are:

converged: true if the algorithm converged successfully
centroids: a list of centroids
counts: the number of points assigned to each respective centroid
assignments: a list of integer assignments of each point to the respective centroid
iterations: number of iterations used

Credits

Jared Harkins improved the performance by reducing the amount of function calls, reverting to Manhattan distance for measurements and improved the random initialization by choosing from points
Ricky Reusser refactored API

License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

cmtt / kmpp

Programming Languages

Labels

Projects that are alternatives of or similar to kmpp

kmpp

Installation

Installing via npm

Example

API

`kmpp(points[, opts)`

Credits

Further reading

License