gbuesing / kmeans-clusterer

Licence: MIT license

k-means clustering in Ruby

Programming Languages

ruby

36898 projects - #4 most used programming language

Projects that are alternatives of or similar to kmeans-clusterer

machine-learning-course

Machine Learning Course @ Santa Clara University

Stars: ✭ 17 (-80.68%)

Mutual labels: clustering, kmeans-clustering

ParallelKMeans.jl

Parallel & lightning fast implementation of available classic and contemporary variants of the KMeans clustering algorithm

Stars: ✭ 45 (-48.86%)

Mutual labels: clustering, kmeans-clustering

Clustering-in-Python

Clustering methods in Machine Learning includes both theory and python code of each algorithm. Algorithms include K Mean, K Mode, Hierarchical, DB Scan and Gaussian Mixture Model GMM. Interview questions on clustering are also added in the end.

Stars: ✭ 27 (-69.32%)

Mutual labels: clustering, kmeans-clustering

k-means-quantization-js

🎨 Apply color quantization to images using k-means clustering.

Stars: ✭ 27 (-69.32%)

Mutual labels: clustering, kmeans-clustering

tsp-essay

A fun study of some heuristics for the Travelling Salesman Problem.

Stars: ✭ 15 (-82.95%)

Mutual labels: clustering, kmeans-clustering

swanager

A high-level Docker Services management tool built on top of Swarm

Stars: ✭ 12 (-86.36%)

Mutual labels: clustering

WatsonCluster

A simple C# class using Watson TCP to enable a one-to-one high availability cluster.

Stars: ✭ 18 (-79.55%)

Mutual labels: clustering

scSeqR

This package has migrated to https://github.com/rezakj/iCellR please use iCellR instead of scSeqR for more functionalities and updates.

Stars: ✭ 16 (-81.82%)

Mutual labels: clustering

kohonen-maps

Implementation of SOM and GSOM

Stars: ✭ 62 (-29.55%)

Mutual labels: clustering

mathematics-statistics-for-data-science

Mathematical & Statistical topics to perform statistical analysis and tests; Linear Regression, Probability Theory, Monte Carlo Simulation, Statistical Sampling, Bootstrapping, Dimensionality reduction techniques (PCA, FA, CCA), Imputation techniques, Statistical Tests (Kolmogorov Smirnov), Robust Estimators (FastMCD) and more in Python and R.

Stars: ✭ 56 (-36.36%)

Mutual labels: clustering

clustering-python

Different clustering approaches applied on different problemsets

Stars: ✭ 36 (-59.09%)

Mutual labels: clustering

RcppML

Rcpp Machine Learning: Fast robust NMF, divisive clustering, and more

Stars: ✭ 52 (-40.91%)

Mutual labels: clustering

snATAC

<<------ Use SnapATAC!!

Stars: ✭ 23 (-73.86%)

Mutual labels: clustering

Heart disease prediction

Heart Disease prediction using 5 algorithms

Stars: ✭ 43 (-51.14%)

Mutual labels: clustering

consul role

Ansible role to install Consul (cluster of) server/agent

Stars: ✭ 14 (-84.09%)

Mutual labels: clustering

syncflux

SyncFlux is an Open Source InfluxDB Data synchronization and replication tool for migration purposes or HA clusters

Stars: ✭ 145 (+64.77%)

Mutual labels: clustering

EgoSplitting

A NetworkX implementation of "Ego-splitting Framework: from Non-Overlapping to Overlapping Clusters" (KDD 2017).

Stars: ✭ 78 (-11.36%)

Mutual labels: clustering

clusterix

Visual exploration of clustered data.

Stars: ✭ 44 (-50%)

Mutual labels: clustering

A-quantum-inspired-genetic-algorithm-for-k-means-clustering

Implementation of a Quantum inspired genetic algorithm proposed by A quantum-inspired genetic algorithm for k-means clustering paper.

Stars: ✭ 28 (-68.18%)

Mutual labels: clustering

Revisiting-Contrastive-SSL

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]

Stars: ✭ 81 (-7.95%)

Mutual labels: clustering

View All Similar Projects ➔

KMeansClusterer

k-means clustering in Ruby. Uses NArray under the hood for fast calculations.

Jump to the examples directory to see this in action.

Features

Runs multiple clustering attempts to find optimal solution (single runs are susceptible to falling into non-optimal local minima)
Initializes centroids via k-means++ algorithm, for faster convergence
Calculates silhouette score for evaluation
Option to scale data before clustering, so that output isn't biased by different feature scales
Works with high-dimensional data

Install

gem install kmeans-clusterer

Usage

Simple example:

require 'kmeans-clusterer'

data = [[40.71,-74.01],[34.05,-118.24],[39.29,-76.61],
        [45.52,-122.68],[38.9,-77.04],[36.11,-115.17]]

labels = ['New York', 'Los Angeles', 'Baltimore', 
          'Portland', 'Washington DC', 'Las Vegas']

k = 2 # find 2 clusters in data

kmeans = KMeansClusterer.run k, data, labels: labels, runs: 5

kmeans.clusters.each do |cluster|
  puts  cluster.id.to_s + '. ' + 
        cluster.points.map(&:label).join(", ") + "\t" +
        cluster.centroid.to_s
end

# Use existing clusters for prediction with new data:
predicted = kmeans.predict [[41.85,-87.65]] # Chicago
puts "\nClosest cluster to Chicago: #{predicted[0]}"

# Clustering quality score. Value between -1.0..1.0 (1.0 is best)
puts "\nSilhouette score: #{kmeans.silhouette.round(2)}"

Output of simple example:

0. New York, Baltimore, Washington DC [39.63, -75.89]
1. Los Angeles, Portland, Las Vegas [38.56, -118.7]

Closest cluster to Chicago: 0

Silhouette score: 0.91

Options

The following options can be passed in to KMeansClusterer.run:

option	default	description
:labels	nil	optional array of Ruby objects to collate with data array
:runs	10	number of times to run kmeans
:log	false	print stats after each run
:init	:kmpp	algorithm for picking initial cluster centroids. Accepts :kmpp, :random, or an array of k centroids
:scale_data	false	scales features before clustering using formula (data - mean) / std
:float_precision	:double	float precision to use. :double or :single
:max_iter	300	max iterations per run

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

gbuesing / kmeans-clusterer

Programming Languages

Labels

Projects that are alternatives of or similar to kmeans-clusterer

KMeansClusterer

Features

Install

Usage

Options