All Projects → mlampros → ClusterR

mlampros / ClusterR

Licence: other
Gaussian mixture models, k-means, mini-batch-kmeans and k-medoids clustering

Programming Languages

r
7636 projects
C++
36643 projects - #6 most used programming language

Projects that are alternatives of or similar to ClusterR

ml
经典机器学习算法的极简实现
Stars: ✭ 130 (+88.41%)
Mutual labels:  kmeans, gmm
MachineLearning
Implementations of machine learning algorithm by Python 3
Stars: ✭ 16 (-76.81%)
Mutual labels:  kmeans, gmm
Machine learning
Estudo e implementação dos principais algoritmos de Machine Learning em Jupyter Notebooks.
Stars: ✭ 161 (+133.33%)
Mutual labels:  kmeans
SIFT-BoF
Feature extraction by using SITF+BoF.
Stars: ✭ 22 (-68.12%)
Mutual labels:  kmeans
kmeans1d
⭐ A Python package for optimal 1D k-means clustering.
Stars: ✭ 35 (-49.28%)
Mutual labels:  kmeans
faiss-ruby
Efficient similarity search and clustering for Ruby
Stars: ✭ 62 (-10.14%)
Mutual labels:  kmeans
textTinyR
Text Processing for Small or Big Data Files in R
Stars: ✭ 32 (-53.62%)
Mutual labels:  rcpparmadillo
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (+127.54%)
Mutual labels:  kmeans
KMeans elbow
Code for determining optimal number of clusters for K-means algorithm using the 'elbow criterion'
Stars: ✭ 35 (-49.28%)
Mutual labels:  kmeans
android-vad
This VAD library can process audio in real-time utilizing GMM which helps identify presence of human speech in an audio sample that contains a mixture of speech and noise.
Stars: ✭ 64 (-7.25%)
Mutual labels:  gmm
skmeans
Super fast simple k-means implementation for unidimiensional and multidimensional data.
Stars: ✭ 59 (-14.49%)
Mutual labels:  kmeans
Speaker-Recognition
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
Stars: ✭ 94 (+36.23%)
Mutual labels:  gmm
bob
Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob
Stars: ✭ 38 (-44.93%)
Mutual labels:  gmm
ml-simulations
Animated Visualizations of Popular Machine Learning Algorithms
Stars: ✭ 33 (-52.17%)
Mutual labels:  kmeans
Cukatify
Cukatify is a music social media project
Stars: ✭ 21 (-69.57%)
Mutual labels:  kmeans
data-science-popular-algorithms
Data Science algorithms and topics that you must know. (Newly Designed) Recommender Systems, Decision Trees, K-Means, LDA, RFM-Segmentation, XGBoost in Python, R, and Scala.
Stars: ✭ 65 (-5.8%)
Mutual labels:  kmeans
Machine Learning Models
Decision Trees, Random Forest, Dynamic Time Warping, Naive Bayes, KNN, Linear Regression, Logistic Regression, Mixture Of Gaussian, Neural Network, PCA, SVD, Gaussian Naive Bayes, Fitting Data to Gaussian, K-Means
Stars: ✭ 160 (+131.88%)
Mutual labels:  kmeans
deepvis
machine learning algorithms in Swift
Stars: ✭ 54 (-21.74%)
Mutual labels:  kmeans
clustering-python
Different clustering approaches applied on different problemsets
Stars: ✭ 36 (-47.83%)
Mutual labels:  kmeans
AnnA Anki neuronal Appendix
Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Stars: ✭ 39 (-43.48%)
Mutual labels:  kmeans

tic codecov.io CRAN_Status_Badge Downloads Buy Me A Coffee Dependencies

ClusterR


The ClusterR package consists of Gaussian mixture models, k-means, mini-batch-kmeans, k-medoids and affinity propagation clustering algorithms with the option to plot, validate, predict (new data) and find the optimal number of clusters. The package takes advantage of 'RcppArmadillo' to speed up the computationally intensive parts of the functions. More details on the functionality of ClusterR can be found in the blog-post, Vignette and in the package Documentation ( scroll down for information on how to use the docker image )

UPDATE 16-08-2018

As of version 1.1.4 the ClusterR package allows R package maintainers to perform linking between packages at a C++ code (Rcpp) level. This means that the Rcpp functions of the ClusterR package can be called in the C++ files of another package. In the next lines I'll give detailed explanations on how this can be done:


Assumming that an R package ('PackageA') calls one of the ClusterR Rcpp functions. Then the maintainer of 'PackageA' has to :


  • 1st. install the ClusterR package to take advantage of the new functionality either from CRAN using,

install.packages("ClusterR")
 

or download the latest version from Github using the remotes package,


remotes::install_github('mlampros/ClusterR', upgrade = 'always', dependencies = TRUE, repos = 'https://cloud.r-project.org/')
 

  • 2nd. update the DESCRIPTION file of 'PackageA' and especially the LinkingTo field by adding the ClusterR package (besides any other packages),

LinkingTo: ClusterR

  • 3rd. open a new C++ file (for instance in Rstudio) and at the top of the file add the following 'headers', 'depends' and 'plugins',

# include <RcppArmadillo.h>
# include <ClusterRHeader.h>
# include <affinity_propagation.h>
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::depends(ClusterR)]]
// [[Rcpp::plugins(cpp11)]]


The available functions can be found in the following files: inst/include/ClusterRHeader.h and inst/include/affinity_propagation.h


A complete minimal example would be :


# include <RcppArmadillo.h>
# include <ClusterRHeader.h>
# include <affinity_propagation.h>
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::depends(ClusterR)]]
// [[Rcpp::plugins(cpp11)]]


using namespace clustR;


// [[Rcpp::export]]
Rcpp::List mini_batch_kmeans(arma::mat& data, int clusters, int batch_size, int max_iters, int num_init = 1, 

                            double init_fraction = 1.0, std::string initializer = "kmeans++",
                            
                            int early_stop_iter = 10, bool verbose = false, 
                            
                            Rcpp::Nullable<Rcpp::NumericMatrix> CENTROIDS = R_NilValue, 
                            
                            double tol = 1e-4, double tol_optimal_init = 0.5, int seed = 1) {

  ClustHeader clust_header;

  return clust_header.mini_batch_kmeans(data, clusters, batch_size, max_iters, num_init, init_fraction, 
  
                                        initializer, early_stop_iter, verbose, CENTROIDS, tol, 
                                        
                                        tol_optimal_init, seed);
}


Then, by opening an R file a user can call the mini_batch_kmeans function using,


Rcpp::sourceCpp('example.cpp')              # assuming that the previous Rcpp code is included in 'example.cpp' 
             
set.seed(1)
dat = matrix(runif(100000), nrow = 1000, ncol = 100)

mbkm = mini_batch_kmeans(dat, clusters = 3, batch_size = 50, max_iters = 100, num_init = 2, 

                         init_fraction = 1.0, initializer = "kmeans++", early_stop_iter = 10, 
                         
                         verbose = T, CENTROIDS = NULL, tol = 1e-4, tol_optimal_init = 0.5, seed = 1)
                         
str(mbkm)


Use the following link to report bugs/issues,

https://github.com/mlampros/ClusterR/issues


UPDATE 28-11-2019


Docker images of the ClusterR package are available to download from my dockerhub account. The images come with Rstudio and the R-development version (latest) installed. The whole process was tested on Ubuntu 18.04. To pull & run the image do the following,


docker pull mlampros/clusterr:rstudiodev

docker run -d --name rstudio_dev -e USER=rstudio -e PASSWORD=give_here_your_password --rm -p 8787:8787 mlampros/clusterr:rstudiodev

The user can also bind a home directory / folder to the image to use its files by specifying the -v command,


docker run -d --name rstudio_dev -e USER=rstudio -e PASSWORD=give_here_your_password --rm -p 8787:8787 -v /home/YOUR_DIR:/home/rstudio/YOUR_DIR mlampros/clusterr:rstudiodev


In the latter case you might have first give permission privileges for write access to YOUR_DIR directory (not necessarily) using,


chmod -R 777 /home/YOUR_DIR


The USER defaults to rstudio but you have to give your PASSWORD of preference (see https://rocker-project.org/ for more information).


Open your web-browser and depending where the docker image was build / run give,


1st. Option on your personal computer,


http://0.0.0.0:8787 

2nd. Option on a cloud instance,


http://Public DNS:8787

to access the Rstudio console in order to give your username and password.


Citation:

If you use the code of this repository in your paper or research please cite both ClusterR and the original articles / software https://CRAN.R-project.org/package=ClusterR:


@Manual{,
  title = {{ClusterR}: Gaussian Mixture Models, K-Means, Mini-Batch-Kmeans, K-Medoids and Affinity Propagation Clustering},
  author = {Lampros Mouselimis},
  year = {2022},
  note = {R package version 1.2.7},
  url = {https://CRAN.R-project.org/package=ClusterR},
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].