All Projects → zdebruine → RcppML

zdebruine / RcppML

Licence: GPL-2.0, GPL-3.0 licenses found Licenses found GPL-2.0 LICENSE GPL-3.0 LICENSE.md
Rcpp Machine Learning: Fast robust NMF, divisive clustering, and more

Programming Languages

C++
36643 projects - #6 most used programming language
r
7636 projects
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to RcppML

M-NMF
An implementation of "Community Preserving Network Embedding" (AAAI 2017)
Stars: ✭ 119 (+128.85%)
Mutual labels:  clustering, matrix-factorization, nmf
NMFADMM
A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).
Stars: ✭ 39 (-25%)
Mutual labels:  matrix-factorization, sparse-matrix, nmf
Awesome Community Detection
A curated list of community detection research papers with implementations.
Stars: ✭ 1,874 (+3503.85%)
Mutual labels:  clustering, matrix-factorization
Quick-Data-Science-Experiments-2017
Quick-Data-Science-Experiments
Stars: ✭ 19 (-63.46%)
Mutual labels:  matrix-factorization, nmf
Gemsec
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
Stars: ✭ 210 (+303.85%)
Mutual labels:  clustering, matrix-factorization
FixedEffectjlr
R interface for Fixed Effect Models
Stars: ✭ 20 (-61.54%)
Mutual labels:  clustering
consul role
Ansible role to install Consul (cluster of) server/agent
Stars: ✭ 14 (-73.08%)
Mutual labels:  clustering
TrajectoryTracking
Trajectory Tracking Project
Stars: ✭ 16 (-69.23%)
Mutual labels:  clustering
tsp-essay
A fun study of some heuristics for the Travelling Salesman Problem.
Stars: ✭ 15 (-71.15%)
Mutual labels:  clustering
clusterix
Visual exploration of clustered data.
Stars: ✭ 44 (-15.38%)
Mutual labels:  clustering
NNM
The PyTorch official implementation of the CVPR2021 Poster Paper NNM: Nearest Neighbor Matching for Deep Clustering.
Stars: ✭ 46 (-11.54%)
Mutual labels:  clustering
EgoSplitting
A NetworkX implementation of "Ego-splitting Framework: from Non-Overlapping to Overlapping Clusters" (KDD 2017).
Stars: ✭ 78 (+50%)
Mutual labels:  clustering
rabbitmq-clusterer
This project is ABANDONWARE. Use https://www.rabbitmq.com/cluster-formation.html instead.
Stars: ✭ 72 (+38.46%)
Mutual labels:  clustering
swanager
A high-level Docker Services management tool built on top of Swarm
Stars: ✭ 12 (-76.92%)
Mutual labels:  clustering
MAL-Map
Cluster and visualize relationships between anime on MyAnimeList
Stars: ✭ 201 (+286.54%)
Mutual labels:  clustering
ssdc
ssdeep cluster analysis for malware files
Stars: ✭ 24 (-53.85%)
Mutual labels:  clustering
NNet
algorithm for study: multi-layer-perceptron, cluster-graph, cnn, rnn, restricted boltzmann machine, bayesian network
Stars: ✭ 24 (-53.85%)
Mutual labels:  clustering
kohonen-maps
Implementation of SOM and GSOM
Stars: ✭ 62 (+19.23%)
Mutual labels:  clustering
snATAC
<<------ Use SnapATAC!!
Stars: ✭ 23 (-55.77%)
Mutual labels:  clustering
IntroduceToEclicpseVert.x
This repository contains the code of Vert.x examples contained in my articles published on platforms such as kodcu.com, medium, dzone. How to run each example is described in its readme file.
Stars: ✭ 27 (-48.08%)
Mutual labels:  clustering

Rcpp Machine Learning Library

License: GPL v2

RcppML is an R package for fast non-negative matrix factorization and divisive clustering using large sparse matrices.

See pkgdown site here: https://zdebruine.github.io/RcppML/

RcppML NMF is:

  • The fastest NMF implementation in any language for sparse and dense matrices
  • More interpretable than other implementations due to diagonal scaling
  • Easy to regularize with an L1 penalty

Installation

Install from CRAN or the development version from GitHub:

install.packages('RcppML')                       # install CRAN version
devtools::install_github("zdebruine/RcppML")     # compile dev version

NOTE: RcppML is being actively developed. Please check that your packageVersion("RcppML") is current before raising issues.

Check out the CRAN manual.

Once installed and loaded, RcppML C++ headers defining classes can be used in C++ files for any R package using #include <RcppML.hpp>.

Matrix Factorization

Sparse matrix factorization by alternating least squares:

  • Non-negativity constraints
  • L1 regularization
  • Diagonal scaling
  • Rank-1 and Rank-2 specializations (~2x faster than irlba SVD equivalents)

Read (and cite) our bioRXiv manuscript on NMF for single-cell experiments.

R functions

The nmf function runs matrix factorization by alternating least squares in the form A = WDH. The project function updates w or h given the other, while the mse function calculates mean squared error of the factor model.

library(RcppML)
A <- Matrix::rsparsematrix(1000, 100, 0.1) # sparse Matrix::dgCMatrix
model <- RcppML::nmf(A, k = 10)
h0 <- predict(model, A)
evaluate(model, A) # calculate mean squared error

Divisive Clustering

Divisive clustering by rank-2 spectral bipartitioning.

  • 2nd SVD vector is linearly related to the difference between factors in rank-2 matrix factorization.
  • Rank-2 matrix factorization (optional non-negativity constraints) for spectral bipartitioning ~2x faster than irlba SVD
  • Sensitive distance-based stopping criteria similar to Newman-Girvan modularity, but orders of magnitude faster
  • Stopping criteria based on minimum number of samples

R functions

The dclust function runs divisive clustering by recursive spectral bipartitioning, while the bipartition function exposes the rank-2 NMF specialization and returns statistics of the bipartition.

library(RcppML)
A <- Matrix::rsparsematrix(1000, 1000, 0.1) # sparse Matrix::dgcMatrix
clusters <- dclust(A, min_dist = 0.001, min_samples = 5)
cluster0 <- bipartition(A)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].