nlp-ltNatural Language Processing for Lithuanian language
RATTLEReference-free reconstruction and error correction of transcriptomes from Nanopore long-read sequencing
Clustering-DatasetsThis repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.
faytheAn experimental cluster brings Prometheus and OpenStack together
ZeitlineA polylinear timeline with clustering, centred on interactions. — Doc and demo https://octree-gva.github.io/Zeitline/
minicoreFast and memory-efficient clustering + coreset construction, including fast distance kernels for Bregman and f-divergences.
py-lbgPython Implementation for Linde-Buzo-Gray / Generalized Lloyd Algorithm for vector quantization.
Apartment-Interest-PredictionPredict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.
product-quantization🙃Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
dbscanDBSCAN Clustering Algorithm C# Implementation
GrouProxFedGroup, A Clustered Federated Learning framework based on Tensorflow
point-cloud-clustersA catkin workspace in ROS which uses DBSCAN to identify which points in a point cloud belong to the same object.
autoplaitPython implementation of AutoPlait (SIGMOD'14) without smoothing algorithm. NOTE: This repository is for my personal use.
topometryA comprehensive dimensional reduction framework to recover the latent topology from high-dimensional data.
dtw-pythonPython port of R's Comprehensive Dynamic Time Warp algorithms package
acoustic-keyloggerPipeline of a keylogging attack using just an audio signal and unsupervised learning.
Machine-learningThis repository will contain all the stuffs required for beginners in ML and DL do follow and star this repo for regular updates
AnnA Anki neuronal AppendixUsing machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
postsackVisually cluster your emails by sender, domain, and more to identify waste
torch DCECPytorch Deep Clustering with Convolutional Autoencoders implementation
RAE基于tensorflow搭建的神经网络recursive autuencode,用于实现句子聚类
algorithmsThe All ▲lgorithms documentation website.
scarfToolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
TrajSuiteTrajSuite is a cross-platform Java application that provides a suite of trajectory data-mining and visualisation features.
SHARPSHARP: Single-cell RNA-seq Hyper-fast and Accurate processing via ensemble Random Projection
nbodykitAnalysis kit for large-scale structure datasets, the massively parallel way
M-NMFAn implementation of "Community Preserving Network Embedding" (AAAI 2017)
VOSviewer-OnlineVOSviewer Online is a tool for network visualization. It is a web-based version of VOSviewer, a popular tool for constructing and visualizing bibliometric networks.
ParallelKMeans.jlParallel & lightning fast implementation of available classic and contemporary variants of the KMeans clustering algorithm
LabelPropagationA NetworkX implementation of Label Propagation from a "Near Linear Time Algorithm to Detect Community Structures in Large-Scale Networks" (Physical Review E 2008).
tsamA python-based time series aggregation module (tsam) which can be used to reduce the number of time steps using typical periods or by decreasing the temporal resolution
st dbscanST-DBSCAN: Simple and effective tool for spatial-temporal clustering
goudaGolang Utilities for Data Analysis
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
ClusterTransformerTopic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from huggingface.
FSDAFlexible Statistics and Data Analysis (FSDA) extends MATLAB for a robust analysis of data sets affected by different sources of heterogeneity. It is open source software licensed under the European Union Public Licence (EUPL). FSDA is a joint project by the University of Parma and the Joint Research Centre of the European Commission.
ML-TrackThis repository is a recommended track, designed to get started with Machine Learning.
audio noise clusteringhttps://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Machine LearningA repository of resources for understanding the concepts of machine learning/deep learning.
scikit-cmeansFlexible, extensible fuzzy c-means clustering in python.
CoronaDashCOVID-19 spread shiny dashboard with a forecasting model, countries' trajectories graphs, and cluster analysis tools