mathematics-statistics-for-data-scienceMathematical & Statistical topics to perform statistical analysis and tests; Linear Regression, Probability Theory, Monte Carlo Simulation, Statistical Sampling, Bootstrapping, Dimensionality reduction techniques (PCA, FA, CCA), Imputation techniques, Statistical Tests (Kolmogorov Smirnov), Robust Estimators (FastMCD) and more in Python and R.
Stars: ✭ 56 (+69.7%)
ezancestryEasy genetic ancestry predictions in Python
Stars: ✭ 38 (+15.15%)
twpca🕝 Time-warped principal components analysis (twPCA)
Stars: ✭ 118 (+257.58%)
Data-ScienceUsing Kaggle Data and Real World Data for Data Science and prediction in Python, R, Excel, Power BI, and Tableau.
Stars: ✭ 15 (-54.55%)
MillerMiller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Stars: ✭ 4,633 (+13939.39%)
audio noise clusteringhttps://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: ✭ 24 (-27.27%)
Machine Learning With PythonPractice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+6557.58%)
Awesome Single CellCommunity-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Stars: ✭ 1,937 (+5769.7%)
UmapUniform Manifold Approximation and Projection
Stars: ✭ 5,268 (+15863.64%)
HARRecognize one of six human activities such as standing, sitting, and walking using a Softmax Classifier trained on mobile phone sensor data.
Stars: ✭ 18 (-45.45%)
Competitive-Feature-LearningOnline feature-extraction and classification algorithm that learns representations of input patterns.
Stars: ✭ 32 (-3.03%)
tGPLVMtGPLVM: A Nonparametric, Generative Model for Manifold Learning with scRNA-seq experimental data
Stars: ✭ 16 (-51.52%)
Unsupervised-Learning-in-RWorkshop (6 hours): Clustering (Hdbscan, LCA, Hopach), dimension reduction (UMAP, GLRM), and anomaly detection (isolation forests).
Stars: ✭ 34 (+3.03%)
dbMAPA fast, accurate, and modularized dimensionality reduction approach based on diffusion harmonics and graph layouts. Escalates to millions of samples on a personal laptop. Adds high-dimensional big data intrinsic structure to your clustering and data visualization workflow.
Stars: ✭ 39 (+18.18%)
NIDS-Intrusion-DetectionSimple Implementation of Network Intrusion Detection System. KddCup'99 Data set is used for this project. kdd_cup_10_percent is used for training test. correct set is used for test. PCA is used for dimension reduction. SVM and KNN supervised algorithms are the classification algorithms of project. Accuracy : %83.5 For SVM , %80 For KNN
Stars: ✭ 45 (+36.36%)
dmlR package for Distance Metric Learning
Stars: ✭ 58 (+75.76%)
UMAP.jlUniform Manifold Approximation and Projection (UMAP) implementation in Julia
Stars: ✭ 93 (+181.82%)
bhtsneParallel Barnes-Hut t-SNE implementation written in Rust.
Stars: ✭ 43 (+30.3%)
SpectreA computational toolkit in R for the integration, exploration, and analysis of high-dimensional single-cell cytometry and imaging data.
Stars: ✭ 31 (-6.06%)
timecorrEstimate dynamic high-order correlations in multivariate timeseries data
Stars: ✭ 30 (-9.09%)
adenineADENINE: A Data ExploratioN PipelINE
Stars: ✭ 15 (-54.55%)
pymdeMinimum-distortion embedding with PyTorch
Stars: ✭ 420 (+1172.73%)
walkletsA lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).
Stars: ✭ 94 (+184.85%)
scHPFSingle-cell Hierarchical Poisson Factorization
Stars: ✭ 52 (+57.58%)
ReductionWrappersR wrappers to connect Python dimensional reduction tools and single cell data objects (Seurat, SingleCellExperiment, etc...)
Stars: ✭ 31 (-6.06%)
tldrTLDR is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self-supervised learning losses
Stars: ✭ 95 (+187.88%)
50-days-of-Statistics-for-Data-ScienceThis repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
Stars: ✭ 19 (-42.42%)
lfdaLocal Fisher Discriminant Analysis in R
Stars: ✭ 74 (+124.24%)
topometryA comprehensive dimensional reduction framework to recover the latent topology from high-dimensional data.
Stars: ✭ 64 (+93.94%)
enstopEnsemble topic modelling with pLSA
Stars: ✭ 104 (+215.15%)
ParametricUMAP paperParametric UMAP embeddings for representation and semisupervised learning. From the paper "Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning" (Sainburg, McInnes, Gentner, 2020).
Stars: ✭ 132 (+300%)
mosesStreaming, Memory-Limited, r-truncated SVD Revisited!
Stars: ✭ 19 (-42.42%)
uapcaUncertainty-aware principal component analysis.
Stars: ✭ 16 (-51.52%)
federated pcaFederated Principal Component Analysis Revisited!
Stars: ✭ 30 (-9.09%)
Machine LearningA repository of resources for understanding the concepts of machine learning/deep learning.
Stars: ✭ 29 (-12.12%)
DRComparisonComparison of dimensionality reduction methods
Stars: ✭ 29 (-12.12%)
sefA Python Library for Similarity-based Dimensionality Reduction
Stars: ✭ 24 (-27.27%)