All Projects → zhengxwen → SNPRelate

zhengxwen / SNPRelate

Licence: other
R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development Version)

Programming Languages

C++
36643 projects - #6 most used programming language
r
7636 projects

Projects that are alternatives of or similar to SNPRelate

ml-simulations
Animated Visualizations of Popular Machine Learning Algorithms
Stars: ✭ 33 (-55.41%)
Mutual labels:  pca
qHilbert
qHilbert is a vectorized speedup of Hilbert curve generation using SIMD intrinsics
Stars: ✭ 22 (-70.27%)
Mutual labels:  simd
xor
Move to: https://github.com/templexxx/xorsimd
Stars: ✭ 27 (-63.51%)
Mutual labels:  simd
Quickenshtein
Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support
Stars: ✭ 204 (+175.68%)
Mutual labels:  simd
federated pca
Federated Principal Component Analysis Revisited!
Stars: ✭ 30 (-59.46%)
Mutual labels:  pca
Loan-Prediction-Dataset
No description or website provided.
Stars: ✭ 21 (-71.62%)
Mutual labels:  pca
Machine-Learning-Models
In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Stars: ✭ 30 (-59.46%)
Mutual labels:  pca
geeSharp.js
Pan-sharpening in the Earth Engine code editor
Stars: ✭ 25 (-66.22%)
Mutual labels:  pca
penguinV
Simple and fast C++ image processing library with focus on heterogeneous systems
Stars: ✭ 110 (+48.65%)
Mutual labels:  simd
ClassifierToolbox
A MATLAB toolbox for classifier: Version 1.0.7
Stars: ✭ 72 (-2.7%)
Mutual labels:  pca
Amplifier.NET
Amplifier allows .NET developers to easily run complex applications with intensive mathematical computation on Intel CPU/GPU, NVIDIA, AMD without writing any additional C kernel code. Write your function in .NET and Amplifier will take care of running it on your favorite hardware.
Stars: ✭ 142 (+91.89%)
Mutual labels:  simd
SIMDArray
SIMD enhanced Array operations
Stars: ✭ 123 (+66.22%)
Mutual labels:  simd
linear-vs-binary-search
Comparing linear and binary searches
Stars: ✭ 28 (-62.16%)
Mutual labels:  simd
HIBAG
R package – HLA Genotype Imputation with Attribute Bagging (development version only)
Stars: ✭ 23 (-68.92%)
Mutual labels:  snp
mir-glas
[Experimental] LLVM-accelerated Generic Linear Algebra Subprograms
Stars: ✭ 99 (+33.78%)
Mutual labels:  simd
ocr-machine-learning
OCR Machine Learning in python
Stars: ✭ 42 (-43.24%)
Mutual labels:  pca
psimd
Portable 128-bit SIMD intrinsics
Stars: ✭ 48 (-35.14%)
Mutual labels:  simd
playing with vae
Comparing FC VAE / FCN VAE / PCA / UMAP on MNIST / FMNIST
Stars: ✭ 53 (-28.38%)
Mutual labels:  pca
moses
Streaming, Memory-Limited, r-truncated SVD Revisited!
Stars: ✭ 19 (-74.32%)
Mutual labels:  pca
runtime
AnyDSL Runtime Library
Stars: ✭ 17 (-77.03%)
Mutual labels:  simd

SNPRelate: Parallel computing toolset for relatedness and principal component analysis of SNP data

GPLv3 GNU General Public License, GPLv3

Availability Years-in-BioC Build Status Build status Comparison is done across all Bioconductor packages over the last 6 months codecov.io

Features

Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed SNPRelate (R package for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized.

The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. The SNP GDS format in this package is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variation (SNV), insertion/deletion polymorphism (indel) and structural variation calls. It is strongly suggested to use SeqArray for large-scale whole-exome and whole-genome sequencing variant data instead of SNPRelate.

Bioconductor

Release Version: v1.30.0

http://www.bioconductor.org/packages/SNPRelate

News

Tutorials

http://www.bioconductor.org/packages/release/bioc/vignettes/SNPRelate/inst/doc/SNPRelate.html

Citations

Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012). A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. DOI: 10.1093/bioinformatics/bts606.

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics. DOI: 10.1093/bioinformatics/btx145.

Installation

  • Bioconductor repository:
if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("SNPRelate")
  • Development version from Github (for developers/testers only):
library("devtools")
install_github("zhengxwen/gdsfmt")
install_github("zhengxwen/SNPRelate")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

Implementation with Intel Intrinsics

Functions No SIMD SSE2 AVX AVX2 AVX-512
snpgdsDiss » X
snpgdsEIGMIX » X X X
snpgdsGRM » X X X .
snpgdsIBDKING » X X X
snpgdsIBDMoM » X
snpgdsIBS » X X
snpgdsIBSNum » X X
snpgdsIndivBeta » X X P X
snpgdsPCA » X X X
snpgdsPCACorr » X
snpgdsPCASampLoading » X
snpgdsPCASNPLoading » X
...

X: fully supported; .: partially supported; P: POPCNT instruction.

Install the package from the source code with the support of Intel SIMD Intrinsics:

You have to customize the package compilation, see: CRAN: Customizing-package-compilation

Change ~/.R/Makevars to, assuming GNU Compilers (gcc/g++) or Clang compiler (clang++) are installed:

## for C code
CFLAGS=-g -O3 -march=native -mtune=native
## for C++ code
CXXFLAGS=-g -O3 -march=native -mtune=native
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].