All Projects → stephenslab → rss

stephenslab / rss

Licence: MIT license
Regression with Summary Statistics.

Programming Languages

matlab
3953 projects
r
7636 projects
c
50402 projects - #5 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to rss

autoreparam
Automatic Reparameterisation of Probabilistic Programs
Stars: ✭ 29 (-30.95%)
Mutual labels:  mcmc, variational-inference
Gpstuff
GPstuff - Gaussian process models for Bayesian analysis
Stars: ✭ 106 (+152.38%)
Mutual labels:  mcmc, variational-inference
Pymc3
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Aesara
Stars: ✭ 6,214 (+14695.24%)
Mutual labels:  mcmc, variational-inference
cmdstanr
CmdStanR: the R interface to CmdStan
Stars: ✭ 82 (+95.24%)
Mutual labels:  mcmc, variational-inference
Deepbayes
Bayesian methods in deep learning Summer School
Stars: ✭ 15 (-64.29%)
Mutual labels:  bayesian-methods, mcmc
Boltzmann Machines
Boltzmann Machines in TensorFlow with examples
Stars: ✭ 768 (+1728.57%)
Mutual labels:  mcmc, variational-inference
Bayesian Neural Networks
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more
Stars: ✭ 900 (+2042.86%)
Mutual labels:  mcmc, variational-inference
Deep Generative Models For Natural Language Processing
DGMs for NLP. A roadmap.
Stars: ✭ 185 (+340.48%)
Mutual labels:  mcmc, variational-inference
LogDensityProblems.jl
A common framework for implementing and using log densities for inference.
Stars: ✭ 26 (-38.1%)
Mutual labels:  bayesian-methods, mcmc
Probabilistic Models
Collection of probabilistic models and inference algorithms
Stars: ✭ 217 (+416.67%)
Mutual labels:  mcmc, variational-inference
Shinystan
shinystan R package and ShinyStan GUI
Stars: ✭ 172 (+309.52%)
Mutual labels:  bayesian-methods, mcmc
Nimble
The base NIMBLE package for R
Stars: ✭ 95 (+126.19%)
Mutual labels:  bayesian-methods, mcmc
SIVI
Using neural network to build expressive hierarchical distribution; A variational method to accurately estimate posterior uncertainty; A fast and general method for Bayesian inference. (ICML 2018)
Stars: ✭ 49 (+16.67%)
Mutual labels:  mcmc, variational-inference
bayseg
An unsupervised machine learning algorithm for the segmentation of spatial data sets.
Stars: ✭ 46 (+9.52%)
Mutual labels:  bayesian-methods
ccube
Bayesian mixture models for estimating and clustering cancer cell fractions
Stars: ✭ 23 (-45.24%)
Mutual labels:  variational-inference
DynamicHMCModels.jl
DynamicHMC versions of StatisticalRethinking models
Stars: ✭ 17 (-59.52%)
Mutual labels:  mcmc
mcmcr
An R package to manipulate MCMC samples
Stars: ✭ 17 (-59.52%)
Mutual labels:  mcmc
walker
Bayesian Generalized Linear Models with Time-Varying Coefficients
Stars: ✭ 38 (-9.52%)
Mutual labels:  mcmc
covidestim
Bayesian nowcasting with adjustment for delayed and incomplete reporting to estimate COVID-19 infections in the United States
Stars: ✭ 20 (-52.38%)
Mutual labels:  mcmc
binary.com-interview-question
The sample question for Interview a job in Binary options
Stars: ✭ 52 (+23.81%)
Mutual labels:  mcmc

Regression with Summary Statistics (RSS)

DOI

Overview

Multiple regression analyses often assume that the response and covariates of each individual are observed, and use them to infer the regression coefficients. Here, motivated by the applications in genetics, we assume that these individual-level data are not available, but instead the summary statistics of univariate regression (essentially, the effect size estimates and their standard errors) are provided. We also assume that information on the correlation structure among covariates is available. The aim is to infer the multiple regression coefficients using the marginal regression summary statistics.

This work is motivated by applications in genome-wide association studies (GWAS). When fitting the multiple regression model to individual-level data of GWAS, the covariates are the genotypes typed at different genetic variants (typically SNPs), the response is the quantitative phenotype (e.g. height or blood lipid level), and the regression coefficients are the effects of each SNP on phenotype. Due to privacy and logistical issues, the individual-level data are often not easily available. In contrast, the GWAS summary statistics (from standard single-SNP analysis) are widely available in the public domain (e.g. GIANT and PGC). Moreover, the correlation among covariates (genotypes of SNPs), known as linkage disequilibrium, also can be obtained from public databases (e.g. the 1000 Genomes Project). When the protected individual-level data are not available, can we perform "multiple-SNP" analysis using these public assets?

Here we provide a generally-applicable framework for the multiple-SNP analyses using GWAS single-SNP summary data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results. We then combine the RSS likelihood with suitable priors to perform Bayesian inference for the regression coefficients.

License

The repository is licensed under the MIT License.

Support

  1. Get started from some short tutorials.
  2. Refer to FAQ for answers to some common questions.
  3. Create a new issue to report bugs and/or request features.
  4. Send an email to xiangzhu[at]psu[dot]edu.

Citation

  • The Regression with Summary Statistics (RSS) likelihood
    Xiang Zhu and Matthew Stephens (2017). Bayesian large-scale multiple regression with summary statistics from genome-wide association studies. Annals of Applied Statistics 11(3): 1561-1592. [Article PDF] [Journal Page] [bioRxiv Page] [Supplementary Information] [Software]

  • RSS-E: Enrichment and prioritization analysis based on RSS likelihood
    Xiang Zhu and Matthew Stephens (2018). Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nature Communications 9, 4361. [Article PDF] [Journal Page] [bioRxiv Page] [Supplementary Information] [Online Results] [Software]

  • RSS-NET: Integrated analysis of regulatory networks based on RSS likelihood
    Xiang Zhu, Zhana Duren and Wing Hung Wong (2021). Modeling regulatory network topology improves genome-wide analyses of complex human traits. Nature Communications 12, 2851. [Article PDF] [Journal Page] [bioRxiv Page] [Supplementary Information] [Online Results] [Software]

  • Genetic architecture inference of complex traits based on RSS likelihood
    TBA

  • Simple and robust heritability estimation based on RSS likelihood
    TBA

  • Cross-population genetic analysis of complex traits based on RSS likelihood
    TBA

Collaboration

Here we have developed a likelihood function of multiple regression coefficients based on univariate regression summary data, which opens the door to a wide range of statistical machinery for inference. Using this likelihood, we have implemented Bayesian methods to estimate SNP heritability, detect genetic association, assess gene set or network enrichment, prioritize trait-associated genes and infer genetic architecture. Please check our progress updates regularly.

If you have specific applications that use GWAS summary data as input, and want to build new statistical methods based on the RSS likelihood, please feel free to contact us. We are glad to help!

Contact

Xiang Zhu
Matthew Stephens Lab
Department of Statistics
University of Chicago

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].