All Projects → biobakery → melonnpan

biobakery / melonnpan

Licence: MIT license
Model-based Genomically Informed High-dimensional Predictor of Microbial Community Metabolic Profiles

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to melonnpan

Maaslin2
MaAsLin2: Microbiome Multivariate Association with Linear Models
Stars: ✭ 76 (+280%)
Mutual labels:  public, metagenomics, bioconductor, microbiome, biobakery
DAtest
Compare different differential abundance and expression methods
Stars: ✭ 34 (+70%)
Mutual labels:  metagenomics, metabolomics, microbiome
DRAM
Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
Stars: ✭ 159 (+695%)
Mutual labels:  metagenomics, microbiome
MicEco
Various functions for analysis of microbial community data
Stars: ✭ 25 (+25%)
Mutual labels:  microbial-communities, microbiome
microbiomeMarker
R package for microbiome biomarker discovery
Stars: ✭ 89 (+345%)
Mutual labels:  metagenomics, microbiome
ganon
ganon classifies short DNA sequences against large sets of genomic sequences efficiently, with download and update of references (RefSeq/Genbank), taxonomic (NCBI/GTDB) and hierarchical classification, customized reporting and more
Stars: ✭ 57 (+185%)
Mutual labels:  metagenomics, microbiome
humann
HUMAnN 3.0 is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).
Stars: ✭ 95 (+375%)
Mutual labels:  public, biobakery
xcms
This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Stars: ✭ 124 (+520%)
Mutual labels:  bioconductor, metabolomics
q2-qemistree
Hierarchical orderings for mass spectrometry data. Canonically pronounced "chemis-tree".
Stars: ✭ 23 (+15%)
Mutual labels:  metabolomics, microbiome
kneaddata
Quality control tool on metagenomic and metatranscriptomic sequencing data, especially data from microbiome experiments.
Stars: ✭ 52 (+160%)
Mutual labels:  public, biobakery
QFeatures
Quantitative features for mass spectrometry data
Stars: ✭ 12 (-40%)
Mutual labels:  bioconductor, metabolomics
GNPS Workflows
Public Workflows at GNPS
Stars: ✭ 31 (+55%)
Mutual labels:  metabolomics
covid-19-signal
Files and methodology pertaining to the sequencing and analysis of SARS-CoV-2, causative agent of COVID-19.
Stars: ✭ 31 (+55%)
Mutual labels:  metagenomics
DeepMAsED
Deep learning for Metagenome Assembly Error Detection
Stars: ✭ 24 (+20%)
Mutual labels:  metagenomics
maui
Multi-omics Autoencoder Integration: Deep learning-based heterogenous data analysis toolkit
Stars: ✭ 42 (+110%)
Mutual labels:  multi-omics
AMBER
AMBER: Assessment of Metagenome BinnERs
Stars: ✭ 18 (-10%)
Mutual labels:  metagenomics
pyqms
pyQms, generalized, fast and accurate mass spectrometry data quantification
Stars: ✭ 22 (+10%)
Mutual labels:  metabolomics
Public-Method-CardGame-NiuNiu
纸牌游戏牛牛的最优算法及Method
Stars: ✭ 21 (+5%)
Mutual labels:  public
masader
The largest public catalogue for Arabic NLP and speech datasets. There are +250 datasets annotated with more than 25 attributes.
Stars: ✭ 66 (+230%)
Mutual labels:  public
sunbeam
A robust, extensible metagenomics pipeline
Stars: ✭ 143 (+615%)
Mutual labels:  metagenomics

MelonnPan - Model-based Genomically Informed High-dimensional Predictor of Microbial Community Metabolic Profiles

Himel Mallick 2021-02-08

Introduction

MelonnPan is a computational method for predicting metabolite compositions from microbiome sequencing data.

Overview of MelonnPan

MelonnPan is composed of two high-level workflows: MelonnPan-Predict and MelonnPan-Train.

The MelonnPan-Predict workflow takes a table of microbial sequence features (i.e., taxonomic or functional abundances on a per sample basis) as input, and outputs a predicted metabolomic table (i.e., relative abundances of metabolite compounds across samples).

The MelonnPan-Train workflow creates an weight matrix that links an optimal set of sequence features to a subset of predictable metabolites following rigorous internal validation, which is then used to generate a table of predicted metabolite compounds (i.e., relative abundances of metabolite compounds per sample). When sufficiently accurate, these predicted metabolite relative abundances can be used for downstream statistical analysis and end-to-end biomarker discovery.

How to Install

Before installing MelonnPan, please install the prequisites as follows (execute from within a fresh R session):

devtools::install_version("GenABEL.data", version = "1.0.0", repos = "http://cran.us.r-project.org")
devtools::install_version("GenABEL", version = "1.8-0", repos = "http://cran.us.r-project.org")

Once these packages are installed, there are three options for installing MelonnPan:

  • Within R
  • Directly from Bitbucket
  • Directly from GitHub

From Within R

You can install MelonnPan using the devtools package in R using either install_github or install_bitbucket function calls:

devtools::install_github("biobakery/melonnpan")
devtools::install_bitbucket("biobakery/melonnpan")

From Bitbucket (Directly)

Clone the repository using git clone, which downloads the package as its own directory called melonnpan.

git clone https://<your-user-name>@bitbucket.org/biobakery/melonnpan.git

Then, install MelonnPan using R CMD INSTALL.

R CMD INSTALL melonnpan

From GitHub (Directly)

Clone the repository using git clone, which downloads the package as its own directory called melonnpan.

git clone https://github.com/biobakery/melonnpan.git

Then, install MelonnPan using R CMD INSTALL.

R CMD INSTALL melonnpan

Usage

MelonnPan can be run from the command line or from within R. Both methods require the same arguments, have the same options, and use the same default settings. Check out the MelonnPan tutorial for an example application.

  • The default MelonnPan-Predict function can be run by executing the script predict_metabolites.R from the command line or within R using the function melonnpan.predict(). Currently it uses a pre-trained model from the human gut based on UniRef90 gene families (functionally profiled by HUMAnN2), as described in Franzosa et al. (2019) and the original MelonnPan paper (Mallick et al., 2019), which is included in the package and can also be downloaded from the data/ sub-directory.

  • If you have paired metabolite and microbial sequencing data (possibly measured from the same biospecimen), you can also train a MelonnPan model by running the script train_metabolites.R from the command line or within R using the function melonnpan.train().

  • MelonnPan currently requires input data that is specified using UniRef90 gene families (functionally profiled by HUMAnN2). If you do not have functionally profiled UniRef90 gene families from the human gut or other environments, you may need to first train a MelonnPan model using the MelonnPan-Train workflow and supply the resulting weights to the MelonnPan-Predict module to get the relevant predictions.

Input

  • MelonnPan-Predict workflow requires the following input:
    • a table of microbial sequence features' relative abundances (samples in rows)
  • MelonnPan-Train workflow requires the following inputs:
    • a table of metabolite relative abundances (samples in rows)
    • a table of microbial sequence features' relative abundances (samples in rows)
  • For a complete description of the possible parameters for specific MelonnPan functions and their default values and output, run the help within R with the ? operator.

Output

  • The MelonnPan-Predict workflow outputs the following:
    • MelonnPan_Predicted_Metabolites.txt: Predicted relative abundances of metabolites as determined by MelonnPan-Predict.
    • MelonnPan_RTSI.txt: Table summarizing RTSI scores per sample.
  • Similarly, the MelonnPan-Train workflow outputs the following:
    • MelonnPan_Training_Summary.txt: Significant compounds list with per-compound prediction accuracy (correlation coefficient) and the associated p-value and q-value.
    • MelonnPan_Trained_Metabolites.txt: Predicted relative abundances of statisticially significant metabolites as determined by MelonnPan-Train.
    • MelonnPan_Trained_Weights.txt: Table summarizing coefficient estimates (weights) per compound.

Contributions

Thanks go to these wonderful people:

References

Zou H, Hastie T (2005). Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society. Series B (Methodological) 67(2):301–320.

Franzosa EA et al. (2019). Gut microbiome structure and metabolic activity in inflammatory bowel disease. Nature Microbiology 4(2):293–305.

Citation

Mallick H, Franzosa EA, McIver LJ, Banerjee S, Sirota-Madi A, Kostic AD, Clish CB, Vlamakis H, Xavier R, Huttenhower C (2019). Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences. Nature Communications 10(1):3136-3146.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].