All Projects → kussell-lab → mcorr

kussell-lab / mcorr

Licence: other
Inferring bacterial recombination rates from large-scale sequencing datasets.

Programming Languages

go
31211 projects - #10 most used programming language
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to mcorr

heatmaps
Better heatmaps in Python
Stars: ✭ 117 (+303.45%)
Mutual labels:  correlation
DGCA
Differential Gene Correlation Analysis
Stars: ✭ 32 (+10.34%)
Mutual labels:  correlation
IGoR
IGoR is a C++ software designed to infer V(D)J recombination related processes from sequencing data. Find full documentation at:
Stars: ✭ 42 (+44.83%)
Mutual labels:  recombination
servicestack-request-correlation
A plugin for ServiceStack that creates a correlation id that allows requests to be tracked across multiple services
Stars: ✭ 12 (-58.62%)
Mutual labels:  correlation
ANCOMBC
Differential abundance (DA) and correlation analyses for microbial absolute abundance data
Stars: ✭ 60 (+106.9%)
Mutual labels:  correlation
CorrelationLayer
Pure Pytorch implementation of Correlation Layer that commonly used in learning based optical flow estimator
Stars: ✭ 22 (-24.14%)
Mutual labels:  correlation
msda
Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector
Stars: ✭ 80 (+175.86%)
Mutual labels:  correlation
flow1d
[ICCV 2021 Oral] High-Resolution Optical Flow from 1D Attention and Correlation
Stars: ✭ 91 (+213.79%)
Mutual labels:  correlation
xmca
Maximum Covariance Analysis in Python
Stars: ✭ 41 (+41.38%)
Mutual labels:  correlation
Posthog
🦔 PostHog provides open-source product analytics that you can self-host.
Stars: ✭ 5,488 (+18824.14%)
Mutual labels:  correlation
Naos
A mildly opiniated modern cloud service architecture blueprint + reference implementation
Stars: ✭ 19 (-34.48%)
Mutual labels:  correlation
Machine-Learning-for-Asset-Managers
Implementation of code snippets, exercises and application to live data from Machine Learning for Asset Managers (Elements in Quantitative Finance) written by Prof. Marcos López de Prado.
Stars: ✭ 168 (+479.31%)
Mutual labels:  correlation
Mixpanel-Statistics
Perform statistics on Mixpanel API data
Stars: ✭ 26 (-10.34%)
Mutual labels:  correlation
tuneta
Intelligently optimizes technical indicators and optionally selects the least intercorrelated for use in machine learning models
Stars: ✭ 77 (+165.52%)
Mutual labels:  correlation
nfc-laboratory
NFC signal and protocol analyzer using SDR receiver
Stars: ✭ 41 (+41.38%)
Mutual labels:  correlation
TreeCorr
Code for efficiently computing 2-point and 3-point correlation functions. For documentation, go to
Stars: ✭ 85 (+193.1%)
Mutual labels:  correlation
CorBinian
CorBinian: A toolbox for modelling and simulating high-dimensional binary and count-data with correlations
Stars: ✭ 15 (-48.28%)
Mutual labels:  correlation
get phylomarkers
A pipeline to select optimal markers for microbial phylogenomics and species tree estimation using coalescent and concatenation approaches
Stars: ✭ 34 (+17.24%)
Mutual labels:  recombination

mcorr

Using Correlation Profiles of mutations to infer the recombination rate from large-scale sequencing data in bacteria.

Requirements

Installation

  1. Install mcorr-xmfa, mcorr-bam, and mcorr-fit from your terminal:
go get -u github.com/kussell-lab/mcorr/cmd/mcorr-xmfa
go get -u github.com/kussell-lab/mcorr/cmd/mcorr-bam
cd $HOME/go/src/github.com/kussell-lab/mcorr/cmd/mcorr-fit
python3 setup.py install

or to install mcorr-fit in local directory (~/.local/bin in Linux or ~/Library/Python/3.6/bin in MacOS):

python3 setup.py install --user
  1. Add $HOME/go/bin and $HOME/.local/bin to your $PATH environment. In Linux, you can do it in your terminal:
export PATH=$PATH:$HOME/go/bin:$HOME/.local/bin

In MacOS, you can do it as follows:

export PATH=$PATH:$HOME/go/bin:$HOME/Library/Python/3.6/bin

We have tested installation in Windows 10, Ubuntu 17.10, and MacOS Big Sur (on both Intel and M1 chips), using Python 3 and Go 1.15 and 1.16.

Typical installation time on an iMac is 10 minutes.

Basic Usage

The inference of recombination parameters requires two steps:

  1. Calculate Correlation Profile

    1. For whole-genome alignments (multiple gene alignments), use mcorr-xmfa:

      mcorr-xmfa <input XMFA file> <output prefix>

      The XMFA files should contain only coding sequences. The description of XMFA file can be found in http://darlinglab.org/mauve/user-guide/files.html. We provide two useful pipelines to generate whole-genome alignments:

    2. For read alignments, use mcorr-bam:

      mcorr-bam <GFF3 file> <sorted BAM file> <output prefix>

      The GFF3 file is used for extracting the coding regions of the sorted BAM file.

    3. For calculating correlation profiles between two clades or sequence clusters from whole-genome alignments, you can use mcorr-xmfa-2clades:

      mcorr-xmfa-2clades <input XMFA file 1> <input XMFA file 2>  <output prefix>

      Where file 1 and file 2 are the multiple gene alignments for the two clades.

    All programs will produce two files:

    • a .csv file stores the calculated Correlation Profile, which will be used for fitting in the next step;
    • a .json file stores the (intermediate) Correlation Profile for each gene.
  2. Fit the Correlation Profile using mcorr-fit:

    1. For fitting correlation profiles as described in the 2019 Nature Methods paper use mcorr-fit:

      mcorr-fit <.csv file> <output_prefix>

      It will produce four files:

      • <output_prefix>_best_fit.svg shows the plots of the Correlation Profile, fitting, and residuals;
      • <output_prefix>_fit_reports.txt shows the summary of the fitted parameters;
      • <output_prefix>_fit_results.csv shows the table of fitted parameters;
      • <output_prefix>_lmfit_report.csv shows goodness of fit-statistics from LMFIT
    2. To fit correlation profiles using the method from the Nature Methods paper and do model selection with AIC by comparing to the zero recombination case, use mcorrFitCompare:

      mcorrFitCompare <.csv file> <output_prefix>

      It will produce five files:

      • <output_prefix>_recombo_best_fit.svg and <output_prefix>_zero-recombo_best_fit.svg show the plots of the Correlation Profile, fitting, and residuals for the model with recombination and for the zero recombination case;
      • <output_prefix>_comparemodels.csv shows the table of fitted parameters and AIC values;
      • <output_prefix>_recombo_residuals.csv and <output_prefix>_zero-recombo_residuals.csv includes residuals for the model with recombination and the zero-recombination case

Examples

  1. Inferring recombination rates of Helicobacter pylori from whole genome sequences of a set of global strains;
  2. Inferring recombination rates of Helicobacter pylori from reads sequenced from a transformation experiment.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].