All Projects → sunbeam-labs → sunbeam

sunbeam-labs / sunbeam

Licence: other
A robust, extensible metagenomics pipeline

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to sunbeam

panoptes
Monitor computational workflows in real time
Stars: ✭ 45 (-68.53%)
Mutual labels:  snakemake, reproducible-research
awflow
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!
Stars: ✭ 15 (-89.51%)
Mutual labels:  reproducible-research
ck-crowd-scenarios
Public scenarios to crowdsource experiments (such as DNN crowd-benchmarking and crowd-tuning) using Collective Knowledge Framework across diverse mobile devices provided by volunteers. Results are continuously aggregated at the open repository of knowledge:
Stars: ✭ 22 (-84.62%)
Mutual labels:  reproducible-research
ngs pipeline
Exome/Capture/RNASeq Pipeline Implementation using snakemake
Stars: ✭ 40 (-72.03%)
Mutual labels:  snakemake
OpenPlantPathology
Open Plant Pathology website
Stars: ✭ 18 (-87.41%)
Mutual labels:  reproducible-research
gargammel
gargammel is an ancient DNA simulator
Stars: ✭ 17 (-88.11%)
Mutual labels:  metagenomics
MetaCoAG
Binning Metagenomic Contigs via Composition, Coverage and Assembly Graphs
Stars: ✭ 29 (-79.72%)
Mutual labels:  metagenomics
kraken-biom
Create BIOM-format tables (http://biom-format.org) from Kraken output (http://ccb.jhu.edu/software/kraken/, https://github.com/DerrickWood/kraken).
Stars: ✭ 35 (-75.52%)
Mutual labels:  metagenomics
reskit
A library for creating and curating reproducible pipelines for scientific and industrial machine learning
Stars: ✭ 27 (-81.12%)
Mutual labels:  reproducible-research
microbiomeHD
Cross-disease comparison of case-control gut microbiome studies
Stars: ✭ 58 (-59.44%)
Mutual labels:  reproducible-research
ngs-preprocess
A pipeline for preprocessing NGS data from Illumina, Nanopore and PacBio technologies
Stars: ✭ 22 (-84.62%)
Mutual labels:  reproducible-research
StrainFLAIR
Strain-level abundances estimation in metagenomic samples using variation graphs
Stars: ✭ 23 (-83.92%)
Mutual labels:  metagenomics
DUN
Code for "Depth Uncertainty in Neural Networks" (https://arxiv.org/abs/2006.08437)
Stars: ✭ 65 (-54.55%)
Mutual labels:  reproducible-research
metacherchant
No description or website provided.
Stars: ✭ 19 (-86.71%)
Mutual labels:  metagenomics
MOSCA
Meta-Omics Software for Community Analysis
Stars: ✭ 26 (-81.82%)
Mutual labels:  metagenomics
genepattern-notebook
Platform for integrating genomic analysis with Jupyter Notebooks.
Stars: ✭ 37 (-74.13%)
Mutual labels:  reproducible-research
wrench
WRENCH: Cyberinfrastructure Simulation Workbench
Stars: ✭ 25 (-82.52%)
Mutual labels:  reproducible-research
us-rawdata-sda
A Deep Learning Approach to Ultrasound Image Recovery
Stars: ✭ 39 (-72.73%)
Mutual labels:  reproducible-research
ganon
ganon classifies short DNA sequences against large sets of genomic sequences efficiently, with download and update of references (RefSeq/Genbank), taxonomic (NCBI/GTDB) and hierarchical classification, customized reporting and more
Stars: ✭ 57 (-60.14%)
Mutual labels:  metagenomics
Topcuoglu ML mBio 2020
Best practices for applying machine learning to bacterial 16S rRNA gene sequencing data
Stars: ✭ 21 (-85.31%)
Mutual labels:  reproducible-research

Sunbeam: a robust, extensible metagenomic sequencing pipeline

CircleCI Documentation Status DOI:10.1186/s40168-019-0658-x

Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. It uses conda to manage dependencies, so it doesn't have pre-existing dependencies or admin privileges, and can be deployed on most Linux workstations and clusters. To read more, check out our paper in Microbiome.

Sunbeam currently automates the following tasks:

  • Quality control, including adaptor trimming, host read removal, and quality filtering;
  • Taxonomic assignment of reads to databases using Kraken;
  • Assembly of reads into contigs using Megahit;
  • Contig annotation using BLAST[n/p/x];
  • Mapping of reads to target genomes; and
  • ORF prediction using Prodigal.

Sunbeam was designed to be modular and extensible. Some extensions have been built for:

  • IGV for viewing read alignments
  • KrakenHLL, an alternate read classifier
  • Kaiju, a read classifier that uses BWA rather than kmers
  • Anvi'o, a downstream analysis pipeline that does lots of stuff!

More extensions can be found at the extension page: https://www.sunbeam-labs.org/.

To get started, see our documentation!

If you use the Sunbeam pipeline in your research, please cite:

EL Clarke, LJ Taylor, C Zhao, A Connell, J Lee, FD Bushman, K Bittinger. Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments. Microbiome 7:46 (2019)

See how people are using Sunbeam:


Changelog:

v3.0.0 (June 27, 2022)

  • Support use of .smk file extensions in Sunbeam extensions (in addition to .rules)
  • Making use of snakemake's builtin features for environment management to separate dependencies and shrink environments
  • Support mamba as an alternate package dependency solver at install time, for faster installs
  • New command sunbeam extend to automatically install Sunbeam extensions! Use like sunbeam extend https://github.com/sunbeam-labs/sbx_report
  • sunbeam init and sunbeam config update now add options for extensions you've installed to your default config file! (#247)
  • Updated the path to the Illumina adapter sequences from hardcoded to templated (fixes #150 and #152)
  • Use the updated kraken2 classifier instead of kraken
  • Update other dependencies (trimmomatic -> 0.3.9; grabseqs -> 0.6.1; snakemake -> <5.7.0)
  • Use diamond instead of blastx/p for a significant speed increase

v2.1.0 (November 26, 2019)

  • Added a build manifest, which is run every time on integration testing and can be fed into conda by users to install the most recent successful dependencies
  • Updates to documentation (#169, #230, #231)
  • Fix missing samtools (#224)
  • Integration test updates to schedule weekly builds (#222)
  • Fix issues with old paired-end illumina adapters (#221)
  • Script updates to use conda commands instead of source commands (#220)
  • Add h5py package explicitly to avoid dependency metadata problem (#219)
  • Add multiQC to build QC report (#203)
  • Use multithreading for cutadapt in QC (#202)
  • Correct conda channel priority during install (#201)
  • Update documentation to spell out requirements (#199)
  • New megahit failure handling (#194)
  • Enforce sample wildcard constraints in Snakemake rules (#190)
  • Run megahit multithreaded (#189)

v2.0.2 (August 28, 2019)

  • Add implicit dependencies (samtools and bcftools) to environment file to make them explicit

v2.0.1 (July 24, 2019)

  • Increment Snakemake version requirement for compatibility with recent conda
  • Specify earlier megahit version to ensure compatbility with existing assembly behavior
  • Integration test improvements

v2.0.0 (January 22, 2019)

  • Start a project using resources directly from the SRA using sunbeam init --data_acc [SRA ###]. For more information, see the docs
  • New extension website: https://www.sunbeam-labs.org/
  • Improved documentation
  • Numerous bugfixes and optimizations

v1.2.1 (May 24, 2018)

  • Minor bugfixes

v1.2.0 (May 2, 2018)

  • Low-complexity reads are now removed by default rather than masked
  • Bug fixes related to single-end sequencing experiments
  • Documentation updates

v1.1.0 (April 8, 2018)

  • Reports include number of filtered reads per host, rather than in aggregate
  • Static binary dependency for komplexity for easier deployment
  • Remove max length filter for contigs

v1.0.0 (March 22, 2018)

  • First stable release!
  • Support for single-end sequencing experiments
  • Low-complexity read masking via komplexity
  • Support for extensions
  • Documentation on ReadTheDocs.io
  • Better assembler (megahit)
  • Better ORF finder (prodigal)
  • Can remove reads from any number of host/contaminant genomes
  • Semantic versioning checks
  • Integration tests and continuous deployment

Contributors

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].