All Projects → fulcrumgenomics → Fgbio

fulcrumgenomics / Fgbio

Licence: mit
Tools for working with genomic and high throughput sequencing data.

Programming Languages

scala
5932 projects

Projects that are alternatives of or similar to Fgbio

Deeptools
Tools to process and analyze deep sequencing data.
Stars: ✭ 448 (+169.88%)
Mutual labels:  bioinformatics, ngs
Fusiondirect.jl
(No maintenance) Detect gene fusion directly from raw fastq files
Stars: ✭ 23 (-86.14%)
Mutual labels:  bioinformatics, ngs
Htslib
C library for high-throughput sequencing data formats
Stars: ✭ 529 (+218.67%)
Mutual labels:  bioinformatics, ngs
ctdna-pipeline
A simplified pipeline for ctDNA sequencing data analysis
Stars: ✭ 29 (-82.53%)
Mutual labels:  bioinformatics, ngs
Bioconvert
Bioconvert is a collaborative project to facilitate the interconversion of life science data from one format to another.
Stars: ✭ 112 (-32.53%)
Mutual labels:  bioinformatics, ngs
platon
Identification & characterization of bacterial plasmid-borne contigs from short-read draft assemblies.
Stars: ✭ 52 (-68.67%)
Mutual labels:  bioinformatics, ngs
Manorm
A robust model for quantitative comparison of ChIP-Seq data sets.
Stars: ✭ 16 (-90.36%)
Mutual labels:  bioinformatics, ngs
OpenGene.jl
(No maintenance) OpenGene, core libraries for NGS data analysis and bioinformatics in Julia
Stars: ✭ 60 (-63.86%)
Mutual labels:  bioinformatics, ngs
Gatk
Official code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+503.61%)
Mutual labels:  bioinformatics, ngs
Migmap
HTS-compatible wrapper for IgBlast V-(D)-J mapping tool
Stars: ✭ 38 (-77.11%)
Mutual labels:  bioinformatics, ngs
gencore
Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
Stars: ✭ 91 (-45.18%)
Mutual labels:  bioinformatics, ngs
Ngless
NGLess: NGS with less work
Stars: ✭ 115 (-30.72%)
Mutual labels:  bioinformatics, ngs
peppy
Project metadata manager for PEPs in Python
Stars: ✭ 29 (-82.53%)
Mutual labels:  bioinformatics, ngs
Jvarkit
Java utilities for Bioinformatics
Stars: ✭ 313 (+88.55%)
Mutual labels:  bioinformatics, ngs
reg-gen
Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data.
Stars: ✭ 64 (-61.45%)
Mutual labels:  bioinformatics, ngs
Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+389.16%)
Mutual labels:  bioinformatics, ngs
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (-66.87%)
Mutual labels:  bioinformatics, ngs
SVCollector
Method to optimally select samples for validation and resequencing
Stars: ✭ 20 (-87.95%)
Mutual labels:  bioinformatics, ngs
Fastp
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Stars: ✭ 966 (+481.93%)
Mutual labels:  bioinformatics, ngs
Ugene
UGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (-32.53%)
Mutual labels:  bioinformatics, ngs

Build Status codecov Codacy Badge Maven Central Bioconda Javadocs License Language

fgbio

A set of tools to analyze genomic data with a focus on Next Generation Sequencing. This readme document is mostly for developers/contributors and those attempting to build the project from source. Detailed user documentation is available on the project website including tool usage and documentation of metrics produced. Detailed developer documentation can be found here.

Goals

There are many toolkits available for analyzing genomic data; fgbio does not aim to be all things to all people but is specifically focused on providing:

  • Robust, well-tested tools.
  • An easy to use command-line.
  • Clear and thorough documentation for each tool.
  • Open source development for the benefit of the community and our clients.

Overview

Fgbio is a set of command line tools to perform bioinformatic/genomic data analysis. The collection of tools within fgbio are used by our customers and others both for ad-hoc data analysis and within production pipelines. These tools typically operate on read-level data (ex. FASTQ, SAM, or BAM) or variant-level data (ex. VCF or BCF). They range from simple tools to filter reads in a BAM file, to tools to compute consensus reads from reads with the same molecular index/tag. See the list of tools for more detail on the tools

List of tools

For a full list of available tools please see the tools section of the project website.

Below we highlight a few tools that you may find useful.

  • Tools for working with Unique Molecular Indexes (UMIs, aka Molecular IDs or MIDs).
    • Annotating/Extract Umis from read-level data: AnnotateBamWithUmis and ExtractUmisFromBam.
    • Tools to manipulate read-level data containing Umis: CorrectUmis, GroupReadsByUmi, CallMolecularConsensusReads and CallDuplexConsensusReads
  • Tools to manipulate read-level data:
    • FastqManipulation: DemuxFastqs and FastqToBam
    • Filter read-level data: FilterBam.
    • Clipping of reads: ClipBam.
    • Randomize the order of read-level data: RandomizeBam.
    • Update read-level metadata: SetMateInformation and UpdateReadGroups.
  • Quality assessment tools:
    • Detailed substitution error rate evaluation: ErrorRateByReadPosition
    • Sample pooling QC: EstimatePoolingFractions
    • Splice-aware insert size QC for RNA-seq libraries: EstimateRnaSeqInsertSize
    • Assessment of duplex sequencing experiments: CollectDuplexSeqMetrics
  • Miscellaneous tools:
    • Pick molecular indices (ex. sample barcodes, or molecular indexes): PickIlluminaIndices and PickLongIndices.
    • Convert the output of HAPCUT (a tool for phasing variants): HapCutToVcf.
    • Find technical or synthetic sequences in read-level data: FindTechnicalReads.
    • Assess phased variant calls: AssessPhasing.

Building

Cloning the Repository

Git LFS is used to store large files used in testing fgbio. In order to compile and run tests it is necessary to install git lfs. To retrieve the large files either:

  1. Clone the repository after installing git lfs, or
  2. In a previously cloned repository run git lfs pull once

After initial setup regular git commands (e.g. pull, fetch, push) will also operate on large files and no special handling is needed.

To clone the repository: git clone https://github.com/fulcrumgenomics/fgbio.git

Running the build

fgbio is built using sbt.

Use sbt assembly to build an executable jar in target/scala-2.13/.

Tests may be run with sbt test.

Java SE 8 is required.

Command line

java -jar target/scala-2.13/fgbio-<version>.jar to see the commands supported. Use java -jar target/scala-2.13/fgbio-<version>.jar <command> to see the help message for a particular command.

Include fgbio in your project

You can include fgbio in your project using:

"com.fulcrumgenomics" %% "fgbio" % "1.0.0"

for the latest released version or (buyer beware):

"com.fulcrumgenomics" %% "fgbio" % "0.9.0-<commit-hash>-SNAPSHOT"

for the latest development snapshot.

Contributing

Contributions are welcome and encouraged. We will do our best to provide an initial response to any pull request or issue within one-week. For urgent matters, please contact us directly.

Authors

License

fgbio is open source software released under the MIT License.

Sponsorship

Become a sponsor

As a free and open source project, fgbio relies on the support of the community of users for its development. If you work for an organization that uses and benefits from fgbio, please consider supporting fgbio. There are different ways, such as employing people to work on fgbio, funding the project, or becoming a sponsor to support the broader ecosystem. Please [email protected] to discuss.

Sponsors

Sponsors provide support for fgbio through direct funding or employing contributors. Public sponsors include:

Fulcrum Genomics   TwinStrand Biosciences   Jumpcode Genomics   iGenomX   Myriad Genetics   Mission Bio   Singular Genomics   Verogen   Integrated DNA Technologies   Strata Oncology

The full list of sponsors supporting fgbio is available in the sponsor page.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].