All Projects → dib-lab → Elvers

dib-lab / Elvers

Licence: other
(formerly eelpond) an automated RNA-Seq workflow system

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Elvers

Csvtk
A cross-platform, efficient and practical CSV/TSV toolkit in Golang
Stars: ✭ 566 (+2472.73%)
Mutual labels:  bioinformatics
Multiqc
Aggregate results from bioinformatics analyses across many samples into a single report.
Stars: ✭ 708 (+3118.18%)
Mutual labels:  bioinformatics
Splatter Paper
Data and analysis for the Splatter paper
Stars: ✭ 17 (-22.73%)
Mutual labels:  bioinformatics
Seqkit
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation in Golang
Stars: ✭ 607 (+2659.09%)
Mutual labels:  bioinformatics
React Plotly.js
A plotly.js React component from Plotly 📈
Stars: ✭ 701 (+3086.36%)
Mutual labels:  bioinformatics
Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+3590.91%)
Mutual labels:  bioinformatics
Htslib
C library for high-throughput sequencing data formats
Stars: ✭ 529 (+2304.55%)
Mutual labels:  bioinformatics
Helmsman
highly-efficient & lightweight mutation signature matrix aggregation
Stars: ✭ 19 (-13.64%)
Mutual labels:  bioinformatics
Hail
Scalable genomic data analysis.
Stars: ✭ 706 (+3109.09%)
Mutual labels:  bioinformatics
Pybedgraph
A Python package for fast operations on 1-dimensional genomic signal tracks
Stars: ✭ 17 (-22.73%)
Mutual labels:  bioinformatics
Khmer
In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more
Stars: ✭ 640 (+2809.09%)
Mutual labels:  bioinformatics
Cromwell
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
Stars: ✭ 655 (+2877.27%)
Mutual labels:  bioinformatics
Scipipe
Robust, flexible and resource-efficient pipelines using Go and the commandline
Stars: ✭ 826 (+3654.55%)
Mutual labels:  bioinformatics
Getting Started With Genomics Tools And Resources
Unix, R and python tools for genomics and data science
Stars: ✭ 587 (+2568.18%)
Mutual labels:  bioinformatics
Cookiecutter
DEPRECIATED! Please use nf-core/tools instead
Stars: ✭ 18 (-18.18%)
Mutual labels:  bioinformatics
Cs Video Courses
List of Computer Science courses with video lectures.
Stars: ✭ 27,209 (+123577.27%)
Mutual labels:  bioinformatics
Seqtk
Toolkit for processing sequences in FASTA/Q formats
Stars: ✭ 799 (+3531.82%)
Mutual labels:  bioinformatics
Jamp
JAMP - Just Another Metabarcoding Pipeline
Stars: ✭ 19 (-13.64%)
Mutual labels:  bioinformatics
Nsga Ii
an implementation of NSGA-II in java
Stars: ✭ 18 (-18.18%)
Mutual labels:  bioinformatics
Manorm
A robust model for quantitative comparison of ChIP-Seq data sets.
Stars: ✭ 16 (-27.27%)
Mutual labels:  bioinformatics

elvers

Build Status

DOI

                           ___
                        .-'   `'.
                       /         \
                      |           ;
                      |           |           ___.--,
             _.._     |O)  ~  (O) |    _.---'`__.-( (_.       
      __.--'`_.. '.__.\      '--. \_.-' ,.--'`     `""`
     ( ,.--'`   ',__ /./;     ;, '.__.'`    __
     _`) )  .---.__.' / |     |\   \__..--""  """--.,_
    `---' .'.''-._.-'`_./    /\ '.  \_.-~~~````~~~-.__`-.__.'
          | |  .' _.-' |    |  \  \  '.
           \ \/ .'     \    \   '. '-._)
            \/ /        \    \    `=.__`-~-.
            / /\         `)   )     / / `"".`\
      , _.-'.'\ \        /   /     (  (   /  /
       `--~`  )  )    .-'  .'       '.'. |  (
             (/`     (   (`           ) ) `-;
              `       '--;            (' 

elvers started as a snakemake update of the Eel Pond Protocol for de novo RNAseq analysis. It has evolved slightly to enable a number of workflows for (mostly) RNA data, which can all be run via the elvers workflow wrapper. elvers uses snakemake for workflow management and conda for software installation. The code can be found here.

Getting Started

Linux is the recommended OS. Nearly everything also works on MacOSX, but some programs (fastqc, Trinity) are troublesome.

If you don't have conda yet, install miniconda (for Ubuntu 16.04 Jetstream image):

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Be sure to answer 'yes' to all yes/no questions. You'll need to restart your terminal for conda to be active.

Create a working environment and install elvers!

elvers needs a few programs installed in order to run properly. To handle this, we run elvers within a conda environment that contains all dependencies.

Get the elvers code

git clone https://github.com/dib-lab/elvers.git
cd elvers

When you first get elvers, you'll need to create this environment on your machine:

conda env create --file environment.yml -n elvers-env

Now, activate that environment:

conda activate elvers-env

To deactivate after you've finished running elvers, type conda deactivate. You'll need to reactivate this environment anytime you want to run elvers.

Now. install the elvers package.

pip install -e .

Now you can start running workflows on test data!

Default workflow: Eel Pond Protocol for de novo RNAseq analysis

The Eel Pond protocol (which inspired the elvers name) included line-by-line commands that the user could follow along with using a test dataset provided in the instructions. We have re-implemented the protocol here to enable automated de novo transcriptome assembly, annotation, and quick differential expression analysis on a set of short-read Illumina data using a single command. See more about this protocol here.

To test the default workflow:

elvers examples/nema.yaml default

This will download and run a small set of Nematostella vectensis test data (from Tulin et al., 2013)

Running Your Own Data

To run your own data, you'll need to create one or more files:

  • a yaml file containing basic configuration info

This yaml config file must specify either:

  • a tsv file containing your read sample info
  • a reference file input (.fasta file and optional gene_trans_map)

Generate these files by following instructions here: Understanding and Configuring Workflows.

Available Workflows

  • preprocess: Read Quality Trimming and Filtering (fastqc, trimmomatic)
  • kmer_trim: Kmer Trimming and/or Digital Normalization (khmer)
  • assemble: Transcriptome Assembly (trinity)
  • annotate : Annotate the transcriptome (dammit)
  • sourmash_compute: Build sourmash signatures for the reads and assembly (sourmash)
  • quantify: Quantify transcripts (salmon)
  • diffexp: Conduct differential expression (DESeq2)
  • plass_assemble: assemble at the protein level with PLASS
  • paladin_map: map to a protein assembly using paladin

end-to-end workflows:

  • default: preprocess, kmer_trim, assemble, annotate, quantify
  • protein assembly: preprocess, kmer_trim, plass_assemble, paladin_map

You can see the available workflows (and which programs they run) by using the --print_workflows flag:

elvers examples/nema.yaml --print_workflows

Each included tool can also be run independently, if appropriate input files are provided. This is not always intuitive, so please see our documentation for running each tools for details (described as "Advanced Usage"). To see all available tools, run:

elvers examples/nema.yaml --print_rules

Citation information

This is pre-publication code; a manuscript is in preparation. Please contact the authors for the current citation information if you wish to use it and cite it.

Additional Info

See the help, here:

elvers -h

References:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].