All Projects → ababaian → LIONS

ababaian / LIONS

Licence: GPL-3.0 License
LIONS is a bioinformatic analysis pipeline which brings together a few pieces of software and some home-brewed scripts to annotate a paired-end RNAseq library to detect TE-intiated transcripts

Programming Languages

PostScript
262 projects
shell
77523 projects
r
7636 projects
python
139335 projects - #7 most used programming language
TeX
3793 projects
Dockerfile
14818 projects

LIONS

Detecting TE-initiated transcripts from paired-end RNAseq

LIONS Publication ( sci-hub link )

LIONS is a bioinformatic analysis pipeline which brings together a few pieces of software and some home-brewed scripts to annotate a paired-end RNAseq library against a reference TE annotation set (such as Repeat Masker)

East Lion scripts processes bam file input, re-aligns it to a genome, builds an ab initio assembly using Tophat2. This assembly is then proccessed and local read searches are done at the 5' ends to find additional transcript start sites and quality control the 5' ends of the assembly. The output is a file-type .lions which annotates the intersection between the assembly, a reference gene set and repeat set.

West Lion scripts compile different .lions files, groups them into biological catagories (i.e. Cancer vs. Normal or Treatment vs. Control) and compares and analyzes the data to create graphs and meaningful interpretation of the data.

Installation

  1. Download the LIONS repo

  2. Install the dependencies for LIONS

  3. Initialize the 'Parameter Files' for your system for LIONS

    1. $LIONS_PATH/controls/<system>.sysctrl: System-specific variables
    2. $LIONS_PATH/controls/<project>.ctrl: Project-specific variables
    3. $LIONS_PATH/controls/<input>.ctrl: List of RNA-seq file inputs for project
  4. Add Reference / Annotation files for LIONS

  5. Populate the resource files: NOTE: UCSC files are downloaded from: UCSC Genome Browser). There is an example folder with example of what files should look like.

    1. In $LIONS_PATH/resources/<genomeName>/genome/ add a .fa genome sequence file
    2. In $LIONS_PATH/resources/<genomeName>/repeat/ UCSC annotation for RepeatMasker for
    3. (Optional) In $LIONS_PATH/resources/<genomeName>/annotation/ UCSC annotation for protein-coding genes
  6. Run the master lions.sh in bash:

    bash $LIONS_PATH/lions.sh <$LIONS_PATH/controls/parameter.ctrl>
    

LIONS Folder Map

LIONS Error Codes

If you have any questions please email me: Artem Babaian. I'll do my best to respond and help get this working!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].