All Projects → alexpreynolds → soda

alexpreynolds / soda

Licence: MIT license
Python-based UCSC genome browser snapshot-taker and gallery-maker

Programming Languages

javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
HTML
75241 projects
CSS
56736 projects
Less
1899 projects
shell
77523 projects

Projects that are alternatives of or similar to soda

sample
Performs memory-efficient reservoir sampling on very large input files delimited by newlines
Stars: ✭ 61 (+408.33%)
Mutual labels:  genomics, bed
cljam
A DNA Sequence Alignment/Map (SAM) library for Clojure
Stars: ✭ 85 (+608.33%)
Mutual labels:  genomics, bed
shell-genomics
Introduction to the Command Line for Genomics
Stars: ✭ 54 (+350%)
Mutual labels:  genomics
fwdpy11
Forward-time simulation in Python using fwdpp
Stars: ✭ 25 (+108.33%)
Mutual labels:  genomics
ucsc-results-center
Unofficial Results Center of University of Colombo School of Computing
Stars: ✭ 16 (+33.33%)
Mutual labels:  ucsc
CUT-RUNTools-2.0
CUT&RUN and CUT&Tag data processing and analysis
Stars: ✭ 36 (+200%)
Mutual labels:  genomics
bap
Bead-based single-cell atac processing
Stars: ✭ 20 (+66.67%)
Mutual labels:  genomics
DISCOVER
DISCOVER co-occurrence and mutual exclusivity analysis for cancer genomics data
Stars: ✭ 21 (+75%)
Mutual labels:  genomics
scarf
Toolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (+350%)
Mutual labels:  genomics
haslr
A fast tool for hybrid genome assembly of long and short reads
Stars: ✭ 68 (+466.67%)
Mutual labels:  genomics
SplitThreader
Explore rearrangements and copy-number amplifications in a cancer genome
Stars: ✭ 65 (+441.67%)
Mutual labels:  genomics
STing
Ultrafast sequence typing and gene detection from NGS raw reads
Stars: ✭ 15 (+25%)
Mutual labels:  genomics
atacr
Analysing Capture Seq Count Data
Stars: ✭ 14 (+16.67%)
Mutual labels:  genomics
wp-github-gos
利用 github api 实现的一个存储图片/附件的 wordpress 插件
Stars: ✭ 42 (+250%)
Mutual labels:  bed
bactmap
A mapping-based pipeline for creating a phylogeny from bacterial whole genome sequences
Stars: ✭ 36 (+200%)
Mutual labels:  genomics
assembly improvement
Improve the quality of a denovo assembly by scaffolding and gap filling
Stars: ✭ 46 (+283.33%)
Mutual labels:  genomics
mapping-iterative-assembler
Consensus calling (or "reference assisted assembly"), chiefly of ancient mitochondria
Stars: ✭ 15 (+25%)
Mutual labels:  genomics
omxware-getting-started
Examples to get started with IBM Functional Genomics Platform
Stars: ✭ 13 (+8.33%)
Mutual labels:  genomics
gnomad-browser
Explore gnomAD datasets on the web
Stars: ✭ 61 (+408.33%)
Mutual labels:  genomics
BigComputeLabs
Big Compute Learning Labs
Stars: ✭ 19 (+58.33%)
Mutual labels:  genomics

soda

Python-based UCSC genome browser snapshot gallery-maker

Description

soda is a Python script that generates a gallery of images made from snapshots from a UCSC genome browser instance, so-called "soda plots". Snapshots can be derived from an external browser instance, by pointing soda to that browser instance's host name.

You provide the script with four required parameters:

  1. A BED-formatted file containing your regions of interest.
  2. The genome build name, such as hg19, hg38, mm10, etc.
  3. The session ID from your genome browser session, which specifies the browser tracks you want to visualize, as well as other visual display parameters that are specific to your session.
  4. Where you want to store the gallery end-product.

If the BED file contains a fourth column (commonly used to store the name of the region), its values are used as labels for each page in the gallery.

Additional options are available; please see the Options section.

Note

The BED file does not need to be in BEDOPS sort-bed order. In fact, it can be useful to order the regions in a BED file by some criteria other than genomic position, such as some numerical value stored in the BED file's score column, e.g.:

$ sort -k5,5n input.bed > input_sorted_by_scores.bed

Any ordering is allowed. Gallery snapshots are presented in the same order as rows in the input BED file.

Installation

Set up a virtual environment via virtualenv or conda create and activate it. Then install via pip:

$ python3 -m pip install soda-gallery

Or via Bioconda:

$ conda config --add channels bioconda
$ conda install soda-gallery

Development

Clone it from Github and install locally into a virtual environment:

$ git clone https://github.com/alexpreynolds/soda.git
$ cd soda
$ python3 -m pip install -e .

Usage

As a usage example, you may have a BED file in some home directory called /home/abc/regions.bed. You have a session ID from the UCSC genome browser called 123456_abcdef, with all your tracks selected and display parameters set, using hg38 as the reference genome build. Finally, you want to store the results in a folder called /home/abc/my-soda-plot-results:

$ soda -r "/home/abc/regions.bed" -b "hg38" -s "123456_abcdef" -o "/home/abc/my-soda-plot-results"

If you run this locally, you can open the result folder's index.html file with your web browser to load the gallery. For example, from the Terminal application in OS X, you can run:

$ open /Users/abc/my-soda-plot-results/index.html

which opens the gallery index in your default web browser.

Options

A full listing of options is available via soda --help.

Required

Four options are required. At minimum:

-r, --regionsFn

Use -r or --regionsFn to specify the path to the input BED file containing regions of interest.

-b, --browserBuildID

The -b or --browserBuildID option specifies the genome build, e.g., hg19, mm10, etc.

-s, --browserSessionID

The -s or --browserSessionID option specify the browser session ID, which references a configuration of tracks and display parameters from a genome browser instance.

-o, --outputDir

Use the -o or --outputDir option to specify where the image gallery is saved. If this path already exists, soda will exit with a fatal error message.

Optional

Other options are available depending on how you want to customize the run.

-t, --title

Use -t or --title to specify a gallery title.

[ -i, --addIntervalAnnotation | -d, --addMidpointAnnotation ]

Use -i or --addIntervalAnnotation to add a rectangle underneath all tracks that demarcates the original genomic range (useful when used with --range). Alternatively, use -d or --addMidpointAnnotation to add a vertical line underneath all tracks, centered on the midpoint of the input genomic range. In both cases, the annotation is labeled with the genomic coordinates of the original interval or the calculated midpoint, respectively. It is not allowed to specify both options together.

-w, --annotationRgba
-z, --annotationFontPointSize
-f, --annotationFontFamily

When used with -i or -d to add an interval or midpoint annotation, these options may be used to override the default rgba() color, typeface point size, and typeface family (where supported by the local installation of ImageMagick), which are parameters used to render the appearance of the annotation components. The default color is rgba(255, 0, 0, 0.333) and the default point size and font family values are 5 and Helvetica-Bold, respectively.

-a, --range

Use the -a or --range option to pad the BED input's midpoint symmetrically by the specified number of bases. This works regardless of the sort order of the input.

-g, --browserURL

Use the -g or --browserURL option to specify a different genome browser URL other than the UCSC genome browser. If a different host is specified and credentials are required, please use the -u and -p options (see below).

-u, --browserUsername
-p, --browserPassword

Use these two options to specify a username and password for the browser instance, if you pick a different --browserURL and that browser instance requires basic credentials. If these options are not specified, no credentials are passed along. If authentication is required and it fails, soda may exit with an error.

-y, --useKerberosAuthentication

Use this option if access to your custom browser instance requires a Kerberos ticket (obtained via kinit, for example).

-v, --verbose

Use -v or --verbose to print debug messages, which may be useful for automation or debugging.

Credits

The general "soda" gallery tool has been authored in various bash- and Perl-flavored incarnations since 2008 by primary authors Richard Sandstrom and Scott Kuehn, with modifications over time by Bob Thurman, Jay Hesselberth, Richard Humbert, Brady Miller and Alex Reynolds.

This Python rewrite and new functionality were authored by Alex Reynolds.

This tool uses the blueimp Gallery and Github octicons projects, both of which are MIT-licensed.

So what's up with the name?

"Scott Kuehn: he came up with the name. The legend is that when asked what to call the program, he lifted a can of cola, sighed deeply, and said: 'soda plot'." - R. Sandstrom

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].