All Projects → fomightez → sequencework

fomightez / sequencework

Licence: other
programs and scripts, mainly python, for analyses related to nucleic or protein sequences

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to sequencework

bystro
Bystro genetic analysis (annotation, filtering, statistics)
Stars: ✭ 31 (+40.91%)
Mutual labels:  genomics, bioinformatics-analysis, bioinformatics-scripts
Nucleus
Python and C++ code for reading and writing genomics data.
Stars: ✭ 657 (+2886.36%)
Mutual labels:  genomics, dna
Pyfaidx
Efficient pythonic random access to fasta subsequences
Stars: ✭ 307 (+1295.45%)
Mutual labels:  genomics, dna
Gatk
Official code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+4454.55%)
Mutual labels:  genomics, dna
awesome-genetics
A curated list of awesome bioinformatics software.
Stars: ✭ 60 (+172.73%)
Mutual labels:  genomics, dna
dna-traits
A fast 23andMe genome text file parser, now superseded by arv
Stars: ✭ 64 (+190.91%)
Mutual labels:  genomics, dna
Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+3590.91%)
Mutual labels:  genomics, dna
variantkey
Numerical Encoding for Human Genetic Variants
Stars: ✭ 32 (+45.45%)
Mutual labels:  genomics, dna
Genomics
A collection of scripts and notes related to genomics and bioinformatics
Stars: ✭ 101 (+359.09%)
Mutual labels:  genomics, dna
Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+10827.27%)
Mutual labels:  genomics, dna
Htsjdk
A Java API for high-throughput sequencing data (HTS) formats.
Stars: ✭ 220 (+900%)
Mutual labels:  genomics, dna
dnapacman
waka waka
Stars: ✭ 15 (-31.82%)
Mutual labels:  dna, rna
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (+150%)
Mutual labels:  genomics, dna
Bio.jl
[DEPRECATED] Bioinformatics and Computational Biology Infrastructure for Julia
Stars: ✭ 257 (+1068.18%)
Mutual labels:  genomics, dna
adapt
A package for designing activity-informed nucleic acid diagnostics for viruses.
Stars: ✭ 16 (-27.27%)
Mutual labels:  genomics, dna
Vg
tools for working with genome variation graphs
Stars: ✭ 710 (+3127.27%)
Mutual labels:  genomics, dna
naf
Nucleotide Archival Format - Compressed file format for DNA/RNA/protein sequences
Stars: ✭ 35 (+59.09%)
Mutual labels:  dna, rna
STing
Ultrafast sequence typing and gene detection from NGS raw reads
Stars: ✭ 15 (-31.82%)
Mutual labels:  genomics, dna
Sns
Analysis pipelines for sequencing data
Stars: ✭ 43 (+95.45%)
Mutual labels:  genomics, dna
Biopython
Official git repository for Biopython (originally converted from CVS)
Stars: ✭ 2,936 (+13245.45%)
Mutual labels:  genomics, dna

sequencework

Mainly python scripts related to nucleic or protein sequence work

I sorely need to put an index here with links to better guide to the appropriate folders. <== TO DO
(For now look at the title of the folders to try and discern if it is something of interest.)

Descriptions of the scripts are found within README.md files in the sub folders.

Several have demonstrations in sessions served by MyBinder.org from my command line-sequence associated repo; however, probably best to follow guide listed with individual scripts so that you quickly find the right location. If you already know where you are going, you can launch a session via this button:

Binder

Related 'Binderized' Utilities

Collection of links to launchable Jupyter environments where various sequence analysis tools work WITHOUT ANY NEED FOR ADDITIONAL EFFORT/INSTALLS. Many of my recent scripts are built with use in these environments in mind:

(Many of these include/feature Biopython, too, such as but I haven't made a one all encompassing one yet for that since I use it a lot as an underlying library.)

  • patmatch-binder - launchable Jupyter sessions for running command line-based PatMatch in Jupyter environment provided via Binder (Perl and Python-based).

  • blast-binder - launchable Jupyter sessions for running command line-based BLAST+ in Jupyter environment provided via Binder.

  • InterMine-binder - Intermine Web Services available in a Jupyter environment running via the Binder service. (See the guide to getting started with using Intermine sites and Jupyter using MyBinder-served Jupyter notebooks.)

  • mcscan-binder - MCscan software available in a launchable Jupyter environment running via the Binder service (Python 2-based), with an example workflow and some other use examples.

  • mcscan-blast-binder - MCscan and BLAST+ command line software available in a launchable Jupyter environment running via the Binder service (Python 2-based).

  • synchro-binder - SynChro software available in a launchable Jupyter environment running via the Binder service with Quick start and some other illustrations of its use.

  • cl_sq_demo-binder - launchable, working Jupyter-based environment that has a collection of demonstrations of useful resources on command line (or useable in Jupyter sessions) for manipulating sequence files. (Note: THIS WAS STARTED AFTER SEVERAL OTHER DEMO NOTEBOOKS (many meant to be static) MADE FOR SEQUENCE SCRIPTs, and hopefully slowly those will be added to here as well to be available in active form.)

  • clausen_ribonucleotides binder - Analyze ribonucleotide incorporation data from Clausen et al. 2015 data using script plot_5prime_end_from_bedgraph.py.

  • circos-binder - Circos software available in a launchable Jupyter environment running via the Binder service with tutorials illustrating use (TBD)(Perl and Python-based).

Related resources by others

"Install and use genomes & gene annotations the easy way!
genomepy is designed to provide a simple and straightforward way to download and use genomic data. This includes (1) searching available data, (2) showing the available metadata, (3) automatically downloading, preprocessing and matching data and (4) generating optional aligner indexes. All with sensible, yet controllable defaults. Currently, genomepy supports UCSC, Ensembl and NCBI." - Includes an S. cerevisiae example.

"A general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths. SeqFu is available for Linux and MacOS. - A compiled program delivering high performance analyses - Supports FASTA/FASTQ files, also Gzip compressed - A growing collection of handy utilities, also for quick inspection of the datasets." - Example uses Biopython to make a Pandas dataframe from FASTA sequences

See also

My simulated data repo has some useful scripts and resources for generating simulated (mock / fake) sequence data, gene expression data, or gene lists.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].