All Projects → biotite-dev → Biotite

biotite-dev / Biotite

Licence: bsd-3-clause
A comprehensive library for computational molecular biology

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Biotite

Fqtools
An efficient FASTQ manipulation suite
Stars: ✭ 114 (-13.64%)
Mutual labels:  bioinformatics
Scgen
Single cell perturbation prediction
Stars: ✭ 122 (-7.58%)
Mutual labels:  bioinformatics
Somalier
fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
Stars: ✭ 128 (-3.03%)
Mutual labels:  bioinformatics
Apbs Pdb2pqr
APBS - software for biomolecular electrostatics and solvation
Stars: ✭ 114 (-13.64%)
Mutual labels:  bioinformatics
Blacklist
Application for making ENCODE Blacklists
Stars: ✭ 119 (-9.85%)
Mutual labels:  bioinformatics
Krakenuniq
🐙 KrakenUniq: Metagenomics classifier with unique k-mer counting for more specific results
Stars: ✭ 123 (-6.82%)
Mutual labels:  bioinformatics
Ugene
UGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (-15.15%)
Mutual labels:  bioinformatics
Readfq
Fast multi-line FASTA/Q reader in several programming languages
Stars: ✭ 128 (-3.03%)
Mutual labels:  bioinformatics
Circlator
A tool to circularize genome assemblies
Stars: ✭ 121 (-8.33%)
Mutual labels:  bioinformatics
Plip
Protein-Ligand Interaction Profiler - Analyze and visualize non-covalent protein-ligand interactions in PDB files according to 📝 Salentin et al. (2015), https://www.doi.org/10.1093/nar/gkv315
Stars: ✭ 123 (-6.82%)
Mutual labels:  bioinformatics
Ngless
NGLess: NGS with less work
Stars: ✭ 115 (-12.88%)
Mutual labels:  bioinformatics
Hicexplorer
HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
Stars: ✭ 116 (-12.12%)
Mutual labels:  bioinformatics
Deepecg
ECG classification programs based on ML/DL methods
Stars: ✭ 124 (-6.06%)
Mutual labels:  bioinformatics
Cooler
A cool place to store your Hi-C
Stars: ✭ 112 (-15.15%)
Mutual labels:  bioinformatics
Masurca
Stars: ✭ 128 (-3.03%)
Mutual labels:  bioinformatics
Bio4j
Bio4j abstract model and general entry point to the project
Stars: ✭ 113 (-14.39%)
Mutual labels:  bioinformatics
Kmer Cnt
Code examples of fast and simple k-mer counters for tutorial purposes
Stars: ✭ 124 (-6.06%)
Mutual labels:  bioinformatics
Hts Nim
nim wrapper for htslib for parsing genomics data files
Stars: ✭ 132 (+0%)
Mutual labels:  bioinformatics
Splatter
Simple simulation of single-cell RNA sequencing data
Stars: ✭ 128 (-3.03%)
Mutual labels:  bioinformatics
Sarek
Detect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
Stars: ✭ 124 (-6.06%)
Mutual labels:  bioinformatics

.. image:: https://img.shields.io/pypi/v/biotite.svg :target: https://pypi.python.org/pypi/biotite :alt: Biotite at PyPI .. image:: https://img.shields.io/pypi/pyversions/biotite.svg :alt: Python version .. image:: https://img.shields.io/travis/biotite-dev/biotite.svg :target: https://travis-ci.org/biotite-dev/biotite :alt: Travis CI status

.. image:: https://www.biotite-python.org/_static/assets/general/biotite_logo_m.png :alt: The Biotite Project

Biotite project

Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:

  • Searching and fetching data from biological databases
  • Reading and writing popular sequence/structure file formats
  • Analyzing and editing sequence/structure data
  • Visualizing sequence/structure data
  • Interfacing external applications for further analysis

Biotite internally stores most of the data as NumPy ndarray objects, enabling

  • fast C-accelerated analysis,
  • intuitive usability through NumPy-like indexing syntax,
  • extensibility through direct access of the internal NumPy arrays.

As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.

If you use Biotite in a scientific publication, please cite:

| Kunzmann, P. & Hamacher, K. BMC Bioinformatics (2018) 19:346. | <https://doi.org/10.1186/s12859-018-2367-z>_

Installation

Biotite requires the following packages:

  • numpy
  • requests
  • msgpack

Some functions require some extra packages:

  • mdtraj - Required for trajetory file I/O operations.
  • matplotlib - Required for plotting purposes.

Biotite can be installed via Conda...

.. code-block:: console

$ conda install -c conda-forge biotite

... or pip

.. code-block:: console

$ pip install biotite

Usage

Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:

.. code-block:: python

import biotite.sequence.align as align import biotite.sequence.io.fasta as fasta import biotite.database.entrez as entrez

Download FASTA file for the sequences of avidin and streptavidin

file_name = entrez.fetch_single_file( uids=["CAC34569", "ACL82594"], file_name="sequences.fasta", db_name="protein", ret_type="fasta" )

Parse the downloaded FASTA file

and create 'ProteinSequence' objects from it

fasta_file = fasta.FastaFile.read(file_name) avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()

Align sequences using the BLOSUM62 matrix with affine gap penalty

matrix = align.SubstitutionMatrix.std_protein_matrix() alignments = align.align_optimal( avidin_seq, streptavidin_seq, matrix, gap_penalty=(-10, -1), terminal_penalty=False ) print(alignments[0])

.. code-block::

MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA -------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA

TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT

DIGDDWKATRVGINIFTRLRTQKE--------------------- -AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ

More documentation, including a tutorial, an example gallery and the API reference is available at <https://www.biotite-python.org/>_.

Contribution

Interested in improving Biotite? Have a look at the contribution guidelines <https://www.biotite-python.org/contribute.html>. Feel free to join or community chat on Discord <https://discord.gg/cUjDguF>.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].