All Projects → rasbt → Biopandas

rasbt / Biopandas

Licence: bsd-3-clause
Working with molecular structures in pandas DataFrames

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Biopandas

Rmsd
Calculate Root-mean-square deviation (RMSD) of two molecules, using rotation, in xyz or pdb format
Stars: ✭ 215 (-34.65%)
Mutual labels:  molecule, pdb
Biosyntax
Syntax highlighting for computational biology
Stars: ✭ 164 (-50.15%)
Mutual labels:  bioinformatics, pdb
10 Simple Hacks To Speed Up Your Data Analysis In Python
Some useful Tips and Tricks to speed up the data analysis process in Python.
Stars: ✭ 45 (-86.32%)
Mutual labels:  pandas-dataframe, pdb
Dgl Lifesci
Python package for graph neural networks in chemistry and biology
Stars: ✭ 194 (-41.03%)
Mutual labels:  bioinformatics, molecule
Biojava
📖🔬☕️ BioJava is an open-source project dedicated to providing a Java library for processing biological data.
Stars: ✭ 434 (+31.91%)
Mutual labels:  bioinformatics, pdb
VSCoding-Sequence
VSCode Extension for interactively visualising protein structure data in the editor
Stars: ✭ 41 (-87.54%)
Mutual labels:  pdb, molecule
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-17.02%)
Mutual labels:  pandas-dataframe
Edlib
Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
Stars: ✭ 298 (-9.42%)
Mutual labels:  bioinformatics
Manta
Structural variant and indel caller for mapped sequencing data
Stars: ✭ 271 (-17.63%)
Mutual labels:  bioinformatics
Seq
A high-performance, Pythonic language for bioinformatics
Stars: ✭ 263 (-20.06%)
Mutual labels:  bioinformatics
Mdtraj
An open library for the analysis of molecular dynamics trajectories
Stars: ✭ 317 (-3.65%)
Mutual labels:  pdb
Dash Cytoscape
Interactive network visualization in Python and Dash, powered by Cytoscape.js
Stars: ✭ 309 (-6.08%)
Mutual labels:  bioinformatics
Tdc
Therapeutics Data Commons: Machine Learning Datasets and Tasks for Therapeutics
Stars: ✭ 291 (-11.55%)
Mutual labels:  bioinformatics
Arvados
An open source platform for managing and analyzing biomedical big data
Stars: ✭ 274 (-16.72%)
Mutual labels:  bioinformatics
Gwa tutorial
A comprehensive tutorial about GWAS and PRS
Stars: ✭ 303 (-7.9%)
Mutual labels:  bioinformatics
Anvio
An analysis and visualization platform for 'omics data
Stars: ✭ 273 (-17.02%)
Mutual labels:  bioinformatics
Jvarkit
Java utilities for Bioinformatics
Stars: ✭ 313 (-4.86%)
Mutual labels:  bioinformatics
Cobrapy
COBRApy is a package for constraint-based modeling of metabolic networks.
Stars: ✭ 267 (-18.84%)
Mutual labels:  bioinformatics
Python
This repository helps you understand python from the scratch.
Stars: ✭ 285 (-13.37%)
Mutual labels:  pandas-dataframe
Bioinformatics One Liners
Bioinformatics one liners from Ming Tang
Stars: ✭ 309 (-6.08%)
Mutual labels:  bioinformatics

Logo

Working with molecular structures in pandas DataFrames

Continuous Integration Build status Code Coverage PyPI Version License Python 3 JOSS [Discuss


Links


If you are a computational biologist, chances are that you cursed one too many times about protein structure files. Yes, I am talking about ye Goode Olde Protein Data Bank format, aka "PDB files." Nothing against PDB, it's a neatly structured format (if deployed correctly); yet, it is a bit cumbersome to work with PDB files in "modern" programming languages -- I am pretty sure we all agree on this.

As machine learning and "data science" person, I fell in love with pandas DataFrames for handling just about everything that can be loaded into memory.
So, why don't we take pandas to the structural biology world? Working with molecular structures of biological macromolecules (from PDB and MOL2 files) in pandas DataFrames is what BioPandas is all about!


Examples

3eiy

# Initialize a new PandasPdb object
# and fetch the PDB file from rcsb.org
>>> from biopandas.pdb import PandasPdb
>>> ppdb = PandasPdb().fetch_pdb('3eiy')
>>> ppdb.df['ATOM'].head()

3eiy head





3eiy head

# Load structures from your drive and compute the
# Root Mean Square Deviation
>>> from biopandas.pdb import PandasPdb
>>> pl1 = PandasPdb().read_pdb('./docking_pose_1.pdb')
>>> pl2 = PandasPdb().read_pdb('./docking_pose_2.pdb')
>>> r = PandasPdb.rmsd(pl1.df['HETATM'], pl2.df['HETATM'],
                       s='hydrogen', invert=True)
>>> print('RMSD: %.4f Angstrom' % r)

RMSD: 2.6444 Angstrom





Quick Install

  • install the latest version (from GitHub): pip install git+git://github.com/rasbt/biopandas.git#egg=biopandas
  • install the latest PyPI version: pip install biopandas
  • install biopandas via conda-forge: conda install biopandas -c conda-forge

Requirements

For more information, please see http://rasbt.github.io/biopandas/installation/.





Cite as

If you use BioPandas as part of your workflow in a scientific publication, please consider citing the BioPandas repository with the following DOI:

  • Sebastian Raschka. Biopandas: Working with molecular structures in pandas dataframes. The Journal of Open Source Software, 2(14), jun 2017. doi: 10.21105/joss.00279. URL http://dx.doi.org/10.21105/joss.00279.
@article{raschkas2017biopandas,
  doi = {10.21105/joss.00279},
  url = {http://dx.doi.org/10.21105/joss.00279},
  year  = {2017},
  month = {jun},
  publisher = {The Open Journal},
  volume = {2},
  number = {14},
  author = {Sebastian Raschka},
  title = {BioPandas: Working with molecular structures in pandas DataFrames},
  journal = {The Journal of Open Source Software}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].