All Projects → eliberis → parapred

eliberis / parapred

Licence: MIT license
Paratope Prediction using Deep Learning

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to parapred

mmtf-spark
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-59.18%)
Mutual labels:  protein-structure, protein-sequences
pytorch-rgn
Recurrent Geometric Network in Pytorch
Stars: ✭ 28 (-42.86%)
Mutual labels:  protein-structure, protein-sequences
tape-neurips2019
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)
Stars: ✭ 117 (+138.78%)
Mutual labels:  protein-structure, protein-sequences
plmc
Inference of couplings in proteins and RNAs from sequence variation
Stars: ✭ 85 (+73.47%)
Mutual labels:  protein-structure, protein-sequences
deepblast
Neural Networks for Protein Sequence Alignment
Stars: ✭ 29 (-40.82%)
Mutual labels:  protein-structure, protein-sequences
SeqVec
Modelling the Language of Life - Deep Learning Protein Sequences
Stars: ✭ 74 (+51.02%)
Mutual labels:  protein-structure
ddpm-proteins
A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms
Stars: ✭ 55 (+12.24%)
Mutual labels:  protein-structure
lightdock
Protein-protein, protein-peptide and protein-DNA docking framework based on the GSO algorithm
Stars: ✭ 110 (+124.49%)
Mutual labels:  protein-structure
gcWGAN
Guided Conditional Wasserstein GAN for De Novo Protein Design
Stars: ✭ 38 (-22.45%)
Mutual labels:  protein-structure
lstm-kalman-hybrid-timeseries
Hybrid Time Series using LSTM and Kalman Filtering
Stars: ✭ 33 (-32.65%)
Mutual labels:  lstm-neural-networks
Uni-Fold
An open-source platform for developing protein models beyond AlphaFold.
Stars: ✭ 227 (+363.27%)
Mutual labels:  protein-structure
VSCoding-Sequence
VSCode Extension for interactively visualising protein structure data in the editor
Stars: ✭ 41 (-16.33%)
Mutual labels:  protein-structure
FunFolDesData
Rosetta FunFolDes – a general framework for the computational design of functional proteins.
Stars: ✭ 15 (-69.39%)
Mutual labels:  protein-structure
r3dmol
🧬 An R package for visualizing molecular data in 3D
Stars: ✭ 45 (-8.16%)
Mutual labels:  protein-structure
protein-transformer
Predicting protein structure through sequence modeling
Stars: ✭ 77 (+57.14%)
Mutual labels:  protein-structure
cbh21-protein-solubility-challenge
Template with code & dataset for the "Structural basis for solubility in protein expression systems" challenge at the Copenhagen Bioinformatics Hackathon 2021.
Stars: ✭ 15 (-69.39%)
Mutual labels:  protein-structure
hPDB
PDB parser in Haskell
Stars: ✭ 20 (-59.18%)
Mutual labels:  protein-structure
Chase
Automatic trading bot (WIP)
Stars: ✭ 73 (+48.98%)
Mutual labels:  lstm-neural-networks
geometric-vector-perceptron
Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021
Stars: ✭ 45 (-8.16%)
Mutual labels:  protein-structure
DeepCov
Fully convolutional neural networks for protein residue-residue contact prediction
Stars: ✭ 36 (-26.53%)
Mutual labels:  protein-structure

Parapred --- antibody binding residue prediction

The original implementation of methodology described in "Parapred: antibody paratope prediction using convolutional and recurrent neural networks" by Liberis et al.

Install

Requirements:

  • Python 3.6+ (or Python 2.7 for just running the predictor)

To install:

  • Run python setup.py install in the root directory. If you are using a Python installation manager, such as Anaconda or Canopy, follow their package installation instructions.
  • If you do not wish to install and run Parapred directly from a clone of this repository instead, install required packages using pip install -r requirements.txt.

Usage

  • If installed, Parapred should just be available as a parapred executable on the command line (run parapred --help to check).
  • If you choose to run Parapred directly, make sure you've installed required packages from requirements.txt and try executing python -m parapred --help or ./parapred-runner.py --help in the root of this repository.
➜  ~ parapred --help
Parapred - neural network-based paratope predictor.

Parapred works on parts of antibody's amino acid sequence that correspond to the
CDR and two extra residues on the either side of it. The program will output
binding probability for every residue in the input. The program accepts two
kinds of input (see usage section below for examples):

(a) The full sequence of a VH or VL domain, or a larger stretch of the sequence
    of either the heavy or light chain comprising the CDR loops. (requires the
    additional module anarci for Chothia numbering of sequences, available at
    http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/ANARCI.php).
    Command example: `parapred seq DIEMTQSPSSLSASVGDRVTITCR...`

(b) A fasta file with various antibody sequences, either light or heavy chains
    (requires anarci http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/ANARCI.php)

(c) A Chothia-numbered PDB file together with antibody's heavy and light chain
    IDs (provided using `--abh` and `--abl` options). The program will overwrite
    the B-factor value in the PDB to a residue's binding probability.

    Multiple PDB files can be processed by specifying a description file (using
    the `--pdb-list` option), which is a CSV file containing columns `pdb`,
    `Hchain` and `Lchain` meaning PDB file name (.pdb extention will be appended
    if missing), heavy chain ID and light chain ID respectively. For example:

    pdb,Hchain,Lchain
    2uzi,H,L
    4leo,A,B

    Extra columns in the CSV file are allowed and will be ignored. A folder
    containing PDB files can be specified using the `--pdb-folder` option
    (defaults to the current directory).

(d) An amino acid sequence corresponding to a CDR with 2 extra residues on
    either side, e.g. `parapred cdr ARSGYYGDSDWYFDVGG`.

    Multiple CDR sequences can be processed at once by specifying a file,
    containing each sequence on a separate line (using the `--cdr-list` option).

Usage:
  parapred seq <sequence>
  parapred fasta <fasta_file>
  parapred pdb <pdb_file> [--abh <ab_h_chain_id>] [--abl <ab_l_chain_id>]
  parapred pdb --pdb-list <pdb_descr_file> [--pdb-folder=<path>]
  parapred cdr <cdr_seq>
  parapred cdr --cdr-list <cdr_file>
  parapred (-h | --help)

Options:
  -h --help                    Show this help.
  seq                          Takes the full amino acid sequence of a VH or VL
                               domain (requires anarci).
  fasta                        Takes a fasta-formatted file of amino acid
                               sequences corresponding or containing VH and VL
                               domains (requires anarci).
  pdb                          PDB-annotating mode. Replaces B-factor entries
                               with binding probabilities (in percentages). PDBs
                               must be Chothia-numbered.
  --abh <ab_h_chain_id>        Antibody's heavy chain ID [default: H].
  --abl <ab_l_chain_id>        Antibody's light chain ID [default: L].
  --pdb-list <pdb_descr_file>  List containing PDB file names and chain IDs in CSV format.
  --pdb-folder <path>          Path to a folder with PDB files [default: .].
  cdr <cdr_seq>                Given an individual CDR sequence with 2 extra residues
                               on either side, outputs binding probabilities for each residue.
  --cdr-list <cdr_file>        List containing CDR amino acid sequences, one per line.

Example output

➜  ~ parapred cdr ASGYTFTSYWI
... omitted TensorFlow messages ... 
# ParaPred annotation of ASGYTFTSYWI
A 0.005611494
S 0.022217814
G 0.13472338
Y 0.3498
T 0.4621269
F 0.077797584
T 0.7191864
S 0.9059194
Y 0.8069638
W 0.9702157
I 0.014193774
----------------------------------
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].