All Projects → mircare → Porter5

mircare / Porter5

Licence: other
Fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes

Programming Languages

C++
36643 projects - #6 most used programming language
python
139335 projects - #7 most used programming language
perl
6916 projects

Projects that are alternatives of or similar to Porter5

pytorch-rgn
Recurrent Geometric Network in Pytorch
Stars: ✭ 28 (-28.21%)
Mutual labels:  proteins
oncoEnrichR
Cancer-dedicated gene set interpretation
Stars: ✭ 35 (-10.26%)
Mutual labels:  proteins
dssp
Application to assign secondary structure to proteins
Stars: ✭ 56 (+43.59%)
Mutual labels:  proteins
pmartR
The pmartR R package provides functionality for quality control, normalization, exploratory data analysis, and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data. This includes data transformation, specification of groups that are to be …
Stars: ✭ 19 (-51.28%)
Mutual labels:  proteins
psipred
PSIPRED Protein Secondary Structure Predictor
Stars: ✭ 38 (-2.56%)
Mutual labels:  proteins
progen
Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax
Stars: ✭ 71 (+82.05%)
Mutual labels:  proteins
DMPfold
De novo protein structure prediction using iteratively predicted structural constraints
Stars: ✭ 52 (+33.33%)
Mutual labels:  protein-structure-prediction

PWC PWC

Porter 5

Light, fast and high quality prediction of protein secondary structure in 3 and 8 classes

The web server, train and test sets of Porter 5 are available at http://distilldeep.ucd.ie/porter/.
The docker container is available at https://hub.docker.com/r/mircare/porter5 (HOWTO).

See https://github.com/mircare/Brewery to predict more protein structure annotations, and download COVID-19 predictions.

Pipeline of BreweryDiagram of the pipeline we propose to gather and exploit deeper profiles.

Setup

$ git clone https://github.com/mircare/Porter5/ --depth 1 && rm -rf Porter5/.git

Requirements

  1. Python3 (https://www.python.org/downloads/);
  2. NumPy (https://www.scipy.org/scipylib/download.html);
  3. HHblits (https://github.com/soedinglab/hh-suite/);
  4. uniprot20 (http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/old-releases/uniprot20_2016_02.tgz).

Optionally (for more accurate predictions):

  1. PSI-BLAST (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/);
  2. UniRef90 (ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz).

How to run Porter 5

# For fast and accurate predictions (exploiting HHblits only)
$ python3 Porter5/Porter5.py -i Porter5/example/2FLGA.fasta --cpu 4 --fast

# For very accurate predictions (exploiting both HHblits and PSI-BLAST)
$ python3 Porter5/Porter5.py -i Porter5/example/2FLGA.fasta --cpu 4

How to run Porter 5 on multiple sequences

# To split a FASTA file with multiple sequences (Optional)
$ python3 Porter5/split_fasta.py many_sequences.fasta

# To predict all the fasta files in a given directory (Fastas)
$ python3 Porter5/multiple_fasta.py -i Fastas/ --cpu 4 --fast

# To run multiple predictions in parallel (using a total of 8 cores)
$ python3 Porter5/multiple_fasta.py -i Fastas/ --cpu 4 --parallel 2 --fast

Use the docker image

# Set-up docker image
$ docker pull mircare/porter5

# set the absolute PATHs for databases and query sequences (stored locally)
$ docker run --name porter5 -v /**PATH_to_uniprot20_2016_02**:/uniprot20 \
-v /**PATH_to_UniRef90_optional**:/uniref90 -v /**PATH_to_fasta_to_predict**:/Porter5/query \
--cap-add IPC_LOCK mircare/porter5 sleep infinity &

# How to run a prediction using 5 cores and HHblits only
$ docker exec porter5 python3 Porter5.py -i query/2FLGA.fasta --cpu 5 --fast

Performances in 3 states on large independent test set

Method Q3 per AA SOV'99 per AA Q3 per protein SOV'99 per protein
Porter 5 83.81% 80.41% 84.32% 81.05%
SPIDER 3 83.15% 79.43% 83.42% 79.79%
Porter 5 HHblits only 83.06% 79.49% 83.68% 80.26%
SSpro 5.1 with templates 82.58% 78.54% 83.94% 80.29%
PSIPRED 4.01 81.88% 77.36% 82.48% 78.22%
RaptorX-Property 81.86% 78.08% 82.57% 78.99%
Porter 4 81.66% 78.05% 82.29% 78.61%
SSpro 5.1 ab initio 81.17% 76.87% 81.10% 76.92%
DeepCNF 81.04% 76.74% 81.16% 76.99%

Calculated with http://dna.cs.miami.edu/SOV/.

Performances in 8 states on large independent test set in

Method Q8 per AA SOV8'99 per AA Q8 per protein SOV8'99 per protein
Porter 5 73.02% 69.91% 73.92% 70.76%
SSpro 5.1 with templates 71.91% 68.68% 74.46% 71.74%
Porter 5 HHblits only 71.8% 68.87% 72.83% 69.79%
RaptorX-Property 70.74% 67.59% 71.78% 68.36%
DeepCNF 69.76% 66.42% 70.14% 66.44%
SSpro 5.1 ab initio 68.85% 65.33% 69.27% 65.97%

Calculated with http://dna.cs.miami.edu/SOV/.

Citation

If you use Porter 5, please cite our Scientific Reports paper:

@article{torrisi_porter_2019,
	title = {Deeper Profiles and Cascaded Recurrent and Convolutional Neural Networks for state-of-the-art Protein Secondary Structure Prediction},
	volume = {9},
	issn = {2045-2322},
	doi = {10.1038/s41598-019-48786-x},
	journal = {Scientific Reports},
	author = {Torrisi, Mirko and Kaleel, Manaz and Pollastri, Gianluca},
	month = aug,
	year = {2019}
}

References

Deeper Profiles and Cascaded Recurrent and Convolutional Neural Networks for state-of-the-art Protein Secondary Structure Prediction, Scientific Reports, Nature Publishing Group;
Mirko Torrisi, Manaz Kaleel and Gianluca Pollastri; doi: https://doi.org/10.1038/s41598-019-48786-x.

Protein Structure Annotations; Essentials of Bioinformatics, Volume I. Springer Nature
Mirko Torrisi and Gianluca Pollastri; doi: https://doi.org/10.1007/978-3-030-02634-9_10.

Brewery: Deep Learning and deeper profiles for the prediction of 1D protein structure annotations,
Bioinformatics, Oxford University Press; Mirko Torrisi and Gianluca Pollastri;
Toll-free link: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa204/5811232?guestAccessKey=9a73ae2a-2cb6-4fe1-b333-a4f3261f02cf.

Porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes
Mirko Torrisi, Manaz Kaleel and Gianluca Pollastri; bioRxiv 289033; doi: https://doi.org/10.1101/289033.

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Email us at gianluca[dot]pollastri[at]ucd[dot]ie if you wish to use it for purposes not permitted by the CC BY-NC-SA 4.0.

Creative Commons License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].