All Projects → sestaton → HMMER2GO

sestaton / HMMER2GO

Licence: MIT license
Annotate DNA sequences for Gene Ontology terms

Programming Languages

perl
6916 projects
shell
77523 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to HMMER2GO

geneSCF inactive
GeneSCF moved to a dedicated GitHub page, https://github.com/genescf/GeneSCF
Stars: ✭ 21 (-46.15%)
Mutual labels:  gene-annotation
Volcano
A Cloud Native Batch System (Project under CNCF)
Stars: ✭ 2,114 (+5320.51%)
Mutual labels:  gene
fit
Fusion ICA Toolbox (MATLAB)
Stars: ✭ 13 (-66.67%)
Mutual labels:  gene
snpsea
📊 Identify cell types and pathways affected by genetic risk loci.
Stars: ✭ 26 (-33.33%)
Mutual labels:  gene
gene
Grace, fastest, flexibility, simple PHP extension framework!优雅、极速、灵活、简单的PHP扩展框架!
Stars: ✭ 30 (-23.08%)
Mutual labels:  gene
gene-oracle
Feature extraction algorithm for genomic data
Stars: ✭ 13 (-66.67%)
Mutual labels:  gene
biolink-api
API for linked biological knowledge
Stars: ✭ 54 (+38.46%)
Mutual labels:  gene
DigitalCellSorter
Digital Cell Sorter (DCS): single cell RNA-seq analysis toolkit. Documentation:
Stars: ✭ 19 (-51.28%)
Mutual labels:  gene
GeneFuse
Gene fusion detection and visualization
Stars: ✭ 90 (+130.77%)
Mutual labels:  gene
haystack bio
Haystack: Epigenetic Variability and Transcription Factor Motifs Analysis Pipeline
Stars: ✭ 42 (+7.69%)
Mutual labels:  gene
funRiceGenes
The knowledge of cloned rice genes lost in the information of rice functional genomics studies
Stars: ✭ 23 (-41.03%)
Mutual labels:  gene
recount
R package for the recount2 project. Documentation website: http://leekgroup.github.io/recount/
Stars: ✭ 40 (+2.56%)
Mutual labels:  gene
phylostratr
An R framework for phylostratigraphy
Stars: ✭ 25 (-35.9%)
Mutual labels:  gene-annotation
ribotricer
A tool for accurately detecting actively translating ORFs from Ribo-seq data
Stars: ✭ 20 (-48.72%)
Mutual labels:  orfs
RiboCode
release version
Stars: ✭ 31 (-20.51%)
Mutual labels:  orfs

HMMER2GO

Annotate DNA sequences for Gene Ontology terms

Build Status Version
CI GitHub version

What is HMMER2GO?

HMMER2GO is a command line application to map DNA sequences, typically transcripts, to Gene Ontology based on the similarity of the query sequences to curated HMM models for protein families represented in Pfam.

These GO term mappings allow you to make inferences about the function of the gene products, or changes in function in the case of expression studies. The GAF mapping file that is produced can be used with Ontologizer or other tools, to visualize a graph of the term relationships along with their signifcance values.

INSTALLATION

It is recommended to use Docker, as shown below:

docker run -it --name hmmer2go-con -v $(pwd)/db:/db:Z sestaton/hmmer2go

That will create a container called "hmmer2go-con" and start an interactive shell. The above assumes you have a directory called db in the working directory that contains your database files (Pfam HMM file that is formatted), and the input sequences. To run the full analysis, change to the mounted directory with cd db in your container and run the commands shown below.

Alternatively, you can follow the steps in the INSTALL file and install HMMER2GO on any Mac or Linux, and likely Windows (though I have not tested yet, advice is welcome).

Please see the wiki Demonstration page for full working example and demo script that will download and run HMMER2GO. This page also contains a brief description of how to begin analyzing the results.

BRIEF USAGE

Starting with a file of DNA sequences, we first want to get the longest open reading frame (ORF) for each gene and translate those sequences.

hmmer2go getorf -i genes.fasta -o genes_orfs.faa

Next, we search our ORFs for coding domains.

hmmer2go run -i genes_orfs.faa -d Pfam-A.hmm -o genes_orf_Pfam-A.tblout

Now we can map the protein domain matches to GO terms.

hmmer2go mapterms -i genes_orfs_Pfam-A.tblout -o genes_orfs_Pfam-A_GO.tsv --map

If we want to perform a statistical analysis on the GO mappings, it may be necessary to create a GAF file.

hmmer2go map2gaf -i genes_orfs_Pfam-A_GO_GOterm_mapping.tsv -o genes_orfs_Pfam-A_GO_GOterm_mapping.gaf -s 'Helianthus annuus'

For a full explanation of these commands, see the HMMER2GO wiki. In particular, see the tutorial page for a walk-through of all the commands. There is also an example script on the demonstration page to fetch data for Arabidopsis thaliana and run the full analysis.

DOCUMENTATION

Each subcommand can be executed with no arguments to generate a help menu. Alternatively, you may specify help message explicitly. For example,

hmmer2go help run

More information about each command is available by accessing the full documentation at the command line. For example,

hmmer2go run --man

Also, the HMMER2GO wiki is a source of online documentation.

ISSUES

Report any issues at the HMMER2GO issue tracker: https://github.com/sestaton/HMMER2GO/issues

LICENSE AND COPYRIGHT

Copyright (C) 2014-2022 S. Evan Staton

This program is distributed under the MIT (X11) License, which should be distributed with the package. If not, it can be found here: http://www.opensource.org/licenses/mit-license.php

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].