All Projects → bio-ontology-research-group → Multi Drug Embedding

bio-ontology-research-group / Multi Drug Embedding

Method for drug repurposing from knowledge graphs and literature

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Multi Drug Embedding

LD-Connect
LD Connect is a Linked Data portal for IOS Press in collaboration with the STKO Lab at UC Santa Barbara.
Stars: ✭ 0 (-100%)
Mutual labels:  semantic-web, knowledge-graph
semantic-python-overview
(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)
Stars: ✭ 406 (+2155.56%)
Mutual labels:  semantic-web, knowledge-graph
everything
The semantic desktop search engine
Stars: ✭ 22 (+22.22%)
Mutual labels:  semantic-web, knowledge-graph
Topic Db
TopicDB is a topic maps-based semantic graph store (using PostgreSQL for persistence)
Stars: ✭ 164 (+811.11%)
Mutual labels:  knowledge-graph, semantic-web
PheKnowLator
PheKnowLator: Heterogeneous Biomedical Knowledge Graphs and Benchmarks Constructed Under Alternative Semantic Models
Stars: ✭ 74 (+311.11%)
Mutual labels:  semantic-web, knowledge-graph
Market-Trend-Prediction
This is a project of build knowledge graph course. The project leverages historical stock price, and integrates social media listening from customers to predict market Trend On Dow Jones Industrial Average (DJIA).
Stars: ✭ 57 (+216.67%)
Mutual labels:  semantic-web, knowledge-graph
teaching
Teaching material relevant to KGs
Stars: ✭ 61 (+238.89%)
Mutual labels:  semantic-web, knowledge-graph
Processor
Ontology-driven Linked Data processor and server for SPARQL backends. Apache License.
Stars: ✭ 54 (+200%)
Mutual labels:  semantic-web, knowledge-graph
awesome-ontology
A curated list of ontology things
Stars: ✭ 73 (+305.56%)
Mutual labels:  semantic-web, knowledge-graph
FCA-Map
💠 Ontology matching system based on formal concept analysis
Stars: ✭ 25 (+38.89%)
Mutual labels:  semantic-web, knowledge-graph
Kbpedia
KBPedia Knowledge Graph & Knowledge Ontology (KKO)
Stars: ✭ 149 (+727.78%)
Mutual labels:  knowledge-graph, semantic-web
Schema Dts
JSON-LD TypeScript types for Schema.org vocabulary
Stars: ✭ 338 (+1777.78%)
Mutual labels:  knowledge-graph, semantic-web
Web Client
Generic Linked Data browser and UX component framework. Apache license.
Stars: ✭ 105 (+483.33%)
Mutual labels:  knowledge-graph, semantic-web
OLGA
an Ontology SDK
Stars: ✭ 36 (+100%)
Mutual labels:  semantic-web, knowledge-graph
Contextualise
Contextualise is a simple but effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
Stars: ✭ 899 (+4894.44%)
Mutual labels:  knowledge-graph, semantic-web
CSV2RDF
Streaming, transforming, SPARQL-based CSV to RDF converter. Apache license.
Stars: ✭ 48 (+166.67%)
Mutual labels:  semantic-web, knowledge-graph
LinkedDataHub
The Knowledge Graph notebook. Apache license.
Stars: ✭ 150 (+733.33%)
Mutual labels:  semantic-web, knowledge-graph
Semanticmediawiki
🔗 Semantic MediaWiki turns MediaWiki into a knowledge management platform with query and export capabilities
Stars: ✭ 359 (+1894.44%)
Mutual labels:  knowledge-graph, semantic-web
Knowledge Graph Learning
A curated list of awesome knowledge graph tutorials, projects and communities.
Stars: ✭ 516 (+2766.67%)
Mutual labels:  knowledge-graph
Tw5 Tiddlymap
Map drawing and topic visualization for your wiki
Stars: ✭ 620 (+3344.44%)
Mutual labels:  knowledge-graph

Drug repurposing through joint learning on knowledge graphs and literature

Here, we developed a novel method that combines information in literature and structured databases, and applies feature learning to generate vector space embeddings. We apply our method to the identification of drug targets and indications for known drugs based on heterogeneous information about drugs, target proteins, and diseases. We demonstrate that our method is able to combine complementary information from both structured databases and from literature.

Below are the steps for the drugs repurposing pipleine

Requirements

Running

  1. Build the graph as described in link

  2. The output graph is in the data folder in this repository

  3. Before generating the corpus, remove the has-target edges for (Drug target interactions) prediction, and has-indication edges for Drug indications prediction.

python remove_relation_links.py
  1. Generate the knowledge graph corpus from the edgelist after removing edges, run
./deepwalk ../data/edgelist_WalkingRDFOWL_has_indication_free.txt ../data/corpus_WalkingRDFOWL_has_indication_free.txt
  1. Run word2vec on the generated corpus
python word2vec_gensim.py
  1. Normalize the knowledge graph entities with the PubMed abstracts corpus by running
python normalize_text.py
  1. Use the the generated corpus from step 5 with Word2Vec to create independent Pubmed abstracts embeddings.

  2. Combine the generated corpus from step 5 with the knowledge graph corpus similar to the following and run Word2Vec on the combined corpus.

cat ../data/corpus_WalkingRDFOWL_has_indication_free.txt ../data/medline_abstracts_mapped_drugsrepo.txt > ../data/combined_corpus.txt
  1. Run word2Vec on the combined corpus.
  2. Run Ind_ann_graph_common.py and other scripts to train the Artificaial Neural Networks with different embeddings from the knowledge graph and PubMed abstracts available in the data folder.

Data

Knowledge graph and literature

The PubMed abstarcts used in this project was downloaded from [Pubtator] (ftp://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator/), The normalization script can be used to normalize the knowledge graph and literature. The normalized corpus used in this study is available upon request. The knowledge graph edgelist is edgelist_WalkingRDFOWL.txt and the mapping to knowledge graph node is mapping_WalkingRDFOWL.txt

Embeddings

embeddings_WalkingRDFOWL_has_indication_free.txt knowledge graph embeddings for predicting drug indications embeddings_WalkingRDFOWL_has_targets_free.txt knowledge graph embeddings for predicting drugs targets drugs_text_embeddings.txt, diseases_text_embeddings.txt and genes_text_embeddings.txt are Medline abstracts embeddings. drugs_embeddings_combined_has_indication.txt, diseases_embeddings_combined_has_indication.txt and genes_embeddings_combined_has_indication.txt are knowledge graph and Medline abstracts jointly trained.

Evaluations and Mapping

All generated embeddings and mapping data used to normalize Literature information to knowledge graph used in this project is available as python dictionary in the data folder. All drug indications drugs2ind_doid.dict and drug targets drugs2tars_stitch.dict evaluations are available as well. The drug indications is from SIDER database. The drug target is from STITCH database. Chemicals alias from STITCH was used to convert drugs mentions in text to STITCH ID available in chemical_map.dict.

Disease ontology was used to extract MESH to DOID mapping in mesh2doid.dict and OMIM to DOID in omim2doid.dict

Predictions

We make drug indications predictions for approved drugs from SIDER available predicted_indications_approved_processed.tsv in the data folder. The first column is the drug ID and drug name, indications disease ontology ID and name, and the prediction score. The full list of the tested drugs and the predicted ranks for indications and targets are included as indications_ranked_graph.txt, indications_ranked_concat_embeddings.txt and indications_ranked_concat_corpus.txt, etc. The first is the drug PubChem ID followed by the diseases and their ranks.

For the complete data including the mapping files, embeddings and normalized PubMed corpus, please download from here

Sample results

The tables below illustrates few examples of the method's ablility to combine complemnetary information betwene the knowledge graph and the literature which result in improved predictions ranks for drugs indications and targets

Drug Indication Knowledge graph Pubmed abstracts Concatenated embeddings Concatenated corpora
CID00002678 (Cetirizine) allergic hypersensitivity disease (DOID:1205) ranked 34 ranked 4 ranked 1 ranked 10
CID05464096 (Ramiprilat) cerebrovascular disease (DOID:6713) ranked 76 ranked 1 ranked 1 ranked 3
CID00002786 (Clindamycin) impetigo (DOID:8504) ranked 16 ranked 11 ranked 1 ranked 1
CID00002658 (Cefuroxime) pneumonia (DOID:552) ranked 46 ranked 7 ranked 3 ranked 1
CID00004091 (Metformin) diabetes mellitus (DOID:9351) ranked 3 ranked 6 ranked 1 ranked 3
CID00003310 (Etoposide) leukemia (DOID:1240) ranked 177 ranked 3 ranked 11 ranked 1
Drug Target (gene Entrez) Knowledge graph Pubmed abstracts Concatenated embeddings Concatenated corpora
CID00004048 (Megestrol acetate) 2908 ranked 13 ranked 10 ranked 6 ranked 4
CID00004934 (Propantheline) 1131 ranked 91 ranked 13 ranked 1 ranked 1
CID00003155 (Dothiepin) 1129 ranked 62 ranked 26 ranked 19 ranked 1
CID00004666 (Paclitaxel) 7157 ranked 5 ranked 3 ranked 5 ranked 2
CID00003640 (Cortisol) 1551 ranked 13 ranked 20 ranked 3 ranked 10
CID00004594 (Omeprazole) 1544 ranked 53 ranked 18 ranked 7 ranked 2

Citation

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].