All Projects → paulamartingonzalez → Targeted_literature_reviews_via_webscraping

paulamartingonzalez / Targeted_literature_reviews_via_webscraping

Web scraping to get articles for a given query. It returns an spreadsheet with titles, abstracts, doi and references of the article

Projects that are alternatives of or similar to Targeted literature reviews via webscraping

Continuousparetomtl
[ICML 2020] PyTorch Code for "Efficient Continuous Pareto Exploration in Multi-Task Learning"
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Cvnd Udacity
Computer Vision Nanodegree program from Udacity
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Python Tutorial Notebooks
Python tutorials as Jupyter Notebooks for NLP, ML, AI
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Feature Selection For Machine Learning
Code Repository for the online course Feature Selection for Machine Learning
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Metrotwitter
What Twitter reveals about the differences between cities and the monoculture of the Bay Area
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Coronamaskon
Mask On-Off control with computer vision
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Aws Machine Learning University Accelerated Cv
Machine Learning University: Accelerated Computer Vision Class
Stars: ✭ 1,068 (+1915.09%)
Mutual labels:  jupyter-notebook
Hsuantienlin Ml Camp
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Average Word2vec
🔤 Calculate average word embeddings (word2vec) from documents for transfer learning
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Onnx tflite yolov3
A Conversion tool to convert YOLO v3 Darknet weights to TF Lite model (YOLO v3 PyTorch > ONNX > TensorFlow > TF Lite), and to TensorRT (YOLO v3 Pytorch > ONNX > TensorRT).
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Bert Stack Overflow
Train a BERT model with TensorFlow 2.0 to automatically tag StackOverflow questions!
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Lung Diseases Classifier
Diseases Detection from NIH Chest X-ray data
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Nlp Various Tutorials
자연어 처리와 관련한 여러 튜토리얼 저장소
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Tensorflow
Tensorflow实战学习笔记、代码、机器学习进阶系列
Stars: ✭ 1,066 (+1911.32%)
Mutual labels:  jupyter-notebook
Sirmodel covid 19
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Deeplens Workshops
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook
Tensorflow Tutorials For Time Series
TensorFlow Tutorial for Time Series Prediction
Stars: ✭ 1,067 (+1913.21%)
Mutual labels:  jupyter-notebook
Fasttext multilingual
Multilingual word vectors in 78 languages
Stars: ✭ 1,067 (+1913.21%)
Mutual labels:  jupyter-notebook
Sklearn Deeprl
Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.
Stars: ✭ 52 (-1.89%)
Mutual labels:  jupyter-notebook

alt text

Targeted Literature Reviews using webscraping

Web scraping to get articles for a given query. It returns an spreadsheet with titles, abstracts and pmids.

It works on Pubmed and it is based on biopython: https://biopython.org

You can run it on Google Colab without downloading anything locally! :) https://research.google.com/colaboratory/faq.html

How it works?

For a given query, you can get:

  1. an xlsx file with the titles and abstracts of the papers in your query
  2. a graph with the papers in your query and their references. This lets us find highly cited papers in a given field
  3. an xlsx file with the titles and abstracts of the references as well together with their degree (i.e. the number of connections in the graph). The higher the degree, the more papers in your query citing it

For the example query "Radiomics"AND"CT"AND"Ovarian Cancer" we get:

alt text

Next steps:

  • At the moment it only works on PubMed. I'm working on making it work in arxiv and bioarxiv as well. Implementation in Google Scholar is complicated but I am also trying to get my head around it.
  • I'm working on an implementation that requires no code whatsoever - via website or widgets.
  • It would be great to import the articles to Mendeley, so I'm also working on that!

If you have any suggestion to improve the code, please feel free to raise an Issue!

Questions:

What happens to articles behind a paywall?

You'll be able to get the abstract but unfortunately not the references. So those won't be added to the graph. Open science is the way to go!!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].