All Projects → ferru97 → PyPaperBot

ferru97 / PyPaperBot

Licence: MIT license
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref, and SciHub.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to PyPaperBot

papis-zotero
Zotero compatiblity scripts for papis
Stars: ✭ 29 (-84.24%)
Mutual labels:  papers, crossref, scihub
Object Detection
Summary of object detection(modules&&improvements)
Stars: ✭ 50 (-72.83%)
Mutual labels:  papers
Visual-Semantic-Embeddings-an-incomplete-list
A paper list of visual semantic embeddings and text-image retrieval.
Stars: ✭ 42 (-77.17%)
Mutual labels:  papers
understanding-ai
personal repository
Stars: ✭ 34 (-81.52%)
Mutual labels:  papers
Database-Optimization
📚 A collection of work related to Database Optimization.
Stars: ✭ 31 (-83.15%)
Mutual labels:  papers
crminer
⛔ ARCHIVED ⛔ Fetch 'Scholary' Full Text from 'Crossref'
Stars: ✭ 17 (-90.76%)
Mutual labels:  crossref
Academic Phrases
Bypass that mental block when writing your papers.
Stars: ✭ 244 (+32.61%)
Mutual labels:  papers
PyScholar
A 'supervised' parser for Google Scholar
Stars: ✭ 74 (-59.78%)
Mutual labels:  google-scholar
paper seacher
where where where paper
Stars: ✭ 45 (-75.54%)
Mutual labels:  papers
love-a-paper
Twitter bot that tweets randomly selected papers from Papers We Love.
Stars: ✭ 20 (-89.13%)
Mutual labels:  papers
LaTeX-Templates
Commented templates for CVs, homework, lecture notes, presentations, research papers, and essays, with commands for math/statistics symbols
Stars: ✭ 45 (-75.54%)
Mutual labels:  papers
gscholar-citations-crawler
Crawl all your citations from Google Scholar
Stars: ✭ 43 (-76.63%)
Mutual labels:  google-scholar
Awesome-Federated-Learning-on-Graph-and-GNN-papers
Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.
Stars: ✭ 206 (+11.96%)
Mutual labels:  papers
nlp-papers
Must-read papers on Natural Language Processing (NLP)
Stars: ✭ 87 (-52.72%)
Mutual labels:  papers
tools-generation-detection-synthetic-content
Compilation of the state of the art of tools, articles, forums and links of interest to generate and detect any type of synthetic content using deep learning.
Stars: ✭ 107 (-41.85%)
Mutual labels:  papers
Awesome Grounding
awesome grounding: A curated list of research papers in visual grounding
Stars: ✭ 247 (+34.24%)
Mutual labels:  papers
PaperWeeklyAI
📚「@MaiweiAI」Studying papers in the fields of computer vision, NLP, and machine learning algorithms every week.
Stars: ✭ 50 (-72.83%)
Mutual labels:  papers
Diverse-RecSys
Collection of diverse recommendation papers
Stars: ✭ 39 (-78.8%)
Mutual labels:  papers
Paper-Notes
Paper notes in deep learning/machine learning and computer vision
Stars: ✭ 37 (-79.89%)
Mutual labels:  papers
COVID-19-Resources
Resources for Covid-19
Stars: ✭ 25 (-86.41%)
Mutual labels:  papers

Donate

PyPaperBot

PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref, and SciHub. The tool tries to download papers from different sources such as PDF provided by Scholar, Scholar related links, and Scihub. PyPaerbot is also able to download the bibtex of each paper.

Features

  • Download papers given a query
  • Download papers given paper's DOIs
  • Download papers given a Google Scholar link
  • Generate Bibtex of the downloaded paper
  • Filter downloaded paper by year, journal and citations number

Installation

For normal Users

Use pip to install from pypi:

pip install PyPaperBot

If on windows you get an error saying error: Microsoft Visual C++ 14.0 is required.. try to install Microsoft C++ Build Tools or Visual Studio

For Termux users

Since numpy cannot be directly installed....

pkg install wget
wget https://its-pointless.github.io/setup-pointless-repo.sh
pkg install numpy
export CFLAGS="-Wno-deprecated-declarations -Wno-unreachable-code"
pip install pandas

and

pip install PyPaperbot

How to use

PyPaperBot arguments:

Arguments Description Type
--query Query to make on Google Scholar or Google Scholar page link string
--doi DOI of the paper to download (this option uses only SciHub to download) string
--doi-file File .txt containing the list of paper's DOIs to download string
--scholar-pages Number or range of Google Scholar pages to inspect. Each page has a maximum of 10 papers string
--dwn-dir Directory path in which to save the result string
--min-year Minimal publication year of the paper to download int
--max-dwn-year Maximum number of papers to download sorted by year int
--max-dwn-cites Maximum number of papers to download sorted by number of citations int
--journal-filter CSV file path of the journal filter (More info on github) string
--restrict 0:Download only Bibtex - 1:Down load only papers PDF int
--scihub-mirror Mirror for downloading papers from sci-hub. If not set, it is selected automatically string
--scholar-results Number of scholar results to bedownloaded when --scholar-pages=1 int
--proxy Proxies to be used. Please specify the protocol to be used. string
--single-proxy Use a single proxy. Recommended if using --proxy gives errors. string
-h Shows the help --

Note

You can use only one of the arguments in the following groups

  • --query, --doi-file, and --doi
  • --max-dwn-year and and max-dwn-cites

One of the arguments --scholar-pages, --query , and --file is mandatory The arguments --scholar-pages is mandatory when using *--query * The argument --dwn-dir is mandatory

The argument --journal-filter require the path of a CSV containing a list of journal name paired with a boolean which indicates whether or not to consider that journal (0: don't consider /1: consider) Example

The argument --doi-file require the path of a txt file containing the list of paper's DOIs to download organized with one DOI per line Example

Use the --proxy argument at the end of all other arguments and specify the protocol to be used. See the examples to understand how to use the option.

SciHub access

If access to SciHub is blocked in your country, consider using a free VPN service like ProtonVPN Also, you can use proxy option above.

Example

Download a maximum of 30 papers from the first 3 pages given a query and starting from 2018 using the mirror https://sci-hub.do:

python -m PyPaperBot --query="Machine learning" --scholar-pages=3  --min-year=2018 --dwn-dir="C:\User\example\papers" --scihub-mirror="https://sci-hub.do"

Download papers from pages 4 to 7 (7th included) given a query:

python -m PyPaperBot --query="Machine learning" --scholar-pages=4-7 --dwn-dir="C:\User\example\papers"

Download a paper given the DOI:

python -m PyPaperBot --doi="10.0086/s41037-711-0132-1" --dwn-dir="C:\User\example\papers"`

Download papers given a file containing the DOIs:

python -m PyPaperBot --doi-file="C:\User\example\papers\file.txt" --dwn-dir="C:\User\example\papers"`

If it doesn't work, try to use py instead of python i.e.

py -m PyPaperBot --doi="10.0086/s41037-711-0132-1" --dwn-dir="C:\User\example\papers"`

Using a proxy

python -m PyPaperBot --query=rheumatoid+arthritis --scholar-pages=1 --scholar-results=7 --dwn-dir=/download --proxy http://1.1.1.1::8080 https://8.8.8.8::8080

In termux, you can directly use PyPaperBot followed by arguments...

Contributions

Feel free to contribute to this project by proposing any change, fix, and enhancement on the dev branch

To do

  • Tests
  • Code documentation
  • General improvements

Disclaimer

This application is for educational purposes only. I do not take responsibility for what you choose to do with this application.

Donation

If you like this project, you can give me a cup of coffee :)

paypal

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].