All Projects → AbstractGeek → comic-scraper

AbstractGeek / comic-scraper

Licence: MIT License
[Python] Scraps comics and manga from various websites and creates cbz files from them

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to comic-scraper

heroshi
Heroshi – open source web crawler.
Stars: ✭ 51 (+218.75%)
Mutual labels:  web-scraping
linkextractor
A Docker tutorial using a link extraction application example
Stars: ✭ 41 (+156.25%)
Mutual labels:  web-scraping
comp thinking social science
Computational Thinking for Social Scientists book project
Stars: ✭ 42 (+162.5%)
Mutual labels:  web-scraping
actor-scraper
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
Stars: ✭ 83 (+418.75%)
Mutual labels:  web-scraping
halfstaff
🇺🇸 Is the US flag at half-staff?
Stars: ✭ 22 (+37.5%)
Mutual labels:  web-scraping
PaperScraper
A web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
Stars: ✭ 63 (+293.75%)
Mutual labels:  web-scraping
tableau-scraping
Tableau scraper python library. R and Python scripts to scrape data from Tableau viz
Stars: ✭ 91 (+468.75%)
Mutual labels:  web-scraping
UltimateMangaReader
A feature-rich online manga reader for Kobo E-Ink devices based on Qt5.
Stars: ✭ 72 (+350%)
Mutual labels:  manga-scraper
sp-subway-scraper
🚆This web scraper builds a dataset for São Paulo subway operation status
Stars: ✭ 24 (+50%)
Mutual labels:  web-scraping
Movie-Recommendation-System-with-Sentiment-Analysis
Content based movie recommendation system with sentiment analysis
Stars: ✭ 44 (+175%)
Mutual labels:  web-scraping
GSoC-Data-Analyser
Simple search for organisations participating/participated in the GSoC
Stars: ✭ 29 (+81.25%)
Mutual labels:  web-scraping
codechef-rank-comparator
Web application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).
Stars: ✭ 23 (+43.75%)
Mutual labels:  web-scraping
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-6.25%)
Mutual labels:  web-scraping
restaurant-finder-featureReviews
Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (+31.25%)
Mutual labels:  web-scraping
raspagem-de-dados-fatec
📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (+37.5%)
Mutual labels:  web-scraping
top-github-scraper
Scape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+150%)
Mutual labels:  web-scraping
audiobooker
Audio Book scrapper
Stars: ✭ 14 (-12.5%)
Mutual labels:  web-scraping
Stock-Fundamental-data-scraping-and-analysis
Project on building a web crawler to collect the fundamentals of the stock and review their performance in one go
Stars: ✭ 40 (+150%)
Mutual labels:  web-scraping
Text-Analysis
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+200%)
Mutual labels:  web-scraping
article-summary-deep-learning
📖 Using deep learning and scraping to analyze/summarize articles! Just drop in any URL!
Stars: ✭ 18 (+12.5%)
Mutual labels:  web-scraping

Comic-scraper (Comic/Manga Downloader)

Downloads comics and manga from various websites and creates pdf or cbz files from them. Currently supports mangafox.me and mangahere.co (more coming up soon).

Installation

Via pip

To install the comic scraper, simply type this into your terminal (sudo -EH might be necessary):

pip install comic-scraper

Via pip (local)

Clone a copy of the repository using the following command:

git clone https://github.com/AbstractGeek/comic-scraper.git

Open your terminal into the folder and type this into it (sudo might be necessary):

pip install .

Manual Installation

Requirements

The script is written in python. It requires the following packages:

  1. BeautifulSoup4
  2. requests
  3. futures (concurrent.futures)
  4. img2pdf

These can simply be installed by:

pip install -r requirements.txt

That's it. Use comic_scraper.py to download comics and manga.

Usage

Manga

Find your comic of interest in mangafox/mangahere. Copy the url of the comic page (https supported). For example, If I wanted to download kingdom manga, I need to copy this url: https://mangafox.me/manga/kingdom/

To download all the chapters of the comic, simply call the script and input the url.

comic-scraper https://mangafox.me/manga/kingdom/

If you want to set a custom location, add -l and input the location

comic-scraper -l ~/Comics/ https://mangafox.me/manga/kingdom/

If you want to download a select few chapters, add -c and input the chapter numbers. For example, if I want to download chapters 10-20, I use the following command

comic-scraper -l ~/Comics/ -c 10:20 https://mangafox.me/manga/kingdom/

Note: Only individual chapters or sequential chunks (start:stop) can currently be downloaded.

Download format can be specified too. The current default is pdf, but comics can be downloaded as cbz files using the following command.

comic-scraper -l ~/Comics/ -c 10:20 -f cbz https://mangafox.me/manga/kingdom/

Comics

Coming soon...

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].