Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-6.25%)

Mutual labels: web-scraping

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (+31.25%)

Mutual labels: web-scraping

raspagem-de-dados-fatec

📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí

Stars: ✭ 22 (+37.5%)

Mutual labels: web-scraping

top-github-scraper

Scape top GitHub repositories and users based on keywords

Stars: ✭ 40 (+150%)

Mutual labels: web-scraping

audiobooker

Audio Book scrapper

Stars: ✭ 14 (-12.5%)

Mutual labels: web-scraping

Stock-Fundamental-data-scraping-and-analysis

Project on building a web crawler to collect the fundamentals of the stock and review their performance in one go

Stars: ✭ 40 (+150%)

Mutual labels: web-scraping

Text-Analysis

Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.

Stars: ✭ 48 (+200%)

Mutual labels: web-scraping

article-summary-deep-learning

📖 Using deep learning and scraping to analyze/summarize articles! Just drop in any URL!

Stars: ✭ 18 (+12.5%)

Mutual labels: web-scraping

View All Similar Projects ➔

Comic-scraper (Comic/Manga Downloader)

Downloads comics and manga from various websites and creates pdf or cbz files from them. Currently supports mangafox.me and mangahere.co (more coming up soon).

Installation

Via pip

To install the comic scraper, simply type this into your terminal (sudo -EH might be necessary):

pip install comic-scraper

Via pip (local)

Clone a copy of the repository using the following command:

git clone https://github.com/AbstractGeek/comic-scraper.git

Open your terminal into the folder and type this into it (sudo might be necessary):

pip install .

Manual Installation

Requirements

The script is written in python. It requires the following packages:

BeautifulSoup4
requests
futures (concurrent.futures)
img2pdf

These can simply be installed by:

pip install -r requirements.txt

That's it. Use comic_scraper.py to download comics and manga.

Usage

Manga

Find your comic of interest in mangafox/mangahere. Copy the url of the comic page (https supported). For example, If I wanted to download kingdom manga, I need to copy this url: https://mangafox.me/manga/kingdom/

To download all the chapters of the comic, simply call the script and input the url.

comic-scraper https://mangafox.me/manga/kingdom/

If you want to set a custom location, add -l and input the location

comic-scraper -l ~/Comics/ https://mangafox.me/manga/kingdom/

If you want to download a select few chapters, add -c and input the chapter numbers. For example, if I want to download chapters 10-20, I use the following command

comic-scraper -l ~/Comics/ -c 10:20 https://mangafox.me/manga/kingdom/

Note: Only individual chapters or sequential chunks (start:stop) can currently be downloaded.

Download format can be specified too. The current default is pdf, but comics can be downloaded as cbz files using the following command.

comic-scraper -l ~/Comics/ -c 10:20 -f cbz https://mangafox.me/manga/kingdom/

Comics

Coming soon...

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

AbstractGeek / comic-scraper

Programming Languages

Labels

Projects that are alternatives of or similar to comic-scraper

Comic-scraper (Comic/Manga Downloader)

Installation

Via pip

Via pip (local)

Manual Installation

Requirements

Usage

Manga

Comics