ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

✭ 68

python Jupyter Notebook Batchfile scraping crawling scrapy webscraping scrapyd webcrawling

scrapyr

a simple & tiny scrapy clustering solution, considered a drop-in replacement for scrapyd

✭ 50

go HCL Dockerfile clustering scrapy scrapyd-server

toutiao

今日头条科技新闻接口爬虫

✭ 17

python spider scrapy

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

memes-api

API for scrapping common meme sites

✭ 17

python Dockerfile flask scraping scrapy memes-api parsel

douban-spider

基于Scrapy框架的豆瓣电影爬虫

✭ 25

python spider scrapy

ptt-web-crawler

PTT 網路版爬蟲

✭ 20

python HTML javascript crawler scrapy ptt

scrapy-pipelines

A collection of pipelines for Scrapy

✭ 16

python pipelines scrapy

dannyAVgleDownloader

知名網站avgle下載器

✭ 27

python downloader qt multithreading qt5 scrapy threading

Python Master Courses

人生苦短我用Python

✭ 61

python HTML javascript c course spark scrapy

SpiderManager

爬虫管理平台

✭ 27

python django scrapy scrapyd

allitebooks.com

Download all the ebooks with indexed csv of "allitebooks.com"

✭ 24

python CSS HTML ebook scrapy allitebooks webscraping

pythonSpider

🕷️some python spiders with BeautifulSoup or scarpy

✭ 28

python scrapy beautifulsoup

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

✭ 317

python plugin crawler proxy scraping scrapy crawler-detection

XMQ-BackUp

小密圈备份，圈子/话题/图片/文件。

✭ 22

python selenium scrapy

scrapy xiuren

秀人网爬虫 55156爬虫

✭ 43

python scrapy meizitu xiuren

GPlayCrawler

No description or website provided.

✭ 47

python crawler scrapy googleplay

scrapy-admin

A django admin site for scrapy

✭ 44

python HTML crawler spider scrapy scrapyd

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

✭ 22

HTML python scraper facebook spider scraping scrapy

hk0weather

Web scraper project to collect the useful Hong Kong weather data from HKO website

✭ 49

python shell weather scrapy webscraping hongkong

V2EX Spider

V2EX爬虫

✭ 21

python spider scrapy v2ex

ImageGrabber

A Scrapy demo : Download all images from a site

✭ 33

python scrapy

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

✭ 21

python HTML flask data-science machine-learning natural-language-processing text-mining mongodb sentiment-analysis web-scraping webapp flask-application scrapy tableau tripadvisor dataprocessing restaurant-reviews

Scrapy-Spiders

一个基于Scrapy的数据采集爬虫代码库

✭ 34

python javascript crawler spider scrapy appium fiddler selenuim

BOC FER Spider

Use Scrapy crawl foreign exchange rate from BOC (Bank of China)

✭ 18

python scrapy

NovelCrawler

基于Scrapy的爬虫demo

✭ 15

HTML python scrapy

JustDownlink

基于Scrapy+Elasticsearch+Django搭建的分布式电影搜索

✭ 28

javascript python CSS HTML shell elasticsearch django scrapy

scrapy plus

scrapy 常用爬网必备工具包

✭ 18

python scrapy-spider tor middlewares scrapy spiders scrapy-extension

python-fxxk-spider

收集各种免费的 Python 爬虫项目

✭ 184

crawler spider requests scrapy

python-spider

python爬虫小项目【持续更新】【笔趣阁小说下载、Tweet数据抓取、天气查询、网易云音乐逆向、天天基金网查询、微博数据抓取（生成cookie）、有道翻译逆向、企查查免登陆爬虫、大众点评svg加密破解、B站用户爬虫、拉钩免登录爬虫、自如租房字体加密、知乎问答

✭ 45

javascript python spider scrapy

Data-Engineering-Projects

Personal Data Engineering Projects

✭ 167

Jupyter Notebook python postgres airflow spark cassandra mongodb data-warehouse data-engineering data-lake scrapy data-modeling aws-redshift star-schema ingest-data data-engineering-nanodegree

proxi

Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.

✭ 32

go shell Makefile Dockerfile crawler proxy web-crawler scraping http-proxy scrapy proxypool proxy-list

scraping-ebay

Scraping Ebay's products using Scrapy Web Crawling Framework

✭ 79

python web-scraping scrapy

logparser

A tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.

✭ 70

python visualization scrapy log-parser scrapyd log-parsing log-analyse scrapy-log-analysis scrapyd-log-analysis

bgmtools

Bangumi小工具

✭ 66

python javascript HTML django scrapy tampermonkey bangumi

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

✭ 38