web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架，提供灵活的任务队列管理调度方案，提供便捷的数据保存方案（nedb/mongodb），提供数据可视化和用户交互的实现方案

Stars: ✭ 237 (-29.88%)

Mutual labels: crawler, spider

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-85.8%)

Mutual labels: crawler, spider

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-94.97%)

Mutual labels: crawler, spider

Go spider

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.

Stars: ✭ 1,745 (+416.27%)

Mutual labels: crawler, spider

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

Stars: ✭ 138 (-59.17%)

Mutual labels: crawler, spider

Mm131

MM131网站图片爬取 🚨

Stars: ✭ 129 (-61.83%)

Mutual labels: crawler, spider

Crawler China Mainland Universities

中国大陆大学列表爬虫

Stars: ✭ 143 (-57.69%)

Mutual labels: crawler, spider

Digger

Digger is a powerful and flexible web crawler implemented by pure golang

Stars: ✭ 130 (-61.54%)

Mutual labels: crawler, spider

Gospider

golang实现的爬虫框架，使用者只需关心页面规则，提供web管理界面。基于colly开发。

Stars: ✭ 285 (-15.68%)

Mutual labels: crawler, spider

Abot

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

Stars: ✭ 1,961 (+480.18%)

Mutual labels: crawler, spider

Zhihu Login

知乎模拟登录，支持提取验证码和保存 Cookies

Stars: ✭ 340 (+0.59%)

Mutual labels: crawler, spider

Weibo Topic Spider

微博超级话题爬虫，微博词频统计+情感分析+简单分类，新增肺炎超话爬取数据

Stars: ✭ 128 (-62.13%)

Mutual labels: crawler, spider

Proxy pool

Python爬虫代理IP池(proxy pool)

Stars: ✭ 13,964 (+4031.36%)

Mutual labels: crawler, spider

Gain

Web crawling framework based on asyncio.

Stars: ✭ 2,002 (+492.31%)

Mutual labels: crawler, spider

Hacker News Digest

📰 A responsive interface of Hacker News with summaries and thumbnails.

Stars: ✭ 278 (-17.75%)

Mutual labels: crawler, spider

Fun crawler

Crawl some picture for fun

Stars: ✭ 169 (-50%)

Mutual labels: crawler, spider

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (-18.05%)

Mutual labels: crawler, spider

Lianjia Beike Spider

链家网和贝壳网房价爬虫，采集北京上海广州深圳等21个中国主要城市的房价数据（小区，二手房，出租房，新房），稳定可靠快速！支持csv,MySQL, MongoDB,Excel, json存储，支持Python2和3，图表展示数据，注释丰富，点星支持，仅供学习参考，请勿用于商业用途，后果自负。

Stars: ✭ 2,257 (+567.75%)

Mutual labels: crawler, spider

Fooproxy

稳健高效的评分制-针对性- IP代理池 + API服务，可以自己插入采集器进行代理IP的爬取，针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库，支持MongoDB 4.0 使用 Python3.7（Scored IP proxy pool ,customise proxy data crawler can be added anytime）

Stars: ✭ 195 (-42.31%)

Mutual labels: crawler, spider

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

Stars: ✭ 122 (-63.91%)

Mutual labels: crawler, spider

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+4496.15%)

Mutual labels: crawler, spider

Jssoup

JavaScript + BeautifulSoup = JSSoup