web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架，提供灵活的任务队列管理调度方案，提供便捷的数据保存方案（nedb/mongodb），提供数据可视化和用户交互的实现方案

Stars: ✭ 237 (-14.75%)

Mutual labels: crawler, spider

Fast Lianjia Crawler

直接通过链家 API 抓取数据的极速爬虫，宇宙最快~~ 🚀

Stars: ✭ 247 (-11.15%)

Mutual labels: crawler, spider

Hnrss

Custom, realtime RSS feeds for Hacker News

Stars: ✭ 277 (-0.36%)

Mutual labels: hacker-news, rss

Ttrss plugin Feediron

Evolution of ttrss_plugin-af_feedmod

Stars: ✭ 172 (-38.13%)

Mutual labels: rss, article

Morss

Get full text RSS feeds

Stars: ✭ 184 (-33.81%)

Mutual labels: rss, article

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

Stars: ✭ 138 (-50.36%)

Mutual labels: crawler, spider

crawler

A simple and flexible web crawler framework for java.

Stars: ✭ 20 (-92.81%)

Mutual labels: crawler, spider

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-82.73%)

Mutual labels: crawler, spider

slime

🍰 一个可视化的爬虫平台

Stars: ✭ 27 (-90.29%)

Mutual labels: crawler, spider

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

Stars: ✭ 122 (-56.12%)

Mutual labels: crawler, spider

Go spider

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.

Stars: ✭ 1,745 (+527.7%)

Mutual labels: crawler, spider

Pspider

简单易用的Python爬虫框架，QQ交流群：597510560

Stars: ✭ 1,611 (+479.5%)

Mutual labels: crawler, spider

Python3 Spider

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

Stars: ✭ 2,129 (+665.83%)

Mutual labels: crawler, spider

Jlitespider

A lite distributed Java spider framework :-)

Stars: ✭ 151 (-45.68%)

Mutual labels: crawler, spider

Js Reverse

JS逆向研究

Stars: ✭ 159 (-42.81%)

Mutual labels: crawler, spider

Free proxy website

获取免费socks/https/http代理的网站集合

Stars: ✭ 119 (-57.19%)

Mutual labels: crawler, spider

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (-38.49%)

Mutual labels: crawler, spider

Proxy pool

Python爬虫代理IP池(proxy pool)

Stars: ✭ 13,964 (+4923.02%)

Mutual labels: crawler, spider

Zhihu Crawler People

A simple distributed crawler for zhihu && data analysis

Stars: ✭ 182 (-34.53%)

Mutual labels: crawler, spider

Gain

Web crawling framework based on asyncio.

Stars: ✭ 2,002 (+620.14%)

Mutual labels: crawler, spider

Fooproxy

稳健高效的评分制-针对性- IP代理池 + API服务，可以自己插入采集器进行代理IP的爬取，针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库，支持MongoDB 4.0 使用 Python3.7（Scored IP proxy pool ,customise proxy data crawler can be added anytime）

Stars: ✭ 195 (-29.86%)

Mutual labels: crawler, spider

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-31.65%)

Mutual labels: crawler, spider

Querylist

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Stars: ✭ 2,392 (+760.43%)

Mutual labels: crawler, spider

Decryptlogin

APIs for loginning some websites by using requests.