[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.

✭ 1,745

go crawler spider pipeline schedule

Ipproxy

爬虫所需要的IP代理，抓取九个网站的代理IP检测/清洗/入库/更新，添加调用接口

✭ 136

python spider proxies

Bilibili User Information Spider

B站3亿用户信息爬虫（mid号，昵称，性别，关注，粉丝，等级）

✭ 136

python spider bilibili user

Mm131

MM131网站图片爬取 🚨

✭ 129

python crawler spider

Digger

Digger is a powerful and flexible web crawler implemented by pure golang

✭ 130

go crawler spider

Guwen Spider

一个完整的nodeJs 串行爬虫抓取3万多个页面。

✭ 129

javascript es7 nodejs async spider mongoose

Weibo Topic Spider

微博超级话题爬虫，微博词频统计+情感分析+简单分类，新增肺炎超话爬取数据

✭ 128

python crawler spider weibo topic

Scrapy demo

all kinds of scrapy demo

✭ 128

python mongodb demo kafka spider example pipeline scrapy sqlalchemy oss

Feapder

feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架

✭ 110

python spider scrapy

Yspider

yspider -- 轻量级爬虫系统

✭ 125

web mongodb flask spider abstraction

Douban crawler

备份豆瓣计划

✭ 124

python spider douban

Apiproject

[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)

✭ 124

go golang redis makefile kafka jwt spider headless alipay mongo api-server cors gorm livereload compress wxpay

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

✭ 122

vue crawler spider scrapy platform web-crawler

Pspider

简单易用的Python爬虫框架，QQ交流群：597510560

✭ 1,611

python crawler spider multiprocessing multi-threading web-crawler proxies python-spider web-spider

Pddspider

拼多多爬虫，爬取所有商品、评论等信息

✭ 121

python spider selenium

Wechat article

爬取微信公众号文章

✭ 121

python python3 wechat spider pyqt5

Free proxy website

获取免费socks/https/http代理的网站集合

✭ 119

python proxy crawler spider ip

Decryptlogin

APIs for loginning some websites by using requests.

✭ 1,861

python crawler twitter spider pypi login requests bilibili xiaomi baidu weibo zhihu stackoverflow taobao tencent baiduyun xiami 12306 jingdong migu

Copybook

用爬虫爬取小说网站上所有小说，存储到数据库中，并用爬到的数据构建自己的小说网站

✭ 117

python css django spider scrapy

Examples Of Web Crawlers

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

✭ 10,724

python HTML javascript CSS wechat crawler spider example selenium multithreading stock taobao pyquery tmall fund agent-pool wechat-report

House Price Prediction

房价预测完整项目：1.爬取链家网数据 2.处理后，用sklearn中几个逻辑回归机器学习模型和keras神经网络搭建模型预测房价最终结果神经网络效果更好，R^2值0.75左右

✭ 116

python machine-learning keras spider sklearn

Dingdian

Python爬虫和Flask实现小说网站

✭ 115

python python3 spider flask-application

Bilibili member crawler

B站用户爬虫好耶~是爬虫

✭ 115

python python3 mysql web crawler spider queue requests bilibili multithreading

Douban Movie

Golang爬虫爬取豆瓣电影Top250

✭ 114

go golang crawler spider movie douban

Geetest

滑动验证码，希望对你们有所帮助❤️

✭ 114

python python3 spider bilibili crawl

Douyin Api

抖音API、抖音数据、抖音直播数据、抖音直播Api、抖音视频Api、抖音爬虫、抖音去水印、抖音视频下载、抖音视频解析、抖音直播监控、抖音数据采集

✭ 112

api spider api-client

Scrala

Unmaintained 🐳 ☕️ 🕷 Scala crawler(spider) framework, inspired by scrapy, created by @gaocegege

✭ 113

scala docker spider scrapy actor-model

Pkulaw spider

爬取北大法宝网http://www.pkulaw.cn/Case/

✭ 113

python ai crawler spider law

Cockroach

又一个 java 内容（pa）获取（chong）工具

✭ 112

java spider

Baiduspider

BaiduSpider，一个爬取百度搜索结果的爬虫，目前支持百度网页搜索，百度图片搜索，百度知道搜索，百度视频搜索，百度资讯搜索，百度文库搜索，百度经验搜索和百度百科搜索。

✭ 105

python search crawler spider baidu

Jobs Search

🕷招聘网站爬虫合集，不定期更新分支

✭ 111

python mysql spider

Hive

lots of spider (很多爬虫）

✭ 110

python python3 spider scrapy selenium-webdriver beautifulsoup

Not Your Average Web Crawler

A web crawler (for bug hunting) that gathers more than you can imagine.

✭ 107

python security crawler spider scanner scraper vulnerability custom request bug-bounty

Crawler Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

✭ 1,549

PHP hacktoberfest crawler spider bots user-agent detect

Daily scripts

日常小脚本，懒人欢乐多。

✭ 105

python spider scripts python-script daily

Nl2lf

The Resources for "Natural Language to Logical Form" ; "自然语言转逻辑形式"研究资料收集。

✭ 105

spider

Animesearcher

整合第三方网站的视频和弹幕资源, 为白嫖党提供最佳看番追剧体验

✭ 101

python spider player bilibili movies danmaku cctv

Pspider

一个简单的分布式爬虫框架

✭ 102

python spider celery crawl

Ruia

Async Python 3.6+ web scraping micro-framework based on asyncio

✭ 1,366

python crawler asyncio spider aiohttp

Luoo.spider

🤖 A spider and server for Luoo.qy

✭ 99

python typescript music server spider koa

Douyinsdk

抖音 SDK，数据采集，爬虫抓取不是梦

✭ 99

python sdk crawler spider

Gopa Abandoned

GOPA, a spider written in Go.（NOTE: this project moved to https://github.com/infinitbyte/gopa ）

✭ 98

go golang crawler spider lightweight

Economic audit knowledge graph

经济责任审计知识图谱：网络爬虫、关系抽取、领域词汇判定

✭ 98

javascript java spider knowledge-graph neo4j

Spider

🕷some website spider application base on proxy pool (support http & websocket)

✭ 93

python spider

Zhihuspider

知乎用户公开个人信息爬虫, 能够爬取用户关注关系，基于Python、使用代理、多线程

✭ 92

python mysql redis spider

Ant nest

Simple, clear and fast Web Crawler framework build on python3.6+, powered by asyncio.

✭ 90

python python36 framework asyncio spider

Csdn Spider

爬取CSDN上的博客文章

✭ 89

python spider

Spider

简简单单spider

✭ 88

python spider shadowsocks

Zhihu Spider

知乎爬虫程序，定时跟踪问题数据，定时推送热门话题

✭ 87

javascript spider zhihu

Alipayorderssupervisor Gui

GUI of AlipayOrdersSupervisor, implemented in Java and Swing

✭ 85

java spider alipay swing

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

✭ 1,246

go crawler spider scraper scraping

Puppeteer Walker

a puppeteer walker 🕷 🕸

✭ 78

javascript chrome crawler spider puppeteer headless

61-120 of 395 spider projects

‹

›

next*5