Top 395 spider open source projects

Netease Music Spider
netease-music-spider is a sipder that you can find beautiful girlfriend or handsome boyfriend.
Taobaoscrapy
😩Tool For Taobao/Tmall| 儿时玩具已经过时
Papa
一个浏览器端数据爬虫,做每个人的数据助手
Venom
All Terrain Autonomous Quadruped
Qiandao
🌟⏳🌟 各种网站的签到(停止维护)
Amazonbigspider
😱Full Automatic Amazon Distributed Spider | 亚马逊分布式四国际站采集选款产品|账号admin,密码adminadmin
Go spider
[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
Ipproxy
爬虫所需要的IP代理,抓取九个网站的代理IP检测/清洗/入库/更新,添加调用接口
Bilibili User Information Spider
B站3亿用户信息爬虫(mid号,昵称,性别,关注,粉丝,等级)
Mm131
MM131网站图片爬取 🚨
Digger
Digger is a powerful and flexible web crawler implemented by pure golang
Guwen Spider
一个完整的nodeJs 串行爬虫 抓取3万多个页面。
Weibo Topic Spider
微博超级话题爬虫,微博词频统计+情感分析+简单分类,新增肺炎超话爬取数据
Feapder
feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架
Yspider
yspider -- 轻量级爬虫系统
Douban crawler
备份豆瓣计划
Apiproject
[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)
Crawlab Lite
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Pspider
简单易用的Python爬虫框架,QQ交流群:597510560
Pddspider
拼多多爬虫,爬取所有商品、评论等信息
Wechat article
爬取微信公众号文章
Free proxy website
获取免费socks/https/http代理的网站集合
Copybook
用爬虫爬取小说网站上所有小说,存储到数据库中,并用爬到的数据构建自己的小说网站
Examples Of Web Crawlers
一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )
House Price Prediction
房价预测完整项目:1.爬取链家网数据 2.处理后,用sklearn中几个逻辑回归机器学习模型和keras神经网络搭建模型预测房价 最终结果神经网络效果更好,R^2值0.75左右
Dingdian
Python爬虫和Flask实现小说网站
Douban Movie
Golang爬虫 爬取豆瓣电影Top250
Geetest
滑动验证码,希望对你们有所帮助❤️
Douyin Api
抖音API、抖音数据、抖音直播数据、抖音直播Api、抖音视频Api、抖音爬虫、抖音去水印、抖音视频下载、抖音视频解析、抖音直播监控、抖音数据采集
Scrala
Unmaintained 🐳 ☕️ 🕷 Scala crawler(spider) framework, inspired by scrapy, created by @gaocegege
Pkulaw spider
爬取北大法宝网http://www.pkulaw.cn/Case/
Cockroach
又一个 java 内容(pa)获取(chong)工具
✭ 112
javaspider
Baiduspider
BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索,百度视频搜索,百度资讯搜索,百度文库搜索,百度经验搜索和百度百科搜索。
Jobs Search
🕷招聘网站爬虫合集,不定期更新分支
Not Your Average Web Crawler
A web crawler (for bug hunting) that gathers more than you can imagine.
Crawler Detect
🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
Daily scripts
日常小脚本,懒人欢乐多。
Nl2lf
The Resources for "Natural Language to Logical Form" ; "自然语言转逻辑形式"研究资料收集。
✭ 105
spider
Animesearcher
整合第三方网站的视频和弹幕资源, 为白嫖党提供最佳看番追剧体验
Pspider
一个简单的分布式爬虫框架
Ruia
Async Python 3.6+ web scraping micro-framework based on asyncio
Luoo.spider
🤖 A spider and server for Luoo.qy
Douyinsdk
抖音 SDK,数据采集,爬虫抓取不是梦
Gopa Abandoned
GOPA, a spider written in Go.(NOTE: this project moved to https://github.com/infinitbyte/gopa )
Economic audit knowledge graph
经济责任审计知识图谱:网络爬虫、关系抽取、领域词汇判定
Spider
🕷some website spider application base on proxy pool (support http & websocket)
Zhihuspider
知乎用户公开个人信息爬虫, 能够爬取用户关注关系,基于Python、使用代理、多线程
Ant nest
Simple, clear and fast Web Crawler framework build on python3.6+, powered by asyncio.
Csdn Spider
爬取CSDN上的博客文章
Spider
简简单单spider
Zhihu Spider
知乎爬虫程序,定时跟踪问题数据,定时推送热门话题
Alipayorderssupervisor Gui
GUI of AlipayOrdersSupervisor, implemented in Java and Swing
Geziyor
Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
61-120 of 395 spider projects