Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (-0.1%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (-81.57%)
Spoon🥄 A package for building specific Proxy Pool for different Sites.
Stars: ✭ 173 (-96.54%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (-87.25%)
Awesome crawl腾讯新闻、知乎话题、微博粉丝,Tumblr爬虫、斗鱼弹幕、妹子图爬虫、分布式设计等
Stars: ✭ 246 (-95.08%)
CrawlabDistributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+67.91%)
ScrapoxyScrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (-73.55%)
WeibospiderThis is a sina weibo spider built by scrapy [微博爬虫/持续维护]
Stars: ✭ 2,408 (-51.82%)
NetdiscoveryNetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Stars: ✭ 573 (-88.54%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (-89.28%)
Lizard💐 Full Amazon Automatic Download
Stars: ✭ 41 (-99.18%)
Crawlab LiteLite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (-97.56%)
JlitespiderA lite distributed Java spider framework :-)
Stars: ✭ 151 (-96.98%)
Proxy poolPython爬虫代理IP池(proxy pool)
Stars: ✭ 13,964 (+179.39%)
Goribot[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (-96.2%)
FilesensorDynamic file detection tool based on crawler 基于爬虫的动态敏感文件探测工具
Stars: ✭ 227 (-95.46%)
DotnetspiderDotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework
Stars: ✭ 3,233 (-35.31%)
Redsync.go*DEPRECATED* Please use https://gopkg.in/redsync.v1 (https://github.com/go-redsync/redsync)
Stars: ✭ 292 (-94.16%)
GerapyDistributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Stars: ✭ 2,601 (-47.96%)
FunpyspidersearchengineWord2vec 千人千面 个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索
Stars: ✭ 782 (-84.35%)
Xxl CrawlerA distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)
Stars: ✭ 561 (-88.78%)
IcrawlerA multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (-87.41%)
Python Spider豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章
Stars: ✭ 615 (-87.7%)
DisecDistributed Image Search Engine Crawler
Stars: ✭ 11 (-99.78%)
PoopakPOOPAK - TOR Hidden Service Crawler
Stars: ✭ 78 (-98.44%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (-90.72%)
Qqmusicspider基于Scrapy的QQ音乐爬虫(QQ Music Spider),爬取歌曲信息、歌词、精彩评论等,并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料
Stars: ✭ 120 (-97.6%)
Docs《数据采集从入门到放弃》源码。内容简介:爬虫介绍、就业情况、爬虫工程师面试题 ;HTTP协议介绍; Requests使用 ;解析器Xpath介绍; MongoDB与MySQL; 多线程爬虫; Scrapy介绍 ;Scrapy-redis介绍; 使用docker部署; 使用nomad管理docker集群; 使用EFK查询docker日志
Stars: ✭ 118 (-97.64%)
Python3 SpiderPython爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (-57.4%)
Marmot💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Stars: ✭ 186 (-96.28%)
Vaultswiss army knife for hackers
Stars: ✭ 346 (-93.08%)
Ruiji.netcrawler framework, distributed crawler extractor
Stars: ✭ 220 (-95.6%)
Crawler爬虫, http代理, 模拟登陆!
Stars: ✭ 106 (-97.88%)
NScrapyNScrapy is a .net core corss platform Distributed Spider Framework which provide an easy way to write your own Spider
Stars: ✭ 88 (-98.24%)
RedissonRedisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...
Stars: ✭ 17,972 (+259.58%)
scrapy-kafka-redisDistributed crawling/scraping, Kafka And Redis based components for Scrapy
Stars: ✭ 45 (-99.1%)
DsockDistributed WebSocket broker
Stars: ✭ 197 (-96.06%)
PotteryRedis for humans. 🌎🌍🌏
Stars: ✭ 204 (-95.92%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-98%)
Ecommercecrawlers码云仓库链接:AJay13/ECommerceCrawlers
Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers
项目展示平台链接:http://wechat.doonsec.com
Stars: ✭ 3,073 (-38.52%)
Summer这是一个支持分布式和集群的java游戏服务器框架,可用于开发棋牌、回合制等游戏。基于netty实现高性能通讯,支持tcp、http、websocket等协议。支持消息加解密、攻击拦截、黑白名单机制。封装了redis缓存、mysql数据库的连接与使用。轻量级,便于上手。
Stars: ✭ 336 (-93.28%)
RedislockSimplified distributed locking implementation using Redis
Stars: ✭ 370 (-92.6%)