Redisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...

Stars: ✭ 17,972 (+259.94%)

Mutual labels: scheduler, redis, distributed

Netdiscovery

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

Stars: ✭ 573 (-88.52%)

Mutual labels: crawler, spider, redis

Lizard

💐 Full Amazon Automatic Download

Stars: ✭ 41 (-99.18%)

Mutual labels: crawler, spider, distributed

Scrapingoutsourcing

ScrapingOutsourcing专注分享爬虫代码尽量每周更新一个

Stars: ✭ 164 (-96.72%)

Mutual labels: crawler, spider, scrapy

Gerapy

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

Stars: ✭ 2,601 (-47.91%)

Mutual labels: spider, scrapy, distributed

Xxl Crawler

A distributed web crawler framework.（分布式爬虫框架XXL-CRAWLER）

Stars: ✭ 561 (-88.76%)

Mutual labels: crawler, spider, distributed

Zi5book

book.zi5.me全站kindle电子书籍爬取，按照作者书籍名分类，每本书有mobi和equb两种格式，采用分布式进行全站爬取

Stars: ✭ 191 (-96.17%)

Mutual labels: spider, redis, distributed

Marmot

💐Marmot | Web Crawler/HTTP protocol Download Package 🐭

Stars: ✭ 186 (-96.27%)

Mutual labels: crawler, spider, scrapy

Fbcrawl

A Facebook crawler

Stars: ✭ 536 (-89.26%)

Mutual labels: crawler, spider, scrapy

Funpyspidersearchengine

Word2vec 千人千面个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索

Stars: ✭ 782 (-84.34%)

Mutual labels: spider, scrapy, redis

Python Spider

豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章

Stars: ✭ 615 (-87.68%)

Mutual labels: spider, scrapy, redis

Crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+68.08%)

Mutual labels: crawler, spider, scrapy

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-96.19%)

Mutual labels: crawler, spider, scrapy

Scrapy IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

Stars: ✭ 100 (-98%)

Mutual labels: spider, scrapy, ipproxy

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (-4.01%)

Mutual labels: crawler, spider

V2EX Spider

V2EX爬虫

Stars: ✭ 21 (-99.58%)

Mutual labels: spider, scrapy

scrapy-admin

A django admin site for scrapy

Stars: ✭ 44 (-99.12%)

Mutual labels: spider, scrapy

Scrapy-Spiders

一个基于Scrapy的数据采集爬虫代码库

Stars: ✭ 34 (-99.32%)

Mutual labels: spider, scrapy

python-fxxk-spider

收集各种免费的 Python 爬虫项目

Stars: ✭ 184 (-96.31%)

Mutual labels: spider, scrapy

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (-99.56%)

Mutual labels: spider, scrapy

python-spider

python爬虫小项目【持续更新】【笔趣阁小说下载、Tweet数据抓取、天气查询、网易云音乐逆向、天天基金网查询、微博数据抓取（生成cookie）、有道翻译逆向、企查查免登陆爬虫、大众点评svg加密破解、B站用户爬虫、拉钩免登录爬虫、自如租房字体加密、知乎问答

Stars: ✭ 45 (-99.1%)

Mutual labels: spider, scrapy

crawler

A simple and flexible web crawler framework for java.

Stars: ✭ 20 (-99.6%)

Mutual labels: crawler, spider

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-99.04%)

Mutual labels: crawler, spider

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-99.66%)

Mutual labels: crawler, spider

douban-spider

基于Scrapy框架的豆瓣电影爬虫

Stars: ✭ 25 (-99.5%)

Mutual labels: spider, scrapy

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-99.24%)

Mutual labels: spider, scrapy

k8s-lemp

LEMP stack in a Kubernetes cluster

Stars: ✭ 74 (-98.52%)

Mutual labels: distributed, high-availability

ptt-web-crawler

PTT 網路版爬蟲

Stars: ✭ 20 (-99.6%)

Mutual labels: crawler, scrapy

slime

🍰 一个可视化的爬虫平台

Stars: ✭ 27 (-99.46%)

Mutual labels: crawler, spider

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (-90.71%)

Mutual labels: crawler, scrapy

ZhengFang System Spider

🐛一只登录正方教务管理系统，爬取数据的小爬虫

Stars: ✭ 21 (-99.58%)

Mutual labels: crawler, spider

PttImageSpider

PTT 圖片下載器 (抓取整個看板的圖片，並用文章標題作為資料夾的名稱 ) (使用Scrapy)

Stars: ✭ 16 (-99.68%)

Mutual labels: spider, scrapy

Douban Crawler

Uno Crawler por https://douban.com

Stars: ✭ 13 (-99.74%)

Mutual labels: spider, scrapy

toutiao

今日头条科技新闻接口爬虫

Stars: ✭ 17 (-99.66%)

Mutual labels: spider, scrapy

ip proxy pool

Generating spiders dynamically to crawl and check those free proxy ip on the internet with scrapy.

Stars: ✭ 39 (-99.22%)

Mutual labels: spider, scrapy

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

Stars: ✭ 138 (-97.24%)

Mutual labels: crawler, spider

Bt Btt

磁力網站U3C3介紹以及域名更新

Stars: ✭ 261 (-94.77%)

Mutual labels: crawler, spider

Happy Spiders

🔧 🔩 🔨 收集整理了爬虫相关的工具、模拟登陆技术、代理IP、scrapy模板代码等内容。

Stars: ✭ 261 (-94.77%)

Mutual labels: spider, scrapy

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (-94.45%)

Mutual labels: crawler, spider

Hacker News Digest

📰 A responsive interface of Hacker News with summaries and thumbnails.

Stars: ✭ 278 (-94.43%)

Mutual labels: crawler, spider

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (-98.4%)

Mutual labels: spider, scrapy

WebCrawler

一个轻量级、快速、多线程、多管道、灵活配置的网络爬虫。

Stars: ✭ 39 (-99.22%)

Mutual labels: crawler, spider

Tieba spider

百度贴吧爬虫(基于scrapy和mysql)

Stars: ✭ 257 (-94.85%)

Mutual labels: spider, scrapy

Alltheplaces

A set of spiders and scrapers to extract location information from places that post their location on the internet.

Stars: ✭ 277 (-94.45%)

Mutual labels: spider, scrapy

Dotnetspider

DotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework

Stars: ✭ 3,233 (-35.25%)

Mutual labels: crawler, distributed

Weixin Spider

微信公众号爬虫，公众号历史文章，文章评论，文章阅读及在看数据，可视化web页面，可部署于Windows服务器。基于Python3之flask/mysql/redis/mitmproxy/pywin32等实现，高效微信爬虫，微信公众号爬虫，历史文章，文章评论，数据更新。

Stars: ✭ 287 (-94.25%)

Mutual labels: crawler, spider

Gospider

golang实现的爬虫框架，使用者只需关心页面规则，提供web管理界面。基于colly开发。

Stars: ✭ 285 (-94.29%)

Mutual labels: crawler, spider

Redsync.go

*DEPRECATED* Please use https://gopkg.in/redsync.v1 (https://github.com/go-redsync/redsync)

Stars: ✭ 292 (-94.15%)

Mutual labels: redis, distributed

Elves

🎊 Design and implement of lightweight crawler framework.

Stars: ✭ 315 (-93.69%)

Mutual labels: spider, scrapy

Toapi

Every web site provides APIs.

Stars: ✭ 3,209 (-35.73%)

Mutual labels: crawler, spider

Xxl Job

A distributed task scheduling framework.（分布式任务调度平台XXL-JOB）

Stars: ✭ 20,197 (+304.51%)

Mutual labels: scheduler, distributed

1-60 of 2337 similar projects

›

next*5