All Projects → Scrapy Cluster → Similar Projects or Alternatives

2016 Open source projects that are alternatives of or similar to Scrapy Cluster

Haipproxy
💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+442.13%)
Mutual labels:  scrapy, redis, distributed
Jeesuite Libs
分布式架构开发套件。包括缓存(一二级缓存、自动缓存管理)、队列、分布式定时任务、文件服务(七牛、阿里云OSS、fastDFS)、日志、搜索、分布式锁、分布式事务、集成dubbo、spring boot支持以及常用的工具包等。
Stars: ✭ 584 (-36.59%)
Mutual labels:  kafka, redis, distributed
Scrapy Redis
Redis-based components for Scrapy.
Stars: ✭ 4,998 (+442.67%)
Mutual labels:  scrapy, redis, distributed
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-91.1%)
Mutual labels:  kafka, scraping, redis
torchestrator
Spin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (-96.53%)
Mutual labels:  scraping, scrapy
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (-10.31%)
Mutual labels:  kafka, redis
proxi
Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-96.53%)
Mutual labels:  scraping, scrapy
Redsync.go
*DEPRECATED* Please use https://gopkg.in/redsync.v1 (https://github.com/go-redsync/redsync)
Stars: ✭ 292 (-68.3%)
Mutual labels:  redis, distributed
Surging
Surging is a micro-service engine that provides a lightweight, high-performance, modular RPC request pipeline. The service engine supports http, TCP, WS,Grpc, Thrift,Mqtt, UDP, and DNS protocols. It uses ZooKeeper and Consul as a registry, and integrates it. Hash, random, polling, Fair Polling as a load balancing algorithm, built-in service gove…
Stars: ✭ 3,088 (+235.29%)
Mutual labels:  kafka, redis
Zenko
Zenko is the open source multi-cloud data controller: own and keep control of your data on any cloud.
Stars: ✭ 353 (-61.67%)
Mutual labels:  kafka, redis
Post Tuto Deployment
Build and deploy a machine learning app from scratch 🚀
Stars: ✭ 368 (-60.04%)
Mutual labels:  scrapy, scraping
Spring Cloud Shop
spring cloud 版分布式电商项目,全力打造顶级多模块,高可用,高扩展电商项目
Stars: ✭ 248 (-73.07%)
Mutual labels:  kafka, redis
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-86.64%)
Mutual labels:  scraping, scrapy
InstaBot
Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-96.53%)
Mutual labels:  scraping, scrapy
scrapy-kafka-redis
Distributed crawling/scraping, Kafka And Redis based components for Scrapy
Stars: ✭ 45 (-95.11%)
Mutual labels:  distributed, scrapy
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-92.62%)
Mutual labels:  scraping, scrapy
Scrapy Crawlera
Crawlera middleware for Scrapy
Stars: ✭ 281 (-69.49%)
Mutual labels:  scrapy, scraping
Springboot Learning
基于Gradle构建,使用SpringBoot在各个场景的应用,包括集成消息中间件、前后端分离、数据库、缓存、分布式锁、分布式事务等
Stars: ✭ 340 (-63.08%)
Mutual labels:  kafka, redis
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-97.61%)
Mutual labels:  scraping, scrapy
Cookbook
🎉🎉🎉JAVA高级架构师技术栈==任何技能通过 “刻意练习” 都可以达到融会贯通的境界,就像烹饪一样,这里有一份JAVA开发技术手册,只需要增加自己练习的次数。🏃🏃🏃
Stars: ✭ 428 (-53.53%)
Mutual labels:  kafka, redis
Spring Samples For All
spring、spring-boot、spring-cloud 常用整合用例
Stars: ✭ 401 (-56.46%)
Mutual labels:  kafka, redis
Spring Boot Study
SpringBoot框架源码实战(已更新到springboot2版本实现)~基本用法,Rest,Controller,事件监听,连接数据库MySQL,jpa,redis集成,mybatis集成(声明式与xml两种方式~对应的添删查改功能),日志处理,devtools配置,拦截器用法,资源配置读取,测试集成,Web层实现请求映射,security安全验证,rabbitMq集成,kafka集成,分布式id生成器等。项目实战:https://github.com/hemin1003/yfax-parent 已投入生产线上使用
Stars: ✭ 440 (-52.23%)
Mutual labels:  kafka, redis
Javakeeper
✍️ Java 工程师必备架构体系知识总结:涵盖分布式、微服务、RPC等互联网公司常用架构,以及数据存储、缓存、搜索等必备技能
Stars: ✭ 502 (-45.49%)
Mutual labels:  kafka, redis
Testcontainers Spring Boot
Container auto-configurations for spring-boot based integration tests
Stars: ✭ 460 (-50.05%)
Mutual labels:  kafka, redis
Dnc
dnc 去中心化 开源社区 轻联盟 dncto.com QQ群 779699538
Stars: ✭ 551 (-40.17%)
Mutual labels:  kafka, redis
Java Study
java-study 是本人学习Java过程中记录的一些代码!从Java基础的数据类型、jdk1.8的Lambda、Stream和日期的使用、 IO流、数据集合、多线程使用、并发编程、23种设计模式示例代码、常用的工具类, 以及一些常用框架,netty、mina、springboot、kafka、storm、zookeeper、redis、elasticsearch、hbase、hive等等。
Stars: ✭ 571 (-38%)
Mutual labels:  kafka, redis
Syncclient
syncClient,数据实时同步中间件(同步mysql到kafka、redis、elasticsearch、httpmq)!
Stars: ✭ 227 (-75.35%)
Mutual labels:  kafka, redis
Bcmall
以教学为目的的电商系统。包含ToB复杂业务、互联网高并发业务、缓存应用;DDD、微服务指导。模型驱动、数据驱动。了解大型服务进化路线,编码技巧、学习Linux,性能调优。Docker/k8s助力、监控、日志收集、中间件学习。前端技术、后端实践等。主要技术:SpringBoot+JPA+Mybatis-plus+Antd+Vue3。
Stars: ✭ 188 (-79.59%)
Mutual labels:  kafka, redis
RARBG-scraper
With Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (-95.87%)
Mutual labels:  scraping, scrapy
Thunder
⚡️ Nepxion Thunder is a distribution RPC framework based on Netty + Hessian + Kafka + ActiveMQ + Tibco + Zookeeper + Redis + Spring Web MVC + Spring Boot + Docker 多协议、多组件、多序列化的分布式RPC调用框架
Stars: ✭ 204 (-77.85%)
Mutual labels:  kafka, redis
NScrapy
NScrapy is a .net core corss platform Distributed Spider Framework which provide an easy way to write your own Spider
Stars: ✭ 88 (-90.45%)
Mutual labels:  distributed, scrapy
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-98.15%)
Mutual labels:  scraping, scrapy
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-95.87%)
Mutual labels:  scraping, scrapy
Tech Blog
我的个人技术博客(Python、Django、Docker、Go、Redis、ElasticSearch、Kafka、Linux)
Stars: ✭ 203 (-77.96%)
Mutual labels:  kafka, redis
policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-97.61%)
Mutual labels:  scraping, scrapy
memes-api
API for scrapping common meme sites
Stars: ✭ 17 (-98.15%)
Mutual labels:  scraping, scrapy
Redisson
Redisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...
Stars: ✭ 17,972 (+1851.36%)
Mutual labels:  redis, distributed
scrapy-zyte-smartproxy
Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Stars: ✭ 317 (-65.58%)
Mutual labels:  scraping, scrapy
Summer
这是一个支持分布式和集群的java游戏服务器框架,可用于开发棋牌、回合制等游戏。基于netty实现高性能通讯,支持tcp、http、websocket等协议。支持消息加解密、攻击拦截、黑白名单机制。封装了redis缓存、mysql数据库的连接与使用。轻量级,便于上手。
Stars: ✭ 336 (-63.52%)
Mutual labels:  redis, distributed
Linkedin
Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-66.45%)
Mutual labels:  scrapy, scraping
Full Stack Notes
全栈工程师手册
Stars: ✭ 366 (-60.26%)
Mutual labels:  kafka, redis
Voik
♒︎ [WIP] An experimental ~distributed~ commit-log
Stars: ✭ 200 (-78.28%)
Mutual labels:  kafka, distributed
Gnomock
Test your code without writing mocks with ephemeral Docker containers 📦 Setup popular services with just a couple lines of code ⏱️ No bash, no yaml, only code 💻
Stars: ✭ 398 (-56.79%)
Mutual labels:  kafka, redis
Spiderman
基于 scrapy-redis 的通用分布式爬虫框架
Stars: ✭ 392 (-57.44%)
Mutual labels:  kafka, scrapy
Springbootexamples
Spring Boot 学习教程
Stars: ✭ 794 (-13.79%)
Mutual labels:  kafka, redis
Kafka Connect Ui
Web tool for Kafka Connect |
Stars: ✭ 388 (-57.87%)
Mutual labels:  kafka, redis
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (-49.62%)
Mutual labels:  scrapy, scraping
Go Streams
A lightweight stream processing library for Go
Stars: ✭ 615 (-33.22%)
Mutual labels:  kafka, redis
Workflow
C++ Parallel Computing and Asynchronous Networking Engine
Stars: ✭ 6,680 (+625.3%)
Mutual labels:  kafka, redis
Python Spider
豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章
Stars: ✭ 615 (-33.22%)
Mutual labels:  scrapy, redis
Freestyle
A cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (-31.92%)
Mutual labels:  kafka, redis
Goodskill
🐂基于springcloud +dubbo构建的模拟秒杀项目,模块化设计,集成了分库分表、elasticsearch🔍、gateway、mybatis-plus、spring-session等常用开源组件
Stars: ✭ 786 (-14.66%)
Mutual labels:  kafka, redis
Funpyspidersearchengine
Word2vec 千人千面 个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索
Stars: ✭ 782 (-15.09%)
Mutual labels:  scrapy, redis
Easy Scraping Tutorial
Simple but useful Python web scraping tutorial code.
Stars: ✭ 583 (-36.7%)
Mutual labels:  scrapy, scraping
Arq
Fast job queuing and RPC in python with asyncio and redis.
Stars: ✭ 695 (-24.54%)
Mutual labels:  redis, distributed
Istio Micro
istio 微服务示例代码 grpc+protobuf+echo+websocket+mysql+redis+kafka+docker-compose
Stars: ✭ 194 (-78.94%)
Mutual labels:  kafka, redis
Firecamp
Serverless Platform for the stateful services
Stars: ✭ 194 (-78.94%)
Mutual labels:  kafka, redis
Redislock
Simplified distributed locking implementation using Redis
Stars: ✭ 370 (-59.83%)
Mutual labels:  redis, distributed
Netdiscovery
NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Stars: ✭ 573 (-37.79%)
Mutual labels:  kafka, redis
Redlock Php
Redis distributed locks in PHP
Stars: ✭ 651 (-29.32%)
Mutual labels:  redis, distributed
1-60 of 2016 similar projects