All Projects → scrapy-kafka-redis → Similar Projects or Alternatives

582 Open source projects that are alternatives of or similar to scrapy-kafka-redis

Scrapy Cluster
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+1946.67%)
Mutual labels:  distributed, scrapy
Scrapy Redis
Redis-based components for Scrapy.
Stars: ✭ 4,998 (+11006.67%)
Mutual labels:  distributed, scrapy
NScrapy
NScrapy is a .net core corss platform Distributed Spider Framework which provide an easy way to write your own Spider
Stars: ✭ 88 (+95.56%)
Mutual labels:  distributed, scrapy
Haipproxy
💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+10995.56%)
Mutual labels:  distributed, scrapy
Gerapy
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Stars: ✭ 2,601 (+5680%)
Mutual labels:  distributed, scrapy
go-cita
A Go implementation of CITA. https://docs.nervos.org/cita
Stars: ✭ 25 (-44.44%)
Mutual labels:  distributed
meesee
Task queue, Long lived workers for work based parallelization, with processes and Redis as back-end. For distributed computing.
Stars: ✭ 14 (-68.89%)
Mutual labels:  distributed
intelli-swift-core
Distributed, Column-oriented storage, Realtime analysis, High performance Database
Stars: ✭ 17 (-62.22%)
Mutual labels:  distributed
pooljs
Browser computing unleashed!
Stars: ✭ 17 (-62.22%)
Mutual labels:  distributed
itemadapter
Common interface for data container classes
Stars: ✭ 47 (+4.44%)
Mutual labels:  scrapy
goimpulse
高可用,高性能的分布式发号服务
Stars: ✭ 17 (-62.22%)
Mutual labels:  distributed
dist-framework
A prototype for distributed training/validation/evaluation/extraction with PyTorch.
Stars: ✭ 14 (-68.89%)
Mutual labels:  distributed
GraviT
GraviT is a distributed ray tracing framework that enables applications to leverage hardware-optimized ray tracers within a single environment across many nodes for large-scale rendering tasks.
Stars: ✭ 18 (-60%)
Mutual labels:  distributed
sprawl
Alpha implementation of the Sprawl distributed marketplace protocol.
Stars: ✭ 27 (-40%)
Mutual labels:  distributed
Web-Iota
Iota is a web scraper which can find all of the images and links/suburls on a webpage
Stars: ✭ 60 (+33.33%)
Mutual labels:  scrapy
blockchain-hackathon
An electronic health record (EHR) system built on Hyperledger Composer blockchain
Stars: ✭ 67 (+48.89%)
Mutual labels:  distributed
p2p-project
A peer-to-peer networking framework to work across languages
Stars: ✭ 68 (+51.11%)
Mutual labels:  distributed
DemonHunter
Distributed Honeypot
Stars: ✭ 54 (+20%)
Mutual labels:  distributed
orbit-db-cli
CLI for orbit-db
Stars: ✭ 60 (+33.33%)
Mutual labels:  distributed
scrapy-mysql-pipeline
scrapy mysql pipeline
Stars: ✭ 47 (+4.44%)
Mutual labels:  scrapy
simplx
C++ development framework for building reliable cache-friendly distributed and concurrent multicore software
Stars: ✭ 61 (+35.56%)
Mutual labels:  distributed
Scrape-Finance-Data
My code for scraping financial data in Vietnam
Stars: ✭ 13 (-71.11%)
Mutual labels:  scrapy
arche
Analyze scraped data
Stars: ✭ 49 (+8.89%)
Mutual labels:  scrapy
tool-db
A peer-to-peer decentralized database
Stars: ✭ 15 (-66.67%)
Mutual labels:  distributed
dask-sql
Distributed SQL Engine in Python using Dask
Stars: ✭ 271 (+502.22%)
Mutual labels:  distributed
scrapy-LBC
Araignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-68.89%)
Mutual labels:  scrapy
majordodo
Distributed Operations and Data Organizer built on Apache BookKeeper
Stars: ✭ 25 (-44.44%)
Mutual labels:  distributed
FedScale
FedScale is a scalable and extensible open-source federated learning (FL) platform.
Stars: ✭ 274 (+508.89%)
Mutual labels:  distributed
heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
Stars: ✭ 127 (+182.22%)
Mutual labels:  distributed
asyncpy
使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架
Stars: ✭ 86 (+91.11%)
Mutual labels:  scrapy
fernando-pessoa
Classificador de poemas do Fernando Pessoa de acordo com os seus heterônimos
Stars: ✭ 31 (-31.11%)
Mutual labels:  scrapy
xmutca-rpc
Xmutca-rpc是一个基于netty开发的分布式服务框架,提供稳定高性能的RPC远程服务调用功能,支持注册中心,服务治理,负载均衡等特性,开箱即用。
Stars: ✭ 18 (-60%)
Mutual labels:  distributed
ArticleSpider
Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).
Stars: ✭ 34 (-24.44%)
Mutual labels:  scrapy
elfo
Your next actor system
Stars: ✭ 38 (-15.56%)
Mutual labels:  distributed
Credits
Credits(CRDS) - An Evolving Currency For An Evolving Society
Stars: ✭ 14 (-68.89%)
Mutual labels:  distributed
scrapy helper
Dynamic configurable crawl (动态可配置化爬虫)
Stars: ✭ 84 (+86.67%)
Mutual labels:  scrapy
toy-rpc
Java基于Netty,Protostuff和Zookeeper实现分布式RPC框架
Stars: ✭ 55 (+22.22%)
Mutual labels:  distributed
soundstorm
The Federated Social Audio Platform
Stars: ✭ 26 (-42.22%)
Mutual labels:  distributed
Galaxy
Galaxy is an asynchronous parallel visualization ray tracer for performant rendering in distributed computing environments. Galaxy builds upon Intel OSPRay and Intel Embree, including ray queueing and sending logic inspired by TACC GraviT.
Stars: ✭ 18 (-60%)
Mutual labels:  distributed
scrapy-rotated-proxy
A scrapy middleware to use rotated proxy ip list.
Stars: ✭ 22 (-51.11%)
Mutual labels:  scrapy
rockgo
A developing game server framework,based on Entity Component System(ECS).
Stars: ✭ 617 (+1271.11%)
Mutual labels:  distributed
pytorch-distributed
Ape-X DQN & DDPG with pytorch & tensorboard
Stars: ✭ 98 (+117.78%)
Mutual labels:  distributed
Inventus
Inventus is a spider designed to find subdomains of a specific domain by crawling it and any subdomains it discovers.
Stars: ✭ 80 (+77.78%)
Mutual labels:  scrapy
zlimiter
A toolkit for rate limite,support memory and redis
Stars: ✭ 17 (-62.22%)
Mutual labels:  distributed
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (+173.33%)
Mutual labels:  scrapy
Scrapy-tripadvisor-reviews
Using scrapy to scrape tripadvisor in order to get users' reviews.
Stars: ✭ 24 (-46.67%)
Mutual labels:  scrapy
erl dist
Rust Implementation of Erlang Distribution Protocol
Stars: ✭ 110 (+144.44%)
Mutual labels:  distributed
Distributed-ResNet-Tensorflow
A Distributed ResNet on multi-machines each with one GPU card.
Stars: ✭ 20 (-55.56%)
Mutual labels:  distributed
WeIdentity
基于区块链的符合W3C DID和Verifiable Credential规范的分布式身份解决方案
Stars: ✭ 1,063 (+2262.22%)
Mutual labels:  distributed
lgcrawl
python+scrapy+splash 爬取拉勾全站职位信息
Stars: ✭ 22 (-51.11%)
Mutual labels:  scrapy
scrapy-html-storage
Scrapy downloader middleware that stores response HTMLs to disk.
Stars: ✭ 17 (-62.22%)
Mutual labels:  scrapy
pagser
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
Stars: ✭ 82 (+82.22%)
Mutual labels:  scrapy
tips
TiKV based Pub/Sub server
Stars: ✭ 31 (-31.11%)
Mutual labels:  distributed
domains
World’s single largest Internet domains dataset
Stars: ✭ 461 (+924.44%)
Mutual labels:  scrapy
osilo
Personal data silos with secure sharing
Stars: ✭ 15 (-66.67%)
Mutual labels:  distributed
hazelcast-csharp-client
Hazelcast .NET Client
Stars: ✭ 98 (+117.78%)
Mutual labels:  distributed
vietnam-ecommerce-crawler
Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs
Stars: ✭ 28 (-37.78%)
Mutual labels:  scrapy
spicedb
Open Source, Google Zanzibar-inspired fine-grained permissions database
Stars: ✭ 3,358 (+7362.22%)
Mutual labels:  distributed
webhunger
WebHunger is an extensible, full-scale crawler framework that supports distributed crawling, aiming at getting users focused on web page parsing without concerning for the crawling process.
Stars: ✭ 17 (-62.22%)
Mutual labels:  distributed
FastNN
FastNN provides distributed training examples that use EPL.
Stars: ✭ 79 (+75.56%)
Mutual labels:  distributed
1-60 of 582 similar projects