A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (+137.5%)

Mutual labels: scrapy

dspatch

The Refreshingly Simple Cross-Platform C++ Dataflow / Pipelining / Stream Processing / Reactive Programming Framework

Stars: ✭ 124 (+675%)

Mutual labels: pipelines

photo-spider-scrapy

10 photo website spiders, 10 个国外图库的 scrapy 爬虫代码

Stars: ✭ 17 (+6.25%)

Mutual labels: scrapy

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (+31.25%)

Mutual labels: scrapy

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-6.25%)

Mutual labels: scrapy

julia-workshop

"Integrating Julia in real-world, distributed pipelines" for JuliaCon 2017

Stars: ✭ 39 (+143.75%)

Mutual labels: pipelines

scrapy-cookies

A middleware of cookies persistence for Scrapy

Stars: ✭ 19 (+18.75%)

Mutual labels: scrapy

JustDownlink

基于Scrapy+Elasticsearch+Django搭建的分布式电影搜索

Stars: ✭ 28 (+75%)

Mutual labels: scrapy

django-slack-oauth

Handles OAuth and stores slack token

Stars: ✭ 51 (+218.75%)

Mutual labels: pipelines

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (+37.5%)

Mutual labels: scrapy

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.

Stars: ✭ 32 (+100%)

Mutual labels: scrapy

python-spider

python爬虫小项目【持续更新】【笔趣阁小说下载、Tweet数据抓取、天气查询、网易云音乐逆向、天天基金网查询、微博数据抓取（生成cookie）、有道翻译逆向、企查查免登陆爬虫、大众点评svg加密破解、B站用户爬虫、拉钩免登录爬虫、自如租房字体加密、知乎问答

Stars: ✭ 45 (+181.25%)

Mutual labels: scrapy

scraping-ebay

Scraping Ebay's products using Scrapy Web Crawling Framework

Stars: ✭ 79 (+393.75%)

Mutual labels: scrapy

devsearch

A web search engine built with Python which uses TF-IDF and PageRank to sort search results.

Stars: ✭ 52 (+225%)

Mutual labels: scrapy

ImageGrabber

A Scrapy demo : Download all images from a site

Stars: ✭ 33 (+106.25%)

Mutual labels: scrapy

logparser

A tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.

Stars: ✭ 70 (+337.5%)

Mutual labels: scrapy

XMQ-BackUp

小密圈备份，圈子/话题/图片/文件。

Stars: ✭ 22 (+37.5%)

Mutual labels: scrapy

bgmtools

Bangumi小工具

Stars: ✭ 66 (+312.5%)

Mutual labels: scrapy

codeflare

Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.

Stars: ✭ 163 (+918.75%)

Mutual labels: pipelines

IMDB-Scraper

Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.

Stars: ✭ 37 (+131.25%)

Mutual labels: scrapy

allitebooks.com

Download all the ebooks with indexed csv of "allitebooks.com"

Stars: ✭ 24 (+50%)

Mutual labels: scrapy

Intelligent Document Finder

Document Search Engine Tool

Stars: ✭ 45 (+181.25%)

Mutual labels: scrapy

tibanna

Tibanna helps you run your genomic pipelines on Amazon cloud (AWS). It is used by the 4DN DCIC (4D Nucleome Data Coordination and Integration Center) to process data. Tibanna supports CWL/WDL (w/ docker), Snakemake (w/ conda) and custom Docker/shell command.

Stars: ✭ 61 (+281.25%)

Mutual labels: pipelines

factory

Docker microservice & Crawler by scrapy

Stars: ✭ 56 (+250%)

Mutual labels: scrapy

scrapy xiuren

秀人网爬虫 55156爬虫

Stars: ✭ 43 (+168.75%)

Mutual labels: scrapy

scrapy.dart

Scrapy, a fast high-level web crawling & scraping framework for dart and Flutter

Stars: ✭ 50 (+212.5%)

Mutual labels: scrapy

Scrapy-Spiders

一个基于Scrapy的数据采集爬虫代码库

Stars: ✭ 34 (+112.5%)

Mutual labels: scrapy

Scrapy IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

Stars: ✭ 100 (+525%)

Mutual labels: scrapy

Python Master Courses

人生苦短我用Python

Stars: ✭ 61 (+281.25%)

Mutual labels: scrapy

prime-re.github.io

Open resource exchange platform for non-human primate neuroimaging

Stars: ✭ 13 (-18.75%)

Mutual labels: pipelines

NovelCrawler

基于Scrapy的爬虫demo

Stars: ✭ 15 (-6.25%)

Mutual labels: scrapy

163Music

163music spider by scrapy.

Stars: ✭ 60 (+275%)

Mutual labels: scrapy

scrapy-admin

A django admin site for scrapy

Stars: ✭ 44 (+175%)

Mutual labels: scrapy

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor instances

Stars: ✭ 32 (+100%)

Mutual labels: scrapy

gee

🏵 Gee is tool of stdin to each files and stdout. It is similar to the tee command, but there are more functions for convenience. In addition, it was written as go

Stars: ✭ 65 (+306.25%)

Mutual labels: pipelines

animecenter

The source code for animecenter

Stars: ✭ 16 (+0%)

Mutual labels: scrapy

pythonSpider

🕷️some python spiders with BeautifulSoup or scarpy

Stars: ✭ 28 (+75%)

Mutual labels: scrapy

invana-bot

A Web Crawler that scrapes using YAML and python code.