All Projects → scrapy-plugins → Scrapy Crawlera

scrapy-plugins / Scrapy Crawlera

Crawlera middleware for Scrapy

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Scrapy Crawlera

Scrapoxy
Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+370.46%)
Mutual labels:  crawler, scrapy, proxy
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+65.12%)
Mutual labels:  crawler, scrapy, scraping
Dotnetcrawler
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-64.41%)
Mutual labels:  crawler, scrapy, scraping
Easy Scraping Tutorial
Simple but useful Python web scraping tutorial code.
Stars: ✭ 583 (+107.47%)
Mutual labels:  crawler, scrapy, scraping
Marmot
💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Stars: ✭ 186 (-33.81%)
Mutual labels:  crawler, scrapy, proxy
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-56.23%)
Mutual labels:  scraping, scrapy
RARBG-scraper
With Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (-86.48%)
Mutual labels:  scraping, scrapy
InstaBot
Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-88.61%)
Mutual labels:  scraping, scrapy
proxi
Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-88.61%)
Mutual labels:  scraping, scrapy
Filesensor
Dynamic file detection tool based on crawler 基于爬虫的动态敏感文件探测工具
Stars: ✭ 227 (-19.22%)
Mutual labels:  crawler, scrapy
torchestrator
Spin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (-88.61%)
Mutual labels:  scraping, scrapy
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-92.17%)
Mutual labels:  scraping, scrapy
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-1.42%)
Mutual labels:  crawler, scraping
Ppspider
web spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Stars: ✭ 237 (-15.66%)
Mutual labels:  crawler, proxy
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-93.95%)
Mutual labels:  scraping, scrapy
Ecommercecrawlers
码云仓库链接:AJay13/ECommerceCrawlers Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers 项目展示平台链接:http://wechat.doonsec.com
Stars: ✭ 3,073 (+993.59%)
Mutual labels:  crawler, scrapy
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-86.48%)
Mutual labels:  scraping, scrapy
ptt-web-crawler
PTT 網路版爬蟲
Stars: ✭ 20 (-92.88%)
Mutual labels:  crawler, scrapy
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-94.66%)
Mutual labels:  crawler, scraping
memes-api
API for scrapping common meme sites
Stars: ✭ 17 (-93.95%)
Mutual labels:  scraping, scrapy

=============== scrapy-crawlera

.. image:: https://img.shields.io/pypi/v/scrapy-crawlera.svg :target: https://pypi.python.org/pypi/scrapy-crawlera :alt: PyPI Version

.. image:: https://travis-ci.org/scrapy-plugins/scrapy-crawlera.svg?branch=master :target: http://travis-ci.org/scrapy-plugins/scrapy-crawlera :alt: Build Status

.. image:: http://codecov.io/github/scrapy-plugins/scrapy-crawlera/coverage.svg?branch=master :target: http://codecov.io/github/scrapy-plugins/scrapy-crawlera?branch=master :alt: Code Coverage

scrapy-crawlera provides easy use of Crawlera <http://scrapinghub.com/crawlera>_ with Scrapy.

Requirements

  • Python 2.7 or Python 3.4+
  • Scrapy

Installation

You can install scrapy-crawlera using pip::

pip install scrapy-crawlera

Documentation

Documentation is available online at https://scrapy-crawlera.readthedocs.io/ and in the docs directory.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].