All Projects → scrapinghub → Spidermon

scrapinghub / Spidermon

Licence: bsd-3-clause
Scrapy Extension for monitoring spiders execution.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Spidermon

Ferret
Declarative web scraping
Stars: ✭ 4,837 (+1465.37%)
Mutual labels:  hacktoberfest, scraping, crawling
Scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+13603.24%)
Mutual labels:  hacktoberfest, scraping, crawling
Openitcockpit
openITCOCKPIT is an Open Source system monitoring tool built for different monitoring engines like Nagios, Naemon and Prometheus.
Stars: ✭ 108 (-65.05%)
Mutual labels:  hacktoberfest, monitoring, monitoring-tool
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-87.7%)
Mutual labels:  scraping, crawling
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-83.5%)
Mutual labels:  scraping, crawling
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-83.17%)
Mutual labels:  scraping, crawling
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-82.85%)
Mutual labels:  scraping, crawling
bots-zoo
No description or website provided.
Stars: ✭ 59 (-80.91%)
Mutual labels:  scraping, crawling
feedsearch-crawler
Crawl sites for RSS, Atom, and JSON feeds.
Stars: ✭ 23 (-92.56%)
Mutual labels:  scraping, crawling
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-77.99%)
Mutual labels:  scraping, crawling
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-10.36%)
Mutual labels:  scraping, crawling
zcrawl
An open source web crawling platform
Stars: ✭ 21 (-93.2%)
Mutual labels:  scraping, crawling
crawling-framework
Easily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (-92.88%)
Mutual labels:  scraping, crawling
go-scrapy
Web crawling and scraping framework for Golang
Stars: ✭ 17 (-94.5%)
Mutual labels:  scraping, crawling
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-94.5%)
Mutual labels:  scraping, crawling
pomp
Screen scraping and web crawling framework
Stars: ✭ 61 (-80.26%)
Mutual labels:  scraping, crawling
Apify Js
Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+920.71%)
Mutual labels:  scraping, crawling
Stopstalk Deployment
Stop stalking and start StopStalking 😉
Stars: ✭ 276 (-10.68%)
Mutual labels:  hacktoberfest, crawling
Static status
🚦Bash script to generate a static status page.
Stars: ✭ 286 (-7.44%)
Mutual labels:  monitoring, monitoring-tool
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-60.19%)
Mutual labels:  scraping, crawling

========= Spidermon

.. image:: https://img.shields.io/travis/scrapinghub/spidermon/master :target: https://travis-ci.org/scrapinghub/spidermon :alt: travis build status master branch

.. image:: https://img.shields.io/codecov/c/github/scrapinghub/spidermon.svg :target: http://codecov.io/github/scrapinghub/spidermon?branch=master :alt: Coverage report

.. image:: https://img.shields.io/pypi/v/spidermon.svg :target: https://pypi.python.org/pypi/spidermon :alt: pypi version

.. image:: https://img.shields.io/pypi/l/spidermon.svg :target: https://github.com/scrapinghub/spidermon/blob/master/LICENSE :alt: licence

.. image:: https://img.shields.io/pypi/pyversions/spidermon.svg :target: https://pypi.python.org/pypi/spidermon :alt: python versions

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg :target: https://github.com/ambv/black :alt: Code style: black

Overview

Spidermon is an extension for Scrapy spiders. The package provides useful tools for data validation, stats monitoring, and notification messages. This way you leave the monitoring task to Spidermon and just check the reports/notifications.

Requirements

  • Python 3.6, Python 3.7, Python 3.8 or Python 3.9

Install

The quick way::

pip install spidermon

For more details see the install section in the documentation: https://spidermon.readthedocs.io/en/latest/installation.html

Documentation

Documentation is available online at https://spidermon.readthedocs.io/ and in the docs directory.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].