All Projects → my8100 → Scrapydweb

my8100 / Scrapydweb

Licence: gpl-3.0
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO 👉

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
CSS
56736 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Scrapydweb

logparser
A tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.
Stars: ✭ 70 (-97.06%)
Mutual labels:  scrapy, scrapyd, log-parsing, scrapy-log-analysis, scrapyd-log-analysis
Spiderkeeper
admin ui for scrapy/open source scrapinghub
Stars: ✭ 2,562 (+7.42%)
Mutual labels:  spider, scrapy, dashboard, scrapyd, scrapyd-ui
Gerapy
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Stars: ✭ 2,601 (+9.06%)
Mutual labels:  spider, scrapy, dashboard, scrapyd
Crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+251.87%)
Mutual labels:  spider, scrapy, scrapyd-ui
scrapy-admin
A django admin site for scrapy
Stars: ✭ 44 (-98.16%)
Mutual labels:  spider, scrapy, scrapyd
Alipayspider Scrapy
AlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)
Stars: ✭ 70 (-97.06%)
Mutual labels:  spider, scrapy
Image Downloader
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
Stars: ✭ 1,173 (-50.82%)
Mutual labels:  spider, scrapy
Marmot
💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Stars: ✭ 186 (-92.2%)
Mutual labels:  spider, scrapy
Copybook
用爬虫爬取小说网站上所有小说,存储到数据库中,并用爬到的数据构建自己的小说网站
Stars: ✭ 117 (-95.09%)
Mutual labels:  spider, scrapy
App comments spider
爬取百度贴吧、TapTap、appstore、微博官方博主上的游戏评论(基于redis_scrapy),过滤器采用了bloomfilter。
Stars: ✭ 38 (-98.41%)
Mutual labels:  spider, scrapy
Hive
lots of spider (很多爬虫)
Stars: ✭ 110 (-95.39%)
Mutual labels:  spider, scrapy
Crawlab Lite
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (-94.88%)
Mutual labels:  spider, scrapy
Reptile
🏀 Python3 网络爬虫实战(部分含详细教程)猫眼 腾讯视频 豆瓣 研招网 微博 笔趣阁小说 百度热点 B站 CSDN 网易云阅读 阿里文学 百度股票 今日头条 微信公众号 网易云音乐 拉勾 有道 unsplash 实习僧 汽车之家 英雄联盟盒子 大众点评 链家 LPL赛程 台风 梦幻西游、阴阳师藏宝阁 天气 牛客网 百度文库 睡前故事 知乎 Wish
Stars: ✭ 1,048 (-56.06%)
Mutual labels:  spider, scrapy
Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (-57.06%)
Mutual labels:  spider, scrapy
Capturer
capture pictures from website like sina, lofter, huaban and so on
Stars: ✭ 76 (-96.81%)
Mutual labels:  spider, scrapy
Scrala
Unmaintained 🐳 ☕️ 🕷 Scala crawler(spider) framework, inspired by scrapy, created by @gaocegege
Stars: ✭ 113 (-95.26%)
Mutual labels:  spider, scrapy
Taobaoscrapy
😩Tool For Taobao/Tmall| 儿时玩具已经过时
Stars: ✭ 146 (-93.88%)
Mutual labels:  spider, scrapy
Scrapy demo
all kinds of scrapy demo
Stars: ✭ 128 (-94.63%)
Mutual labels:  spider, scrapy
Awesome Web Scraper
A collection of awesome web scaper, crawler.
Stars: ✭ 147 (-93.84%)
Mutual labels:  spider, scrapy
Fp Server
Free proxy server, continuously crawling and providing proxies, based on Tornado and Scrapy. 免费代理服务器,基于Tornado和Scrapy,在本地搭建属于自己的代理池
Stars: ✭ 154 (-93.54%)
Mutual labels:  spider, scrapy

🔤 English | 🀄 简体中文

ScrapydWeb: Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization.

PyPI - scrapydweb Version PyPI - Python Version CircleCI codecov Coverage Status Downloads - total GitHub license Twitter

servers

Scrapyd ScrapydWeb LogParser

📖 Recommended Reading

🔗 How to efficiently manage your distributed web scraping projects

🔗 How to set up Scrapyd cluster on Heroku

👀 Demo

🔗 scrapydweb.herokuapp.com

Features

View contents
  • 💠 Scrapyd Cluster Management

    • 💯 All Scrapyd JSON API Supported
    • ☑️ Group, filter and select any number of nodes
    • 🖱️ Execute command on multinodes with just a few clicks
  • 🔍 Scrapy Log Analysis

    • 📊 Stats collection
    • 📈 Progress visualization
    • 📑 Logs categorization
  • 🔋 Enhancements

    • 📦 Auto packaging
    • 🕵️‍♂️ Integrated with 🔗 LogParser
    • Timer tasks
    • 📧 Monitor & Alert
    • 📱 Mobile UI
    • 🔐 Basic auth for web UI

💻 Getting Started

View contents

⚠️ Prerequisites

Make sure that 🔗 Scrapyd has been installed and started on all of your hosts.

‼️ Note that for remote access, you have to manually set 'bind_address = 0.0.0.0' in 🔗 the configuration file of Scrapyd and restart Scrapyd to make it visible externally.

⬇️ Install

  • Use pip:
pip install scrapydweb

Note that you may need to execute python -m pip install --upgrade pip first in order to get the latest version of scrapydweb, or download the tar.gz file from https://pypi.org/project/scrapydweb/#files and get it installed via pip install scrapydweb-x.x.x.tar.gz

  • Use git:
pip install --upgrade git+https://github.com/my8100/scrapydweb.git

Or:

git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install

▶️ Start

  1. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings at the first startup.)
  2. Visit http://127.0.0.1:5000 (It's recommended to use Google Chrome for a better experience.)

🌐 Browser Support

The latest version of Google Chrome, Firefox, and Safari.

✔️ Running the tests

View contents
$ git clone https://github.com/my8100/scrapydweb.git
$ cd scrapydweb

# To create isolated Python environments
$ pip install virtualenv
$ virtualenv venv/scrapydweb
# Or specify your Python interpreter: $ virtualenv -p /usr/local/bin/python3.7 venv/scrapydweb
$ source venv/scrapydweb/bin/activate

# Install dependent libraries
(scrapydweb) $ python setup.py install
(scrapydweb) $ pip install pytest
(scrapydweb) $ pip install coverage

# Make sure Scrapyd has been installed and started, then update the custom_settings item in tests/conftest.py
(scrapydweb) $ vi tests/conftest.py
(scrapydweb) $ curl http://127.0.0.1:6800

# '-x': stop on first failure
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests/test_a_factory.py -s -vv -x
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests -s -vv --disable-warnings
(scrapydweb) $ coverage report
# To create an HTML report, check out htmlcov/index.html
(scrapydweb) $ coverage html

🏗️ Built With

View contents

📋 Changelog

Detailed changes for each release are documented in the 🔗 HISTORY.md.

👨‍💻 Author


my8100

👥 Contributors


Kaisla

©️ License

This project is licensed under the GNU General Public License v3.0 - see the 🔗 LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].