scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-93.09%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-96.91%)
Pdf downloaderA Scrapy Spider for downloading PDF files from a webpage.
Stars: ✭ 18 (-96.73%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-94.18%)
Python3 SpiderPython爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (+287.09%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-87.64%)
Instagram BotAn Instagram bot developed using the Selenium Framework
Stars: ✭ 138 (-74.91%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-78.73%)
Python Spider豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章
Stars: ✭ 615 (+11.82%)
WswpCode for the second edition Web Scraping with Python book by Packt Publications
Stars: ✭ 112 (-79.64%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+15.82%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-81.82%)
RARBG-scraperWith Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (-93.09%)
Alipayspider ScrapyAlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)
Stars: ✭ 70 (-87.27%)
Cdp4jcdp4j - Chrome DevTools Protocol for Java
Stars: ✭ 232 (-57.82%)
double-agentA test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-77.64%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-89.27%)
Awesome ScrapyA curated list of awesome packages, articles, and other cool resources from the Scrapy community.
Stars: ✭ 360 (-34.55%)
GolemA complete test automation tool
Stars: ✭ 441 (-19.82%)
Vaultswiss army knife for hackers
Stars: ✭ 346 (-37.09%)
AtataC#/.NET test automation framework for web
Stars: ✭ 362 (-34.18%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (-20%)
InstagramcrawlerA non API python program to crawl public photos, posts or followers
Stars: ✭ 349 (-36.55%)
Isp Data PollutionISP Data Pollution to Protect Private Browsing History with Obfuscation
Stars: ✭ 425 (-22.73%)
Serenity JsA next generation, full-stack acceptance testing framework optimised for collaboration, speed and scale!
Stars: ✭ 346 (-37.09%)
Elves🎊 Design and implement of lightweight crawler framework.
Stars: ✭ 315 (-42.73%)
Scrapy RedisRedis-based components for Scrapy.
Stars: ✭ 4,998 (+808.73%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+779.45%)
SeldomWebUI automation testing framework based on Selenium
Stars: ✭ 422 (-23.27%)
Docker AndroidAndroid in docker solution with noVNC supported and video recording
Stars: ✭ 4,042 (+634.91%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (-43.82%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-43.82%)
WebdriversKeep your Selenium WebDrivers updated automatically
Stars: ✭ 466 (-15.27%)
SinglefileWeb Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file
Stars: ✭ 4,417 (+703.09%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (-48%)
Docker SeleniumDocker images for the Selenium Grid Server
Stars: ✭ 5,476 (+895.64%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (-15.64%)
Spiderman基于 scrapy-redis 的通用分布式爬虫框架
Stars: ✭ 392 (-28.73%)
RseleniumAn R client for Selenium Remote WebDriver
Stars: ✭ 278 (-49.45%)
FilesDocs and files for ScrapydWeb, Scrapyd, Scrapy, and other projects
Stars: ✭ 390 (-29.09%)
AlltheplacesA set of spiders and scrapers to extract location information from places that post their location on the internet.
Stars: ✭ 277 (-49.64%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-49.64%)
Docker Python ChromedriverDockerfile for running Python Selenium in headless Chrome (Python 2.7 / 3.6 / 3.7 / 3.8 / Alpine based Python / Chromedriver / Selenium / Xvfb included in different versions)
Stars: ✭ 385 (-30%)