PyspiderA Powerful Spider(Web Crawler) System in Python.
Magic googleGoogle search results crawler, get google search results that you need
Ppspiderweb spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Skrape.itA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Ecommercecrawlers码云仓库链接:AJay13/ECommerceCrawlers
Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers
项目展示平台链接:http://wechat.doonsec.com
FilesensorDynamic file detection tool based on crawler 基于爬虫的动态敏感文件探测工具
Annie👾 Fast and simple video download library and CLI tool written in Go
ArachnidCrawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites
ProxybrokerProxy [Finder | Checker | Server]. HTTP(S) & SOCKS 🎭
Ruiji.netcrawler framework, distributed crawler extractor
PychromelessPython Lambda Chrome Automation (naming pending)
GoreconGorecon is a All in one Reconnaissance Tool , a.k.a swiss knife for Reconnaissance , A tool that every pentester/bughunter might wanna consider into their arsenal
Goose ParserUniversal scrapping tool, which allows you to extract data using multiple environments
Algoliasearch NetlifyOfficial Algolia Plugin for Netlify. Index your website to Algolia when deploying your project to Netlify with the Algolia Crawler
Media ScraperScrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Tianyanchapip安装的天眼查爬虫API,指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.
CollyElegant Scraper and Crawler Framework for Golang
WoidSimple news aggregator displaying top stories in real time
JssoupJavaScript + BeautifulSoup = JSSoup
GooglescraperA Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
Querylist🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
Laosjgolang light-weight image crawler
AntchAntch, a fast, powerful and extensible web crawling & scraping framework for Go
Fooproxy稳健高效的评分制-针对性- IP代理池 + API服务,可以自己插入采集器进行代理IP的爬取,针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库,支持MongoDB 4.0 使用 Python3.7(Scored IP proxy pool ,customise proxy data crawler can be added anytime)
GeccoEasy to use lightweight web crawler(易用的轻量化网络爬虫)
Goribot[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Marmot💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Web Bee🐝 Web vertical crawler framework for fun
Lianjia Beike Spider链家网和贝壳网房价爬虫,采集北京上海广州深圳等21个中国主要城市的房价数据(小区,二手房,出租房,新房),稳定可靠快速!支持csv,MySQL, MongoDB,Excel, json存储,支持Python2和3,图表展示数据,注释丰富 ,点星支持,仅供学习参考,请勿用于商业用途,后果自负。
Crawler illegal cases in chinaCollection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律,避免触碰数据合规红线。 [AD]中文知识图谱门户
NosmokeA cross platform UI crawler which scans view trees then generate and execute UI test cases.
Spoon🥄 A package for building specific Proxy Pool for different Sites.