Top 538 crawler open source projects

Pyspider
A Powerful Spider(Web Crawler) System in Python.
Magic google
Google search results crawler, get google search results that you need
Weibopicdownloader
免登录下载微博图片 爬虫 Download Weibo Images without Logging-in
Fast Lianjia Crawler
直接通过链家 API 抓取数据的极速爬虫,宇宙最快~~ 🚀
Strong Web Crawler
基于C#.NET+PhantomJS+Sellenium的高级网络爬虫程序。可执行Javascript代码、触发各类事件、操纵页面Dom结构。
Ppspider
web spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Skrape.it
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Ecommercecrawlers
码云仓库链接:AJay13/ECommerceCrawlers Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers 项目展示平台链接:http://wechat.doonsec.com
Filesensor
Dynamic file detection tool based on crawler 基于爬虫的动态敏感文件探测工具
Awesome Java Crawler
本仓库收集整理爬虫相关资源,开发语言以Java为主
Annie
👾 Fast and simple video download library and CLI tool written in Go
Laravel Crawler Detect
A Laravel wrapper for CrawlerDetect - the web crawler detection library
Arachnid
Crawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites
Ruiji.net
crawler framework, distributed crawler extractor
Chromium for spider
dynamic crawler for web vulnerability scanner
Pychromeless
Python Lambda Chrome Automation (naming pending)
Sitemap Generator Cli
Creates an XML-Sitemap by crawling a given site.
Jd mask robot
京东口罩库存监控爬虫(非selenium),扫码登录、查价、加购、下单、秒杀
Webvideobot
Web crawler.
Gorecon
Gorecon is a All in one Reconnaissance Tool , a.k.a swiss knife for Reconnaissance , A tool that every pentester/bughunter might wanna consider into their arsenal
Goose Parser
Universal scrapping tool, which allows you to extract data using multiple environments
Algoliasearch Netlify
Official Algolia Plugin for Netlify. Index your website to Algolia when deploying your project to Netlify with the Algolia Crawler
Media Scraper
Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Tianyancha
pip安装的天眼查爬虫API,指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.
Colly
Elegant Scraper and Crawler Framework for Golang
Woid
Simple news aggregator displaying top stories in real time
Googlescraper
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
Querylist
🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
Zhihuspider
多线程知乎用户爬虫,基于python3
Videoserver
以Node.js基于express以及爬虫实现的视频资源后端
Laosj
golang light-weight image crawler
Antch
Antch, a fast, powerful and extensible web crawling & scraping framework for Go
Ok ip proxy pool
🍿爬虫代理IP池(proxy pool) python🍟一个还ok的IP代理池
Fooproxy
稳健高效的评分制-针对性- IP代理池 + API服务,可以自己插入采集器进行代理IP的爬取,针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库,支持MongoDB 4.0 使用 Python3.7(Scored IP proxy pool ,customise proxy data crawler can be added anytime)
Jvppeteer
Headless Chrome For Java (Java 爬虫)
Google Group Crawler
Get (almost) original messages from google group archives. Your data is yours.
Github Spider
Github 仓库及用户分析爬虫
Gecco
Easy to use lightweight web crawler(易用的轻量化网络爬虫)
Goribot
[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Marmot
💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Comiccrawler
An image crawler written in Python.
Zhihu fun
基于 Selenium 的知乎关键词爬虫
Web Bee
🐝 Web vertical crawler framework for fun
Lianjia Beike Spider
链家网和贝壳网房价爬虫,采集北京上海广州深圳等21个中国主要城市的房价数据(小区,二手房,出租房,新房),稳定可靠快速!支持csv,MySQL, MongoDB,Excel, json存储,支持Python2和3,图表展示数据,注释丰富 ,点星支持,仅供学习参考,请勿用于商业用途,后果自负。
Zhihu Crawler People
A simple distributed crawler for zhihu && data analysis
Crawler illegal cases in china
Collection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律,避免触碰数据合规红线。 [AD]中文知识图谱门户
Nosmoke
A cross platform UI crawler which scans view trees then generate and execute UI test cases.
Instagram Crawler
Crawl instagram photos, posts and videos for download.
N2h4
네이버 뉴스 수집을 위한 도구
Leetcode Spider
用 node.js 爬你自己的 leetcode 解题源码
Ncov2019 data crawler
疫情数据爬虫,2019新型冠状病毒数据仓库,轨迹数据,同乘数据,报道
Spoon
🥄 A package for building specific Proxy Pool for different Sites.
Linkedin Profile Scraper
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Scrapedin Linkedin Crawler
Crawler for LinkedIn full profiles 2019
1-60 of 538 crawler projects