Ppspiderweb spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Stars: ✭ 237 (+203.85%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (+119.23%)
Url To Pdf ApiWeb page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
Stars: ✭ 6,544 (+8289.74%)
SquidwarcSquidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (+60.26%)
JvppeteerHeadless Chrome For Java (Java 爬虫)
Stars: ✭ 193 (+147.44%)
Sms Boom利用chrome的headless模式,模拟用户注册进行短信轰炸机
Stars: ✭ 507 (+550%)
Webstera reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (+366.67%)
PuppetronPuppeteer (Headless Chrome Node API)-based rendering solution.
Stars: ✭ 429 (+450%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+6101.28%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+6044.87%)
Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+6301.28%)
Go jobs带你了解一下Golang的市场行情
Stars: ✭ 526 (+574.36%)
Xxl CrawlerA distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)
Stars: ✭ 561 (+619.23%)
InfospiderINFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰,旨在安全快捷的帮助用户拿回自己的数据,工具代码开源,流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、中国移动、中国联通、中国电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源中国博客、简书。
Stars: ✭ 5,984 (+7571.79%)
Html Pdf ChromeHTML to PDF converter via Chrome/Chromium
Stars: ✭ 629 (+706.41%)
Mocha Chrome☕️ Run Mocha tests using headless Google Chrome
Stars: ✭ 66 (-15.38%)
NetdiscoveryNetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Stars: ✭ 573 (+634.62%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+741.03%)
ChromedpA faster, simpler way to drive browsers supporting the Chrome DevTools Protocol.
Stars: ✭ 7,057 (+8947.44%)
Creeper🐾 Creeper - The Next Generation Crawler Framework (Go)
Stars: ✭ 762 (+876.92%)
ArachnidPowerful web scraping framework for Crystal
Stars: ✭ 68 (-12.82%)
PychromeA Python Package for the Google Chrome Dev Protocol [threading base]
Stars: ✭ 469 (+501.28%)
QzoneexportQQ空间导出助手,用于备份QQ空间的说说、日志、私密日记、相册、视频、留言板、QQ好友、收藏夹、分享、最近访客为文件,便于迁移与保存
Stars: ✭ 456 (+484.62%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (+587.18%)
XsrfprobeThe Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.
Stars: ✭ 532 (+582.05%)
LearnpythonPython的基础练习代码与各种爬虫代码
Stars: ✭ 451 (+478.21%)
TorbotDark Web OSINT Tool
Stars: ✭ 821 (+952.56%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+655.13%)
IcrawlerA multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+706.41%)
DouyinAPI of DouYin for Humans used to Crawl Popular Videos and Musics
Stars: ✭ 580 (+643.59%)
Grab SiteThe archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Stars: ✭ 680 (+771.79%)
Zhihu Crawlerzhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目
Stars: ✭ 890 (+1041.03%)
Spiderpython crawler spider
Stars: ✭ 70 (-10.26%)
CupriteHeadless Chrome/Chromium driver for Capybara
Stars: ✭ 743 (+852.56%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+464.1%)
AbotxCross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.
Stars: ✭ 63 (-19.23%)
ScrapitScraping scripts for various websites.
Stars: ✭ 25 (-67.95%)
GospiderGospider - Fast web spider written in Go
Stars: ✭ 785 (+906.41%)
NavaliaA bullet-proof, fast, and reliable headless browser API
Stars: ✭ 950 (+1117.95%)
Nodespider[DEPRECATED] Simple, flexible, delightful web crawler/spider package
Stars: ✭ 33 (-57.69%)
MamanRust Web Crawler saving pages on Redis
Stars: ✭ 39 (-50%)
Foxr🦊 Node.js API to control Firefox
Stars: ✭ 783 (+903.85%)
AxegrinderCrawl websites for accessibility issues from the command line.
Stars: ✭ 12 (-84.62%)
Gowitness🔍 gowitness - a golang, web screenshot utility using Chrome Headless
Stars: ✭ 996 (+1176.92%)
Lizard💐 Full Amazon Automatic Download
Stars: ✭ 41 (-47.44%)
PhotonIncredibly fast crawler designed for OSINT.
Stars: ✭ 8,332 (+10582.05%)
AvbookAV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+10326.92%)
Puppeteer DeepPuppeteer, Headless Chrome;爬取《es6标准入门》、自动推文到掘金、站点性能分析;高级爬虫、自动化UI测试、性能分析;
Stars: ✭ 1,033 (+1224.36%)
Awesome Python Primer自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Stars: ✭ 57 (-26.92%)
CrawlabDistributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+10658.97%)