A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+239.9%)

Mutual labels: crawler, scraper

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-1.55%)

Mutual labels: crawler, scraper

Puppeteer Lambda Starter Kit

Starter Kit for running Headless-Chrome by Puppeteer on AWS Lambda.

Stars: ✭ 563 (+191.71%)

Mutual labels: chrome, puppeteer

Puppeteer Api Zh cn

📖 Puppeteer中文文档（官方指定的中文文档）

Stars: ✭ 697 (+261.14%)

Mutual labels: chrome, puppeteer

Api

API that uncovers the technologies used on websites and generates thumbnail from screenshot of website

Stars: ✭ 189 (-2.07%)

Mutual labels: chrome, chrome-headless

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+308.81%)

Mutual labels: crawler, scraper

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-87.05%)

Mutual labels: crawler, scraper

Instagram Crawler

Crawl instagram photos, posts and videos for download.

Stars: ✭ 178 (-7.77%)

Mutual labels: crawler, scraper

Chart To Aws

Microservice to generate screenshot from a webpage and upload it to a AWS S3 Bucket.

Stars: ✭ 43 (-77.72%)

Mutual labels: puppeteer, chrome-headless

Puppeteer Deep

Puppeteer, Headless Chrome；爬取《es6标准入门》、自动推文到掘金、站点性能分析；高级爬虫、自动化UI测试、性能分析；

Stars: ✭ 1,033 (+435.23%)

Mutual labels: chrome, puppeteer

Viewfinderjs

📷 ViewFinder - NodeJS product to make the browser into a web app. WTF RBI. CBII. Remote browser isolation, embeddable browserview, secure chrome saas. Licenses, managed, self-hosted. Like S2, WebGap, Bromium, Authentic8, Menlo Security and Broadcom, but open source with free live demos available now! Also, integrated RBI/CDR with CDR from https://github.com/dosyago/p2%2e

Stars: ✭ 1,175 (+508.81%)

Mutual labels: chrome, chrome-headless

Pypergrabber

Fetches PubMed article IDs (PMIDs) from email inbox, then crawls PubMed, Google Scholar and Sci-Hub for respective PDF files.

Stars: ✭ 14 (-92.75%)

Mutual labels: crawler, scraper

Social Scraper

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Stars: ✭ 47 (-75.65%)

Mutual labels: crawler, scraper

Jd Autobuy

Python爬虫，京东自动登录，在线抢购商品

Stars: ✭ 1,174 (+508.29%)

Mutual labels: crawler, scraper

Headless Recorder

Chrome extension that records your browser interactions and generates a Playwright or Puppeteer script.

Stars: ✭ 13,786 (+7043.01%)

Mutual labels: chrome, puppeteer

Lightcrawler

Crawl a website and run it through Google lighthouse

Stars: ✭ 1,339 (+593.78%)

Mutual labels: crawler, chrome

Sillynium

Automate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements

Stars: ✭ 100 (-48.19%)

Mutual labels: scraper, chrome

Node Frontend

Node.js Docker image with all Puppeteer dependencies installed for frontend Chrome Headless testing and default Nginx config, for multi-stage Docker building

Stars: ✭ 104 (-46.11%)

Mutual labels: puppeteer, chrome-headless

Fbcrawl

A Facebook crawler

Stars: ✭ 536 (+177.72%)

Mutual labels: crawler, scraper

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+230.05%)

Mutual labels: crawler, scraper

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (+304.66%)

Mutual labels: crawler, scraper

Alpine Chrome

Chrome Headless docker images built upon alpine official image

Stars: ✭ 754 (+290.67%)

Mutual labels: chrome, chrome-headless

Url To Pdf Api

Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.

Stars: ✭ 6,544 (+3290.67%)

Mutual labels: chrome, puppeteer

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (+2383.42%)

Mutual labels: crawler, scraper

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+4113.99%)

Mutual labels: crawler, scraper

Gowitness

🔍 gowitness - a golang, web screenshot utility using Chrome Headless

Stars: ✭ 996 (+416.06%)

Mutual labels: chrome, chrome-headless

Public Instagram

Tool to fetch Instagram's public content.

Stars: ✭ 43 (-77.72%)

Mutual labels: scraper, puppeteer

Not Your Average Web Crawler

A web crawler (for bug hunting) that gathers more than you can imagine.

Stars: ✭ 107 (-44.56%)

Mutual labels: crawler, scraper

Tracker Radar Collector

🕸 Modular, multithreaded, puppeteer-based crawler

Stars: ✭ 67 (-65.28%)

Mutual labels: crawler, puppeteer

Puppeteer Docs Zh Cn

Google Puppeteer 文档的中文版本 , 目标版本 1.9.0, 翻译中...

Stars: ✭ 61 (-68.39%)

Mutual labels: chrome, puppeteer

Rod

A Devtools driver for web automation and scraping

Stars: ✭ 1,392 (+621.24%)

Mutual labels: scraper, chrome-headless

Google Play Scraper

Node.js scraper to get data from Google Play

Stars: ✭ 1,606 (+732.12%)

Mutual labels: crawler, scraper

Sushi Browser

Sushi Browser is the next generation browser which mounts the multi-panel and the video support function and so on. Its goal is to be as fantastic as sushi. 🍣

Stars: ✭ 116 (-39.9%)

Mutual labels: chrome, puppeteer

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+584.97%)

Mutual labels: crawler, scraper

Roam Research Private Api

Private API to enable API access for Roam Research. Now you can connect Roam to your other projects.

Stars: ✭ 88 (-54.4%)

Mutual labels: chrome, puppeteer

Puppeteer Webperf

Automating Web Performance testing with Puppeteer 🎪

Stars: ✭ 1,392 (+621.24%)

Mutual labels: chrome, puppeteer

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.