Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+82.52%)
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Stars: ✭ 494 (-71.41%)
PhantomasHeadless Chromium-based web performance metrics collector and monitoring tool
Stars: ✭ 2,191 (+26.79%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-96.59%)
double-agentA test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-92.88%)
Webstera reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (-78.94%)
Api StoreContains all the public APIs listed in Phantombuster's API store. Pull requests welcome!
Stars: ✭ 69 (-96.01%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-90.1%)
SquidwarcSquidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (-92.77%)
PuphpeteerA Puppeteer bridge for PHP, supporting the entire API.
Stars: ✭ 1,014 (-41.32%)
Puppeteer Extra💯 Teach puppeteer new tricks through plugins.
Stars: ✭ 3,397 (+96.59%)
Deno PuppeteerA port of puppeteer running on Deno
Stars: ✭ 128 (-92.59%)
puppet-masterPuppeteer as a service hosted on Saasify.
Stars: ✭ 25 (-98.55%)
GrawlerGrawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.
Stars: ✭ 98 (-94.33%)
puppeteer-githubGitHub automation driven by headless chrome.
Stars: ✭ 15 (-99.13%)
thal译文:Puppeteer 与 Chrome Headless —— 从入门到爬虫
Stars: ✭ 651 (-62.33%)
Ayakashi⚡️ Ayakashi.io - The next generation web scraping framework
Stars: ✭ 117 (-93.23%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-96.06%)
Playwright GoPlaywright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.
Stars: ✭ 272 (-84.26%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (-83.45%)
PyppeteerHeadless chrome/chromium automation library (unofficial port of puppeteer)
Stars: ✭ 3,480 (+101.39%)
Comic DlComic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Stars: ✭ 365 (-78.88%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (-78.88%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (-74.54%)
pompScreen scraping and web crawling framework
Stars: ✭ 61 (-96.47%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (-82.12%)
Md To PdfHackable CLI tool for converting Markdown files to PDF using Node.js and headless Chrome.
Stars: ✭ 374 (-78.36%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (-73.61%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+179.92%)
hc-pdf-serverConvert HTML to PDF Server by headless chrome with TypeScript. The new version of hcep-pdf-server.
Stars: ✭ 24 (-98.61%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-83.97%)
naos📉 Uptime and error monitoring CLI
Stars: ✭ 30 (-98.26%)
Mochify.js☕️ TDD with Browserify, Mocha, Headless Chrome and WebDriver
Stars: ✭ 338 (-80.44%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+135.94%)
Tinking🧶 Extract data from any website without code, just clicks.
Stars: ✭ 331 (-80.84%)
ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+2350.41%)
Pptraas.comPuppeteer as a service
Stars: ✭ 433 (-74.94%)
puppeteer-emailEmail automation driven by headless chrome.
Stars: ✭ 135 (-92.19%)
DifferencifyDifferencify is a library for visual regression testing
Stars: ✭ 572 (-66.9%)
RendertronA Headless Chrome rendering solution
Stars: ✭ 5,593 (+223.67%)
Try PuppeteerRun Puppeteer code in the cloud
Stars: ✭ 642 (-62.85%)
BrowserlessA browser driver on top of puppeteer, ready for production scenarios.
Stars: ✭ 664 (-61.57%)
Url To Pdf ApiWeb page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
Stars: ✭ 6,544 (+278.7%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (-54.34%)
NavaliaA bullet-proof, fast, and reliable headless browser API
Stars: ✭ 950 (-45.02%)
FerrumHeadless Chrome Ruby API
Stars: ✭ 1,009 (-41.61%)
Page2image📷 page2image is a npm package for taking screenshots which also provides CLI command
Stars: ✭ 66 (-96.18%)
PyppeteerHeadless chrome/chromium automation library (unofficial port of puppeteer)
Stars: ✭ 1,286 (-25.58%)