All Projects → websemantics → codepen-puppeteer

websemantics / codepen-puppeteer

Licence: other
Use Puppeteer to download pens from Codepen.io as single html pages

Programming Languages

javascript
184084 projects - #8 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to codepen-puppeteer

Apify Js
Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+14236.36%)
Mutual labels:  web-scraping, headless-chrome, puppeteer
Phantomas
Headless Chromium-based web performance metrics collector and monitoring tool
Stars: ✭ 2,191 (+9859.09%)
Mutual labels:  headless-chrome, puppeteer
puppet-master
Puppeteer as a service hosted on Saasify.
Stars: ✭ 25 (+13.64%)
Mutual labels:  headless-chrome, puppeteer
browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+222.73%)
Mutual labels:  web-scraping, puppeteer
Squidwarc
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (+468.18%)
Mutual labels:  headless-chrome, puppeteer
Deno Puppeteer
A port of puppeteer running on Deno
Stars: ✭ 128 (+481.82%)
Mutual labels:  headless-chrome, puppeteer
Puppeteer Examples
Puppeteer example scripts for running Headless Chrome from Node.
Stars: ✭ 2,781 (+12540.91%)
Mutual labels:  headless-chrome, puppeteer
Page2image
📷 page2image is a npm package for taking screenshots which also provides CLI command
Stars: ✭ 66 (+200%)
Mutual labels:  headless-chrome, puppeteer
Decapitated
Headless 'Chrome' Orchestration in R
Stars: ✭ 65 (+195.45%)
Mutual labels:  web-scraping, headless-chrome
Ayakashi
⚡️ Ayakashi.io - The next generation web scraping framework
Stars: ✭ 117 (+431.82%)
Mutual labels:  web-scraping, headless-chrome
Actor Page Analyzer
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
Stars: ✭ 124 (+463.64%)
Mutual labels:  web-scraping, headless-chrome
Awesome Puppeteer
A curated list of awesome puppeteer resources.
Stars: ✭ 1,728 (+7754.55%)
Mutual labels:  headless-chrome, puppeteer
Puppeteer Dart
A Dart library to automate the Chrome browser over the DevTools Protocol. This is a port of the Puppeteer API
Stars: ✭ 92 (+318.18%)
Mutual labels:  headless-chrome, puppeteer
Puppeteer Cluster
Puppeteer Pool, run a cluster of instances in parallel
Stars: ✭ 2,175 (+9786.36%)
Mutual labels:  headless-chrome, puppeteer
Puppeteer Functions
Puppeteer Firebase Functions demo
Stars: ✭ 75 (+240.91%)
Mutual labels:  headless-chrome, puppeteer
Puppeteer Extra
💯 Teach puppeteer new tricks through plugins.
Stars: ✭ 3,397 (+15340.91%)
Mutual labels:  headless-chrome, puppeteer
Puphpeteer
A Puppeteer bridge for PHP, supporting the entire API.
Stars: ✭ 1,014 (+4509.09%)
Mutual labels:  headless-chrome, puppeteer
Puppeteer Deep
Puppeteer, Headless Chrome;爬取《es6标准入门》、自动推文到掘金、站点性能分析;高级爬虫、自动化UI测试、性能分析;
Stars: ✭ 1,033 (+4595.45%)
Mutual labels:  headless-chrome, puppeteer
puppeteer-lambda
Module for using Headless-Chrome by Puppeteer on AWS Lambda.
Stars: ✭ 117 (+431.82%)
Mutual labels:  headless-chrome, puppeteer
CrawlerSamples
This is a Puppeteer+AngleSharp crawler console app samples, used C# 7.1 coding and dotnet core build.
Stars: ✭ 36 (+63.64%)
Mutual labels:  headless-chrome, puppeteer
╭─╮     ╭─╮     ╭┬╮     ╭─╮     ╭─╮     ╭─╮     ╭╮╭        ┬    ╭─╮
│       │ │      ││     ├┤      ├─╯     ├┤      │││        │    │ │
╰─╯     ╰─╯     ─┴╯     ╰─╯     ┴       ╰─╯     ╯╰╯    o   ┴    ╰─╯
╭────╮  ╭──╮╭╮  ╭────╮  ╭────╮  ╭──▞─╮   ╭──╮   ╭────╮ ╭────╮ ╭─┬─╮                 
│  ╭╮│  │  │││  │  ╭╮│  │  ╭╮│  │ `◯ │  ╭╯  ╰╮  │ ─  │ │ ─  │ │   │                 
│  ╰╯│  │  ╰╯│  │  ╰╯│  │  ╰╯│  │    │  ╰╮  ╭╯  │    │ │    │ │  ╭╯                 
│   ╭╯  │    │  │   ╭╯  │   ╭╯  │ ───┤   │ ─┤   │ ───┤ │ ───┤ │  │                  
╰───╯   ╰───┴╯  ╰───╯   ╰───╯   ╰────╯   ╰──╯   ╰────╯ ╰────╯ ╰──╯  

Use Puppeteer to download pens from Codepen.io as single html pages.

Features

  • Download example pens as single html pages
  • Easy preview with an index page
  • Built-in error recovery to resume download
  • Skip already downloaded pens
  • Easy to debug using screenshots
  • Custom template pages
  • Easy to follow source code with comments
  • Support for loading external resources (i.e. jquery, google fonts)

Usage

  • Clone this project locally,
git clone https://github.com/websemantics/codepen-puppeteer
cd codepen-puppeteer
  • Install dependencies (puppeteer),
npm i

There're two commands to interact with,

  1. search command to download pens matching search query
penpet search flexbox

You can specify start and end page with -s and -e options

  • Browse to ./pens/index.html to preview full list of downloads
  1. file command to download provided list of pens
penpet file pens.json

File pens.json is provided as an example

  1. For examples and more help, use option -h with both commands

Debug

This project is a proof of concept so you might find problematic pens that wouldn't download fully. Turn the debug flag -d with the file command to enable screenshots which might help you debug the issue,

penpet file pens.json -d

Hint

I find the following command useful to force quit running chromium processes on OSX

pkill -f -- "chromium"

Preview Downloads

Codepen Puppeteer Preview Page

Resources

Support

Need help or have a question? post at StackOverflow.

Please don't use the issue trackers for support/questions.

Star if you find this project useful, to show support or simply for being awesome :)

Contribution

Contributions to this project are accepted in the form of feedback, bugs reports and even better - pull requests.

License

MIT license Copyright (c) Web Semantics, Inc.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].