All Projects → pyppeteer → Pyppeteer

pyppeteer / Pyppeteer

Licence: other
Headless chrome/chromium automation library (unofficial port of puppeteer)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pyppeteer

Phantomas
Headless Chromium-based web performance metrics collector and monitoring tool
Stars: ✭ 2,191 (+70.37%)
Mutual labels:  automation, puppeteer, chromium
Apify Js
Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+145.26%)
Mutual labels:  automation, puppeteer
mitm-play
Man in the middle using Playwright
Stars: ✭ 13 (-98.99%)
Mutual labels:  chromium, puppeteer
Puppetron
Puppeteer (Headless Chrome Node API)-based rendering solution.
Stars: ✭ 429 (-66.64%)
Mutual labels:  puppeteer, chromium
Recorder
A browser extension that generates Cypress, Playwright and Puppeteer test scripts from your interactions 🖱 ⌨
Stars: ✭ 277 (-78.46%)
Mutual labels:  chromium, puppeteer
FlareSolverrSharp
FlareSolverr .Net / Proxy server to bypass Cloudflare protection
Stars: ✭ 62 (-95.18%)
Mutual labels:  chromium, puppeteer
Webster
a reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (-71.7%)
Mutual labels:  puppeteer, chromium
playwright-demos
playwright for scrapping and UI testing / automate testing workflows
Stars: ✭ 65 (-94.95%)
Mutual labels:  chromium, puppeteer
Headless Chrome Crawler
Distributed crawler powered by Headless Chrome
Stars: ✭ 5,129 (+298.83%)
Mutual labels:  puppeteer, chromium
Puppeteer Api Zh cn
📖 Puppeteer中文文档(官方指定的中文文档)
Stars: ✭ 697 (-45.8%)
Mutual labels:  automation, puppeteer
Edge Selenium Tools
An updated EdgeDriver implementation for Selenium 3 with newly-added support for Microsoft Edge (Chromium).
Stars: ✭ 41 (-96.81%)
Mutual labels:  automation, chromium
throughout
🎪 End-to-end testing made simple (using Jest and Puppeteer)
Stars: ✭ 16 (-98.76%)
Mutual labels:  chromium, puppeteer
LInkedIn-Reverese-Lookup
🔎Search LinkedIn profile by email address📧
Stars: ✭ 20 (-98.44%)
Mutual labels:  chromium, puppeteer
pccomponentes-buy-bot
A script made to buy any out-of-stock product off spanish stores
Stars: ✭ 34 (-97.36%)
Mutual labels:  chromium, puppeteer
simplechrome
Webrecorders DevTools Protocol Automation Library
Stars: ✭ 16 (-98.76%)
Mutual labels:  chromium, puppeteer
Playwright Go
Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.
Stars: ✭ 272 (-78.85%)
Mutual labels:  automation, chromium
Puphpeteer
A Puppeteer bridge for PHP, supporting the entire API.
Stars: ✭ 1,014 (-21.15%)
Mutual labels:  automation, puppeteer
pupflare
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Stars: ✭ 183 (-85.77%)
Mutual labels:  chromium, puppeteer
clusteer
Clusteer is a Puppeteer wrapper written for Laravel, with the super-power of parallelizing pages across multiple browser instances.
Stars: ✭ 81 (-93.7%)
Mutual labels:  chromium, puppeteer
Playwright Sharp
.NET version of the Playwright testing and automation library.
Stars: ✭ 459 (-64.31%)
Mutual labels:  automation, chromium

pyppeteer

PyPI PyPI version Documentation CircleCI codecov

Note: this is a continuation of the pyppeteer project. Before undertaking any sort of developement, it is highly recommended that you take a look at #16 for the ongoing effort to update this library to avoid duplicating efforts.

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Installation

pyppeteer requires Python >= 3.6

Install with pip from PyPI:

pip install pyppeteer

Or install the latest version from this github repo:

pip install -U git+https://github.com/pyppeteer/[email protected]

Usage

Note: When you run pyppeteer for the first time, it downloads the latest version of Chromium (~150MB) if it is not found on your system. If you don't prefer this behavior, ensure that a suitable Chrome binary is installed. One way to do this is to run pyppeteer-install command before prior to using this library.

Full documentation can be found here. Puppeteer's documentation and its troubleshooting guide are also great resources for pyppeteer users.

Examples

Open web page and take a screenshot:

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Evaluate javascript on a page:

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://example.com')
    await page.screenshot({'path': 'example.png'})

    dimensions = await page.evaluate('''() => {
        return {
            width: document.documentElement.clientWidth,
            height: document.documentElement.clientHeight,
            deviceScaleFactor: window.devicePixelRatio,
        }
    }''')

    print(dimensions)
    # >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Differences between puppeteer and pyppeteer

pyppeteer strives to replicate the puppeteer API as close as possible, however, fundamental differences between Javascript and Python make this difficult to do precisely. More information on specifics can be found in the documentation.

Keyword arguments for options

puppeteer uses an object for passing options to functions/methods. pyppeteer methods/functions accept both dictionary (python equivalent to JavaScript's objects) and keyword arguments for options.

Dictionary style options (similar to puppeteer):

browser = await launch({'headless': True})

Keyword argument style options (more pythonic, isn't it?):

browser = await launch(headless=True)

Element selector method names

In python, $ is not a valid identifier. The equivalent methods to Puppeteer's $, $$, and $x methods are listed below, along with some shorthand methods for your convenience:

puppeteer pyppeteer pyppeteer shorthand
Page.$() Page.querySelector() Page.J()
Page.$$() Page.querySelectorAll() Page.JJ()
Page.$x() Page.xpath() Page.Jx()

Arguments of Page.evaluate() and Page.querySelectorEval()

puppeteer's version of evaluate() takes a JavaScript function or a string representation of a JavaScript expression. pyppeteer takes string representation of JavaScript expression or function. pyppeteer will try to automatically detect if the string is function or expression, but it will fail sometimes. If an expression is erroneously treated as function and an error is raised, try setting force_expr to True, to force pyppeteer to treat the string as expression.

Examples:

Get a page's textContent:

content = await page.evaluate('document.body.textContent', force_expr=True)

Get an element's textContent:

element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)

Roadmap

See projects

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].