All Projects → tutorcruncher → pydf

tutorcruncher / pydf

Licence: MIT license
PDF generation in python using wkhtmltopdf for heroku and docker

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to pydf

api2pdf.php
PHP client library for the Api2Pdf.com REST API - Convert HTML to PDF, URL to PDF, Office Docs to PDF, Merge PDFs, HTML to Image, URL to Image, HTML to Docx, HTML to Xlsx, PDF to HTML, Thumbnail preview of office files
Stars: ✭ 42 (-38.24%)
Mutual labels:  wkhtmltopdf, html-to-pdf, pdf-generation
magento2-module-pdf
Magento 2 Module for creating PDF's based on wkhtmltopdf
Stars: ✭ 55 (-19.12%)
Mutual labels:  wkhtmltopdf, pdf-generation
wkhtmltopdf-flask-aas
Wkhtmltopdf Flask As a Service
Stars: ✭ 17 (-75%)
Mutual labels:  wkhtmltopdf, pdf-generation
caligo
SelfBot for Telegram
Stars: ✭ 34 (-50%)
Mutual labels:  heroku, asyncio
docraptor-ruby
A native Ruby client for the DocRaptor HTML to PDF generation API
Stars: ✭ 20 (-70.59%)
Mutual labels:  html-to-pdf, pdf-generation
Dompdf
HTML to PDF converter for PHP
Stars: ✭ 8,446 (+12320.59%)
Mutual labels:  html-to-pdf, pdf-generation
Pdfkit
A Ruby gem to transform HTML + CSS into PDFs using the command-line utility wkhtmltopdf
Stars: ✭ 2,799 (+4016.18%)
Mutual labels:  wkhtmltopdf, html-to-pdf
Snappy
PHP library allowing thumbnail, snapshot or PDF generation from a url or a html page. Wrapper for wkhtmltopdf/wkhtmltoimage
Stars: ✭ 3,986 (+5761.76%)
Mutual labels:  html-to-pdf, pdf-generation
Heroku Aiohttp Web
A project starter template for deploying an aiohttp app to Heroku
Stars: ✭ 14 (-79.41%)
Mutual labels:  heroku, asyncio
Telepyrobot
A userbot for Telegram account made using Pyrogram Library and Python
Stars: ✭ 27 (-60.29%)
Mutual labels:  heroku, asyncio
python-pdfkit-example
python-pdfkit HTML TO PDF Example
Stars: ✭ 18 (-73.53%)
Mutual labels:  wkhtmltopdf, html-to-pdf
Docotic.Pdf.Samples
C# and VB.NET samples for Docotic.Pdf library
Stars: ✭ 52 (-23.53%)
Mutual labels:  html-to-pdf, pdf-generation
Userge
Userge, Durable as a Serge
Stars: ✭ 363 (+433.82%)
Mutual labels:  heroku, asyncio
dhtml2pdf
Simple, free and very easy to use PHP API that allows you to see, download or get the binary of the PDF generated from the HTML of an URL.
Stars: ✭ 27 (-60.29%)
Mutual labels:  wkhtmltopdf, pdf-generation
imprenta
An AWS lambda in python 3 that generates PDF files from HTML using jinja, pdfkit and wkhtmltopdf.
Stars: ✭ 18 (-73.53%)
Mutual labels:  wkhtmltopdf, pdf-generation
python3-concurrency
Python3爬虫系列的理论验证,首先研究I/O模型,分别用Python实现了blocking I/O、nonblocking I/O、I/O multiplexing各模型下的TCP服务端和客户端。然后,研究同步I/O操作(依序下载、多进程并发、多线程并发)和异步I/O(asyncio)之间的效率差别
Stars: ✭ 49 (-27.94%)
Mutual labels:  asyncio
PokerTexter
SMS App for Poker Odds. Runs on Flask + Twilio + Heroku.
Stars: ✭ 17 (-75%)
Mutual labels:  heroku
django-kcproject-starter
Kickstart Coding's Django Project Starter Template
Stars: ✭ 14 (-79.41%)
Mutual labels:  heroku
aiosc
Lightweight Open Sound Control implementation for Python using asyncio
Stars: ✭ 26 (-61.76%)
Mutual labels:  asyncio
Malicious-Urlv5
A multi-layered and multi-tiered Machine Learning security solution, it supports always on detection system, Django REST framework used, equipped with a web-browser extension that uses a REST API call.
Stars: ✭ 35 (-48.53%)
Mutual labels:  heroku

pydf

BuildStatus codecov PyPI license docker

PDF generation in python using wkhtmltopdf.

Wkhtmltopdf binaries are precompiled and included in the package making pydf easier to use, in particular this means pydf works on heroku.

Currently using wkhtmltopdf 0.12.5 for Ubuntu 18.04 (bionic), requires Python 3.6+.

If you're not on Linux amd64: pydf comes bundled with a wkhtmltopdf binary which will only work on Linux amd64 architectures. If you're on another OS or architecture your mileage may vary, it is likely that you'll need to supply your own wkhtmltopdf binary and point pydf towards it by setting the WKHTMLTOPDF_PATH environment variable.

Install

pip install python-pdf

For python 2 use pip install python-pdf==0.30.0.

Basic Usage

import pydf
pdf = pydf.generate_pdf('<h1>this is html</h1>')
with open('test_doc.pdf', 'wb') as f:
    f.write(pdf)

Async Usage

Generation of lots of documents with wkhtmltopdf can be slow as wkhtmltopdf can only generate one document per process. To get round this pydf uses python 3's asyncio create_subprocess_exec to generate multiple pdfs at the same time. Thus the time taken to spin up processes doesn't slow you down.

from pathlib import Path
from pydf import AsyncPydf

async def generate_async():
    apydf = AsyncPydf()

    async def gen(i):
        pdf_content = await apydf.generate_pdf('<h1>this is html</h1>')
        Path(f'output_{i:03}.pdf').write_bytes(pdf_content)

    coros = [gen(i) for i in range(50)]
    await asyncio.gather(*coros)

loop = asyncio.get_event_loop()
loop.run_until_complete(generate_async())

See benchmarks/run.py for a full example.

Locally generating an entire invoice goes from 0.372s/pdf to 0.035s/pdf with the async model.

Docker

pydf is available as a docker image with a very simple http API for generating pdfs.

Simple POST (or GET with data if possible) you HTML data to /generate.pdf.

Arguments can be passed using http headers; any header starting pdf- or pdf_ will have that prefix removed, be converted to lower case and passed to wkhtmltopdf.

For example:

docker run -rm -p 8000:80 -d samuelcolvin/pydf
curl -d '<h1>this is html</h1>' -H "pdf-orientation: landscape" http://localhost:8000/generate.pdf > created.pdf
open "created.pdf"

In docker compose:

services:
  pdf:
    image: samuelcolvin/pydf

Other services can then generate PDFs by making requests to pdf/generate.pdf. Pretty cool.

API

generate_pdf(source, [**kwargs])

Generate a pdf from either a url or a html string.

After the html and url arguments all other arguments are passed straight to wkhtmltopdf

For details on extra arguments see the output of get_help() and get_extended_help()

All arguments whether specified or caught with extra_kwargs are converted to command line args with '--' + original_name.replace('_', '-').

Arguments which are True are passed with no value eg. just --quiet, False and None arguments are missed, everything else is passed with str(value).

Arguments:

  • source: html string to generate pdf from or url to get
  • quiet: bool
  • grayscale: bool
  • lowquality: bool
  • margin_bottom: string eg. 10mm
  • margin_left: string eg. 10mm
  • margin_right: string eg. 10mm
  • margin_top: string eg. 10mm
  • orientation: Portrait or Landscape
  • page_height: string eg. 10mm
  • page_width: string eg. 10mm
  • page_size: string: A4, Letter, etc.
  • image_dpi: int default 600
  • image_quality: int default 94
  • extra_kwargs: any exotic extra options for wkhtmltopdf

Returns string representing pdf

get_version()

Get version of pydf and wkhtmltopdf binary

get_help()

get help string from wkhtmltopdf binary uses -h command line option

get_extended_help()

get extended help string from wkhtmltopdf binary uses -H command line option

execute_wk(*args)

Low level function to call wkhtmltopdf, arguments are added to wkhtmltopdf binary and passed to subprocess with not processing.

Heroku

If you are deploying onto Heroku, then you will need to install a couple of dependencies before WKHTMLTOPDF will work.

Add the Heroku buildpack https://buildpack-registry.s3.amazonaws.com/buildpacks/heroku-community/apt.tgz

Then create an Aptfile in your root directory with the dependencies:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].