Yuras / Pdf Toolbox
A collection of tools for processing PDF files in Haskell
Stars: ✭ 145
Programming Languages
haskell
3896 projects
Labels
Projects that are alternatives of or similar to Pdf Toolbox
Cheat Sheets
🌟 All the cheat-sheets mentioned on my blog in pdf format
Stars: ✭ 136 (-6.21%)
Mutual labels: pdf
Cs Books Pdf
编程电子书pdf,计算机常用电子书整理(高质量/附下载链接)包括 Java, Python, Linux, Go, C, C++, 数据结构与算法, AI人工智能, 计算机基础, 面试, 设计模式, 数据库, 前端等编程书籍。
Stars: ✭ 140 (-3.45%)
Mutual labels: pdf
Pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
Stars: ✭ 2,261 (+1459.31%)
Mutual labels: pdf
Educative.io Downloader
📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.
Stars: ✭ 139 (-4.14%)
Mutual labels: pdf
Pdfcreatorandroid
Simple library to generate and view PDF in Android
Stars: ✭ 128 (-11.72%)
Mutual labels: pdf
Easytable
Small table drawing library built upon Apache PDFBox
Stars: ✭ 136 (-6.21%)
Mutual labels: pdf
Pdfcropmargins
pdfCropMargins -- a program to crop the margins of PDF files
Stars: ✭ 141 (-2.76%)
Mutual labels: pdf
Net Core Docx Html To Pdf Converter
.NET Core library to create custom reports based on Word docx or HTML documents and convert to PDF
Stars: ✭ 133 (-8.28%)
Mutual labels: pdf
Pdf reports
📕 Python library and CSS theme to generate PDF reports from HTML/Pug
Stars: ✭ 142 (-2.07%)
Mutual labels: pdf
Markdown Themeable Pdf
ARCHIVED. NOT MAINTAINED. Themeable Markdown Converter (Print to PDF, HTML, JPEG or PNG)
Stars: ✭ 130 (-10.34%)
Mutual labels: pdf
Doctron
Docker-powered html convert to pdf(html2pdf), html to image(html2image like jpeg,png),which using chrome(golang) kernel, add watermarks to pdf, convert pdf to images etc.
Stars: ✭ 141 (-2.76%)
Mutual labels: pdf
Pyecharts Snapshot
renders the output of pyecharts as png, jpeg, gif, svg, eps, pdf and raw base64
Stars: ✭ 142 (-2.07%)
Mutual labels: pdf
pdf-toolbox
A collection of tools for processing PDF files
Stable and HEAD
See "stable" branch for Hackage version. The current "master" branch is in a middle of API rewrite, see here for details.
Features
- Written in Haskell
- Parsing on demand. You don't need to parse or load into memory the entire PDF file just to extract one image
- Different levels of abstraction. You can inspect high level (catalog, page tree, pages) or low level (xref, trailer, object) structure of PDF file. You can even switch between levels of details on the fly.
- Extremely fast and memory efficient when you need to inspect only part of the document
- Resonably fast and memory efficient in general case
- Text extraction with exact glyph positions (mostly works, but in progress yet). It can be used e.g. to implement text selection and copying in pdf viewer
- Full support of xref streams and object streams
- Supports editing of PDF files (incremental updates)
- Basic support for PDF file generating
- Encrypted PDF documents are partially supported
Still in TODO list
- Linearized PDF files
- Content stream tools: extract text, images, etc (basic implementation is already included)
- Higher level API for incremental updates and PDF generating
Examples
(Also see examples
and viewer
directories)
Inspect high level structure:
import Pdf.Document
main =
withPdfFile "input.pdf" $ \pdf ->
encrypted <- isEncrypted pdf
when encrypted $ do
ok <- setUserPassword pdf defaultUserPassword
unless ok $
fail "need password"
doc <- document pdf
catalog <- documentCatalog doc
rootNode <- catalogPageNode catalog
count <- pageNodeNKids rootNode
print count
-- the first page of the document
page <- pageNodePageByNum rootNode 0
-- extract text
txt <- pageExtractText page
print txt
...
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].