All Projects → dotemacs → Pdfboxing

dotemacs / Pdfboxing

Nice wrapper of PDFBox in Clojure

Programming Languages

clojure
4091 projects

Labels

Projects that are alternatives of or similar to Pdfboxing

Bepasty Server
binary pastebin server
Stars: ✭ 111 (-9.02%)
Mutual labels:  pdf
Technology books
Premium eBook free for Geeks
Stars: ✭ 115 (-5.74%)
Mutual labels:  pdf
Moodle Downloader 2
A Moodle downloader that downloads course content fast from Moodle (eg. lecture pdfs)
Stars: ✭ 118 (-3.28%)
Mutual labels:  pdf
Gpnd
Generative Probabilistic Novelty Detection with Adversarial Autoencoders
Stars: ✭ 112 (-8.2%)
Mutual labels:  pdf
C Sharp Cheatsheet
C# Cheatsheet
Stars: ✭ 111 (-9.02%)
Mutual labels:  pdf
Cheatsheet
Pretty cheat sheets, or ``reference cards'', obtainable from Org files.
Stars: ✭ 116 (-4.92%)
Mutual labels:  pdf
Pdfsam
PDFsam, a desktop application to extract pages, split, merge, mix and rotate PDF files
Stars: ✭ 1,829 (+1399.18%)
Mutual labels:  pdf
Markdownslides
MarkdownSlides is a Reveal.js and PDF slides generator from MARKDOWN files, that also generate HTML, EPUB and DOCX documents. The idea is that from a same MARKDOWN file we can get slides and books without worrying about style, just worrying about content.
Stars: ✭ 121 (-0.82%)
Mutual labels:  pdf
Vue Pdf
vue.js pdf viewer
Stars: ✭ 1,700 (+1293.44%)
Mutual labels:  pdf
Labelmake
Declarative style JavaScript PDF generator library. Works on Node and the browser 🖨︎
Stars: ✭ 112 (-8.2%)
Mutual labels:  pdf
Studybook
Study E-Book(ComputerVision DeepLearning MachineLearning Math NLP Python ReinforcementLearning)
Stars: ✭ 1,457 (+1094.26%)
Mutual labels:  pdf
Terraform Docs As Pdf
Complete Terraform documentation (core + all official providers) as PDF files. Updating nightly.
Stars: ✭ 113 (-7.38%)
Mutual labels:  pdf
Kramdown
kramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
Stars: ✭ 1,546 (+1167.21%)
Mutual labels:  pdf
Prawn Rails
Prawn Handler for Rails. Handles and registers pdf formats.
Stars: ✭ 111 (-9.02%)
Mutual labels:  pdf
Py Pdf Parser
A Python tool to help extracting information from structured PDFs.
Stars: ✭ 120 (-1.64%)
Mutual labels:  pdf
Email To Pdf Converter
Converts email files (eml, msg) to pdf
Stars: ✭ 110 (-9.84%)
Mutual labels:  pdf
Hypertag
Knowledge Management for Humans using Machine Learning & Tags
Stars: ✭ 116 (-4.92%)
Mutual labels:  pdf
Chromehtmltopdf
Convert HTML to PDF with Chrome
Stars: ✭ 122 (+0%)
Mutual labels:  pdf
Pdf2img
convert pdf to img,使用JS将PDF转换为图片,压缩成.zip下载,可直接在线查看demo!如有疑问,可通过个人主页邮箱联系。
Stars: ✭ 121 (-0.82%)
Mutual labels:  pdf
React Antd Admin
后台前端管理系统,基于react、typescript、antd、dva及一些特别优秀的开源库实现
Stars: ✭ 117 (-4.1%)
Mutual labels:  pdf

pdfboxing

Clojure PDF manipulation library & wrapper for PDFBox.

  • "Clojure CLI"
  • "Leiningen version"
  • "Continuous Integration status"
  • License
  • Dependencies Status
  • Downloads

Usage

Extract text

(require '[pdfboxing.text :as text])
(text/extract "test/pdfs/hello.pdf")

Merge multiple PDFs

(require '[pdfboxing.merge :as pdf])
(pdf/merge-pdfs :input ["test/pdfs/clojure-1.pdf" "test/pdfs/clojure-2.pdf"] :output "foo.pdf")

Merge multiple images into single PDF

You can use either merge-images-from-path for providing images in form of vector of string paths or merge-images-from-byte-array to provide them as a vector of byte arrays. Each image will be inserted into its own page.

(require '[pdfboxing.merge :as pdf])
(pdf/merge-images-from-path ["image1.png" "image2.png"] "output.pdf")

Split a PDF into mutliple PDDocuments

 (require '[pdfboxing.split :as pdf])

List of PDDocument pages 1 through 8

 (pdf/split-pdf :input "test/pdfs/multi-page.pdf" :start 1 :end 8)

Splits the PDF into single pages as a list of PDDocument

 (pdf/split-pdf :input "test/pdfs/multi-page.pdf")

Splits the PDF in half and writes them to disk as multi-page-1.pdf and multi-page-2.pdf

 (pdf/split-pdf-at :input "test/pdfs/multi-page.pdf")

Splits into two PDFs, the first having 5 pages and second has rest

 (pdf/split-pdf-at :input "test/pdfs/multi-page.pdf" :split 5)

List form fields of a PDF

To list fields and values:

(require '[pdfboxing.form :as form])
(form/get-fields "test/pdfs/interactiveform.pdf")
{"Emergency_Phone" "", "ZIP" "", "COLLEGE NO DEGREE" "", ...}

Fill in PDF forms

To fill in form's field supply a hash map with field names and desired values. It will create a copy of fillable.pdf as new.pdf with the fields filled in:

(require '[pdfboxing.form :as form])
(form/set-fields "test/pdfs/fillable.pdf" "test/pdfs/new.pdf" {"Text10" "My first name"})

Rename form fields of a PDF

To rename PDF form fields, supply a hash map where the keys are the current names and the values new names:

(require '[pdfboxing.form :as form])
(form/rename-fields "test/pdfs/interactiveform.pdf" "test/pdfs/addr1.pdf" {"Address_1" "NewAddr"})

Get page count of a PDF document

(require '[pdfboxing.info :as info])
(info/page-number "test/pdfs/interactiveform.pdf")

Get info about a PDF document

Such as title, author, subject, keywords, creator & producer

(require '[pdfboxing.info :as info])
(info/about-doc "test/pdfs/interactiveform.pdf")

Draw lines on a PDF document

Supply a PDF document, a name for the output PDF document, the coordinates where the line should be drawn along with the page number on which the line should be drawn

(require '[pdfboxing.draw :as draw])
(draw/draw-line :input-pdf "test/pdfs/clojure-1.pdf"
                :output-pdf "ninja.pdf"
                :coordinates {:page-number 0
                              :x 0
                              :y 160
                              :x1 650
                              :y1 160})

Compatibility with PDFBox's PDDocuments

The following functions referenced above have direct compatibility with PDFBox's internal PDDocument type:

  • text/extract
  • pdf/split-pdf
  • form/get-fields
  • form/set-fields
  • form/rename-fields
  • info/page-number
  • draw/draw-line

This allows you to substitute each filepath (of each function's input) referenced above with a PDDocument type. This is helpful for example in the case that you were to want to split a PDF up by pages and then extract the text from only the 3rd page:

(require '[pdfboxing.text :as text])
(require '[pdfboxing.split :as split])
(-> (split/split-pdf :input "test/pdfs/multi-page.pdf")
    (nth 2)
    text/extract)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].