All Projects → pankajr141 → pdf2jpg

pankajr141 / pdf2jpg

Licence: other
Utility to convert PDF into JPG files

Programming Languages

java
68154 projects - #9 most used programming language
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pdf2jpg

Remarks
Extract highlights, scribbles, and annotations from PDFs marked with the reMarkable tablet. Export to Markdown, PDF, PNG, and SVG
Stars: ✭ 94 (+141.03%)
Mutual labels:  pdf-converter
Doctron
Docker-powered html convert to pdf(html2pdf), html to image(html2image like jpeg,png),which using chrome(golang) kernel, add watermarks to pdf, convert pdf to images etc.
Stars: ✭ 141 (+261.54%)
Mutual labels:  pdf-converter
Npm Pdfreader
🚜 Read text and parse tables from PDF files.
Stars: ✭ 225 (+476.92%)
Mutual labels:  pdf-converter
Pdf2docx
Parse PDF file with PyMuPDF and generate docx with python-docx
Stars: ✭ 114 (+192.31%)
Mutual labels:  pdf-converter
Hrconvert2
A self-hosted, drag-and-drop, & nosql file conversion server that supports 62x file formats.
Stars: ✭ 132 (+238.46%)
Mutual labels:  pdf-converter
Docviewer
文档/文件查看器(支持本地或者其他app分享过来的word、excel、pdf、rtf等格式文件)
Stars: ✭ 155 (+297.44%)
Mutual labels:  pdf-converter
Laravel Pdf
A Simple package for easily generating PDF documents from HTML. This package is specially for laravel but you can use this without laravel.
Stars: ✭ 79 (+102.56%)
Mutual labels:  pdf-converter
node-poppler
Asynchronous node.js wrapper for the Poppler PDF rendering library
Stars: ✭ 97 (+148.72%)
Mutual labels:  pdf-converter
Pdfcropmargins
pdfCropMargins -- a program to crop the margins of PDF files
Stars: ✭ 141 (+261.54%)
Mutual labels:  pdf-converter
Athenapdf
Drop-in replacement for wkhtmltopdf built on Go, Electron and Docker
Stars: ✭ 2,160 (+5438.46%)
Mutual labels:  pdf-converter
Ptext Release
pText is a library for reading, creating and manipulating PDF files in python.
Stars: ✭ 124 (+217.95%)
Mutual labels:  pdf-converter
Net Core Docx Html To Pdf Converter
.NET Core library to create custom reports based on Word docx or HTML documents and convert to PDF
Stars: ✭ 133 (+241.03%)
Mutual labels:  pdf-converter
Aws Lambda Wkhtmltopdf
Convert HTML to PDF using Webkit (QtWebKit) on AWS Lambda
Stars: ✭ 165 (+323.08%)
Mutual labels:  pdf-converter
Email To Pdf Converter
Converts email files (eml, msg) to pdf
Stars: ✭ 110 (+182.05%)
Mutual labels:  pdf-converter
Stapler
A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk
Stars: ✭ 238 (+510.26%)
Mutual labels:  pdf-converter
Gotenberg Php Client
PHP client for the Gotenberg API
Stars: ✭ 80 (+105.13%)
Mutual labels:  pdf-converter
Aws Lambda Libreoffice
85 MB LibreOffice to fit inside AWS Lambda compressed with Brotli
Stars: ✭ 145 (+271.79%)
Mutual labels:  pdf-converter
WeReadScan
扫描“微信读书”已购图书并下载本地PDF的爬虫
Stars: ✭ 273 (+600%)
Mutual labels:  pdf-converter
Wkhtmltopdf
C# wrapper around excellent wkhtmltopdf console utility.
Stars: ✭ 243 (+523.08%)
Mutual labels:  pdf-converter
Golang Html To Pdf Converter
Golang HTML to PDF Converter
Stars: ✭ 177 (+353.85%)
Mutual labels:  pdf-converter

Building jar file from source.

To build the package maven is used, by default pdfbox does not include converted for certain jpg images. To add support include the jar file provided in data/dependency path of project in your classpath and then maven compile.

Dependency Jar location - pdf2jpg/data/dependency/jbig2-imageio-3.0.0-SNAPSHOT.jar

Below is the entry in pom.xml for this jar file.

	<dependency> 
	    <groupId>org.apache.pdfbox</groupId>
	    <artifactId>jbig2-imageio</artifactId>
	    <version>3.0.0-SNAPSHOT</version>
	    <type>jar</type> <!-- Meaning it is picking this artifact from a jar file, add this jar to classpath-->
	</dependency>

After adding above in pom.xml add the above jar in classpath and your are good to go.

Installation

we only support python3

pip3 install pdf2jpg

also make sure that you have java installed in system, just check if entering java in terminal is working and not throwing error

The video contains demo of installation and usage Demo Video

Usage

The utility can be executed in two ways

Python bindings

Convertion of PDF into jpgfiles

from pdf2jpg import pdf2jpg
inputpath = r"D:\inputdir\pdf1.pdf"
outputpath = r"D:\outputdir"
# To convert single page
result = pdf2jpg.convert_pdf2jpg(inputpath, outputpath, dpi=300, pages="1")
print(result)

# To convert multiple pages
result = pdf2jpg.convert_pdf2jpg(inputpath, outputpath, dpi=300, pages="1,0,3")
print(result)

# to convert all pages
result = pdf2jpg.convert_pdf2jpg(inputpath, outputpath, dpi=300, pages="ALL")
print(result)

output results

[{   
    'cmd': 'java -jar D:\\pdf2jpg-bindings\\pdf2jpg\\pdf2jpg.jar -i "D:\inputdir\pdf1.pdf" -o "D:\outputdir" -d 300 -p 0,1,2,3',
    'input_path': 'D:\inputdir\pdf1.pdf',
    'output_jpgfiles': [   
			'D:\outputdir\\pdf1.pdf\\0_pdf1.pdf.jpg',
                        'D:\outputdir\\pdf1.pdf\\1_pdf1.pdf.jpg',
                        'D:\outputdir\\pdf1.pdf\\2_pdf1.pdf.jpg',
                        'D:\outputdir\\pdf1.pdf\\3_pdf1.pdf.jpg'
			],
    'output_pdfpath': 'D:\outputdir\\pdf1.pdf'
}]

Convertion of Readable PDF into PDF of scanned images, in converted PDF user will not be able to select any text.

from pdf2jpg import pdf2jpg
inputpath = r"D:\inputdir\pdf1.pdf"
outputpath = r"D:\outputdir\pdf1.pdf"

# To convert pdf to imgpdf
result = pdf2jpg.convert_pdf2imgpdf(inputpath, outputpath, dpi=300)
print(result)

Directly through jar - data/pdf2jpg.jar

To use the jar just type below commands which will work

To convert single pdf page to image [Eg, below converting 3rd page]
$ java -jar data/pdf2jpg.jar -i path_to_pdf -o output_directory -d 300 -p 2

To convert Multiple pdf pages to image 
$ java -jar data/pdf2jpg.jar -i path_to_pdf -o output_directory -d 300 -p 0,1,2,3

To convert ALL pdf pages to image
$ java -jar data/pdf2jpg.jar -i path_to_pdf -o output_directory -d 300 -p ALL

To do

  • Bulk Model implementation in java, to convert directory instead of single pdf [yet to decide if to use multithrreading or multiprocesing]
  • Python bindings
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].