All Projects → Asutosh11 → Documentreader

Asutosh11 / Documentreader

Licence: mit
This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.

Programming Languages

kotlin
9241 projects

Projects that are alternatives of or similar to Documentreader

P2.
📄 p2. - Simple and secure PDF to PNG server.
Stars: ✭ 191 (+235.09%)
Mutual labels:  docx, pdf
Pdfpig
Read and extract text and other content from PDFs in C# (port of PdfBox)
Stars: ✭ 391 (+585.96%)
Mutual labels:  pdf, pdf-document
Gotenberg
A Docker-powered stateless API for PDF files.
Stars: ✭ 3,272 (+5640.35%)
Mutual labels:  docx, pdf
Etherpad Lite
Etherpad: A modern really-real-time collaborative document editor.
Stars: ✭ 11,937 (+20842.11%)
Mutual labels:  docx, pdf
Koodo Reader
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
Stars: ✭ 2,938 (+5054.39%)
Mutual labels:  pdf, docx
Net Core Docx Html To Pdf Converter
.NET Core library to create custom reports based on Word docx or HTML documents and convert to PDF
Stars: ✭ 133 (+133.33%)
Mutual labels:  docx, pdf
Technical Ebooks
PDFs for programming tutorials.
Stars: ✭ 342 (+500%)
Mutual labels:  pdf, pdf-document
Superfileview
基于腾讯浏览服务Tbs,使用X5Webkit内核,实现文件的展示功能,支持多种文件格式
Stars: ✭ 1,115 (+1856.14%)
Mutual labels:  docx, pdf
Zettlr
A Markdown Editor for the 21st century.
Stars: ✭ 6,099 (+10600%)
Mutual labels:  docx, pdf
Printable Mockups
Create printable UI mockups & wireframes templates
Stars: ✭ 479 (+740.35%)
Mutual labels:  pdf, pdf-document
Markdownslides
MarkdownSlides is a Reveal.js and PDF slides generator from MARKDOWN files, that also generate HTML, EPUB and DOCX documents. The idea is that from a same MARKDOWN file we can get slides and books without worrying about style, just worrying about content.
Stars: ✭ 121 (+112.28%)
Mutual labels:  docx, pdf
Docconv
Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text
Stars: ✭ 735 (+1189.47%)
Mutual labels:  docx, pdf
Officeproducer
Produce doc/docx/pdf format from doc/docx template
Stars: ✭ 95 (+66.67%)
Mutual labels:  docx, pdf
Plagiarism Checker
A utility to check if a document's contents are plagiarised
Stars: ✭ 149 (+161.4%)
Mutual labels:  docx, pdf
Word2pdf Tools
📝通过LibreOffice / WPS / Microsoft Office / 第三方库 实现多种word转pdf格式的方案
Stars: ✭ 82 (+43.86%)
Mutual labels:  docx, pdf
Boxable
Boxable is a library that can be used to easily create tables in pdf documents.
Stars: ✭ 253 (+343.86%)
Mutual labels:  pdf, pdf-document
Pdf Lib
Create and modify PDF documents in any JavaScript environment
Stars: ✭ 3,426 (+5910.53%)
Mutual labels:  pdf, pdf-document
Academic Pandoc Template
Write beautiful academic texts with the distraction-free Pandoc Markdown and typademic.
Stars: ✭ 60 (+5.26%)
Mutual labels:  docx, pdf
Tabulizer
Bindings for Tabula PDF Table Extractor Library
Stars: ✭ 413 (+624.56%)
Mutual labels:  pdf, pdf-document
Phpword
A pure PHP library for reading and writing word processing documents
Stars: ✭ 6,017 (+10456.14%)
Mutual labels:  docx, pdf

API Android Arsenal

DocumentReader

This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.

If you have ever tried to read contents of a PDF or MS word document on Android, you know how painful it is. This library makes your work easy.


Dependency for build.gradle (Module: app)


repositories {
  ...
  maven { url 'https://jitpack.io' }
}
dependencies {
  ....
  implementation 'com.github.Asutosh11:DocumentReader:0.12'
  
  // NOTE: use this only if you get a multidex exception
  implementation "androidx.multidex:multidex:2.0.1"
}
// NOTE: use this only if you get an error like - More than one file was found with OS independent path
packagingOptions {
   exclude 'META-INF/DEPENDENCIES'
   exclude 'META-INF/INDEX.LIST'
   exclude 'META-INF/spring.handlers'
   exclude 'META-INF/spring.schemas'
   exclude 'META-INF/cxf/bus-extensions.txt'
}
// NOTE: use this only if you get a multidex exception
defaultConfig {
   ...
   multiDexEnabled true
}


How to use it?


// Read a pdf file from Uri
val docString : String = DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
// Read a pdf file from File
val docString : String = DocumentReaderUtil.readPdfFromFile(file, applicationContext)
// read a doc file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a doc file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
// read a docx file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a docx file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
// read a txt file from Uri
val docString : String = DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
/*
 Even if you don't know your file type, 
 this library detects the file mime type and gives you the content of the file as a String
*/
val docString : String = when (DocumentReaderUtil.getMimeType(fileUri, applicationContext)) {
        "text/plain" -> DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
        "application/pdf" -> DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
        "application/msword" -> DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
        "application/vnd.openxmlformats-officedocument.wordprocessingml.document" -> 
                                        DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
         else -> ""
	 }

Thanks

The Apache Tika project
Apache's PdfBox port by TomRoush
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].