All Projects → hrbrmstr → pubcrawl

hrbrmstr / pubcrawl

Licence: other
🍺📖 Convert 'epub' Files to Text (Use https://github.com/ropensci/epubr instead)

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to pubcrawl

read-offline
Read Offline allows you to download or print posts and pages. You can download the post as PDF, ePub and mobi
Stars: ✭ 28 (+27.27%)
Mutual labels:  epub
PanBook
Pandoc LaTeX,Epub模板,用于生成书籍,幻灯片(beamer),简历,论文等(cv, thesis, ebook,beamer)
Stars: ✭ 190 (+763.64%)
Mutual labels:  epub
react-native-ebook
React Native E-book (.mobi, .epub)
Stars: ✭ 45 (+104.55%)
Mutual labels:  epub
fimfic2epub
📚 Chrome/Firefox extension & npm package for improved EPUB export on fimfiction.net
Stars: ✭ 17 (-22.73%)
Mutual labels:  epub
Epub
一步一步学习如何制作一个EPub电子书阅读器。How to make a EPub e-book reader step by step.
Stars: ✭ 22 (+0%)
Mutual labels:  epub
plato
Document reader
Stars: ✭ 117 (+431.82%)
Mutual labels:  epub
kaf-cli
把txt文本转成epub和mobi电子书的命令行工具(原TmdTextEpub)
Stars: ✭ 133 (+504.55%)
Mutual labels:  epub
rePocketable
Tool to fetch articles from (getPocket|the web) and turn them into epub
Stars: ✭ 49 (+122.73%)
Mutual labels:  epub
jEpub
Simple EPUB builder library, works in modern browsers.
Stars: ✭ 33 (+50%)
Mutual labels:  epub
epub kitty
a beautiful flutter epub reader!
Stars: ✭ 49 (+122.73%)
Mutual labels:  epub
SDKLauncher-OSX
A small OS X application to serve as a launcher/testbed for the Readium SDK on the Mac.
Stars: ✭ 21 (-4.55%)
Mutual labels:  epub
readiator
a cross-platform epub reader app
Stars: ✭ 30 (+36.36%)
Mutual labels:  epub
Kingpin
📖 EPUB и PDF версии перевода книги Kingpin (Кевин Поулсен)
Stars: ✭ 18 (-18.18%)
Mutual labels:  epub
epub-parser
A powerful yet easy-to-use epub parser
Stars: ✭ 103 (+368.18%)
Mutual labels:  epub
EveReader
Epub Reader, focused on annotation.
Stars: ✭ 68 (+209.09%)
Mutual labels:  epub
epub3guide
台灣 EPUB3 制作指引
Stars: ✭ 95 (+331.82%)
Mutual labels:  epub
readium-css
🌈 A set of reference stylesheets for EPUB Reading Systems, starting with Readium Mobile
Stars: ✭ 78 (+254.55%)
Mutual labels:  epub
epub-viewer
android epub viewer
Stars: ✭ 32 (+45.45%)
Mutual labels:  epub
acsm-calibre-plugin
Calibre plugin for ACSM->EPUB and ACSM->PDF conversion.
Stars: ✭ 118 (+436.36%)
Mutual labels:  epub
iRead
iRead is an EPUB reader for iOS written in Swift
Stars: ✭ 83 (+277.27%)
Mutual labels:  epub

*** IMPORTANT ***

No further development will occur in this package as it has been supeseded by the actively maintained and quite spiffy! epubr package.


Travis-CI Build Status AppVeyor Build Status Coverage Status

pubcrawl

Convert ‘epub’ Files to Text

Description

Convert ‘epub’ Files to Text

The ‘epub’ file format is really just a structured ‘ZIP’ archive with metadata, graphics and (usually) ‘HTML’ text. Tools are provided to turn an ‘epub’ file into a tidy data frame.

What’s Inside The Tin

The following functions are implemented:

  • epub_to_text: Convert an epub file into a data frame of plaintext chapters

NOTE

There are edge cases I’ve totally not covered yet. Feel free to jump in and make this a real, useful package!

TODO

  • Refactor so there aren’t so many heavy dependencies
  • [ ] Try to get hgr on CRAN so it’s not a GH dep Moved the cleaner code into here
  • Better docs
  • Embed some epubs for examples and tests
  • Setup Travis, Appveyor, code coverage

Installation

devtools::install_github("hrbrmstr/pubcrawl")

Usage

library(pubcrawl)
library(tidyverse)

# current verison
packageVersion("pubcrawl")
## [1] '0.1.0'

An O’Reilly epub

epub_to_text("~/Data/R Packages.epub")
## # A tibble: 26 x 4
##    path                         size date                content                                                       
##    <chr>                       <dbl> <dttm>              <chr>                                                         
##  1 OEBPS/cover.html              315 2015-03-24 21:49:16 Cover                                                         
##  2 OEBPS/titlepage01.html        466 2015-03-24 21:49:16 "R Packages\n\nHadley Wickham"                                
##  3 OEBPS/copyright-page01.html  3286 2015-03-24 21:49:16 "R Packages\n\nby Hadley  Wickham\n\n\n\nPrinted in the Unite…
##  4 OEBPS/toc01.html            17557 2015-03-24 21:49:16 "navPrefaceIn This Book\n\nConventions Used in This Book\n\nU…
##  5 OEBPS/preface01.html        17784 2015-03-24 21:49:16 "Preface\n\n\nIn This Book\n\nThis book will guide you from b…
##  6 OEBPS/part01.html             444 2015-03-24 21:49:16 Getting Started                                               
##  7 OEBPS/ch01.html             12007 2015-03-24 21:49:16 "Introduction\n\nIn R, the fundamental unit of shareable code…
##  8 OEBPS/ch02.html             28633 2015-03-24 21:49:18 "Package Structure\n\nThis chapter will start you on the road…
##  9 OEBPS/part02.html             454 2015-03-24 21:49:18 Package Components                                            
## 10 OEBPS/ch03.html             28629 2015-03-24 21:49:18 "R Code\n\nThe first principle of using a package is that all…
## # ... with 16 more rows

A Project Gutenberg epub that comes with the package

epub_to_text(system.file("extdat", "augustine.epub", package="pubcrawl")) %>% 
  mutate(path = abbreviate(path))
## # A tibble: 10 x 4
##    path                             size date                content                                                   
##    <chr>                           <dbl> <dttm>              <chr>                                                     
##  1 OEBPS/@@@@@@@3296@3296-@3296--0 63804 2017-10-02 07:00:00 "THE CONFESSIONS\nOF\nSAINT AUGUSTINE\n\nBy Saint Augusti…
##  2 OEBPS/@@@@@@@3296@3296-@3296--1 68504 2017-10-02 07:00:00 "BOOK III\nTo Carthage I came, where there sang all aroun…
##  3 OEBPS/@@@@@@@3296@3296-@3296--2 80192 2017-10-02 07:00:00 "BOOK V\nAccept the sacrifice of my confessions from the …
##  4 OEBPS/@@@@@@@3296@3296-@3296--3 51898 2017-10-02 07:00:00 "O crooked paths! Woe to the audacious soul, which hoped,…
##  5 OEBPS/@@@@@@@3296@3296-@3296--4 80194 2017-10-02 07:00:00 "Anubis, barking Deity, and all         The monster Gods …
##  6 OEBPS/@@@@@@@3296@3296-@3296--5 80718 2017-10-02 07:00:00 "The boy then being stilled from weeping, Euodius took up…
##  7 OEBPS/@@@@@@@3296@3296-@3296--6 65956 2017-10-02 07:00:00 "And Thou knowest how far Thou hast already changed me, w…
##  8 OEBPS/@@@@@@@3296@3296-@3296--7 57022 2017-10-02 07:00:00 "BOOK XII\nMy heart, O Lord, touched with the words of Th…
##  9 OEBPS/@@@@@@@3296@3296-@3296--8 69513 2017-10-02 07:00:00 "BOOK XIII\nI call upon Thee, O my God, my mercy, Who cre…
## 10 OEBPS/@@@@@@@3296@3296-@3296--9 21223 2017-10-02 07:00:00 "The Confessions of Saint Augustine, by Saint Augustine\n…

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].