All Projects → mkearney → readthat

mkearney / readthat

Licence: Unknown, MIT licenses found Licenses found Unknown LICENSE MIT LICENSE.md
Read Text Data

Programming Languages

r
7636 projects
C++
36643 projects - #6 most used programming language

Projects that are alternatives of or similar to readthat

React Native Text Ticker
React Native Text Ticker/Marquee Component
Stars: ✭ 212 (+685.19%)
Mutual labels:  text
DotGrok
Parse text with pattern. Inspired by grok filter.
Stars: ✭ 26 (-3.7%)
Mutual labels:  text
textalyzer
Analyze key metrics like number of words, readability, complexity, etc. of any kind of text
Stars: ✭ 50 (+85.19%)
Mutual labels:  text
Swiftrichstring
👩‍🎨 Elegant Attributed String composition in Swift sauce
Stars: ✭ 2,744 (+10062.96%)
Mutual labels:  text
Text
An efficient packed, immutable Unicode text type for Haskell, with a powerful loop fusion optimization framework.
Stars: ✭ 248 (+818.52%)
Mutual labels:  text
SilentServer
Silent is very lightweight, high quality - low latency voice chat for gaming. The server runs on Windows and Linux.
Stars: ✭ 52 (+92.59%)
Mutual labels:  text
Alyn
Detect and fix skew in images containing text
Stars: ✭ 202 (+648.15%)
Mutual labels:  text
markdown
markdown tools, libraries & scripts
Stars: ✭ 52 (+92.59%)
Mutual labels:  text
GPT2-Telegram-Chatbot
GPT-2 Telegram Chat bot
Stars: ✭ 67 (+148.15%)
Mutual labels:  text
easy reader
⏮ ⏯ ⏭ A Rust library for easily navigating forward, backward or randomly through the lines of huge files.
Stars: ✭ 83 (+207.41%)
Mutual labels:  read
React Native Text Size
Measure text accurately before laying it out and get font information from your App.
Stars: ✭ 238 (+781.48%)
Mutual labels:  text
Finalcut
A text-based widget toolkit
Stars: ✭ 244 (+803.7%)
Mutual labels:  text
TextBoxes
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
Stars: ✭ 625 (+2214.81%)
Mutual labels:  text
Deepsegment
A sentence segmenter that actually works!
Stars: ✭ 211 (+681.48%)
Mutual labels:  text
vue-swimlane
A Text Swimlane plugin for Vue.js
Stars: ✭ 71 (+162.96%)
Mutual labels:  text
Stringi
THE String Processing Package for R (with ICU)
Stars: ✭ 204 (+655.56%)
Mutual labels:  text
J2N
Java-like Components for .NET
Stars: ✭ 37 (+37.04%)
Mutual labels:  text
react-bones
💀 Dead simple content loading components for React and React-Native. 💀
Stars: ✭ 42 (+55.56%)
Mutual labels:  text
MailDemon
Smtp server for mass emailing, managing email lists and more. Built on .NET Core. Linux, MAC and Windows compatible.
Stars: ✭ 113 (+318.52%)
Mutual labels:  text
jomini
Low level, performance oriented parser for save and game files from EU4, CK3, HOI4, Vic3, Imperator, and other PDS titles.
Stars: ✭ 40 (+48.15%)
Mutual labels:  text

readthat

CRAN status Lifecycle: experimental Travis build status Codecov test coverage

Quickly read text/source from local files and web pages.

Installation

You can install the development released version of readthat from Github with:

remotes::install_github("mkearney/readthat")

Examples

Let’s say we want to read-in the source of the following websites:

## a vector of URLs
urls <- c(
  "https://mikewk.com",
  "https://cnn.com",
  "https://www.cnn.com/us"
)

Use readthat::read() to read the text/source of a single file/URL

## read single web/file (returns text vector)
x <- read(urls[1])

## preview output
substr(x, 1, 60)
#> [1] "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n\n  <meta charset=\"ut"

## use apply functions to read multiple pages
xx <- sapply(urls, read)

## preview output
lapply(xx, substr, 1, 60)
#> $`https://mikewk.com`
#> [1] "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n\n  <meta charset=\"ut"
#> 
#> $`https://cnn.com`
#> [1] "<!DOCTYPE html><html class=\"no-js\"><head><meta content=\"IE=e"
#> 
#> $`https://www.cnn.com/us`
#> [1] "<!DOCTYPE html><html class=\"no-js\"><head><meta content=\"IE=e"

Comparisons

Benchmark comparison for reading a text file:

## save a text file
writeLines(read(urls[1]), x <- tempfile())

## coompare read times
bm_file <- bench::mark(
  readr = readr::read_lines(x),
  readthat = read(x),
  readLines = readLines(x),
  check = FALSE
)

## view results
bm_file
#> # A tibble: 3 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 readr       262.2µs  273.9µs     3517.    3.69MB     4.05
#> 2 readthat     83.7µs   86.2µs    10741.   28.65KB     4.04
#> 3 readLines   144.8µs  150.7µs     6311.   13.16KB     0

Benchmark comparison for reading a web page:

x <- "https://www.espn.com/nfl/scoreboard"
bm_html <- bench::mark(
  httr = httr::content(httr::GET(x), as = "text", encoding = "UTF-8"),
  xml2 = xml2::read_html(x),
  readthat = read(x),
  readLines = readLines(x, warn = FALSE),
  readr = readr::read_lines(x),
  check = FALSE,
  iterations = 25,
  filter_gc = TRUE
)
bm_html
#> # A tibble: 5 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 httr         71.5ms    104ms      6.84     2.7MB    0.285
#> 2 xml2        187.9ms    200ms      4.50    1.85MB    1.42 
#> 3 readthat     48.6ms     52ms     14.1     23.9KB    0    
#> 4 readLines   375.9ms    472ms      1.95  620.33KB    0    
#> 5 readr       158.6ms    169ms      5.57   799.7KB    0.232

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].