All Projects → ropensci → Fulltext

ropensci / Fulltext

Licence: other
Search across and get full text for OA & closed journals

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Fulltext

Rplos
R client for the PLoS Journals API
Stars: ✭ 289 (+30.77%)
Mutual labels:  xml, metadata, pdf, r-package, rstats
Rcrossref
R client for various CrossRef APIs
Stars: ✭ 137 (-38.01%)
Mutual labels:  metadata, r-package, rstats
Tabulizer
Bindings for Tabula PDF Table Extractor Library
Stars: ✭ 413 (+86.88%)
Mutual labels:  pdf, r-package, rstats
Dataspice
🌶 Create lightweight schema.org descriptions of your datasets
Stars: ✭ 137 (-38.01%)
Mutual labels:  metadata, r-package, rstats
Gender
Predict Gender from Names Using Historical Data
Stars: ✭ 149 (-32.58%)
Mutual labels:  r-package, rstats
Shinyalert
🗯️ Easily create pretty popup messages (modals) in Shiny
Stars: ✭ 148 (-33.03%)
Mutual labels:  r-package, rstats
Rentrez
talk with NCBI entrez using R
Stars: ✭ 151 (-31.67%)
Mutual labels:  r-package, rstats
Writexl
Portable, light-weight data frame to xlsx exporter for R
Stars: ✭ 162 (-26.7%)
Mutual labels:  r-package, rstats
Rnaturalearth
an R package to hold and facilitate interaction with natural earth map data 🌍
Stars: ✭ 140 (-36.65%)
Mutual labels:  r-package, rstats
Textreuse
Detect text reuse and document similarity
Stars: ✭ 156 (-29.41%)
Mutual labels:  r-package, rstats
Plotly
An interactive graphing library for R
Stars: ✭ 2,096 (+848.42%)
Mutual labels:  r-package, rstats
Googlelanguager
R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API
Stars: ✭ 145 (-34.39%)
Mutual labels:  r-package, rstats
Biomartr
Genomic Data Retrieval with R
Stars: ✭ 144 (-34.84%)
Mutual labels:  r-package, rstats
Qualtrics
Download ⬇️ Qualtrics survey data directly into R!
Stars: ✭ 151 (-31.67%)
Mutual labels:  r-package, rstats
Colourpicker
🎨 A colour picker tool for Shiny and for selecting colours in plots (in R)
Stars: ✭ 144 (-34.84%)
Mutual labels:  r-package, rstats
Tokenizers
Fast, Consistent Tokenization of Natural Language Text
Stars: ✭ 161 (-27.15%)
Mutual labels:  r-package, rstats
Datasaurus
R Package 📦 Containing the Datasaurus Dozen datasets 📊
Stars: ✭ 193 (-12.67%)
Mutual labels:  r-package, rstats
Dataretrieval
This R package is designed to obtain USGS or EPA water quality sample data, streamflow data, and metadata directly from web services. See: http://usgs-r.github.io/dataRetrieval/
Stars: ✭ 176 (-20.36%)
Mutual labels:  r-package, rstats
Tesseract
Bindings to Tesseract OCR engine for R
Stars: ✭ 192 (-13.12%)
Mutual labels:  r-package, rstats
Charlatan
Create fake data in R
Stars: ✭ 209 (-5.43%)
Mutual labels:  r-package, rstats

fulltext

cran checks Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-check codecov rstudio mirror downloads cran version

Get full text research articles

Checkout the package docs and the fulltext manual to get started.


rOpenSci has a number of R packages to get either full text, metadata, or both from various publishers. The goal of fulltext is to integrate these packages to create a single interface to many data sources.

fulltext makes it easy to do text-mining by supporting the following steps:

  • Search for articles - ft_search
  • Fetch articles - ft_get
  • Get links for full text articles (xml, pdf) - ft_links
  • Extract text from articles / convert formats - ft_extract
  • Collect all texts into a data.frame - ft_table

Previously supported use cases, extracted out to other packages:

  • Collect bits of articles that you actually need - moved to package pubchunks
  • Supplementary data from papers has been moved to the suppdata

It's easy to go from the outputs of ft_get to text-mining packages such as tm and quanteda

Data sources in fulltext include:

  • Crossref - via the rcrossref package
  • Public Library of Science (PLOS) - via the rplos package
  • Biomed Central
  • arXiv - via the aRxiv package
  • bioRxiv - via the biorxivr package
  • PMC/Pubmed via Entrez - via the rentrez package
  • Scopus - internal tooling
  • Semantic Scholar - internal tooling
  • Many more are supported via the above sources (e.g., Royal Society Open Science is available via Pubmed)
  • We will add more, as publishers open up, and as we have time...See the issues

Authentication: A number of publishers require authentication via API key, and some even more draconian authentication processes involving checking IP addresses. We are working on supporting all the various authentication things for different publishers, but of course all the OA content is already easily available. See the Authentication section in ?fulltext-package after loading the package.

We'd love your feedback. Let us know what you think in the issue tracker

Installation

Stable version from CRAN

install.packages("fulltext")

Development version from GitHub

remotes::install_github("ropensci/fulltext")

Load library

library('fulltext')

Interoperability with other packages downstream

Note: this example not included in vignettes as that would require the two below packages in Suggests here. To see many examples and documentation see the package docs and the fulltext manual.

cache_options_set(path = (td <- 'foobar'))
res <- ft_get(c('10.7554/eLife.03032', '10.7554/eLife.32763'), type = "pdf")
library(readtext)
x <- readtext::readtext(file.path(cache_options_get()$path, "*.pdf"))
library(quanteda)
quanteda::corpus(x)

Contributors

  • Scott Chamberlain
  • Will Pearse
  • Katrin Leinweber

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for fulltext: citation(package = 'fulltext')
  • Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].