Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → EmilHvitfeldt → Textdata

EmilHvitfeldt / Textdata

Licence: other

Download, parse, store, and load text datasets instead of storing it in packages

Programming Languages

7636 projects

Labels

rstats

Projects that are alternatives of or similar to Textdata

Iptools

🍴 A toolkit for manipulating, validating and testing IP addresses and ranges, along with datasets relating to IP addresses. While it primarily has support for the IPv4 address space, more extensive IPv6 support is intended.

Stars: ✭ 44 (-25.42%)

Mutual labels: rstats

tldr for R!

Stars: ✭ 52 (-11.86%)

Mutual labels: rstats

Nodbi

Document DBI connector for R

Stars: ✭ 56 (-5.08%)

Mutual labels: rstats

Resources

R-Ladies Resources : Various resources for R-Ladies Global and to be shared across chapters 💜 🌍

Stars: ✭ 47 (-20.34%)

Mutual labels: rstats

Dtupdate

The dtupdate package has functions that try to make it easier to keep up with the non-CRAN universe

Stars: ✭ 51 (-13.56%)

Mutual labels: rstats

Orangetext

🍊📄 : An #rstats project to keep track of The 🍊 One's speeches

Stars: ✭ 53 (-10.17%)

Mutual labels: rstats

Ndjson

♨️ Wicked-Fast Streaming 'JSON' ('ndjson') Reader in R

Stars: ✭ 44 (-25.42%)

Mutual labels: rstats

Mixomics

Development repository for the Bioconductor package 'mixOmics '

Stars: ✭ 58 (-1.69%)

Mutual labels: rstats

Euclid

Exact Computation Geometry Framework Based on 'CGAL'

Stars: ✭ 52 (-11.86%)

Mutual labels: rstats

Rtimes

R wrapper for NYTimes API for government data - ABANDONED

Stars: ✭ 55 (-6.78%)

Mutual labels: rstats

Getlandsat

get landsat 8 images and metadata

Stars: ✭ 47 (-20.34%)

Mutual labels: rstats

Rdoc

colourised R docs in the terminal

Stars: ✭ 49 (-16.95%)

Mutual labels: rstats

Vcr

Record HTTP calls and replay them

Stars: ✭ 54 (-8.47%)

Mutual labels: rstats

Dsci 100

Repository for UBC's Introduction to Data Science course (DSCI 100)

Stars: ✭ 46 (-22.03%)

Mutual labels: rstats

Drake Examples

Example workflows for the drake R package

Stars: ✭ 57 (-3.39%)

Mutual labels: rstats

Liger

Lightweight Iterative Gene set Enrichment in R

Stars: ✭ 44 (-25.42%)

Mutual labels: rstats

Ggeconodist

📉 Create Diminutive Distribution Charts

Stars: ✭ 53 (-10.17%)

Mutual labels: rstats

Sigmajs

Σ sigma.js for R

Stars: ✭ 58 (-1.69%)

Mutual labels: rstats

Lawn

⛔ ARCHIVED ⛔ turf.js R client

Stars: ✭ 57 (-3.39%)

Mutual labels: rstats

Colormap

R package to generate colors from a list of 44 pre-defined palettes

Stars: ✭ 55 (-6.78%)

Mutual labels: rstats

View All Similar Projects ➔

textdata

The goal of textdata is to provide access to text-related data sets for easy access without bundling them inside a package. Some text datasets are too large to store within an R package or are licensed in such a way that prevents them from being included in an OSS-licensed package. Instead, this package provides a framework to download, parse, and store the datasets on the disk and load them when needed.

Installation

You can install the not yet released version of textdata from CRAN with:

install.packages("textdata")

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("EmilHvitfeldt/textdata")

Example

The first time you use one of the functions for accessing an included text dataset, such as lexicon_afinn() or dataset_sentence_polarity(), the function will prompt you to agree that you understand the dataset’s license or terms of use and then download the dataset to your computer.

After the first use, each time you use a function like lexicon_afinn(), the function will load the dataset from disk.

Included text datasets

As of today, the datasets included in textdata are:

Dataset	Function
v1.0 sentence polarity dataset	`dataset_sentence_polarity()`
AFINN-111 sentiment lexicon	`lexicon_afinn()`
Hu and Liu’s opinion lexicon	`lexicon_bing()`
NRC word-emotion association lexicon	`lexicon_nrc()`
NRC Emotion Intensity Lexicon	`lexicon_nrc_eil()`
The NRC Valence, Arousal, and Dominance Lexicon	`lexicon_nrc_vad()`
Loughran and McDonald’s opinion lexicon for financial documents	`lexicon_loughran()`
AG’s News	`dataset_ag_news()`
DBpedia ontology	`dataset_dbpedia()`
Trec-6 and Trec-50	`dataset_trec()`
IMDb Large Movie Review Dataset	`dataset_imdb()`
Stanford NLP GloVe pre-trained word vectors	`embedding_glove6b()`
	`embedding_glove27b()`
	`embedding_glove42b()`
	`embedding_glove840b()`

Check out each function’s documentation for detailed information (including citations) for the relevant dataset.

Community Guidelines

Note that this project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms. Feedback, bug reports (and fixes!), and feature requests are welcome; file issues or seek support here. For details on how to add a new dataset to this package, check out the vignette!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 59

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (9) 🔗