quanteda / quanteda.corpora

Licence: other

A collection of corpora for quanteda

Programming Languages

7636 projects

Projects that are alternatives of or similar to quanteda.corpora

workshop-IJTA

Rによる日本語テキスト分析入門

Stars: ✭ 25 (+47.06%)

Mutual labels: text-analysis, quanteda

LSX

A word embeddings-based semi-supervised model for document scaling

Stars: ✭ 42 (+147.06%)

Mutual labels: text-analysis, quanteda

wordhoard

This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.

Stars: ✭ 78 (+358.82%)

Mutual labels: text-analysis

ChineseTextAnalysisResouce

中文文本分析相关资源汇总

Stars: ✭ 71 (+317.65%)

Mutual labels: text-analysis

uima-uimafit

Apache UIMA uimaFIT

Stars: ✭ 31 (+82.35%)

Mutual labels: text-analysis

text-analysis

Weaving analytical stories from text data

Stars: ✭ 12 (-29.41%)

Mutual labels: text-analysis

corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Stars: ✭ 16 (-5.88%)

Mutual labels: text-analysis

Fake news detection

Fake News Detection in Python

Stars: ✭ 194 (+1041.18%)

Mutual labels: text-analysis

nlpbuddy

A text analysis application for performing common NLP tasks through a web dashboard interface and an API

Stars: ✭ 115 (+576.47%)

Mutual labels: text-analysis

ConTexto

Librería en Python para minería de texto y NLP

Stars: ✭ 43 (+152.94%)

Mutual labels: text-analysis

knime-textprocessing

KNIME - Text Processing Extension (Labs)

Stars: ✭ 17 (+0%)

Mutual labels: text-analysis

visualization

Text visualization tools

Stars: ✭ 18 (+5.88%)

Mutual labels: text-analysis

IncredibleTextAdventure

No description or website provided.

Stars: ✭ 19 (+11.76%)

Mutual labels: text-analysis

woolly

The Text Mining Elixir

Stars: ✭ 48 (+182.35%)

Mutual labels: text-analysis

OleanderStemmingLibrary

Porter stemming library (C++)

Stars: ✭ 37 (+117.65%)

Mutual labels: text-analysis

uima-uimaj

Apache UIMA Java SDK

Stars: ✭ 50 (+194.12%)

Mutual labels: text-analysis

Shifterator

Interpretable data visualizations for understanding how texts differ at the word level

Stars: ✭ 209 (+1129.41%)

Mutual labels: text-analysis

tutorials.quanteda.io

Quanteda tutorials website

Stars: ✭ 37 (+117.65%)

Mutual labels: quanteda

TRUNAJOD2.0

An easy-to-use library to extract indices from texts.

Stars: ✭ 18 (+5.88%)

Mutual labels: text-analysis

rectr

💒 Reproducible Extraction of Cross-lingual Topics using R

Stars: ✭ 19 (+11.76%)

Mutual labels: text-analysis

View All Similar Projects ➔

Corpora for quanteda

Package to provide easy access to large corpora for quanteda.

How to Install

You can download the files and build the package from source, or you can use the devtools library to install the package directly from GitHub. This is done as follows:

devtools::install_github("quanteda/quanteda.corpora")

Available corpora

Corpora contained in the package are the following:

Corpus	Name
Amicus curiae briefs from Bakke (1978) and Bollinger (2008)	data_corpus_amicus
Annual budget speeches from the Irish Dáil, 2008-2012	data_corpus_irishbudgets
UK news articles from 2014 that mention immigration	data_corpus_immigrationnews
Movie reviews from Pang, Lee, and Vaithyanathan (2002)	moved to quanteda.textmodels
US State of the Union addresses from 1790 to present	data_corpus_sotu
UK political party manifestos, 1945-2005	data_corpus_ukmanifestos
UN General Debate speeches, 2017	data_corpus_ungd2017
Universal Declaration of Human Rights in 464 languages	data_corpus_udhr

Larger corpora are also available from online locations using download():

Corpus	Name
Guardian newspaper articles in politics, economy, society and international sections from 2012 to 2016	data_corpus_guardian
Transcripts of speeches at Japan's Committee on Foreign Affairs and Defense of the lower house (Shugiin) from 1947 to 2017	data_corpus_foreignaffairscommittee

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

quanteda / quanteda.corpora

Programming Languages

Labels

Projects that are alternatives of or similar to quanteda.corpora

Corpora for quanteda

How to Install

Available corpora