All Projects → trinker → Lexicon

trinker / Lexicon

A data package containing lexicons and dictionaries for text analysis

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Lexicon

Symbolized
Hash with indifferent access, with keys stored internally as symbols.
Stars: ✭ 58 (-33.33%)
Mutual labels:  hash
Python nlp tutorial
This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Stars: ✭ 72 (-17.24%)
Mutual labels:  text-mining
Beamsplitter
💎 Beamsplitter - A new (possibly universal) hash that passes SMHasher. Built mainly with a random 10x64 S-box. Also in NodeJS
Stars: ✭ 83 (-4.6%)
Mutual labels:  hash
Applied Text Mining In Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan
Stars: ✭ 59 (-32.18%)
Mutual labels:  text-mining
Bughunt
A weekly challenge where we share some code and you find a bug in it.
Stars: ✭ 68 (-21.84%)
Mutual labels:  hash
Meow hash
Official version of the Meow hash, an extremely fast level 3 hash
Stars: ✭ 1,204 (+1283.91%)
Mutual labels:  hash
Cryptonight
➿ Pure Go/ASM implementation of CryptoNight hash function with its variants, without any CGO binding.
Stars: ✭ 58 (-33.33%)
Mutual labels:  hash
R Text Data
List of textual data sources to be used for text mining in R
Stars: ✭ 85 (-2.3%)
Mutual labels:  text-mining
Digestif
Simple hash algorithms in OCaml
Stars: ✭ 69 (-20.69%)
Mutual labels:  hash
Birdseed
🐦 🎲 Use Twitter's Search API to get random numbers
Stars: ✭ 81 (-6.9%)
Mutual labels:  hash
How To Mine Newsfeed Data And Extract Interactive Insights In Python
A practical guide to topic mining and interactive visualizations
Stars: ✭ 61 (-29.89%)
Mutual labels:  text-mining
Biddle
Self-hosted application distribution
Stars: ✭ 66 (-24.14%)
Mutual labels:  hash
Signature Base
Signature base for my scanner tools
Stars: ✭ 1,212 (+1293.1%)
Mutual labels:  hash
Cape
String encryption for Arduino, limited microcontrollers and other embedded systems.
Stars: ✭ 58 (-33.33%)
Mutual labels:  hash
Pluck all
A more efficient way to get data from database. Like #pluck method but return array of hashes instead.
Stars: ✭ 83 (-4.6%)
Mutual labels:  hash
Konlpy
Python package for Korean natural language processing.
Stars: ✭ 1,098 (+1162.07%)
Mutual labels:  text-mining
Active enumerable
ActiveRecord like query methods for Ruby enumerable collections.
Stars: ✭ 73 (-16.09%)
Mutual labels:  hash
Dash
Scalable Hashing on Persistent Memory
Stars: ✭ 86 (-1.15%)
Mutual labels:  hash
Orange3 Text
🍊 📄 Text Mining add-on for Orange3
Stars: ✭ 83 (-4.6%)
Mutual labels:  text-mining
Pictogrify
🎭 Generate unique pictograms from any text
Stars: ✭ 80 (-8.05%)
Mutual labels:  hash

lexicon

Project Status: Active - The project has reached a stable, usable state and is being actively developed. Build Status

Table of Contents

Description

lexicon is a collection of lexical hash tables, dictionaries, and word lists. The data prefixes help to categorize the data types:

Prefix Meaning
key_ A data.frame with a lookup and return value
hash_ A keyed data.table hash table
freq_ A data.table of terms with frequencies
profanity_ A profane words vector
pos_ A part of speech vector
pos_df_ A part of speech data.frame
sw_ A stopword vector

Data

Data Description
cliches Common Cliches
common_names First Names (U.S.)
constraining_loughran_mcdonald Loughran-McDonald Constraining Words
emojis_sentiment Emoji Sentiment Data
freq_first_names Frequent U.S. First Names
freq_last_names Frequent U.S. Last Names
function_words Function Words
grady_augmented Augmented List of Grady Ward’s English Words and Mark Kantrowitz’s Names List
hash_emojis Emoji Description Lookup Table
hash_emojis_identifier Emoji Identifier Lookup Table
hash_emoticons Emoticons
hash_grady_pos Grady Ward’s Moby Parts of Speech
hash_internet_slang List of Internet Slang and Corresponding Meanings
hash_lemmas Lemmatization List
hash_nrc_emotions NRC Emotion Table
hash_sentiment_emojis Emoji Sentiment Polarity Lookup Table
hash_sentiment_huliu Hu Liu Polarity Lookup Table
hash_sentiment_jockers Jockers Sentiment Polarity Table
hash_sentiment_jockers_rinker Combined Jockers & Rinker Polarity Lookup Table
hash_sentiment_loughran_mcdonald Loughran-McDonald Polarity Table
hash_sentiment_nrc NRC Sentiment Polarity Table
hash_sentiment_senticnet Augmented SenticNet Polarity Table
hash_sentiment_sentiword Augmented Sentiword Polarity Table
hash_sentiment_slangsd SlangSD Sentiment Polarity Table
hash_sentiment_socal_google SO-CAL Google Polarity Table
hash_valence_shifters Valence Shifters
key_contractions Contraction Conversions
key_corporate_social_responsibility Nadra Pencle and Irina Malaescu’s Corporate Social Responsibility Dictionary
key_grade Grades Data Set
key_rating Ratings Data Set
key_regressive_imagery Colin Martindale’s English Regressive Imagery Dictionary
key_sentiment_jockers Jockers Sentiment Data Set
modal_loughran_mcdonald Loughran-McDonald Modal List
nrc_emotions NRC Emotions
pos_action_verb Action Word List
pos_df_irregular_nouns Irregular Nouns Word Dataframe
pos_df_pronouns Pronouns
pos_interjections Interjections
pos_preposition Preposition Words
profanity_alvarez Alejandro U. Alvarez’s List of Profane Words
profanity_arr_bad Stackoverflow user2592414’s List of Profane Words
profanity_banned bannedwordlist.com’s List of Profane Words
profanity_racist Titus Wormer’s List of Racist Words
profanity_zac_anger Zac Anger’s List of Profane Words
sw_dolch Leveled Dolch List of 220 Common Words
sw_fry_100 Fry’s 100 Most Commonly Used English Words
sw_fry_1000 Fry’s 1000 Most Commonly Used English Words
sw_fry_200 Fry’s 200 Most Commonly Used English Words
sw_fry_25 Fry’s 25 Most Commonly Used English Words
sw_jockers Matthew Jocker’s Expanded Topic Modeling Stopword List
sw_loughran_mcdonald_long Loughran-McDonald Long Stopword List
sw_loughran_mcdonald_short Loughran-McDonald Short Stopword List
sw_lucene Lucene Stopword List
sw_mallet MALLET Stopword List
sw_python Python Stopword List

Installation

To download the development version of lexicon:

Download the zip ball or tar ball, decompress and run R CMD INSTALL on it, or use the pacman package to install the development version:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("trinker/lexicon")

Contact

You are welcome to:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].