All Projects → words → subtlex-word-frequencies

words / subtlex-word-frequencies

Licence: ISC license
A list of words from the SUBTLEX movie subtitles corpus, sorted by frequency.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to subtlex-word-frequencies

cmu-pronouncing-dictionary
The 134,000+ words and their pronunciations in the CMU pronouncing dictionary
Stars: ✭ 46 (+84%)
Mutual labels:  english, en
rhymes
Give me an English word and I’ll give you a list of rhymes
Stars: ✭ 34 (+36%)
Mutual labels:  english, en
convert-british-to-american-spellings
Convert text so that British spellings are swapped with their Americanized form or vice versa.
Stars: ✭ 26 (+4%)
Mutual labels:  english, american
similar-english-words
Give me a word and I’ll give you an array of words that differ by a single letter.
Stars: ✭ 25 (+0%)
Mutual labels:  english, en
translate english
Java程序员阅读源码必知英语单词
Stars: ✭ 24 (-4%)
Mutual labels:  word, english
lingose-notation
The best mnemonics and notational system of English words.
Stars: ✭ 17 (-32%)
Mutual labels:  word, english
Twelveish
🕛 Twelveish - Android Wear/Wear OS Watch Face
Stars: ✭ 29 (+16%)
Mutual labels:  word, english
An Array Of English Words
List of ~275,000 English words
Stars: ✭ 114 (+356%)
Mutual labels:  word, english
CircularCountdown
Android library to create a circular countdown, fully written in Kotlin
Stars: ✭ 24 (-4%)
Mutual labels:  count
Semantic-Textual-Similarity
Natural Language Processing using NLTK and Spacy
Stars: ✭ 30 (+20%)
Mutual labels:  english
vue-countup
A plugin to count up to a figure using Vue.js
Stars: ✭ 42 (+68%)
Mutual labels:  count
Data-Structure-Algorithm-Programs
This Repo consists of Data structures and Algorithms
Stars: ✭ 464 (+1756%)
Mutual labels:  count
nim-presentation-slides
Nim Presentation Slides and Examples.
Stars: ✭ 30 (+20%)
Mutual labels:  english
ProjectEnglish
An open source project and an efficient way to improve Chinese programmer's English language skill
Stars: ✭ 105 (+320%)
Mutual labels:  english
count.macro
Babel macro for counting number of lines or words in files at compile time
Stars: ✭ 20 (-20%)
Mutual labels:  count
HurdleDMR.jl
Hurdle Distributed Multinomial Regression (HDMR) implemented in Julia
Stars: ✭ 19 (-24%)
Mutual labels:  count
CounterView
一个数字变化效果的计数器视图控件
Stars: ✭ 38 (+52%)
Mutual labels:  count
recount
R package for the recount2 project. Documentation website: http://leekgroup.github.io/recount/
Stars: ✭ 40 (+60%)
Mutual labels:  count
EnglishStu
英语学习软件,集成有道翻译、科大讯飞,有翻译、朗读示例、阅读评测功能
Stars: ✭ 27 (+8%)
Mutual labels:  english
styles
Styles for The Carpentries lessons. No README to avoid merge conflicts with lessons. Demo 👇
Stars: ✭ 72 (+188%)
Mutual labels:  english

subtlex-word-frequencies

Build Downloads Size

List of 74,286 words sorted by frequency of use in spoken English.

The word counts are derived from SUBTLEXus, a corpus of American English subtitles of movies.

Install

npm:

npm install subtlex-word-frequencies

Use

var subtlex = require('subtlex-word-frequencies')

console.log(words.length)

console.log(words.slice(0, 3))

console.log(words.filter(d => d.word.match(/chick/)).slice(0, 5))

Yields:

74286
[
  {word: 'you', count: 2134713},
  {word: 'I', count: 2038529},
  {word: 'the', count: 1501908}
]
[
  {word: 'chicken', count: 3148},
  {word: 'chick', count: 1334},
  {word: 'chicks', count: 742},
  {word: 'chickens', count: 520},
  {word: 'chickenshit', count: 85}
]

API

subtlexWordFrequencies

Array.<Entry> — List of all entries in SUBTLEXus. Each entry has the following properties:

  • word (string) — Unique word (example: git)
  • value (number) — Number of times the word appears in the corpus (example: 101)

word starts with a capital when the word more often starts with an uppercase letter than with a lowercase letter (example: I).

The entire original corpus consists of 51 million words.

License

ISC © Zeke Sikelianos

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].