All Projects → moos → Wordpos

moos / Wordpos

Part-of-speech utilities for node.js based on the WordNet database.

Programming Languages

javascript
184084 projects - #8 most used programming language
language
365 projects
grammar
57 projects

Labels

Projects that are alternatives of or similar to Wordpos

escpos-tools
Utilities to read ESC/POS print data
Stars: ✭ 145 (-65.23%)
Mutual labels:  pos
laravel-pos
Türk bankaları için sanal pos paketi (Laravel 5/6/7/8)
Stars: ✭ 61 (-85.37%)
Mutual labels:  pos
larapos
Laravel Point of sale with invoice full source code free download pos apps.
Stars: ✭ 38 (-90.89%)
Mutual labels:  pos
RcppMeCab
RcppMeCab: Rcpp Interface of CJK Morpheme Analyzer MeCab
Stars: ✭ 24 (-94.24%)
Mutual labels:  pos
nltk-maxent-pos-tagger
maximum entropy based part-of-speech tagger for NLTK
Stars: ✭ 45 (-89.21%)
Mutual labels:  pos
LightPOS
Just a simple Point Of Sale app. [Mostly unfinished]
Stars: ✭ 26 (-93.76%)
Mutual labels:  pos
citar
Citar HMM part-of-speech tagger
Stars: ✭ 16 (-96.16%)
Mutual labels:  pos
Tailpos
TailPOS an Offline First Open Source POS for ERPNext
Stars: ✭ 258 (-38.13%)
Mutual labels:  pos
bitpocket-mobile-app
Mobile app for accepting bitcoin payments at the point of sale (Bitcoin POS).
Stars: ✭ 26 (-93.76%)
Mutual labels:  pos
erpnext simple-pos
Simplified POS for ERPNext that works well on mobile browsers.
Stars: ✭ 18 (-95.68%)
Mutual labels:  pos
ppp
PHP POS Print Server
Stars: ✭ 35 (-91.61%)
Mutual labels:  pos
NotrinosERP
A web-based erp, accounting system that written in PHP and MySql includes Sales, Purchasing, Warehousing, Manufacturing, Human Resource... It supports multi user, multi currencies, multi languages.
Stars: ✭ 46 (-88.97%)
Mutual labels:  pos
pos-billing-and-invoicing-software
Most Advanced POS, Billing, Inventory & Invoicing Software which can perfectly fit on your WholeSale & Retail Business --- Demo :
Stars: ✭ 33 (-92.09%)
Mutual labels:  pos
ESCPOS
A ESC/POS Printer Commands Helper
Stars: ✭ 26 (-93.76%)
Mutual labels:  pos
nanopos
A simple Lightning ⚡ point-of-sale system, powered by Lightning Charge
Stars: ✭ 95 (-77.22%)
Mutual labels:  pos
flutter-pos
A mobile POS app written with Flutter, compatible Sunmi device
Stars: ✭ 106 (-74.58%)
Mutual labels:  pos
barmate
Modern and intuitive POS web application written with the Laravel framework
Stars: ✭ 13 (-96.88%)
Mutual labels:  pos
Provas Poscomp
Provas e gabaritos da POSCOMP, sem marcação das respostas 📚
Stars: ✭ 372 (-10.79%)
Mutual labels:  pos
POS-Awesome
POS Awesome is an open-source Point of Sale for Erpnext using Vue.js and Vuetify
Stars: ✭ 109 (-73.86%)
Mutual labels:  pos
JavaFX-Point-of-Sales
Point of Sales with inventory management system
Stars: ✭ 162 (-61.15%)
Mutual labels:  pos

wordpos

NPM version Build Status

wordpos is a set of fast part-of-speech (POS) utilities for Node.js and browser using fast lookup in the WordNet database.

Version 1.x is a major update with no direct dependence on natural's WordNet module, with support for Promises, and roughly 5x speed improvement over previous version.

CAUTION The WordNet database wordnet-db comprises 155,287 words (3.0 numbers) which uncompress to over 30 MB of data in several unbrowserify-able files. It is not meant for the browser environment.

🔥 Version 2.x is totally refactored and works in browsers also -- see wordpos-web.

Installation

 npm install -g wordpos

To run test: (or just: npm test)

npm install -g mocha
mocha test

Quick usage

Node.js:

var WordPOS = require('wordpos'),
    wordpos = new WordPOS();

wordpos.getAdjectives('The angry bear chased the frightened little squirrel.', function(result){
    console.log(result);
});
// [ 'little', 'angry', 'frightened' ]

wordpos.isAdjective('awesome', function(result){
    console.log(result);
});
// true 'awesome'

Command-line: (see CLI for full command list)

$ wordpos def git
git
  n: a person who is deemed to be despicable or contemptible; "only a rotter would do that"; "kill the rat"; "throw the bum out"; "you cowardly little pukes!"; "the British call a contemptible person a 'git'"  

$ wordpos def git | wordpos get --adj
# Adjective 6:
despicable
contemptible
bum
cowardly
little
British

Options

WordPOS.defaults = {
  /**
   * enable profiling, time in msec returned as last argument in callback
   */
  profile: false,

  /**
   * if true, exclude standard stopwords.
   * if array, stopwords to exclude, eg, ['all','of','this',...]
   * if false, do not filter any stopwords.
   */
  stopwords: true,

  /**
   * preload files (in browser only)
   *    true - preload all POS
   *    false - do not preload any POS
   *    'a' - preload adj
   *    ['a','v'] - preload adj & verb
   * @type {boolean|string|Array}
   */
  preload: false,

  /**
   * include data files in preload
   * @type {boolean}
   */
  includeData: false,

  /**
   * set to true to enable debug logging
   * @type {boolean}
   */
  debug: false

};

To override, pass an options hash to the constructor. With the profile option, most callbacks receive a last argument that is the execution time in msec of the call.

    wordpos = new WordPOS({profile: true});
    wordpos.isAdjective('fast', console.log);
    // true 'fast' 29

API

Please note: all API are async since the underlying WordNet library is async.

getPOS(text, callback)

getNouns(text, callback)

getVerbs(text, callback)

getAdjectives(text, callback)

getAdverbs(text, callback)

Get part-of-speech from text. callback(results) receives an array of words for specified POS, or a hash for getPOS():

wordpos.getPOS(text, callback) -- callback receives a result object:
    {
      nouns:[],       Array of words that are nouns
      verbs:[],       Array of words that are verbs
      adjectives:[],  Array of words that are adjectives
      adverbs:[],     Array of words that are adverbs
      rest:[]         Array of words that are not in dict or could not be categorized as a POS
    }
    Note: a word may appear in multiple POS (eg, 'great' is both a noun and an adjective)

If you're only interested in a certain POS (say, adjectives), using the particular getX() is faster than getPOS() which looks up the word in all index files. stopwords are stripped out from text before lookup.

If text is an array, all words are looked-up -- no deduplication, stopword filtering or tokenization is applied.

getX() functions return a Promise.

Example:

wordpos.getNouns('The angry bear chased the frightened little squirrel.', console.log)
// [ 'bear', 'squirrel', 'little', 'chased' ]

wordpos.getPOS('The angry bear chased the frightened little squirrel.', console.log)
// output:
  {
    nouns: [ 'bear', 'squirrel', 'little', 'chased' ],
    verbs: [ 'bear' ],
    adjectives: [ 'little', 'angry', 'frightened' ],
    adverbs: [ 'little' ],
    rest: [ 'the' ]
  }

This has no relation to correct grammar of given sentence, where here only 'bear' and 'squirrel' would be considered nouns.

isNoun(word, callback)

isVerb(word, callback)

isAdjective(word, callback)

isAdverb(word, callback)

Determine if word is a particular POS. callback(result, word) receives true/false as first argument and the looked-up word as the second argument. The resolved Promise receives true/false.

Examples:

wordpos.isVerb('fish', console.log);
// true 'fish'

wordpos.isNoun('fish', console.log);
// true 'fish'

wordpos.isAdjective('fishy', console.log);
// true 'fishy'

wordpos.isAdverb('fishly', console.log);
// false 'fishly'

lookup(word, callback)

lookupNoun(word, callback)

lookupVerb(word, callback)

lookupAdjective(word, callback)

lookupAdverb(word, callback)

Get complete definition object for word. The lookupX() variants can be faster if you already know the POS of the word. Signature of the callback is callback(result, word) where result is an array of lookup object(s).

Example:

wordpos.lookupAdjective('awesome', console.log);
// output:
[ { synsetOffset: 1285602,
    lexFilenum: 0,
    lexName: 'adj.all',
    pos: 's',
    wCnt: 5,
    lemma: 'amazing',
    synonyms: [ 'amazing', 'awe-inspiring', 'awesome', 'awful', 'awing' ],
    lexId: '0',
    ptrs: [],
    gloss: 'inspiring awe or admiration or wonder; [...] awing majesty, so vast, so high, so silent"  '
    def: 'inspiring awe or admiration or wonder',     
    ...
} ], 'awesome'

In this case only one lookup was found, but there could be several.

Version 1.1 adds the lexName parameter, which maps the lexFilenum to one of 45 lexicographer domains.

seek(offset, pos, callback)

Version 1.1 introduces the seek method to lookup a record directly from the synsetOffset for a given POS. Unlike other methods, callback (if provided) receives (err, result) arguments.

Examples:

wordpos.seek(1285602, 'a').then(console.log)
// same result as wordpos.lookupAdjective('awesome', console.log);

rand(options, callback)

randNoun(options, callback)

randVerb(options, callback)

randAdjective(options, callback)

randAdverb(options, callback)

Get random word(s). (Introduced in version 0.1.10) callback(results, startsWith) receives array of random words and the startsWith option, if one was given. options, if given, is:

{
  startsWith : <string> -- get random words starting with this
  count : <number> -- number of words to return (default = 1)
}

Examples:

wordpos.rand(console.log)
// ['wulfila'] ''

wordpos.randNoun(console.log)
// ['bamboo_palm'] ''

wordpos.rand({starstWith: 'foo'}, console.log)
// ['foot'] 'foo'

wordpos.randVerb({starstWith: 'bar', count: 3}, console.log)
// ['barge', 'barf', 'barter_away'] 'bar'

wordpos.rand({starsWith: 'zzz'}, console.log)
// [] 'zzz'

Note on performance: (node only) random lookups could involve heavy disk reads. It is better to use the count option to get words in batches. This may benefit from the cached reads of similarly keyed entries as well as shared open/close of the index files.

Getting random POS (randNoun(), etc.) is generally faster than rand(), which may look at multiple POS files until count requirement is met.

parse(text)

Returns tokenized array of words in text, less duplicates and stopwords. This method is called on all getX() calls internally.

WordPOS.WNdb

Access to the wordnet-db object containing the dictionary & index files.

WordPOS.stopwords

Access the array of stopwords.

Promises

As of v1.0, all get, is, rand, and lookup methods return a standard ES6 Promise.

wordpos.isVerb('fish').then(console.log);
// true

Compound, with error handler:

wordpos.isVerb('fish')
  .then(console.log)
  .then(doSomethingElse)
  .catch(console.error);

Callbacks, if given, are executed before the Promise is resolved.

wordpos.isVerb('fish', console.log)
  .then(console.log)
  .catch(console.error);
// true 'fish' 13
// true

Note that callback receives full arguments (including profile, if enabled), while the Promise receives only the result of the call. Also, beware that exceptions in the callback will result in the Promise being rejected and caught by catch(), if provided.

Running inside the browsers?

See wordpos-web.

Fast Index (node)

Version 0.1.4 introduces fastIndex option. This uses a secondary index on the index files and is much faster. It is on by default. Secondary index files are generated at install time and placed in the same directory as WNdb.path. Details can be found in tools/stat.js.

Fast index improves performance 30x over Natural's native methods. See blog article Optimizing WordPos.

As of version 1.0, fast index is always on and cannot be turned off.

Command-line (CLI) usage

For CLI usage and examples, see bin/README.

Benchmark

See bench/README.

Changes

See CHANGELOG.

License

https://github.com/moos/wordpos Copyright (c) 2012-2020 [email protected] (The MIT License)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].