All Projects → wooorm → Parse English

wooorm / Parse English

Licence: mit
English (natural language) parser

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Parse English

Link Grammar
The CMU Link Grammar natural language parser
Stars: ✭ 286 (+108.76%)
Mutual labels:  english, natural-language
automation-for-humans
Converts English statements to automation.
Stars: ✭ 67 (-51.09%)
Mutual labels:  natural-language, english
An Array Of English Words
List of ~275,000 English words
Stars: ✭ 114 (-16.79%)
Mutual labels:  english
Slap
Painless shell argument parsing and dependency check.
Stars: ✭ 130 (-5.11%)
Mutual labels:  parse
Novels.org
Novels.org - Your Novels in Plain Text (Emacs . org-mode)
Stars: ✭ 120 (-12.41%)
Mutual labels:  english
Python Ecology Lesson
Data Analysis and Visualization in Python for Ecologists
Stars: ✭ 116 (-15.33%)
Mutual labels:  english
Make Novice
Automation and Make
Stars: ✭ 122 (-10.95%)
Mutual labels:  english
Nodejs Language
Node.js client for Google Cloud Natural Language: Derive insights from unstructured text using Google machine learning.
Stars: ✭ 113 (-17.52%)
Mutual labels:  natural-language
Mdfreader
Read Measurement Data Format (MDF) versions 3.x and 4.x file formats in python
Stars: ✭ 131 (-4.38%)
Mutual labels:  parse
Retext Equality
plugin to check for possible insensitive, inconsiderate language
Stars: ✭ 118 (-13.87%)
Mutual labels:  natural-language
R Novice Gapminder
R for Reproducible Scientific Analysis
Stars: ✭ 127 (-7.3%)
Mutual labels:  english
Pytextrank
Python implementation of TextRank for phrase extraction and summarization of text documents
Stars: ✭ 1,675 (+1122.63%)
Mutual labels:  natural-language
Nlcst
Natural Language Concrete Syntax Tree format
Stars: ✭ 116 (-15.33%)
Mutual labels:  natural-language
Roenglishre
An unofficial english translation project for Korea Ragnarok Online (kRO).
Stars: ✭ 121 (-11.68%)
Mutual labels:  english
Wordreview
📚 背单词网页 Django + MySQL + Pug + JS
Stars: ✭ 115 (-16.06%)
Mutual labels:  english
Pluralize
Pluralize or singularize any word based on a count
Stars: ✭ 1,808 (+1219.71%)
Mutual labels:  english
Netcopa
Network Configuration Parser
Stars: ✭ 112 (-18.25%)
Mutual labels:  parse
Gray Matter
Contributing Pull requests and stars are always welcome. For bugs and feature requests, please create an issue.
Stars: ✭ 2,105 (+1436.5%)
Mutual labels:  parse
Gse
Go efficient multilingual NLP and text segmentation; support english, chinese, japanese and other. Go 高性能多语言 NLP 和分词
Stars: ✭ 1,695 (+1137.23%)
Mutual labels:  english
Workshop Template
The Carpentries Workshop Template
Stars: ✭ 137 (+0%)
Mutual labels:  english

parse-english

Build Coverage Downloads Size Chat

English language parser for retext producing nlcst nodes.

Install

npm:

npm install parse-english

Use

var inspect = require('unist-util-inspect')
var English = require('parse-english')

var tree = new English().parse(
  'Mr. Henry Brown: A hapless but friendly City of London worker.'
)

console.log(inspect(tree))

Yields:

RootNode[1] (1:1-1:63, 0-62)
└─ ParagraphNode[1] (1:1-1:63, 0-62)
   └─ SentenceNode[23] (1:1-1:63, 0-62)
      ├─ WordNode[2] (1:1-1:4, 0-3)
      │  ├─ TextNode: "Mr" (1:1-1:3, 0-2)
      │  └─ PunctuationNode: "." (1:3-1:4, 2-3)
      ├─ WhiteSpaceNode: " " (1:4-1:5, 3-4)
      ├─ WordNode[1] (1:5-1:10, 4-9)
      │  └─ TextNode: "Henry" (1:5-1:10, 4-9)
      ├─ WhiteSpaceNode: " " (1:10-1:11, 9-10)
      ├─ WordNode[1] (1:11-1:16, 10-15)
      │  └─ TextNode: "Brown" (1:11-1:16, 10-15)
      ├─ PunctuationNode: ":" (1:16-1:17, 15-16)
      ├─ WhiteSpaceNode: " " (1:17-1:18, 16-17)
      ├─ WordNode[1] (1:18-1:19, 17-18)
      │  └─ TextNode: "A" (1:18-1:19, 17-18)
      ├─ WhiteSpaceNode: " " (1:19-1:20, 18-19)
      ├─ WordNode[1] (1:20-1:27, 19-26)
      │  └─ TextNode: "hapless" (1:20-1:27, 19-26)
      ├─ WhiteSpaceNode: " " (1:27-1:28, 26-27)
      ├─ WordNode[1] (1:28-1:31, 27-30)
      │  └─ TextNode: "but" (1:28-1:31, 27-30)
      ├─ WhiteSpaceNode: " " (1:31-1:32, 30-31)
      ├─ WordNode[1] (1:32-1:40, 31-39)
      │  └─ TextNode: "friendly" (1:32-1:40, 31-39)
      ├─ WhiteSpaceNode: " " (1:40-1:41, 39-40)
      ├─ WordNode[1] (1:41-1:45, 40-44)
      │  └─ TextNode: "City" (1:41-1:45, 40-44)
      ├─ WhiteSpaceNode: " " (1:45-1:46, 44-45)
      ├─ WordNode[1] (1:46-1:48, 45-47)
      │  └─ TextNode: "of" (1:46-1:48, 45-47)
      ├─ WhiteSpaceNode: " " (1:48-1:49, 47-48)
      ├─ WordNode[1] (1:49-1:55, 48-54)
      │  └─ TextNode: "London" (1:49-1:55, 48-54)
      ├─ WhiteSpaceNode: " " (1:55-1:56, 54-55)
      ├─ WordNode[1] (1:56-1:62, 55-61)
      │  └─ TextNode: "worker" (1:56-1:62, 55-61)
      └─ PunctuationNode: "." (1:62-1:63, 61-62)

API

parse-english has the same API as parse-latin.

Algorithm

All of parse-latin is included, and the following support for the English natural language:

  • Unit abbreviations (tsp., tbsp., oz., ft., and more)
  • Time references (sec., min., tues., thu., feb., and more)
  • Business Abbreviations (Inc. and Ltd.)
  • Social titles (Mr., Mmes., Sr., and more)
  • Rank and academic titles (Dr., Rep., Gen., Prof., Pres., and more)
  • Geographical abbreviations (Ave., Blvd., Ft., Hwy., and more)
  • American state abbreviations (Ala., Minn., La., Tex., and more)
  • Canadian province abbreviations (Alta., Qué., Yuk., and more)
  • English county abbreviations (Beds., Leics., Shrops., and more)
  • Common elision (omission of letters) (’n’, ’o, ’em, ’twas, ’80s, and more)

License

MIT © Titus Wormer

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].