All Projects → gerardobort → node-corenlp

gerardobort / node-corenlp

Licence: GPL-3.0 license
CoreNLP @ NodeJS

Programming Languages

javascript
184084 projects - #8 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to node-corenlp

turing
✨ 🧬 Turing AI - Semantic Navigation, Chatbot using Search Engine and Many NLP Vendors.
Stars: ✭ 30 (-52.38%)
Mutual labels:  corenlp
Stanford.NLP.Fsharp
F# extentions for The Stanford.NLP.NET
Stars: ✭ 57 (-9.52%)
Mutual labels:  stanford-nlp
jstarcraft-nlp
专注于解决自然语言处理领域的几个核心问题:词法分析,句法分析,语义分析,语种检测,信息抽取,文本聚类和文本分类. 为相关领域的研发人员提供完整的通用设计与参考实现. 涵盖了多种自然语言处理算法,适配了多个自然语言处理框架. 兼容Lucene/Solr/ElasticSearch插件.
Stars: ✭ 92 (+46.03%)
Mutual labels:  corenlp
datalinguist
Stanford CoreNLP in idiomatic Clojure.
Stars: ✭ 93 (+47.62%)
Mutual labels:  corenlp
ling
Natural Language Processing Toolkit in Golang
Stars: ✭ 57 (-9.52%)
Mutual labels:  corenlp
python-corenlp-protobuf
Python bindings for Stanford CoreNLP's protobufs.
Stars: ✭ 21 (-66.67%)
Mutual labels:  corenlp
stansent
No description or website provided.
Stars: ✭ 16 (-74.6%)
Mutual labels:  stanford-nlp
spacy conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
Stars: ✭ 60 (-4.76%)
Mutual labels:  stanford-nlp
hawking
A Natural Language Date Time Parser that Extract date and time from text with context and parse to the required format
Stars: ✭ 168 (+166.67%)
Mutual labels:  corenlp
Corenlp
Stanford CoreNLP: A Java suite of core NLP tools.
Stars: ✭ 8,248 (+12992.06%)
Mutual labels:  stanford-nlp
qa
TensorFlow Models for the Stanford Question Answering Dataset
Stars: ✭ 72 (+14.29%)
Mutual labels:  stanford-nlp
Stanza
Official Stanford NLP Python Library for Many Human Languages
Stars: ✭ 5,887 (+9244.44%)
Mutual labels:  corenlp
corenlp-docker
Docker image for Stanford CoreNLP
Stars: ✭ 25 (-60.32%)
Mutual labels:  corenlp
flowsense
FlowSense: A Natural Language Interface for Visual Data Exploration within a Dataflow System
Stars: ✭ 40 (-36.51%)
Mutual labels:  corenlp
stanford-corenlp-docker
build/run the most current Stanford CoreNLP server in a docker container
Stars: ✭ 38 (-39.68%)
Mutual labels:  corenlp
dstlr
scalable knowledge graph construction from unstructured text
Stars: ✭ 82 (+30.16%)
Mutual labels:  corenlp

CoreNLP for NodeJS

This library helps making NodeJS/Web applications using the state-of-the-art technology for Natural Language Processing: Stanford CoreNLP. It is compatible with the latest release of CoreNLP 3.9.0.

Build Status Try corenlp on RunKit

NPM package

This project is under active development, please stay tuned for updates. More documentation and examples are comming.

Example

Assuming that StanfordCoreNLPServer is running on http://localhost:9000....

import CoreNLP, { Properties, Pipeline } from 'corenlp';

const props = new Properties({
  annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English'); // uses ConnectorServer by default

const sent = new CoreNLP.simple.Sentence('The little dog runs so fast.');
pipeline.annotate(sent)
  .then(sent => {
    console.log('parse', sent.parse());
    console.log(CoreNLP.util.Tree.fromSentence(sent).dump());
  })
  .catch(err => {
    console.log('err', err);
  });

API

Read the full API documentation.

Setup

1. Install the package:

npm i --save corenlp

2. Download Stanford CoreNLP

2.1. Shortcut (recommended to give this library a first try)

Via npm, run this command from your own project after having installed this library:

npm explore corenlp -- npm run corenlp:download

Once downloaded you can easily start the server by running

npm explore corenlp -- npm run corenlp:server

Or you can manually download the project from the Stanford's CoreNLP download section at: https://stanfordnlp.github.io/CoreNLP/download.html You may want to download, apart of the full package, other language models (see more on that page).

2.2. Via sources

For advanced projects, when you have to customize the library a bit more, we highly recommend to download the StanfordCoreNLP from the original repository, and compile the source code by using ant jar.

NOTE: Some functionality included in this library, for TokensRegex, Semgrex and Tregex, requires the latest version from that repository, which contains some fixes needed by this library, not included in the latest stable release.

3. Configure Stanford CoreNLP

There are two method to connect your NodeJS application to Stanford CoreNLP:

  1. HTTP is the preferred method since it requires CoreNLP to initialize just once to serve many requests, it also avoids extra I/O given that the CLI method need to write temporary files to run recommended.
  2. Via Command Line Interface, this is by spawning processes from your app.

3.1. Using StanfordCoreNLPServer

# Run the server using all jars in the current directory (e.g., the CoreNLP home directory)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

CoreNLP connects by default via StanfordCoreNLPServer, using port 9000. You can also opt to setup the connection differently:

import CoreNLP, { Properties, Pipeline, ConnectorServer } from 'corenlp';

const connector = new ConnectorServer({ dsn: 'http://localhost:9000' });
const props = new Properties({
  annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English', connector);

3.2. Use CoreNLP via CLI

CoreNLP expects by default the StanfordCoreNLP package to be placed (unzipped) inside the path ${YOUR_NPM_PROJECT_ROOT}/corenlp/. You can also opt to setup the CLI interface differently:

import CoreNLP, { Properties, Pipeline, ConnectorCli } from 'corenlp';

const connector = new ConnectorCli({
  classPath: 'corenlp/stanford-corenlp-full-2017-06-09/*', // specify the paths relative to your npm project root
  mainClass: 'edu.stanford.nlp.pipeline.StanfordCoreNLP', // optional
  props: 'StanfordCoreNLP-spanish.properties', // optional
});
const props = new Properties({
  annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English', connector);

4. Usage

4.1 Pipeline

// ... include dependencies

const props = new Properties({ annotators: 'tokenize,ssplit,lemma,pos,ner' });
const pipeline = new Pipeline(props, 'English', connector);
const sent = new CoreNLP.simple.Sentence('Hello world');
pipeline.annotate(sent)
  .then(sent => {
    console.log(sent.words());
    console.log(sent.nerTags());
  })
  .catch(err => {
    console.log('err', err);
  });

4.2 Penn TreeBank traversing

// ... include dependencies

const props = new Properties();
props.setProperty('annotators', 'tokenize,ssplit,pos,lemma,ner,parse');
const pipeline = new Pipeline(props, 'Spanish');

const sent = new CoreNLP.simple.Sentence('Jorge quiere cinco empanadas de queso y carne.');
pipeline.annotate(sent)
  .then(sent => {
    console.log('parse', sent.parse()); // constituency parsing string representation
    const tree = CoreNLP.util.Tree.fromSentence(sent);
    tree.visitLeaves(node =>
      console.log(node.word(), node.pos(), node.token().ner()));
    console.log(tree.dump());
  })
  .catch(err => {
    console.log('err', err);
  });

4.3 TokensRegex, Tregex and Semgrex

// ... include dependencies

const props = new Properties();
props.setProperty('annotators', 'tokenize,ssplit,regexner,depparse');
const expression = new CoreNLP.simple.Expression(
  'John Snow eats snow.',
  '{ner:PERSON}=who <nsubj ({pos:VBZ}=action >dobj {}=what)');
const pipeline = new Pipeline(props, 'English');

pipeline.annotateSemgrex(expression, true)  // similarly use pipeline.annotateTokensRegex / pipeline.annotateTregex
  .then(expression => expression.sentence(0).matches().map(match => {
      console.log('match', match.group('who'), match.group('action'), match.group('what'));
  }))
  .catch(err => {
    console.log('err', err);
  });

5. Client Side

This library is isomorphic, which means that works as well on a Browser. The API is exactly the same, and you can use it directly by requiring it via a <script> tag, using AMD (RequireJS), or within your app bundle.

The browser ready version of corenlp can be found as dist/index.browser.min.js, once built (npm run build).

See the examples folder for more details.

6. External Documentation

Properties
Pipeline
Service
ConnectorServer                   # https://stanfordnlp.github.io/CoreNLP/corenlp-server.html
ConnectorCli                      # https://stanfordnlp.github.io/CoreNLP/cmdline.html
CoreNLP
  simple                          # https://stanfordnlp.github.io/CoreNLP/simple.html
    Annotable
    Annotator
    Document
    Sentence
    Token
    annotator                     # https://stanfordnlp.github.io/CoreNLP/annotators.html
      TokenizerAnnotator          # https://stanfordnlp.github.io/CoreNLP/tokenize.html
      WordsToSentenceAnnotator    # https://stanfordnlp.github.io/CoreNLP/ssplit.html
      POSTaggerAnnotator          # https://stanfordnlp.github.io/CoreNLP/pos.html
      MorphaAnnotator             # https://stanfordnlp.github.io/CoreNLP/lemma.html
      NERClassifierCombiner       # https://stanfordnlp.github.io/CoreNLP/ner.html
      ParserAnnotator             # https://stanfordnlp.github.io/CoreNLP/parse.html
      DependencyParseAnnotator    # https://stanfordnlp.github.io/CoreNLP/depparse.html
      RelationExtractorAnnotator  # https://stanfordnlp.github.io/CoreNLP/relation.html
      CorefAnnotator              # https://stanfordnlp.github.io/CoreNLP/coref.html
      SentimentAnnotator          # https://stanfordnlp.github.io/CoreNLP/sentiment.html - Comming soon...
      RelationExtractorAnnotator  # https://stanfordnlp.github.io/CoreNLP/relation.html - TODO
      NaturalLogicAnnotator       # https://stanfordnlp.github.io/CoreNLP/natlog.html - TODO
      QuoteAnnotator              # https://stanfordnlp.github.io/CoreNLP/quote.html - TODO
  util
    Tree                          # http://www.cs.cornell.edu/courses/cs474/2004fa/lec1.pdf

7. References

This library is not maintained by StanfordNLP. However, it's based on and depends on StanfordNLP/CoreNLP to function.

7.1 Stanford CoreNLP Reference

Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].