bastienbot / Nlp Js Tools French
Licence: mit
POS Tagger, lemmatizer and stemmer for french language in javascript
Stars: ✭ 32
Programming Languages
javascript
184084 projects - #8 most used programming language
Projects that are alternatives of or similar to Nlp Js Tools French
hunspell
High-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (+215.63%)
Mutual labels: tokenizer, stemmer
Notes App
Node.js application - simple notes management using Express, Postgres, Objection.js, Docker, Socket.io, Bluebird Promises
Stars: ✭ 14 (-56.25%)
Mutual labels: postgresql
Arabic Light Stemmer
Arabic light stemmer. Light stemming for Arabic words removes prefixes and suffixes and normalizes words
Stars: ✭ 14 (-56.25%)
Mutual labels: stemmer
Tbls
tbls is a CI-Friendly tool for document a database, written in Go.
Stars: ✭ 940 (+2837.5%)
Mutual labels: postgresql
Analytics
Simple, open-source, lightweight (< 1 KB) and privacy-friendly web analytics alternative to Google Analytics.
Stars: ✭ 9,469 (+29490.63%)
Mutual labels: postgresql
Soci
Official repository of the SOCI - The C++ Database Access Library
Stars: ✭ 960 (+2900%)
Mutual labels: postgresql
Wait4x
Wait4X is a cli tool to wait for everything! It can be wait for a port to open or enter to rquested state.
Stars: ✭ 30 (-6.25%)
Mutual labels: postgresql
Postgresql Postgis Timescaledb
PostgreSQL + PostGIS + TimescaleDB docker image 🐘🌎📈
Stars: ✭ 19 (-40.62%)
Mutual labels: postgresql
Guardian auth
The Guardian Authentication Implementation Using Ecto/Postgresql Elixir Phoenix [ User Authentication ]
Stars: ✭ 15 (-53.12%)
Mutual labels: postgresql
Omnicat Bayes
Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
Stars: ✭ 30 (-6.25%)
Mutual labels: tokenizer
Run johnny
An endless runner game built on phaser and nodejs
Stars: ✭ 14 (-56.25%)
Mutual labels: postgresql
Treefrog Framework
TreeFrog Framework : High-speed C++ MVC Framework for Web Application
Stars: ✭ 885 (+2665.63%)
Mutual labels: postgresql
Kotgres
SQL generator and result set mapper for Postgres and Kotlin
Stars: ✭ 21 (-34.37%)
Mutual labels: postgresql
Lealone Plugins
与 Lealone 集成的各类插件(例如网络框架以及不同的数据库协议和存储引擎)
Stars: ✭ 31 (-3.12%)
Mutual labels: postgresql
NLP Javascript tools for french language
Tokenize, POS Tagger, lemmatizer and stemmer
This package is partly based on the Snowball stemming algorythm and the javascript adaptation by Kasun Gajasinghe, University of Moratuwa
This package offers 4 NLP tools in javascript for french language :
- Tokenizing
- POS Tagging
- Lemmatizing
- Stemming
Install
npm install nlp-js-tools-french
Usage
var NlpjsTFr = require('nlp-js-tools-french');
Corpus to use
var corpus = "Elle semble se nourrir essentiellement de plancton, et de hotdog.";
Configs
var config = {
tagTypes: ['art', 'ver', 'nom'],
strictness: false,
minimumLength: 3,
debug: true
};
New instance with specific corpus and configs
var nlpToolsFr = new NlpjsTFr(corpus, config);
These are the available methods, self-explanatory. Note: The sentence that is passed into the class earlier is automaticaly tokenized.
var tokenizedWords = nlpToolsFr.tokenized;
var posTaggedWords = nlpToolsFr.posTagger();
var lemmatizedWords = nlpToolsFr.lemmatizer();
var stemmedWords = nlpToolsFr.stemmer();
var stemmedWord = nlpToolsFr.wordStemmer("aléatoirement");
Attributes
config
Shows config
tokenized
["semble", "nourrir", "de"]
Methods return
posTagger()
[{
"id": 1,
"word": "semble",
"pos": [
"VER",
"VER"
]
},
{
"id": 2,
"word": "nourrir",
"pos": [
"VER"
]
},
{
"id": 3,
"word": "de",
"pos": [
"NOM",
"ART:def",
"PRE"
]
}]
lemmatizer()
[{
"id": 1,
"word": "semble",
"lemma": "sembler"
},
{
"id": 2,
"word": "nourrir",
"lemma": "nourrir"
},
{
"id": 3,
"word": "de",
"lemma": "de"
}]
stemmer()
[{
"id": 1,
"word": "semble",
"stem": "sembl"
},
{
"id": 3,
"word": "nourrir",
"stem": "nourr"
},
{
"id": 5,
"word": "de",
"stem": "de"
}]
wordStemmer(word)
{
word: "aléatoirement",
stem: "aléatoir"
}
Config
Option | Type | Default | Description |
---|---|---|---|
tagTypes | Array | ["adj", "adv", "art", "con", "nom", "ono", "pre", "ver", "pro"] |
List of dictionnaries the package will look in, in case you only need verbs or nouns, both or whatever else. If a word does not belong to any type, it is tagged as "UNK" . |
strictness | Bool | false |
If you set the strictness to true and try to POS Tag the word generalement , it will fail because the word is missine its accents. On the other hand, trying to POS Tag the word dé with the strictness set to false well return the types art , pre and nom because the word will match de in these dictionnaries. |
minimumLength | Int | 1 | Algorythms will ignore words that are shorter than this parameter. |
debug | Bool | false | Enable console debug |
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].