All Categories → Machine Learning → nlp-library

Top 64 nlp-library open source projects

Cn2an
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
Multi Task Nlp
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
Fnlp
中文自然语言处理工具包 Toolkit for Chinese natural language processing
Sudachipy
Python version of Sudachi, a Japanese tokenizer.
Urduhack
An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
Nlp profiler
A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Camel tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Lingo
package lingo provides the data structures and algorithms required for natural language processing
Danlp
DaNLP is a repository for Natural Language Processing resources for the Danish Language.
Turkish Deasciifier
Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs
Punkt Segmenter
Ruby port of the NLTK Punkt sentence segmentation algorithm
Simstring
A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.
Farm
🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Node Opennlp
Apache OpenNLP wrapper for Nodejs
Tika Python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Sentiment Analyser
ML that can extract german and english sentiment
Natas
Python 3 library for processing historical English
Atr4s
Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala
Underthesea
Underthesea - Vietnamese NLP Toolkit
Kuromoji
Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Janome
Japanese morphological analysis engine written in pure Python
Ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Lingua
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Contextualized Topic Models
A python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Giveme5w1h
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Quick Nlp
Pytorch NLP library based on FastAI
Nagisa
A Japanese tokenizer based on recurrent neural networks
NLP-tools
Useful python NLP tools (evaluation, GUI interface, tokenization)
clj-duckling
Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings. (a duckling clojure fork)
Nuts
自然语言处理常见任务(主要包括文本分类,序列标注,自动问答等)解决方案试验田
extra-model
Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.
TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
mlconjug3
A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
bllip-parser
BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
rsmorphy
Morphological analyzer / inflection engine for Russian and Ukrainian languages rewritten in Rust
ppdb
Interface for reading the Paraphrase Database (PPDB)
1-60 of 64 nlp-library projects