A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

Stars: ✭ 181 (+129.11%)

Mutual labels: nlp-library

Sudachi

A Japanese Tokenizer for Business

Stars: ✭ 496 (+527.85%)

Mutual labels: nlp-library

Lingo

package lingo provides the data structures and algorithms required for natural language processing

Stars: ✭ 113 (+43.04%)

Mutual labels: nlp-library

extra-model

Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.

Stars: ✭ 43 (-45.57%)

Mutual labels: nlp-library

NLP-Natural-Language-Processing

Projects and useful articles / links

Stars: ✭ 149 (+88.61%)

Mutual labels: nlp-library

Transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Stars: ✭ 55,742 (+70459.49%)

Mutual labels: nlp-library

Pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Stars: ✭ 426 (+439.24%)

Mutual labels: nlp-library

TextFeatureSelection

Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models

Stars: ✭ 42 (-46.84%)

Mutual labels: nlp-library

Underthesea

Underthesea - Vietnamese NLP Toolkit

Stars: ✭ 823 (+941.77%)

Mutual labels: nlp-library

mlconjug3

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

Stars: ✭ 47 (-40.51%)

Mutual labels: nlp-library

Contextualized Topic Models

A python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.

Stars: ✭ 318 (+302.53%)

Mutual labels: nlp-library

wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

Stars: ✭ 164 (+107.59%)

Mutual labels: nlp-library

Simplenetnlp

.NET NLP library

Stars: ✭ 38 (-51.9%)

Mutual labels: nlp-library

bllip-parser

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.

Stars: ✭ 217 (+174.68%)

Mutual labels: nlp-library

Quick Nlp

Pytorch NLP library based on FastAI

Stars: ✭ 279 (+253.16%)

Mutual labels: nlp-library

ppdb

Interface for reading the Paraphrase Database (PPDB)

Stars: ✭ 22 (-72.15%)

Mutual labels: nlp-library

Janome

Japanese morphological analysis engine written in pure Python

Stars: ✭ 630 (+697.47%)

Mutual labels: nlp-library

schrutepy

The Entire Transcript from the Office in Tidy Format

Stars: ✭ 22 (-72.15%)

Mutual labels: nlp-library

Nagisa

A Japanese tokenizer based on recurrent neural networks

Stars: ✭ 260 (+229.11%)

Mutual labels: nlp-library

spaczz

Fuzzy matching and more functionality for spaCy.

Stars: ✭ 215 (+172.15%)

Mutual labels: nlp-library

Node Opennlp

Apache OpenNLP wrapper for Nodejs

Stars: ✭ 55 (-30.38%)

Mutual labels: nlp-library

Multi Task Nlp

multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.

Stars: ✭ 221 (+179.75%)

Mutual labels: nlp-library

clj-duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings. (a duckling clojure fork)

Stars: ✭ 15 (-81.01%)

Mutual labels: nlp-library

Sudachipy

Python version of Sudachi, a Japanese tokenizer.

Stars: ✭ 207 (+162.03%)

Mutual labels: nlp-library

Kagome

Self-contained Japanese Morphological Analyzer written in pure Go

Stars: ✭ 554 (+601.27%)

Mutual labels: nlp-library

Pyarabic

pyarabic

Stars: ✭ 183 (+131.65%)

Mutual labels: nlp-library

Giveme5W

Extraction of the five journalistic W-questions (5W) from news articles

Stars: ✭ 16 (-79.75%)

Mutual labels: nlp-library

Fastnlp

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Stars: ✭ 2,441 (+2989.87%)

Mutual labels: nlp-library

Natas

Python 3 library for processing historical English

Stars: ✭ 28 (-64.56%)

Mutual labels: nlp-library

Camel tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Stars: ✭ 124 (+56.96%)

Mutual labels: nlp-library

Nuts

自然语言处理常见任务（主要包括文本分类，序列标注，自动问答等）解决方案试验田

Stars: ✭ 21 (-73.42%)

Mutual labels: nlp-library

Danlp

DaNLP is a repository for Natural Language Processing resources for the Danish Language.

Stars: ✭ 111 (+40.51%)

Mutual labels: nlp-library

Spacy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Stars: ✭ 21,978 (+27720.25%)

Mutual labels: nlp-library

simple NER

simple rule based named entity recognition

Stars: ✭ 29 (-63.29%)

Mutual labels: nlp-library

Farm

🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Stars: ✭ 1,140 (+1343.04%)

Mutual labels: nlp-library

Tika Python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

Stars: ✭ 997 (+1162.03%)

Mutual labels: nlp-library

Atr4s

Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala

Stars: ✭ 23 (-70.89%)

Mutual labels: nlp-library

Ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

Stars: ✭ 433 (+448.1%)

Mutual labels: nlp-library

minie

An open information extraction system that provides compact extractions

Stars: ✭ 83 (+5.06%)

Mutual labels: nlp-library

1-60 of 63 similar projects

›