A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

✭ 181

jupyter-notebook hacktoberfest nlp natural-language-processing jupyter profiler nlp-machine-learning text-mining profiling nlp-library

Fastnlp

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

✭ 2,441

python Jupyter Notebook shell deep-learning natural-language-processing text-classification text-processing nlp-library chinese-nlp nlp-parsing

Awesome Pytorch List

A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

✭ 12,475

deep-learning machine-learning pytorch awesome awesome-list computer-vision nlp data-science neural-network natural-language-processing facebook tutorials papers cv nlp-library utility-library pytorch-tutorials pytorch-model

Camel tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

✭ 124

python nlp sentiment-analysis named-entity-recognition nlp-library arabic morphological-analysis

Lingo

package lingo provides the data structures and algorithms required for natural language processing

✭ 113

go golang nlp natural-language-processing language-model nlp-machine-learning nlp-library part-of-speech-tagger

Danlp

DaNLP is a repository for Natural Language Processing resources for the Danish Language.

✭ 111

python machine-learning nlp natural-language-processing named-entity-recognition word-embeddings nlp-library

Turkish Deasciifier

Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs

✭ 108

python nlp nlp-library

Transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Awesome Pytorch List Cnversion

Awesome-pytorch-list 翻译工作进行中......

✭ 1,361

python jupyter-notebook deep-learning machine-learning pytorch computer-vision nlp neural-network facebook tutorials papers cv nlp-library utility-library pytorch-tutorials

Toiro

A comparison tool of Japanese tokenizers

✭ 95

python nlp natural-language-processing japanese nlp-library word-segmentation

Punkt Segmenter

Ruby port of the NLTK Punkt sentence segmentation algorithm

✭ 88

ruby nlp-library nltk

Simstring

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

✭ 79

python nlp nlp-library

Farm

🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

✭ 1,140

python deep-learning pytorch nlp transfer-learning ner question-answering pretrained-models nlp-library

Node Opennlp

Apache OpenNLP wrapper for Nodejs

✭ 55

javascript nlp nlp-library

Tika Python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

✭ 997

python nlp detection parse nlp-machine-learning recognition buffer text-recognition nlp-library mime extraction text-extraction

Simplenetnlp

.NET NLP library

✭ 38

nlp wrapper nuget nlp-library

Sentiment Analyser

ML that can extract german and english sentiment

✭ 35

javascript nodejs nlp node-js sentiment-analysis english nlp-library

Natas

Python 3 library for processing historical English

✭ 28

python english nlp-library

Atr4s

Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala

✭ 23

scala nlp-library

Underthesea

Underthesea - Vietnamese NLP Toolkit

✭ 823

python nlp natural-language-processing nlp-library

Kuromoji

Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search

✭ 745

java japanese nlp-library part-of-speech-tagger

Janome

Japanese morphological analysis engine written in pure Python

✭ 630

python nlp-library japanese-language

Pythainlp

Thai Natural Language Processing in Python.

✭ 582

python hacktoberfest natural-language-processing nlp-library word-segmentation

Kagome

Self-contained Japanese Morphological Analyzer written in pure Go

✭ 554

go hacktoberfest segmentation japanese korean tokenizer nlp-library pos-tagging japanese-language morphological-analysis

Sudachi

A Japanese Tokenizer for Business

✭ 496

java segmentation nlp-library pos-tagging morphological-analysis

Spacy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

✭ 433

python nlp text-processing tokenizer nlp-library word-segmentation

Pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

✭ 426

python machine-learning library nlp natural-language-processing text-processing nlp-library linguistics

Lingua

👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

✭ 341

kotlin nlp android-library natural-language-processing nlp-machine-learning natural-language nlp-library language-detection

Contextualized Topic Models

A python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.

✭ 318

python nlp transformer nlp-machine-learning embeddings topic-modeling nlp-library

Giveme5w1h

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

✭ 316

html nlp news question-answering nlp-library text-analysis

Quick Nlp

Pytorch NLP library based on FastAI

✭ 279

python pytorch seq2seq nlp-library

Chatbot ner

chatbot_ner: Named Entity Recognition for chatbots.

✭ 273

python nlp natural-language-processing elasticsearch chatbot named-entity-recognition ner entity chatbots nlp-library

Nagisa

A Japanese tokenizer based on recurrent neural networks

✭ 260

python japanese nlp-library sequence-labeling pos-tagging word-segmentation

NLP-tools

Useful python NLP tools (evaluation, GUI interface, tokenization)

✭ 39

python nlp gui evaluation text-processing nlp-parsing nlp-library evaluation-metrics bleu-score

clj-duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings. (a duckling clojure fork)

✭ 15

clojure nlp nlp-library

classy

classy is a simple-to-use library for building high-performance Machine Learning models in NLP.

✭ 61

python Jupyter Notebook nlp natural-language-processing deep-learning neural-network transformers pytorch seq2seq sequence-to-sequence natural-language-generation nlp-library bert natural-language-understanding pytorch-lightning bert-fine-tuning

Giveme5W

Extraction of the five journalistic W-questions (5W) from news articles

✭ 16

python nlp qa news text-analysis question question-answering answer event-detection event-extraction fivewoneh nlp-library news-articles fivew

NLP Toolkit

Library of state-of-the-art models (PyTorch) for NLP tasks

✭ 92

python Roff nlp natural-language-processing text-classification machine-translation pytorch style-transfer speech-recognition text-summarization nlp-library text-clustering punctuation-restoration

Nuts

自然语言处理常见任务（主要包括文本分类，序列标注，自动问答等）解决方案试验田

✭ 21

python deep-learning seq2seq nlp-library nlp-machine-learning sequence-labeling text-categorization

extra-model

Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.

✭ 43

python shell Dockerfile nlp machine-learning-algorithms nlp-library nlp-keywords-extraction aspect-based-sentiment-analysis aspect-extraction

simple NER

simple rule based named entity recognition

✭ 29

python nlp extract-information information-extraction named-entity-recognition keywords annotator ner nlp-library extract-text nlp-keywords-extraction annotation-tool ner-entities

minie

An open information extraction system that provides compact extractions

✭ 83

java natural-language-processing paper extract-information information-extraction nlp-apis nlp-resources nlp-library natural-language-understanding open-information-extraction

OpenPrompt

An Open-Source Framework for Prompt-Learning.

✭ 1,769

python shell nlp natural-language-processing ai deep-learning prompt pytorch transformer prompt-toolkit nlp-library nlp-machine-learning prompts natural-language-understanding pre-trained-model pre-trained-language-models prompt-based-tuning prompt-learning

TextFeatureSelection

Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models

✭ 42

python nlp machine-learning natural-language-processing text-classification natural-language feature-selection machinelearning natural-language-generation nlp-resources nlp-library natural-language-inference nlp-machine-learning natural-language-understanding text-categorization nlproc naturallanguageprocessing

GrammarEngine

Грамматический Словарь Русского Языка (+ английский, японский, etc)

✭ 68

C++C#c shell HTML Roff nlp machine-learning syntax-parser chunking lemmatizer nlp-parsing nlp-library part-of-speech-tagger morphological-analysis russian-morphology morphological-analyser lemmatization

mlconjug3

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

✭ 47

python nlp devops machine-learning linguistics conjugation test-driven-development nlp-library nlp-machine-learning conjugator

empythy

Automated NLP sentiment predictions- batteries included, or use your own data

✭ 17

python nlp machine-learning sentiment machinelearning nlp-library nlp-machine-learning batteries-included sentiment-classifier sentiment-classification automated-machine-learning nlp-sentiment-classifier sentiment-predictions

wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

✭ 164

python nlp library word-embeddings nlp-library bias-reduction bias-detection fairness-ai fairness-ml word-embedding-evaluation word-embedding-fairness

Node-Ark-TweetNLP

Node wrapper for Ark-TweetNLP.

✭ 16

javascript nodejs nlp nlp-library ark-tweetnlp mbejda

bllip-parser

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.

✭ 217

nlp machine-learning natural-language-processing ai parsing artificial-intelligence computational-linguistics nlp-library

rsmorphy

Morphological analyzer / inflection engine for Russian and Ukrainian languages rewritten in Rust

✭ 27

rust russian inflection ukrainian rust-library nlp-library

ppdb

Interface for reading the Paraphrase Database (PPDB)

✭ 22

python nlp natural-language-processing nlp-resources nlp-library

1-60 of 64 nlp-library projects

›