Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English

✭ 115

python nlp natural-language-processing spark pyspark nlp-machine-learning phrase-extraction collocation-extraction multiword-expressions phrase-discovery multiword-extraction

qutrub

Qutrub: Arabic verb conjugator

✭ 48

python HTML javascript natural-language-processing arabic verb-conjugation

mongolian-nlp

Useful resources for Mongolian NLP

✭ 119

Jupyter Notebook nlp natural-language-processing text-to-speech deep-learning pytorch speech-recognition language-model mongolian

lingvo--Ner-ru

Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке

✭ 38

C#javascript HTML nlp natural-language-processing linguistics named-entity-recognition lingvo ner nlp-machine-learning

keras-crf-layer

Implementation of CRF layer in Keras.

✭ 76

python machine-learning natural-language-processing deep-learning crf keras

llda

Labeled LDA in Python

✭ 19

python machine-learning natural-language-processing information-retrieval

CoLAKE

COLING'2020: CoLAKE: Contextualized Language and Knowledge Embedding

✭ 86

python shell natural-language-processing deep-learning knowledge-graph language-model knowledge-embedding

machine-learning-notebooks

🤖 An authorial collection of fundamental python recipes on Machine Learning and Artificial Intelligence.

✭ 63

Jupyter Notebook python data-science machine-learning natural-language-processing deep-learning algorithms machine-learning-algorithms mathematics artificial-intelligence machine-learning-notebooks

image-recognition-and-information-extraction-from-image-documents

Image Recognition and Information Extraction from Image Documents using Keras and Watson NLU

✭ 71

Jupyter Notebook natural-language-processing image-processing image-classification

LDA thesis

Hierarchical, multi-label topic modelling with LDA

✭ 49

python natural-language-processing bayesian-inference latent-dirichlet-allocation gibbs-sampler multilabel-multiclass

task-transferability

Data and code for our paper "Exploring and Predicting Transferability across NLP Tasks", to appear at EMNLP 2020.

✭ 35

python natural-language-processing transfer-learning bert nlp-tasks emnlp2020 task-transferability

spacy-french-models

French models for spacy

✭ 22

machine-learning natural-language-processing spacy

nlp-akash

Natural Language Processing notes and implementations.

✭ 66

python nlp natural-language-processing nltk text-summarization summarization nlp-akash

slackotron

A plugin extensible Slack bot.

✭ 13

python HTML CSS slack flask natural-language-processing rabbitmq chatbot plugins slack-bot plugin-system rabbitmq-consumer chatbots-framework

deeplearning-papernotes

Краткое изложение статей по NLP, Deep Learning и диалоговым агентам

✭ 17

review natural-language-processing deep-learning arxiv dialogue-systems russian-notes

compound-word-splitter

A compound word splitter for Python

✭ 41

python natural-language-processing

watson-document-co-relation

Correlate text content across documents using Watson NLU, Python NLTK and Watson Studio.

✭ 28

Jupyter Notebook natural-language-processing nlu jupyter-notebook watson-nlu ibmcode text-correlation document-correlation ibm-data-science-experience

datalinguist

Stanford CoreNLP in idiomatic Clojure.

✭ 93

clojure nlp graphviz natural-language-processing stanford stanford-corenlp computational-linguistics dependency-parser pos-tagging part-of-speech-tagger dependency-parsing pos-tagger corenlp rebl datafy

ake-datasets

Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.

✭ 125

shell python nlp benchmarking natural-language-processing information-retrieval datasets keyword-extraction nlp-machine-learning keyphrase-extraction keyphrase-generation

mipt-nlp2021

NLP course, MIPT

✭ 26

Jupyter Notebook natural-language-processing course

chariot

Deliver the ready-to-train data to your NLP model.

✭ 123

Jupyter Notebook python natural-language-processing tensorflow keras preprocessing

allennlp imdb

AllenNLP Startup Guide

✭ 13

python Jsonnet machine-learning natural-language-processing deep-learning pytorch allennlp

Turkish-Lemmatizer

Lemmatization for Turkish Language

✭ 72

python natural-language-processing turkish lemmatizer lemmatization

BTM

Biterm Topic Modelling for Short Text with R

✭ 78

C++r natural-language-processing topic-modeling biterm-topic-modelling

SwiftUIMLKitTranslator

SwiftUI MLKit Language Identification & Translator

✭ 23

swift ruby ios natural-language-processing language-detection machinelearning mlkit naturallanguage translator-text-api swiftui

airy

💬 Open source conversational platform to power conversations with an open source Live Chat, Messengers like Facebook Messenger, WhatsApp and more - 💎 UI from Inbox to dashboards - 🤖 Integrations to Conversational AI / NLP tools and standard enterprise software - ⚡ APIs, WebSocket, Webhook - 🔧 Create any conversational experience

Hierarchical-Typing

Code and Data for all experiments from our ACL 2018 paper "Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking"

✭ 44

python natural-language-processing deep-learning neural-networks convolutional-neural-network

LM-CNLC

Chinese Natural Language Correction via Language Model

✭ 15

python natural-language-processing deep-learning tensorflow chinese language-model natural-language-correction

TeBaQA

A question answering system which utilises machine learning.

✭ 17

java HTML machine-learning natural-language-processing weka question-answering

FewSum

Few-shot learning framework for opinion summarization published at EMNLP 2020.

✭ 29

python Jupyter Notebook machine-learning natural-language-processing deep-learning summarization opinion-summarization

PyLDA

A Latent Dirichlet Allocation implementation in Python.

✭ 51

python shell nlp machine-learning natural-language-processing machine-learning-algorithms topic-modeling bayesian-inference lda variational-inference latent-dirichlet-allocation gibbs-sampling gibbs-sampler topic-models

word2vec-from-scratch-with-python

A very simple, bare-bones, inefficient, implementation of skip-gram word2vec from scratch with Python

✭ 85

python nlp natural-language-processing tutorial word2vec

Relation-Classification

Relation Classification - SEMEVAL 2010 task 8 dataset

✭ 46

Jupyter Notebook perl java machine-learning natural-language-processing text-classification keras ipython-notebook cnn gru classification semeval relation-extraction relation-classification semeval-2010

TextFeatureSelection

Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models

✭ 42

python nlp machine-learning natural-language-processing text-classification natural-language feature-selection machinelearning natural-language-generation nlp-resources nlp-library natural-language-inference nlp-machine-learning natural-language-understanding text-categorization nlproc naturallanguageprocessing

label-studio-transformers

Label data using HuggingFace's transformers and automatically get a prediction service

✭ 117

python nlp natural-language-processing transformers bert natural-language-understanding text-labeling data-labeling pytorch-transformers label-studio

awesome-text-classification

Text classification meets word embeddings.

✭ 27

python machine-learning natural-language-processing sentiment-analysis text-classification classification

Attention mechanism-event-extraction

Attention mechanism in CNNs to extract events of interest

✭ 17

python natural-language-processing discourse-analysis attention-mechanism

natural-language-preprocessings

Some recipes of natural language pre-processing

✭ 123

python Jupyter Notebook HTML machine-learning natural-language-processing preprocessnig

allsummarizer

Multilingual automatic text summarizer using statistical approach and extraction

✭ 28

java python perl nlp natural-language-processing information-retrieval ai statistical-methods text-summarization ir information-retrival automatic-text-summarization sentence-relevance sentence-extraction

bert nli

A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)

✭ 97

python natural-language-processing albert natural-language-inference bert nli mixed-precision-training nli-model

STS-CNN-STSbenchmark-Semantic-Similarity-Semantic-Textual-Similarity-CNN-HCTI-Tensorflow-Keras

A simple implementation of paper "HCTI at SemEval-2017 Task 1: Use convolutional neural network to evaluate semantic textual similarity."

✭ 26

python nlp natural-language-processing tensorflow keras cnn sts convolutional-neural-networks semantic-similarity natural-language-understanding semantic-textual-similarity stsbenchmark dataset-sts

frog

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

✭ 70

C++M4 nlp syntax natural-language-processing morphology named-entity-recognition computational-linguistics text-processing dutch dependency-parser pos-tagger folia lemmatiser morphological-analyser

SelSum

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

✭ 36

python shell natural-language-processing reinforcement-learning deep-learning amazon summarization opinion-mining variational-inference natural-language-understanding

NLP-Review-Scorer

Score your NLP paper review

✭ 25

Jupyter Notebook python natural-language-processing conference bert paper-review

unihandecode

unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities

✭ 71

python unicode natural-language-processing japanese transliteration character chinese korean romanization latin-alphabet python3-library transiliterator romanization-systems

roberta-wwm-base-distill

this is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large

✭ 61

python natural-language-processing tensorflow pretrained-models bert distillation roberta

character-extraction

Extracts character names from a text file and performs analysis of text sentences containing the names.

✭ 40

python natural-language-processing analysis character nltk gutenberg character-extraction

Kevinpro-NLP-demo

All NLP you Need Here. 个人实现了一些好玩的NLP demo，目前包含13个NLP应用的pytorch实现

✭ 117

python Jupyter Notebook nlp natural-language-processing text-classification pytorch transformer baseline bert vae-gan textclassification

bookworm

📚 social networks from novels

✭ 72

Jupyter Notebook data-science natural-language-processing information-retrieval data-mining social-network graph-theory network-analysis bookworm

sarcasm-detection-for-sentiment-analysis

Sarcasm Detection for Sentiment Analysis

✭ 21

python natural-language-processing deep-learning sentiment-analysis text-classification tensorflow word2vec cnn lstm glove sarcasm-detection

revery

A personal semantic search engine capable of surfacing relevant bookmarks, journal entries, notes, blogs, contacts, and more, built on an efficient document embedding algorithm and Monocle's personal search index.

✭ 200

javascript go CSS search-engine natural-language-processing word2vec browser-extension torus-dom

MP-CNN-Variants

Variants of Multi-Perspective Convolutional Neural Networks

✭ 22

Jupyter Notebook python natural-language-processing convolutional-neural-networks semantic-textual-similarity sentence-similarity paraphrase-identification answer-selection mp-cnn

Question-Answering-based-on-SQuAD

Question Answering System using BiDAF Model on SQuAD v2.0

✭ 20

nlp machine-learning natural-language-processing neural-network question-answering squad nlp-machine-learning bidaf natural-language-understanding nlp-datasets

sentencepiece-jni

Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.