hama-py🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
tapText Analytics Pipeline (TAP)
phrase-at-scaleDetect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
qutrubQutrub: Arabic verb conjugator
lingvo--Ner-ruNamed entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
lldaLabeled LDA in Python
CoLAKECOLING'2020: CoLAKE: Contextualized Language and Knowledge Embedding
machine-learning-notebooks🤖 An authorial collection of fundamental python recipes on Machine Learning and Artificial Intelligence.
LDA thesisHierarchical, multi-label topic modelling with LDA
task-transferabilityData and code for our paper "Exploring and Predicting Transferability across NLP Tasks", to appear at EMNLP 2020.
nlp-akashNatural Language Processing notes and implementations.
ake-datasetsLarge, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
chariotDeliver the ready-to-train data to your NLP model.
BTMBiterm Topic Modelling for Short Text with R
airy💬 Open source conversational platform to power conversations with an open source Live Chat, Messengers like Facebook Messenger, WhatsApp and more - 💎 UI from Inbox to dashboards - 🤖 Integrations to Conversational AI / NLP tools and standard enterprise software - ⚡ APIs, WebSocket, Webhook - 🔧 Create any conversational experience
Hierarchical-TypingCode and Data for all experiments from our ACL 2018 paper "Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking"
LM-CNLCChinese Natural Language Correction via Language Model
TeBaQAA question answering system which utilises machine learning.
FewSumFew-shot learning framework for opinion summarization published at EMNLP 2020.
PyLDAA Latent Dirichlet Allocation implementation in Python.
TextFeatureSelectionPython library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
allsummarizerMultilingual automatic text summarizer using statistical approach and extraction
bert nliA Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
frogFrog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
SelSumAbstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.
unihandecodeunihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities
roberta-wwm-base-distillthis is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large
character-extractionExtracts character names from a text file and performs analysis of text sentences containing the names.
reveryA personal semantic search engine capable of surfacing relevant bookmarks, journal entries, notes, blogs, contacts, and more, built on an efficient document embedding algorithm and Monocle's personal search index.
sentencepiece-jniJava JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
pair2vecpair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference