TextFeatureSelectionPython library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (+100%)
NagisaA Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (+1138.1%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (+1414.29%)
Lingua👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (+1523.81%)
Rnn For Joint NluTensorflow implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling" (https://arxiv.org/abs/1609.01454)
Stars: ✭ 281 (+1238.1%)
Cluener2020CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Stars: ✭ 689 (+3180.95%)
Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+265338.1%)
CVAE DialCVAE_XGate model in paper "Xu, Dusek, Konstas, Rieser. Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity"
Stars: ✭ 16 (-23.81%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (+438.1%)
fairseq-tagginga Fairseq fork for sequence tagging/labeling tasks
Stars: ✭ 26 (+23.81%)
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+761.9%)
Tika PythonTika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Stars: ✭ 997 (+4647.62%)
schrutepyThe Entire Transcript from the Office in Tidy Format
Stars: ✭ 22 (+4.76%)
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+10542.86%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+190.48%)
mlconjug3A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Stars: ✭ 47 (+123.81%)
Quick NlpPytorch NLP library based on FastAI
Stars: ✭ 279 (+1228.57%)
OpenPromptAn Open-Source Framework for Prompt-Learning.
Stars: ✭ 1,769 (+8323.81%)
Multi Task Nlpmulti_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
Stars: ✭ 221 (+952.38%)
empythyAutomated NLP sentiment predictions- batteries included, or use your own data
Stars: ✭ 17 (-19.05%)
Quora question pairs NLP KaggleQuora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training
Stars: ✭ 17 (-19.05%)
adversarial-code-generationSource code for the ICLR 2021 work "Generating Adversarial Computer Programs using Optimized Obfuscations"
Stars: ✭ 16 (-23.81%)
alter-nluNatural language understanding library for chatbots with intent recognition and entity extraction.
Stars: ✭ 45 (+114.29%)
dynmt-pyNeural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-19.05%)
coreWIP - A personal life helper providing solutions and happiness
Stars: ✭ 17 (-19.05%)
seq3Source code for the NAACL 2019 paper "SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression"
Stars: ✭ 121 (+476.19%)
embeddingsEmbeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polish Language
Stars: ✭ 27 (+28.57%)
lucene-geo-gazetteerUses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.
Stars: ✭ 34 (+61.9%)
Hutoma-Conversational-AI-PlatformHu:toma AI is an open source stack designed to help you create compelling conversational interfaces with little effort and above industry accuracy
Stars: ✭ 35 (+66.67%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+619.05%)
chatbotkbqa task-oriented qa seq2seq ir neo4j jena seq2seq tf chatbot chat
Stars: ✭ 32 (+52.38%)
TextCategorization⚡ Using deep learning (MLP, CNN, Graph CNN) to classify text in TensorFlow.
Stars: ✭ 30 (+42.86%)
DeepLearning-LabCode lab for deep learning. Including rnn,seq2seq,word2vec,cross entropy,bidirectional rnn,convolution operation,pooling operation,InceptionV3,transfer learning.
Stars: ✭ 83 (+295.24%)
transformerA PyTorch Implementation of "Attention Is All You Need"
Stars: ✭ 28 (+33.33%)
keras-chatbot-web-apiSimple keras chat bot using seq2seq model with Flask serving web
Stars: ✭ 51 (+142.86%)
arabic-taggerAQMAR Arabic Tagger: Sequence tagger with cost-augmented structured perceptron training
Stars: ✭ 38 (+80.95%)
Machine-learningThis repository will contain all the stuffs required for beginners in ML and DL do follow and star this repo for regular updates
Stars: ✭ 27 (+28.57%)
word2vec-tsneGoogle News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+180.95%)
extra-modelCode to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.
Stars: ✭ 43 (+104.76%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+3157.14%)
minimal-nmtA minimal nmt example to serve as an seq2seq+attention reference.
Stars: ✭ 36 (+71.43%)
chatbot一个基于深度学习的中文聊天机器人,这里有详细的教程与代码,每份代码都有详细的注释,作为学习是美好的选择。A Chinese chatbot based on deep learning.
Stars: ✭ 94 (+347.62%)
pytorch-translmAn implementation of transformer-based language model for sentence rewriting tasks such as summarization, simplification, and grammatical error correction.
Stars: ✭ 22 (+4.76%)
vlainic.github.ioMy GitHub blog: things you might be interested, and probably not...
Stars: ✭ 26 (+23.81%)
minieAn open information extraction system that provides compact extractions
Stars: ✭ 83 (+295.24%)
datastories-semeval2017-task6Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (-4.76%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+223.81%)
NLP-paper🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (+9.52%)
phrase-at-scaleDetect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
Stars: ✭ 115 (+447.62%)
deepsegChinese word segmentation in tensorflow 2.x
Stars: ✭ 23 (+9.52%)