TextvecText vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (+17.61%)
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-91.55%)
ShifteratorInterpretable data visualizations for understanding how texts differ at the word level
Stars: ✭ 209 (+47.18%)
Lingua FrancaMycroft's multilingual text parsing and formatting library
Stars: ✭ 51 (-64.08%)
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+145.07%)
TextpipeTextpipe: clean and extract metadata from text
Stars: ✭ 284 (+100%)
PrenlpPreprocessing Library for Natural Language Processing
Stars: ✭ 130 (-8.45%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+1619.01%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-87.32%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+200%)
Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+152.11%)
PadatiousA neural network intent parser
Stars: ✭ 124 (-12.68%)
StringiTHE String Processing Package for R (with ICU)
Stars: ✭ 204 (+43.66%)
GraphbrainLanguage, Knowledge, Cognition
Stars: ✭ 294 (+107.04%)
Open Korean TextOpen Korean Text Processor - An Open-source Korean Text Processor
Stars: ✭ 438 (+208.45%)
ConTextoLibrería en Python para minería de texto y NLP
Stars: ✭ 43 (-69.72%)
Cogcomp NlpyCogComp's light-weight Python NLP annotators
Stars: ✭ 115 (-19.01%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (-8.45%)
NlprePython library for Natural Language Preprocessing (NLPre)
Stars: ✭ 158 (+11.27%)
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+0%)
BiomedicusCode for the old version of BioMedICUS, for the new version see the biomedicus3 repository.
Stars: ✭ 45 (-68.31%)
KeitaMy personal toolkit for PyTorch development.
Stars: ✭ 124 (-12.68%)
Neuro🔮 Neuro.js is machine learning library for building AI assistants and chat-bots (WIP).
Stars: ✭ 126 (-11.27%)
TokenizerFast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-7.04%)
Kaggle Crowdflower1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
Stars: ✭ 1,708 (+1102.82%)
Scattertext PydataNotebooks for the Seattle PyData 2017 talk on Scattertext
Stars: ✭ 132 (-7.04%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-12.68%)
NcrfppNCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+1144.37%)
UdaUnsupervised Data Augmentation (UDA)
Stars: ✭ 1,877 (+1221.83%)
Fnc 1 BaselineA baseline implementation for FNC-1
Stars: ✭ 123 (-13.38%)
ClicrMachine reading comprehension on clinical case reports
Stars: ✭ 123 (-13.38%)
Spacy Js🎀 JavaScript API for spaCy with Python REST API
Stars: ✭ 123 (-13.38%)
Sluice NetworksCode for Sluice networks: Learning what to share between loosely related tasks
Stars: ✭ 135 (-4.93%)
Files2rougeCalculating ROUGE score between two files (line-by-line)
Stars: ✭ 120 (-15.49%)
Nlp Pretrained ModelA collection of Natural language processing pre-trained models.
Stars: ✭ 122 (-14.08%)
Cs230 Code ExamplesCode examples in pyTorch and Tensorflow for CS230
Stars: ✭ 1,701 (+1097.89%)
NlpaugData augmentation for NLP
Stars: ✭ 2,761 (+1844.37%)
Learn To Select DataCode for Learning to select data for transfer learning with Bayesian Optimization
Stars: ✭ 140 (-1.41%)
Rasa💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Stars: ✭ 13,219 (+9209.15%)
TextacyNLP, before and after spaCy
Stars: ✭ 1,849 (+1202.11%)
DialoglueDialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue
Stars: ✭ 120 (-15.49%)
ScattertextBeautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+1112.68%)
DiscobertCode for paper "Discourse-Aware Neural Extractive Text Summarization" (ACL20)
Stars: ✭ 120 (-15.49%)
TmtoolkitText Mining and Topic Modeling Toolkit for Python with parallel processing power
Stars: ✭ 135 (-4.93%)
Chars2vecCharacter-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (-8.45%)
PymetamapPython wraper for MetaMap
Stars: ✭ 119 (-16.2%)
Ml Dl ScriptsThe repository provides usefull python scripts for ML and data analysis
Stars: ✭ 119 (-16.2%)