text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-55.56%)
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+425.93%)
Text-AnalysisExplaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+77.78%)
WeTextProcessingText Normalization & Inverse Text Normalization
Stars: ✭ 213 (+688.89%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+122.22%)
XiocExtract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (+448.15%)
Colibri CoreColibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+314.81%)
PipeitPipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (+111.11%)
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+1188.89%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+237.04%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+1477.78%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-33.33%)
Cogcomp NlpyCogComp's light-weight Python NLP annotators
Stars: ✭ 115 (+325.93%)
Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+66.67%)
Textcluster短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (+325.93%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-55.56%)
deduceDeduce: de-identification method for Dutch medical text
Stars: ✭ 40 (+48.15%)
textstatRuby gem to calculate statistics from text to determine readability, complexity and grade level of a particular corpus.
Stars: ✭ 25 (-7.41%)
R.TeMiSR.TeMiS: R Text Mining Solution
Stars: ✭ 21 (-22.22%)
nejiFlexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (+37.04%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+262.96%)
linguisticsdownEasy Linguistics Document Writing with R Markdown
Stars: ✭ 24 (-11.11%)
OpenOctoberOpen-October contribution destination. The Contest has now ended.
Stars: ✭ 27 (+0%)
TabInOutFramework for information extraction from tables
Stars: ✭ 37 (+37.04%)
civicmineText mining cancer biomarkers for the CIVIC database
Stars: ✭ 19 (-29.63%)
Compare-UserJSPowerShell script for comparing user.js (or prefs.js) files.
Stars: ✭ 79 (+192.59%)
SeqToolsA python library to manipulate and transform indexable data (lists, arrays, ...)
Stars: ✭ 42 (+55.56%)
HrEasy Access to Uppercase H
Stars: ✭ 56 (+107.41%)
lametaThe Metadata Editor for Transparent Archiving of language document materials
Stars: ✭ 18 (-33.33%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-37.04%)
TextrudeCode generation from YAML/JSON/CSV models via SCRIBAN templates
Stars: ✭ 79 (+192.59%)
hama-py🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
Stars: ✭ 16 (-40.74%)
tapText Analytics Pipeline (TAP)
Stars: ✭ 17 (-37.04%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (+0%)
textlearnRA simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
Stars: ✭ 16 (-40.74%)
textreadrTools to uniformly read in text data including semi-structured transcripts
Stars: ✭ 65 (+140.74%)
thrones2vecUsing Word2Vec to explore semantic similarities between the entities of "A Song of Ice and Fire" ("Game of Thrones").
Stars: ✭ 27 (+0%)
lingvo--Ner-ruNamed entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+40.74%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+92.59%)
HelloWorldSimple hello world in different language syntax
Stars: ✭ 9 (-66.67%)
MLLabelUtils.jlUtility package for working with classification targets and label-encodings
Stars: ✭ 30 (+11.11%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+2433.33%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (+103.7%)
ICU4NInternational Components for Unicode for .NET
Stars: ✭ 18 (-33.33%)
odinsonOdinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Stars: ✭ 59 (+118.52%)
andaluh-jsTransliterate español (spanish) spelling to andaluz proposals using javascript
Stars: ✭ 22 (-18.52%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (+444.44%)
lab-dotphyThe Virtual Lab for Physics
Stars: ✭ 14 (-48.15%)
react-drip-form☕ HoC based React forms state manager, Support for validation and normalization.
Stars: ✭ 66 (+144.44%)
misinfo📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)
Stars: ✭ 17 (-37.04%)