All Projects → TextDatasetCleaner → Similar Projects or Alternatives

385 Open source projects that are alternatives of or similar to TextDatasetCleaner

corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
Stars: ✭ 16 (-40.74%)
text-analysis
Weaving analytical stories from text data
Stars: ✭ 12 (-55.56%)
Mutual labels:  text-mining, text-processing
support-tickets-classification
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+425.93%)
Mutual labels:  text-mining, text-processing
Text-Analysis
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+77.78%)
Mutual labels:  text-mining, text-processing
WeTextProcessing
Text Normalization & Inverse Text Normalization
Stars: ✭ 213 (+688.89%)
Mutual labels:  text-processing, normalization
perke
A keyphrase extractor for Persian
Stars: ✭ 60 (+122.22%)
Mutual labels:  text-mining, text-processing
Xioc
Extract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (+448.15%)
Mutual labels:  text-mining, text-processing
rosette-elasticsearch-plugin
Document Enrichment plugin for Elasticsearch
Stars: ✭ 25 (-7.41%)
Mutual labels:  text-mining, text-analytics
Text Mining
Text Mining in Python
Stars: ✭ 18 (-33.33%)
Mutual labels:  text-mining, text-processing
Colibri Core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+314.81%)
Mutual labels:  linguistics, text-processing
Applied Text Mining In Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan
Stars: ✭ 59 (+118.52%)
Mutual labels:  text-mining, text-processing
Pipeit
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (+111.11%)
Mutual labels:  text-mining, text-processing
Artificial Adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+1188.89%)
Mutual labels:  text-mining, text-processing
teanaps
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+237.04%)
Mutual labels:  text-mining, text-processing
Pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+1477.78%)
Mutual labels:  linguistics, text-processing
advanced-text-mining
TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다.
Stars: ✭ 15 (-44.44%)
Mutual labels:  text-mining, text-processing
TRUNAJOD2.0
An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-33.33%)
Mutual labels:  text-mining, text-processing
Cogcomp Nlpy
CogComp's light-weight Python NLP annotators
Stars: ✭ 115 (+325.93%)
Mutual labels:  text-mining, text-processing
Text-Classification-LSTMs-PyTorch
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+66.67%)
Mutual labels:  text-mining, text-processing
Textcluster
短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (+325.93%)
Mutual labels:  text-mining, text-processing
estratto
parsing fixed width files content made easy
Stars: ✭ 12 (-55.56%)
Mutual labels:  text-mining, text-processing
deduce
Deduce: de-identification method for Dutch medical text
Stars: ✭ 40 (+48.15%)
Mutual labels:  text-mining, text-processing
textstat
Ruby gem to calculate statistics from text to determine readability, complexity and grade level of a particular corpus.
Stars: ✭ 25 (-7.41%)
Mutual labels:  text-processing
R.TeMiS
R.TeMiS: R Text Mining Solution
Stars: ✭ 21 (-22.22%)
Mutual labels:  text-mining
neji
Flexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (+37.04%)
Mutual labels:  text-mining
tf-idf-python
Term frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+262.96%)
Mutual labels:  text-mining
linguisticsdown
Easy Linguistics Document Writing with R Markdown
Stars: ✭ 24 (-11.11%)
Mutual labels:  linguistics
OpenOctober
Open-October contribution destination. The Contest has now ended.
Stars: ✭ 27 (+0%)
Mutual labels:  hactoberfest2021
TabInOut
Framework for information extraction from tables
Stars: ✭ 37 (+37.04%)
Mutual labels:  text-mining
neural-net-linguistics
Papers about NN and linguistics
Stars: ✭ 14 (-48.15%)
Mutual labels:  linguistics
civicmine
Text mining cancer biomarkers for the CIVIC database
Stars: ✭ 19 (-29.63%)
Mutual labels:  text-mining
Compare-UserJS
PowerShell script for comparing user.js (or prefs.js) files.
Stars: ✭ 79 (+192.59%)
Mutual labels:  text-processing
SeqTools
A python library to manipulate and transform indexable data (lists, arrays, ...)
Stars: ✭ 42 (+55.56%)
Mutual labels:  preprocessing
Hr
Easy Access to Uppercase H
Stars: ✭ 56 (+107.41%)
Mutual labels:  text-processing
lameta
The Metadata Editor for Transparent Archiving of language document materials
Stars: ✭ 18 (-33.33%)
Mutual labels:  linguistics
sparklanes
A lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-37.04%)
Mutual labels:  preprocessing
Textrude
Code generation from YAML/JSON/CSV models via SCRIBAN templates
Stars: ✭ 79 (+192.59%)
Mutual labels:  text-processing
hama-py
🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
Stars: ✭ 16 (-40.74%)
Mutual labels:  text-processing
tap
Text Analytics Pipeline (TAP)
Stars: ✭ 17 (-37.04%)
Mutual labels:  text-analytics
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (+0%)
Mutual labels:  text-mining
contextualSpellCheck
✔️Contextual word checker for better suggestions
Stars: ✭ 274 (+914.81%)
Mutual labels:  preprocessing
textlearnR
A simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
Stars: ✭ 16 (-40.74%)
Mutual labels:  text-mining
textreadr
Tools to uniformly read in text data including semi-structured transcripts
Stars: ✭ 65 (+140.74%)
Mutual labels:  text-mining
thrones2vec
Using Word2Vec to explore semantic similarities between the entities of "A Song of Ice and Fire" ("Game of Thrones").
Stars: ✭ 27 (+0%)
Mutual labels:  text-mining
lingvo--Ner-ru
Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+40.74%)
Mutual labels:  linguistics
extractnet
A Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+92.59%)
Mutual labels:  text-mining
HelloWorld
Simple hello world in different language syntax
Stars: ✭ 9 (-66.67%)
Mutual labels:  hactoberfest2021
MLLabelUtils.jl
Utility package for working with classification targets and label-encodings
Stars: ✭ 30 (+11.11%)
Mutual labels:  preprocessing
keras-layer-normalization
Layer normalization implemented in Keras
Stars: ✭ 58 (+114.81%)
Mutual labels:  normalization
lingua-go
👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+2433.33%)
Mutual labels:  text-processing
JoSH
[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (+103.7%)
Mutual labels:  text-mining
SEDTWik-Event-Detection-from-Tweets
Segmentation based event detection from Tweets. Published at NAACL SRW 2019
Stars: ✭ 58 (+114.81%)
Mutual labels:  text-mining
ICU4N
International Components for Unicode for .NET
Stars: ✭ 18 (-33.33%)
Mutual labels:  normalization
odinson
Odinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Stars: ✭ 59 (+118.52%)
Mutual labels:  text-mining
text-preprocess-python
Text preprocessing tools in python.
Stars: ✭ 22 (-18.52%)
Mutual labels:  text-processing
andaluh-js
Transliterate español (spanish) spelling to andaluz proposals using javascript
Stars: ✭ 22 (-18.52%)
Mutual labels:  text-processing
converse
Conversational text Analysis using various NLP techniques
Stars: ✭ 147 (+444.44%)
Mutual labels:  text-mining
lab-dotphy
The Virtual Lab for Physics
Stars: ✭ 14 (-48.15%)
Mutual labels:  hactoberfest2021
react-drip-form
☕ HoC based React forms state manager, Support for validation and normalization.
Stars: ✭ 66 (+144.44%)
Mutual labels:  normalization
misinfo
📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)
Stars: ✭ 17 (-37.04%)
Mutual labels:  text-mining
1-60 of 385 similar projects