GitPlanet
Projects
Users
Categories
Languages
About
All Categories
→
No Category
→ text-cleaning
Top 4 text-cleaning open source projects
grammarify
Grammarify is a npm package that safely cleans up text that has mispellings, improper capitalization, lexical illusions, among other things.
✭ 43
javascript
spelling-correction
grammar-checker
text-cleaning
extractnet
A Dragnet that also extract author, headline, date, keywords from context
✭ 52
HTML
python
cython
C++
text-mining
news
web-scraping
webscraping
news-articles
news-extractor
content-extraction
news-extraction
text-cleaning
date-extraction
author-extraction
text-preprocess-python
Text preprocessing tools in python.
✭ 22
python
nlp
text-processing
nlp-machine-learning
text-cleaner
text-cleaning
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
✭ 711
python
nlp
crawler
text-mining
news
html-to-markdown
scraping
corpus
news-aggregator
text-extraction
web-scraping
rss-feed
readability
tei
html2text
news-crawler
corpus-builder
corpus-tools
article-extractor
text-cleaning
text-preprocessing
1-4
of
4
text-cleaning projects