odinsonOdinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Stars: ✭ 59 (+84.38%)
Mutual labels: text-mining, information-extraction
TableDisentanglerFunctional and structural analysis of tables in research papers (Table disentangling)
Stars: ✭ 21 (-34.37%)
Mutual labels: text-mining, information-extraction
Awesome Hungarian NlpA curated list of NLP resources for Hungarian
Stars: ✭ 121 (+278.13%)
Mutual labels: text-mining, information-extraction
deduceDeduce: de-identification method for Dutch medical text
Stars: ✭ 40 (+25%)
Mutual labels: text-mining, information-extraction
nejiFlexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (+15.63%)
Mutual labels: text-mining, information-extraction
TabInOutFramework for information extraction from tables
Stars: ✭ 37 (+15.63%)
Mutual labels: text-mining, information-extraction
ChemdataextractorAutomatically extract chemical information from scientific documents
Stars: ✭ 152 (+375%)
Mutual labels: text-mining, information-extraction
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+465.63%)
Mutual labels: text-mining
TokenizersFast, Consistent Tokenization of Natural Language Text
Stars: ✭ 161 (+403.13%)
Mutual labels: text-mining
UdpipeR package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Stars: ✭ 160 (+400%)
Mutual labels: text-mining
TextheroText preprocessing, representation and visualization from zero to hero.
Stars: ✭ 2,407 (+7421.88%)
Mutual labels: text-mining
Gwu data miningMaterials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (+578.13%)
Mutual labels: text-mining
Multi rakeMultilingual Rapid Automatic Keyword Extraction (RAKE) for Python
Stars: ✭ 162 (+406.25%)
Mutual labels: text-mining
koshort(deprecated) 🐱 koshort is a Python package for Korean internet spoken language crawling and processing... or maybe Korean domestic cat.
Stars: ✭ 62 (+93.75%)
Mutual labels: text-mining
LazynlpLibrary to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+6103.13%)
Mutual labels: text-mining
limaThe Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Stars: ✭ 75 (+134.38%)
Mutual labels: information-extraction
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-62.5%)
Mutual labels: text-mining
ShallowlearnAn experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (+512.5%)
Mutual labels: text-mining
Fake news detectionFake News Detection in Python
Stars: ✭ 194 (+506.25%)
Mutual labels: text-mining