odinsonOdinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Stars: ✭ 59 (+59.46%)
ChemdataextractorAutomatically extract chemical information from scientific documents
Stars: ✭ 152 (+310.81%)
palladianPalladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.
Stars: ✭ 32 (-13.51%)
TableDisentanglerFunctional and structural analysis of tables in research papers (Table disentangling)
Stars: ✭ 21 (-43.24%)
deduceDeduce: de-identification method for Dutch medical text
Stars: ✭ 40 (+8.11%)
slotminerTool for slot extraction from text
Stars: ✭ 15 (-59.46%)
crminer⛔ ARCHIVED ⛔ Fetch 'Scholary' Full Text from 'Crossref'
Stars: ✭ 17 (-54.05%)
nejiFlexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (+0%)
malay-datasetText corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html
Stars: ✭ 189 (+410.81%)
Saaghar“Saaghar” (ساغر) is a Persian poetry software written by C++ under Qt framework, it uses "ganjoor" database as its database. It has tab feature in both its “Viewer” and its “Search” page that cause it be suitable for research goals.
Stars: ✭ 42 (+13.51%)
batterydatabaseTools for auto-generating the battery-materials database.
Stars: ✭ 29 (-21.62%)
CogIECogIE: An Information Extraction Toolkit for Bridging Text and CogNet. ACL 2021
Stars: ✭ 47 (+27.03%)
SFDCRulesSimple yet powerful Rule Engine for Salesforce - SFDCRules
Stars: ✭ 38 (+2.7%)
mllpThe code of AAAI 2020 paper "Transparent Classification with Multilayer Logical Perceptrons and Random Binarization".
Stars: ✭ 15 (-59.46%)
textreadrTools to uniformly read in text data including semi-structured transcripts
Stars: ✭ 65 (+75.68%)
reproducible-continual-learningContinual learning baselines and strategies from popular papers, using Avalanche. We include EWC, SI, GEM, AGEM, LwF, iCarl, GDumb, and other strategies.
Stars: ✭ 118 (+218.92%)
alter-nluNatural language understanding library for chatbots with intent recognition and entity extraction.
Stars: ✭ 45 (+21.62%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+145.95%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-56.76%)
DocuNetCode and dataset for the IJCAI 2021 paper "Document-level Relation Extraction as Semantic Segmentation".
Stars: ✭ 84 (+127.03%)
naacl2018-feverFact Extraction and VERification baseline published in NAACL2018
Stars: ✭ 109 (+194.59%)
woollyThe Text Mining Elixir
Stars: ✭ 48 (+29.73%)
sentometricsAn integrated framework in R for textual sentiment time series aggregation and prediction
Stars: ✭ 77 (+108.11%)
rita-dslA Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format
Stars: ✭ 60 (+62.16%)
RuletteA pragmatic business rule management system
Stars: ✭ 91 (+145.95%)
greek scansionPython library for automatic analysis of Ancient Greek hexameter. The algorithm uses linguistic rules and finite-state technology.
Stars: ✭ 16 (-56.76%)
textlearnRA simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
Stars: ✭ 16 (-56.76%)
iwwAI based web-wrapper for web-content-extraction
Stars: ✭ 61 (+64.86%)
SkillNERA (smart) rule based NLP module to extract job skills from text
Stars: ✭ 69 (+86.49%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+164.86%)
PubMed-Best-MatchMachine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
Stars: ✭ 36 (-2.7%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+40.54%)
gotorThis program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.
Stars: ✭ 97 (+162.16%)
rule-engine基于流程,事件驱动,可拓展,响应式,轻量级的规则引擎。
Stars: ✭ 165 (+345.95%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1821.62%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-67.57%)
mmqttAn Open-Source, Distributed MQTT Broker for IoT.
Stars: ✭ 58 (+56.76%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (+48.65%)
Knowledge Graph WanderA collection of papers, codes, projects, tutorials ... for Knowledge Graph and other NLP methods
Stars: ✭ 26 (-29.73%)
intertextDetect and visualize text reuse
Stars: ✭ 97 (+162.16%)
roolsA small rule engine for Node.
Stars: ✭ 118 (+218.92%)
ATGValidatoriOS validation framework with form validation support
Stars: ✭ 51 (+37.84%)
powerflows-dmnPower Flows DMN - Powerful decisions and rules engine
Stars: ✭ 46 (+24.32%)
liteflowSmall but powerful rules engine,轻量强大优雅的规则引擎
Stars: ✭ 1,119 (+2924.32%)
SearchBlue Brain text mining toolbox for semantic search and structured information extraction
Stars: ✭ 26 (-29.73%)
DiDKeeping track of what is going on with the latest DiD innovations.
Stars: ✭ 299 (+708.11%)