Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+12.5%)
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-70%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (+210%)
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+255%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+50%)
TableDisentanglerFunctional and structural analysis of tables in research papers (Table disentangling)
Stars: ✭ 21 (-47.5%)
palladianPalladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.
Stars: ✭ 32 (-20%)
odinsonOdinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Stars: ✭ 59 (+47.5%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-32.5%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+127.5%)
Cogcomp NlpyCogComp's light-weight Python NLP annotators
Stars: ✭ 115 (+187.5%)
PipeitPipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (+42.5%)
TabInOutFramework for information extraction from tables
Stars: ✭ 37 (-7.5%)
XiocExtract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (+270%)
Textcluster短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (+187.5%)
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+770%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-70%)
frogFrog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+75%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-55%)
nejiFlexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (-7.5%)
Text-AnalysisExplaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+20%)
ChemdataextractorAutomatically extract chemical information from scientific documents
Stars: ✭ 152 (+280%)
synsyn - the thesaurus
Stars: ✭ 45 (+12.5%)
Emotion-recognition-from-tweetsA comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-57.5%)
learn perl onelinersExample based guide for text processing with perl from the command line
Stars: ✭ 63 (+57.5%)
crminer⛔ ARCHIVED ⛔ Fetch 'Scholary' Full Text from 'Crossref'
Stars: ✭ 17 (-57.5%)
neural name taggingCode for "Reliability-aware Dynamic Feature Composition for Name Tagging" (ACL2019)
Stars: ✭ 39 (-2.5%)
text2videoText to Video Generation Problem
Stars: ✭ 28 (-30%)
frangipanniProgram to convert lines of text into a tree structure.
Stars: ✭ 1,176 (+2840%)
TypeNetA Hierarchical Type system for fine grained entity typing
Stars: ✭ 51 (+27.5%)
batterydatabaseTools for auto-generating the battery-materials database.
Stars: ✭ 29 (-27.5%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-60%)
ReQuestIndirect Supervision for Relation Extraction Using Question-Answer Pairs (WSDM'18)
Stars: ✭ 26 (-35%)
s3-concatConcatenate Amazon S3 files remotely using flexible patterns
Stars: ✭ 32 (-20%)
ConTextoLibrería en Python para minería de texto y NLP
Stars: ✭ 43 (+7.5%)
PubMed-Best-MatchMachine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
Stars: ✭ 36 (-10%)
sliceslice-rsA fast implementation of single-pattern substring search using SIMD acceleration.
Stars: ✭ 66 (+65%)
DocuNetCode and dataset for the IJCAI 2021 paper "Document-level Relation Extraction as Semantic Segmentation".
Stars: ✭ 84 (+110%)
r4stringsHandling Strings in R
Stars: ✭ 39 (-2.5%)
AnswerableRecommendation system for Stack Overflow unanswered questions
Stars: ✭ 13 (-67.5%)
SearchBlue Brain text mining toolbox for semantic search and structured information extraction
Stars: ✭ 26 (-35%)
CogIECogIE: An Information Extraction Toolkit for Bridging Text and CogNet. ACL 2021
Stars: ✭ 47 (+17.5%)
s3-utilsUtilities and tools based around Amazon S3 to provide convenience APIs in a CLI
Stars: ✭ 45 (+12.5%)
naacl2018-feverFact Extraction and VERification baseline published in NAACL2018
Stars: ✭ 109 (+172.5%)
limaThe Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Stars: ✭ 75 (+87.5%)
readabilityFast readability scores for text data
Stars: ✭ 22 (-45%)
WeTextProcessingText Normalization & Inverse Text Normalization
Stars: ✭ 213 (+432.5%)
koshort(deprecated) 🐱 koshort is a Python package for Korean internet spoken language crawling and processing... or maybe Korean domestic cat.
Stars: ✭ 62 (+55%)
vi-rsVietnamese Input Method library
Stars: ✭ 69 (+72.5%)
dif'dif' is a Linux preprocessing front end to gvimdiff/meld/kompare
Stars: ✭ 18 (-55%)
TVGemistAn *Unofficial* Uitzending Gemist application for TV
Stars: ✭ 23 (-42.5%)
woollyThe Text Mining Elixir
Stars: ✭ 48 (+20%)