Text tone analyzerСистема, анализирующая тональность текстов и высказываний.
Stars: ✭ 15 (-58.33%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+88.89%)
thrones2vecUsing Word2Vec to explore semantic similarities between the entities of "A Song of Ice and Fire" ("Game of Thrones").
Stars: ✭ 27 (-25%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+44.44%)
TableDisentanglerFunctional and structural analysis of tables in research papers (Table disentangling)
Stars: ✭ 21 (-41.67%)
R.TeMiSR.TeMiS: R Text Mining Solution
Stars: ✭ 21 (-41.67%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-25%)
PubMed-Best-MatchMachine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
Stars: ✭ 36 (+0%)
readerDistant Reader, a tool for using & understanding a corpus
Stars: ✭ 18 (-50%)
textlearnRA simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
Stars: ✭ 16 (-55.56%)
learning2hash.github.ioWebsite for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io
Stars: ✭ 14 (-61.11%)
Guten-gutterStrips boilerplate from Project Gutenberg text files
Stars: ✭ 16 (-55.56%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-50%)
deduceDeduce: de-identification method for Dutch medical text
Stars: ✭ 40 (+11.11%)
gofastrMake a DocumentTermMatrix faster
Stars: ✭ 19 (-47.22%)
nlp-cheat-sheet-pythonNLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
Stars: ✭ 69 (+91.67%)
misinfo📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)
Stars: ✭ 17 (-52.78%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+152.78%)
TweebankNLP[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (+133.33%)
nejiFlexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (+2.78%)
alixA Lucene Indexer for XML, with lexical analysis (lemmatization for French)
Stars: ✭ 15 (-58.33%)
sentometricsAn integrated framework in R for textual sentiment time series aggregation and prediction
Stars: ✭ 77 (+113.89%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-25%)
textreadrTools to uniformly read in text data including semi-structured transcripts
Stars: ✭ 65 (+80.56%)
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (-11.11%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (+52.78%)
AdjutantRuns a pubmed query, returns results and allows user to explore high-level structure of returned documents
Stars: ✭ 59 (+63.89%)
odinsonOdinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Stars: ✭ 59 (+63.89%)
ipo-minerIPO Investment via Text Mining.
Stars: ✭ 20 (-44.44%)
beagleBeagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.
Stars: ✭ 46 (+27.78%)
lemmaA Morphological Parser (Analyser) / Lemmatizer written in Elixir.
Stars: ✭ 45 (+25%)
libmorphlibmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian
Stars: ✭ 16 (-55.56%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-41.67%)
lingNatural Language Processing Toolkit in Golang
Stars: ✭ 57 (+58.33%)
SearchBlue Brain text mining toolbox for semantic search and structured information extraction
Stars: ✭ 26 (-27.78%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (+308.33%)
malay-datasetText corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html
Stars: ✭ 189 (+425%)
Emotion-recognition-from-tweetsA comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-52.78%)
VERSEVancouver Event and Relation System for Extraction
Stars: ✭ 13 (-63.89%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-55.56%)
SparseLSHA Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (+252.78%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-66.67%)
TabInOutFramework for information extraction from tables
Stars: ✭ 37 (+2.78%)
woollyThe Text Mining Elixir
Stars: ✭ 48 (+33.33%)
blueprints-textJupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Stars: ✭ 103 (+186.11%)
textdigesterTextDigester: document summarization java library
Stars: ✭ 23 (-36.11%)
sacred📖 Sacred texts in R
Stars: ✭ 19 (-47.22%)
civicmineText mining cancer biomarkers for the CIVIC database
Stars: ✭ 19 (-47.22%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+172.22%)