The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.

Stars: ✭ 45 (+136.84%)

Mutual labels: text-mining

perke

A keyphrase extractor for Persian

Stars: ✭ 60 (+215.79%)

Mutual labels: text-mining

awesome-biomarkers

Curated List of Biomarkers, Blood Tests, and Blood Tracking

Stars: ✭ 214 (+1026.32%)

Mutual labels: biomarkers

neji

Flexible and powerful platform for biomedical information extraction from text

Stars: ✭ 37 (+94.74%)

Mutual labels: text-mining

Blue Brain text mining toolbox for semantic search and structured information extraction

Stars: ✭ 26 (+36.84%)

Mutual labels: text-mining

thrones2vec

Using Word2Vec to explore semantic similarities between the entities of "A Song of Ice and Fire" ("Game of Thrones").

Stars: ✭ 27 (+42.11%)

Mutual labels: text-mining

malay-dataset

Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html

Stars: ✭ 189 (+894.74%)

Mutual labels: text-mining

tf-idf-python

Term frequency–inverse document frequency for Chinese novel/documents implemented in python.

Stars: ✭ 98 (+415.79%)

Mutual labels: text-mining

PubMed-Best-Match

Machine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches

Stars: ✭ 36 (+89.47%)

Mutual labels: text-mining

learning2hash.github.io

Website for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io

Stars: ✭ 14 (-26.32%)

Mutual labels: text-mining

civic-server

Backend Server for CIViC Project

Stars: ✭ 39 (+105.26%)

Mutual labels: cancer

cometa

Corpus of Online Medical EnTities: the cometA corpus

Stars: ✭ 31 (+63.16%)

Mutual labels: bionlp

TableDisentangler

Functional and structural analysis of tables in research papers (Table disentangling)

Stars: ✭ 21 (+10.53%)

Mutual labels: text-mining

misinfo

📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)

Stars: ✭ 17 (-10.53%)

Mutual labels: text-mining

estratto

parsing fixed width files content made easy

Stars: ✭ 12 (-36.84%)

Mutual labels: text-mining

textreadr

Tools to uniformly read in text data including semi-structured transcripts

Stars: ✭ 65 (+242.11%)

Mutual labels: text-mining

oncoEnrichR

Cancer-dedicated gene set interpretation

Stars: ✭ 35 (+84.21%)

Mutual labels: cancer

cacao

Callable Cancer Loci - assessment of sequencing coverage for actionable and pathogenic loci in cancer

Stars: ✭ 21 (+10.53%)

Mutual labels: cancer

corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Stars: ✭ 16 (-15.79%)

Mutual labels: text-mining

JoSH

[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Stars: ✭ 55 (+189.47%)

Mutual labels: text-mining

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+3642.11%)

Mutual labels: text-mining

reader

Distant Reader, a tool for using & understanding a corpus

Stars: ✭ 18 (-5.26%)

Mutual labels: text-mining

ci4cc-informatics-resources

Community-maintained list of resources that the CI4CC organization and the larger cancer informatics community have found useful or are developing.

Stars: ✭ 22 (+15.79%)

Mutual labels: cancer

odinson

Odinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.

Stars: ✭ 59 (+210.53%)

Mutual labels: text-mining

BioMedical-NLP-corpus

Biomedical NLP Corpus or Datasets.

Stars: ✭ 44 (+131.58%)

Mutual labels: text-mining

R.TeMiS

R.TeMiS: R Text Mining Solution

Stars: ✭ 21 (+10.53%)

Mutual labels: text-mining

mageri

MAGERI - Assemble, align and call variants for targeted genome re-sequencing with unique molecular identifiers