Guten-gutterStrips boilerplate from Project Gutenberg text files
Stars: ✭ 16 (-71.93%)
PubMed-Best-MatchMachine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
Stars: ✭ 36 (-36.84%)
misinfo📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)
Stars: ✭ 17 (-70.18%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+59.65%)
textdigesterTextDigester: document summarization java library
Stars: ✭ 23 (-59.65%)
readerDistant Reader, a tool for using & understanding a corpus
Stars: ✭ 18 (-68.42%)
gstateA crazy state management for lazy programmers
Stars: ✭ 27 (-52.63%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-52.63%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-78.95%)
graph datasetsA Repository of Benchmark Graph Datasets for Graph Classification (31 Graph Datasets In Total).
Stars: ✭ 227 (+298.25%)
woollyThe Text Mining Elixir
Stars: ✭ 48 (-15.79%)
seaboltNeo4j Bolt Connector for C
Stars: ✭ 37 (-35.09%)
Cayley.Net.Net Client for an open-source graph database Cayley
Stars: ✭ 14 (-75.44%)
nejiFlexible and powerful platform for biomedical information extraction from text
Stars: ✭ 37 (-35.09%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1147.37%)
ipo-minerIPO Investment via Text Mining.
Stars: ✭ 20 (-64.91%)
angular-neo4jNeo4j Bolt driver wrapper for Angular
Stars: ✭ 18 (-68.42%)
simplegraphdbBasic Golang implementation of a Triple Store. Built to learn the Golang language before an internship.
Stars: ✭ 17 (-70.18%)
dgraphDgraph Dart client which communicates with the server using gRPC.
Stars: ✭ 27 (-52.63%)
word2vec-pt-brImplementação e modelo gerado com o treinamento (trigram) da wikipedia em pt-br
Stars: ✭ 34 (-40.35%)
Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (-21.05%)
liquigraphMigrations for Neo4j
Stars: ✭ 122 (+114.04%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+5.26%)
named-entity-recognitionNotebooks for teaching Named Entity Recognition at the Cultural Heritage Data School, run by Cambridge Digital Humanities
Stars: ✭ 18 (-68.42%)
palladianPalladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.
Stars: ✭ 32 (-43.86%)
textlearnRA simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
Stars: ✭ 16 (-71.93%)
AnswerableRecommendation system for Stack Overflow unanswered questions
Stars: ✭ 13 (-77.19%)
GraphDBLPa Graph-based instance of DBLP
Stars: ✭ 33 (-42.11%)
database-journalDatabases: Concepts, commands, codes, interview questions and more...
Stars: ✭ 50 (-12.28%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (-8.77%)
readabilityFast readability scores for text data
Stars: ✭ 22 (-61.4%)
gofastrMake a DocumentTermMatrix faster
Stars: ✭ 19 (-66.67%)
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-78.95%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-52.63%)
mizoSuper-fast Spark RDD for Titan Graph Database on HBase
Stars: ✭ 24 (-57.89%)
Cypher.jsCypher graph database for Javascript
Stars: ✭ 30 (-47.37%)
blueprints-textJupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Stars: ✭ 103 (+80.7%)
Gwu data miningMaterials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (+280.7%)
QminerAnalytic platform for real-time large-scale streams containing structured and unstructured data.
Stars: ✭ 206 (+261.4%)
AdjutantRuns a pubmed query, returns results and allows user to explore high-level structure of returned documents
Stars: ✭ 59 (+3.51%)
HdltexHDLTex: Hierarchical Deep Learning for Text Classification
Stars: ✭ 191 (+235.09%)
SparseLSHA Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (+122.81%)
deduceDeduce: de-identification method for Dutch medical text
Stars: ✭ 40 (-29.82%)
aera-workshopThis workshop introduces participants to the Learning Analytics (LA), and provides a brief overview of LA methodologies, literature, applications, and ethical issues as they relate to STEM education.
Stars: ✭ 14 (-75.44%)
sensimSentence Similarity Estimator (SenSim)
Stars: ✭ 15 (-73.68%)
sacred📖 Sacred texts in R
Stars: ✭ 19 (-66.67%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-68.42%)
SearchBlue Brain text mining toolbox for semantic search and structured information extraction
Stars: ✭ 26 (-54.39%)