Datasciencera curated list of R tutorials for Data Science, NLP and Machine Learning
Stars: ✭ 1,727 (+828.49%)
Open Semantic SearchOpen Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (+107.53%)
PyphoneticsA Python 3 phonetics library.
Stars: ✭ 61 (-67.2%)
Nlp[UNMANTEINED] Extract values from strings and fill your structs with nlp.
Stars: ✭ 367 (+97.31%)
PdftoolsText Extraction, Rendering and Converting of PDF Documents
Stars: ✭ 349 (+87.63%)
GraphbrainLanguage, Knowledge, Cognition
Stars: ✭ 294 (+58.06%)
Textractextract text from any document. no muss. no fuss.
Stars: ✭ 3,165 (+1601.61%)
PipeitPipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (-69.35%)
TextminingPython文本挖掘系统 Research of Text Mining System
Stars: ✭ 268 (+44.09%)
TokenizersFast, Consistent Tokenization of Natural Language Text
Stars: ✭ 161 (-13.44%)
NgramFast n-Gram Tokenization
Stars: ✭ 55 (-70.43%)
ocrSimple app to extract text from pictures using Tesseract
Stars: ✭ 98 (-47.31%)
ScattertextBeautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+825.81%)
snorkelingExtracting biomedical relationships from literature with Snorkel 🏊
Stars: ✭ 56 (-69.89%)
TadwAn implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-76.88%)
kwxBERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-82.26%)
ChemdataextractorAutomatically extract chemical information from scientific documents
Stars: ✭ 152 (-18.28%)
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (-23.66%)
Tika PythonTika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Stars: ✭ 997 (+436.02%)
any-textGet text content from any file
Stars: ✭ 19 (-89.78%)
Textcluster短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (-38.17%)
ruimteholR package to Embed All the Things! using StarSpace
Stars: ✭ 95 (-48.92%)
TidytextText mining using tidy tools ✨📄✨
Stars: ✭ 975 (+424.19%)
aera-workshopThis workshop introduces participants to the Learning Analytics (LA), and provides a brief overview of LA methodologies, literature, applications, and ethical issues as they relate to STEM education.
Stars: ✭ 14 (-92.47%)
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (-2.69%)
sensimSentence Similarity Estimator (SenSim)
Stars: ✭ 15 (-91.94%)
blueprints-textJupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Stars: ✭ 103 (-44.62%)
Learning Social Media Analytics With RThis repository contains code and bonus content which will be added from time to time for the book "Learning Social Media Analytics with R" by Packt
Stars: ✭ 102 (-45.16%)
textdigesterTextDigester: document summarization java library
Stars: ✭ 23 (-87.63%)
NlpplnNLP pipeline software using common workflow language
Stars: ✭ 31 (-83.33%)
gofastrMake a DocumentTermMatrix faster
Stars: ✭ 19 (-89.78%)
XiocExtract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (-20.43%)
sacred📖 Sacred texts in R
Stars: ✭ 19 (-89.78%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-88.71%)
Lda Topic ModelingA PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-51.08%)
AutophraseAutoPhrase: Automated Phrase Mining from Massive Text Corpora
Stars: ✭ 835 (+348.92%)
UdpipeR package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Stars: ✭ 160 (-13.98%)
wagtail textractText extraction for Wagtail document search
Stars: ✭ 27 (-85.48%)
Rake NltkPython implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
Stars: ✭ 793 (+326.34%)
learning2hash.github.ioWebsite for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io
Stars: ✭ 14 (-92.47%)
R Text DataList of textual data sources to be used for text mining in R
Stars: ✭ 85 (-54.3%)
Image Text Localization RecognitionA general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約
Stars: ✭ 788 (+323.66%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-90.32%)
Hands On Natural Language Processing With PythonThis repository is for my students of Udemy. You can find all lecture codes along with mentioned files for reading in here. So, feel free to clone it and if you have any problem just raise a question.
Stars: ✭ 146 (-21.51%)
Nlp NotebooksA collection of notebooks for Natural Language Processing from NLP Town
Stars: ✭ 513 (+175.81%)
UnidocThis repository has moved! https://github.com/unidoc/unipdf
Stars: ✭ 694 (+273.12%)
TextheroText preprocessing, representation and visualization from zero to hero.
Stars: ✭ 2,407 (+1194.09%)
Multi rakeMultilingual Rapid Automatic Keyword Extraction (RAKE) for Python
Stars: ✭ 162 (-12.9%)
Lambda Text ExtractorAWS Lambda functions to extract text from various binary formats.
Stars: ✭ 159 (-14.52%)
KateCode & data accompanying the KDD 2017 paper "KATE: K-Competitive Autoencoder for Text"
Stars: ✭ 135 (-27.42%)
Python nlp tutorialThis repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Stars: ✭ 72 (-61.29%)