⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-83.33%)

Mutual labels: data-science, text

Tsrepr

TSrepr: R package for time series representations

Stars: ✭ 75 (-78.45%)

Mutual labels: data-science, data-mining

Clevercsv

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

Stars: ✭ 887 (+154.89%)

Mutual labels: data-science, data-mining

Dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

Stars: ✭ 1,238 (+255.75%)

Mutual labels: data-science, data-mining

Mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Stars: ✭ 3,729 (+971.55%)

Mutual labels: data-science, data-mining

Phormatics

Using A.I. and computer vision to build a virtual personal fitness trainer. (Most Startup-Viable Hack - HackNYU2018)

Stars: ✭ 79 (-77.3%)

Mutual labels: data-science, classification

Lda Topic Modeling

A PureScript, browser-based implementation of LDA topic modeling.

Stars: ✭ 91 (-73.85%)

Mutual labels: data-science, text-mining

Vvedenie Mashinnoe Obuchenie

📝 Подборка ресурсов по машинному обучению

Stars: ✭ 1,282 (+268.39%)

Mutual labels: data-science, data-mining

Neuroflow

Artificial Neural Networks for Scala

Stars: ✭ 105 (-69.83%)

Mutual labels: data-science, classification

Dataflowjavasdk

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

Stars: ✭ 854 (+145.4%)

Mutual labels: data-science, data-mining

Matrixprofile

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.

Stars: ✭ 141 (-59.48%)

Mutual labels: data-science, data-mining

Interactive machine learning

IPython widgets, interactive plots, interactive machine learning

Stars: ✭ 140 (-59.77%)

Mutual labels: data-science, classification

Efficient Apriori

An efficient Python implementation of the Apriori algorithm.

Stars: ✭ 145 (-58.33%)

Mutual labels: data-science, data-mining

Accelerator

The Accelerator is a tool for fast and reproducible processing of large amounts of data.

Stars: ✭ 137 (-60.63%)

Mutual labels: data-science, data-mining

Machine Learning With Python

Practice and tutorial-style notebooks covering wide variety of machine learning techniques

Stars: ✭ 2,197 (+531.32%)

Mutual labels: data-science, classification

Fantasy Basketball

Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.

Stars: ✭ 146 (-58.05%)

Mutual labels: data-science, data-mining

Pycaret

An open-source, low-code machine learning library in Python

Stars: ✭ 4,594 (+1220.11%)

Mutual labels: data-science, classification

Datasciencer

a curated list of R tutorials for Data Science, NLP and Machine Learning

Stars: ✭ 1,727 (+396.26%)

Mutual labels: data-science, text-mining

Chefboost

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python

Stars: ✭ 176 (-49.43%)

Mutual labels: data-science, data-mining

Data Science Resources

👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋

Stars: ✭ 171 (-50.86%)

Mutual labels: data-science, data-mining

Dataaspirant codes

Complete machine learning model codes

Stars: ✭ 185 (-46.84%)

Mutual labels: data-science, data-mining

Lightautoml

LAMA - automatic model creation framework

Stars: ✭ 196 (-43.68%)

Mutual labels: data-science, classification

Instascrape

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically

Stars: ✭ 202 (-41.95%)

Mutual labels: data-science, data-mining

Tweetfeels

Real-time sentiment analysis in Python using twitter's streaming api

Stars: ✭ 249 (-28.45%)

Mutual labels: data-science, data-mining

Data Mining Conferences

Ranking, acceptance rate, deadline, and publication tips

Stars: ✭ 236 (-32.18%)

Mutual labels: data-science, data-mining

Datascience

Curated list of Python resources for data science.

Stars: ✭ 3,051 (+776.72%)

Mutual labels: data-science, data-mining

Deepgraph

Analyze Data with Pandas-based Networks. Documentation:

Stars: ✭ 232 (-33.33%)

Mutual labels: data-science, data-mining

Awesome Datascience

📝 An awesome Data Science repository to learn and apply for real world problems.

Stars: ✭ 17,520 (+4934.48%)

Mutual labels: data-science, data-mining

Pm4py Core

Public repository for the PM4Py (Process Mining for Python) project.

Stars: ✭ 313 (-10.06%)

Mutual labels: data-science, data-mining

clustext

Easy, fast clustering of texts

Stars: ✭ 18 (-94.83%)

Mutual labels: text-mining, text-classification

TextClassification

基于scikit-learn实现对新浪新闻的文本分类，数据集为100w篇文档，总计10类，测试集与训练集1:1划分。分类算法采用SVM和Bayes，其中Bayes作为baseline。

Stars: ✭ 86 (-75.29%)

Mutual labels: data-mining, text-classification

Graph Fraud Detection Papers

A curated list of fraud detection papers using graph information or graph neural networks

Stars: ✭ 339 (-2.59%)

Mutual labels: data-science, data-mining

estratto

parsing fixed width files content made easy

Stars: ✭ 12 (-96.55%)

Mutual labels: text-mining, text-processing

iis

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-95.4%)

Mutual labels: text-mining, data-mining

woolly

The Text Mining Elixir

Stars: ✭ 48 (-86.21%)

Mutual labels: text-mining, text-analysis

knime-textprocessing

KNIME - Text Processing Extension (Labs)

Stars: ✭ 17 (-95.11%)

Mutual labels: text-analysis, text-processing

text-classification-baseline

Pipeline for fast building text classification TF-IDF + LogReg baselines.

Stars: ✭ 55 (-84.2%)

Mutual labels: text-classification, text

deduce

Deduce: de-identification method for Dutch medical text

Stars: ✭ 40 (-88.51%)

Mutual labels: text-mining, text-processing

classy

Super simple text classifier using Naive Bayes. Plug-and-play, no dependencies

Stars: ✭ 12 (-96.55%)

Mutual labels: text, classification

Relation-Classification

Relation Classification - SEMEVAL 2010 task 8 dataset

Stars: ✭ 46 (-86.78%)

Mutual labels: text-classification, classification

textlearnR

A simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.

Stars: ✭ 16 (-95.4%)

Mutual labels: text-mining, classification

Statistical Learning

Lecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course

Stars: ✭ 223 (-35.92%)

Mutual labels: data-science, data-mining

tf-idf-python

Term frequency–inverse document frequency for Chinese novel/documents implemented in python.

Stars: ✭ 98 (-71.84%)

Mutual labels: text-mining, data-mining

awesome-text-classification

Text classification meets word embeddings.

Stars: ✭ 27 (-92.24%)

Mutual labels: text-classification, classification

COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers

Rank 1 / 216

Stars: ✭ 24 (-93.1%)

Mutual labels: text-classification, classification

Loan-Approval-Prediction

Loan Application Data Analysis

Stars: ✭ 61 (-82.47%)

Mutual labels: data-mining, classification

FNet-pytorch

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Stars: ✭ 204 (-41.38%)

Mutual labels: text-classification, text

lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019

Stars: ✭ 27 (-92.24%)

Mutual labels: text-mining, text

TextDatasetCleaner

🔬 Очистка датасетов от мусора (нормализация, препроцессинг)

Stars: ✭ 27 (-92.24%)

Mutual labels: text-mining, text-processing

nlpbuddy

A text analysis application for performing common NLP tasks through a web dashboard interface and an API

Stars: ✭ 115 (-66.95%)

Mutual labels: text-classification, text-analysis

nlp classification

Implementing nlp papers relevant to classification with PyTorch, gluonnlp

Stars: ✭ 224 (-35.63%)

Mutual labels: text-classification, classification

stringx

Drop-in replacements for base R string functions powered by stringi

Stars: ✭ 14 (-95.98%)

Mutual labels: text, text-processing

Kaggle-project-list

Summary of my projects on kaggle

Stars: ✭ 20 (-94.25%)

Mutual labels: data-mining, text-classification

converse

Conversational text Analysis using various NLP techniques

Stars: ✭ 147 (-57.76%)

Mutual labels: text-mining, text

SparseLSH

A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.

Stars: ✭ 127 (-63.51%)

Mutual labels: text-mining, data-mining

61-120 of 2695 similar projects

‹

›

next*5