A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Stars: ✭ 13,293 (+32321.95%)

Mutual labels: data-mining

Spypi

An (un-)ethical hacking-station based on Raspberry Pi and Python

Stars: ✭ 167 (+307.32%)

Mutual labels: data-mining

Emuto

manipulate JSON files

Stars: ✭ 180 (+339.02%)

Mutual labels: data-mining

Chirp

Interface to manage and centralize Google Alert information

Stars: ✭ 227 (+453.66%)

Mutual labels: data-mining

Python practice of data analysis and mining

《Python数据分析与挖掘实战》随书源码与数据

Stars: ✭ 172 (+319.51%)

Mutual labels: data-mining

Tweetfeels

Real-time sentiment analysis in Python using twitter's streaming api

Stars: ✭ 249 (+507.32%)

Mutual labels: data-mining

Welly

Well handling

Stars: ✭ 168 (+309.76%)

Mutual labels: data-mining

Prefixspan Py

The shortest yet efficient Python implementation of the sequential pattern mining algorithm PrefixSpan, closed sequential pattern mining algorithm BIDE, and generator sequential pattern mining algorithm FEAT.

Stars: ✭ 214 (+421.95%)

Mutual labels: data-mining

kenchi

A scikit-learn compatible library for anomaly detection

Stars: ✭ 36 (-12.2%)

Mutual labels: data-mining

Pzad

Курс "Прикладные задачи анализа данных" (ВМК, МГУ имени М.В. Ломоносова)

Stars: ✭ 160 (+290.24%)

Mutual labels: data-mining

Suod

(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)

Stars: ✭ 245 (+497.56%)

Mutual labels: data-mining

Zhihu Analysis Python

Social Network Analysis of Zhihu with Python

Stars: ✭ 215 (+424.39%)

Mutual labels: data-mining

Etl unicorn

数据可视化, 数据挖掘, 数据处理 ETL

Stars: ✭ 156 (+280.49%)

Mutual labels: data-mining

Xioc

Extract indicators of compromise from text, including "escaped" ones.

Stars: ✭ 148 (+260.98%)

Mutual labels: data-mining

Smartproxy

HTTP(S) Rotating Residential proxies - Code examples & General information

Stars: ✭ 205 (+400%)

Mutual labels: data-mining

Rosie Pattern Language

Rosie Pattern Language (RPL) and the Rosie Pattern Engine have MOVED!

Stars: ✭ 146 (+256.1%)

Mutual labels: data-mining

Python Machine Learning Book

The "Python Machine Learning (1st edition)" book code repository and info resource

Stars: ✭ 11,428 (+27773.17%)

Mutual labels: data-mining

Dataaspirant codes

Complete machine learning model codes

Stars: ✭ 185 (+351.22%)

Mutual labels: data-mining

Deepgraph

Analyze Data with Pandas-based Networks. Documentation:

Stars: ✭ 232 (+465.85%)

Mutual labels: data-mining

Awesome Machine Learning Interpretability

A curated list of awesome machine learning interpretability resources.

Stars: ✭ 2,404 (+5763.41%)

Mutual labels: data-mining

Orange3

🍊 📊 💡 Orange: Interactive data analysis

Stars: ✭ 3,152 (+7587.8%)

Mutual labels: data-mining

Chefboost

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python

Stars: ✭ 176 (+329.27%)

Mutual labels: data-mining

Automlpipeline.jl

A package that makes it trivial to create and evaluate machine learning pipeline architectures.

Stars: ✭ 223 (+443.9%)

Mutual labels: data-mining

Data Science Resources

👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋

Stars: ✭ 171 (+317.07%)

Mutual labels: data-mining

quick-adc

Quick ADC

Stars: ✭ 20 (-51.22%)

Mutual labels: nearest-neighbor-search

Data Science Toolkit

Collection of stats, modeling, and data science tools in Python and R.

Stars: ✭ 169 (+312.2%)

Mutual labels: data-mining

Amazing Feature Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

Stars: ✭ 218 (+431.71%)

Mutual labels: data-mining

Pipeline

the `pipeline` shell command

Stars: ✭ 168 (+309.76%)

Mutual labels: data-mining

Python Projects

some python projects

Stars: ✭ 247 (+502.44%)

Mutual labels: data-mining

Pdftabextract

A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

Stars: ✭ 1,969 (+4702.44%)

Mutual labels: data-mining

Gwu data mining

Materials for GWU DNSC 6279 and DNSC 6290.

Stars: ✭ 217 (+429.27%)

Mutual labels: data-mining

Gensim

Topic Modelling for Humans

Stars: ✭ 12,763 (+31029.27%)

Mutual labels: data-mining

software-analytics

A repository with my data analysis results of software artifacts

Stars: ✭ 37 (-9.76%)

Mutual labels: data-mining

Sourced Ce

source{d} Community Edition (CE)

Stars: ✭ 153 (+273.17%)

Mutual labels: data-mining

Qminer

Analytic platform for real-time large-scale streams containing structured and unstructured data.

Stars: ✭ 206 (+402.44%)

Mutual labels: data-mining

Alimusic

🎼天池阿里音乐流行趋势预测大赛，项目中涵盖了从初赛到复赛的全部核心代码。复赛的聚合数据可以在百度网盘下载，更详细的思路介绍欢迎访问我的博客。

Stars: ✭ 147 (+258.54%)

Mutual labels: data-mining

Reaper

Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs

Stars: ✭ 240 (+485.37%)

Mutual labels: data-mining

Fantasy Basketball

Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.

Stars: ✭ 146 (+256.1%)

Mutual labels: data-mining

Estadistica Con R

Apuntes personales sobre estadística, machine learning y lenguaje de programación R

Stars: ✭ 201 (+390.24%)

Mutual labels: data-mining

Awesome Datascience

📝 An awesome Data Science repository to learn and apply for real world problems.

Stars: ✭ 17,520 (+42631.71%)

Mutual labels: data-mining

Twitterdatamining

Twitter数据挖掘及其可视化