All Projects → Benchm Ml → Similar Projects or Alternatives

1425 Open source projects that are alternatives of or similar to Benchm Ml

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+208.23%)

Mutual labels: data-science, spark, random-forest, h2o

Rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)

Stars: ✭ 65 (-96.46%)

Mutual labels: data-science, spark, h2o

Awesome Gradient Boosting Papers

A curated list of gradient boosting research papers with implementations.

Stars: ✭ 704 (-61.63%)

Mutual labels: xgboost, random-forest, h2o

Mljar Supervised

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀

Stars: ✭ 961 (-47.63%)

Mutual labels: data-science, xgboost, random-forest

Awesome Decision Tree Papers

A collection of research papers on decision, classification and regression trees with implementations.

Stars: ✭ 1,908 (+3.98%)

Mutual labels: xgboost, random-forest, gradient-boosting-machine

Mli Resources

H2O.ai Machine Learning Interpretability Resources

Stars: ✭ 428 (-76.68%)

Mutual labels: data-science, xgboost, h2o

Tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Stars: ✭ 8,378 (+356.57%)

Mutual labels: data-science, xgboost, random-forest

decision-trees-for-ml

Building Decision Trees From Scratch In Python

Stars: ✭ 61 (-96.68%)

Mutual labels: random-forest, xgboost, gradient-boosting-machine

Machine Learning In R

Workshop (6 hours): preprocessing, cross-validation, lasso, decision trees, random forest, xgboost, superlearner ensembles

Stars: ✭ 144 (-92.15%)

Mutual labels: xgboost, random-forest

Featran

A Scala feature transformation library for data science and machine learning

Stars: ✭ 420 (-77.11%)

Mutual labels: spark, xgboost

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+1101.53%)

Mutual labels: data-science, spark

Sk Dist

Distributed scikit-learn meta-estimators in PySpark

Stars: ✭ 260 (-85.83%)

Mutual labels: data-science, spark

My Data Competition Experience

本人多次机器学习与大数据竞赛Top5的经验总结，满满的干货，拿好不谢

Stars: ✭ 271 (-85.23%)

Mutual labels: data-science, xgboost

Datacompy

Pandas and Spark DataFrame comparison for humans

Stars: ✭ 147 (-91.99%)

Mutual labels: data-science, spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.

Stars: ✭ 3,081 (+67.9%)

Mutual labels: data-science, spark

Sparkling Water

Sparkling Water provides H2O functionality inside Spark cluster

Stars: ✭ 887 (-51.66%)

Mutual labels: spark, h2o

Hyperparameter hunter

Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries

Stars: ✭ 648 (-64.69%)

Mutual labels: data-science, xgboost

Text Classification Benchmark

文本分类基准测试

Stars: ✭ 18 (-99.02%)

Mutual labels: xgboost, random-forest

Vds

Verteego Data Suite

Stars: ✭ 9 (-99.51%)

Mutual labels: data-science, h2o

Interpretable machine learning with python

Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.

Stars: ✭ 530 (-71.12%)

Mutual labels: data-science, h2o

Tiledb Vcf

Efficient variant-call data storage and retrieval library using the TileDB storage library.

Stars: ✭ 26 (-98.58%)

Mutual labels: data-science, spark

Open Solution Value Prediction

Open solution to the Santander Value Prediction Challenge 🐠

Stars: ✭ 34 (-98.15%)

Mutual labels: data-science, xgboost

Optimus

🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

Stars: ✭ 986 (-46.27%)

Mutual labels: data-science, spark

Data Science Cookbook

🎓 Jupyter notebooks from UFC data science course

Stars: ✭ 60 (-96.73%)

Mutual labels: data-science, spark

Roffildlibrary

Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS

Stars: ✭ 63 (-96.57%)

Mutual labels: spark, random-forest

Data science blogs

A repository to keep track of all the code that I end up writing for my blog posts.

Stars: ✭ 139 (-92.43%)

Mutual labels: spark, xgboost

MLDay18

Material from "Random Forests and Gradient Boosting Machines in R" presented at Machine Learning Day '18

Stars: ✭ 15 (-99.18%)

Mutual labels: random-forest, gradient-boosting-machine

aws-machine-learning-university-dte

Machine Learning University: Decision Trees and Ensemble Methods

Stars: ✭ 119 (-93.51%)

Mutual labels: random-forest, xgboost

Awesome H2o

A curated list of research, applications and projects built using the H2O Machine Learning platform

Stars: ✭ 293 (-84.03%)

Mutual labels: data-science, h2o

handson-ml

도서 "핸즈온 머신러닝"의 예제와 연습문제를 담은 주피터 노트북입니다.

Stars: ✭ 285 (-84.47%)

Mutual labels: random-forest, xgboost

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (-77.49%)

Mutual labels: data-science, spark

User Machine Learning Tutorial

useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html

Stars: ✭ 393 (-78.58%)

Mutual labels: data-science, random-forest

Predicting real estate prices using scikit Learn

Predicting Amsterdam house / real estate prices using Ordinary Least Squares-, XGBoost-, KNN-, Lasso-, Ridge-, Polynomial-, Random Forest-, and Neural Network MLP Regression (via scikit-learn)

Stars: ✭ 78 (-95.75%)

Mutual labels: xgboost, random-forest

telco-customer-churn-in-r-and-h2o

Showcase for using H2O and R for churn prediction (inspired by ZhouFang928 examples)

Stars: ✭ 59 (-96.78%)

Mutual labels: h2o, gradient-boosting-machine

Pyspark Example Project

Example project implementing best practices for PySpark ETL jobs and applications.

Stars: ✭ 633 (-65.5%)

Mutual labels: data-science, spark

Sci Pype

A Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository.

Stars: ✭ 90 (-95.1%)

Mutual labels: data-science, xgboost

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-95.69%)

Mutual labels: data-science, spark

H2o Tutorials

Tutorials and training material for the H2O Machine Learning Platform

Stars: ✭ 1,305 (-28.88%)

Mutual labels: data-science, h2o

Data Science Competitions

Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).

Stars: ✭ 572 (-68.83%)

Mutual labels: data-science, xgboost

Awesome Fraud Detection Papers

A curated list of data mining papers about fraud detection.

Stars: ✭ 843 (-54.06%)

Mutual labels: data-science, random-forest

Benchmarks

Comparison tools

Stars: ✭ 139 (-92.43%)

Mutual labels: xgboost, h2o

Github-Stars-Predictor

It's a github repo star predictor that tries to predict the stars of any github repository having greater than 100 stars.

Stars: ✭ 34 (-98.15%)

Mutual labels: random-forest, xgboost

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-96.84%)

Mutual labels: data-science, spark

Pulsar Spark

When Apache Pulsar meets Apache Spark

Stars: ✭ 55 (-97%)

Mutual labels: data-science, spark

W2v

Word2Vec models with Twitter data using Spark. Blog:

Stars: ✭ 64 (-96.51%)

Mutual labels: data-science, spark

25daysinmachinelearning

I will update this repository to learn Machine learning with python with statistics content and materials