H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+208.23%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-96.46%)
Mljar SupervisedAutomated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (-47.63%)
Awesome Decision Tree PapersA collection of research papers on decision, classification and regression trees with implementations.
Stars: ✭ 1,908 (+3.98%)
Mli ResourcesH2O.ai Machine Learning Interpretability Resources
Stars: ✭ 428 (-76.68%)
TpotA Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+356.57%)
Machine Learning In RWorkshop (6 hours): preprocessing, cross-validation, lasso, decision trees, random forest, xgboost, superlearner ensembles
Stars: ✭ 144 (-92.15%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (-77.11%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+1101.53%)
Sk DistDistributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (-85.83%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-91.99%)
Spark NotebookInteractive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+67.9%)
Sparkling WaterSparkling Water provides H2O functionality inside Spark cluster
Stars: ✭ 887 (-51.66%)
Hyperparameter hunterEasy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (-64.69%)
VdsVerteego Data Suite
Stars: ✭ 9 (-99.51%)
Interpretable machine learning with pythonExamples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Stars: ✭ 530 (-71.12%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-98.58%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-46.27%)
RoffildlibraryLibrary for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-96.57%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-92.43%)
MLDay18Material from "Random Forests and Gradient Boosting Machines in R" presented at Machine Learning Day '18
Stars: ✭ 15 (-99.18%)
Awesome H2oA curated list of research, applications and projects built using the H2O Machine Learning platform
Stars: ✭ 293 (-84.03%)
handson-ml도서 "핸즈온 머신러닝"의 예제와 연습문제를 담은 주피터 노트북입니다.
Stars: ✭ 285 (-84.47%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-77.49%)
User Machine Learning TutorialuseR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html
Stars: ✭ 393 (-78.58%)
Predicting real estate prices using scikit LearnPredicting Amsterdam house / real estate prices using Ordinary Least Squares-, XGBoost-, KNN-, Lasso-, Ridge-, Polynomial-, Random Forest-, and Neural Network MLP Regression (via scikit-learn)
Stars: ✭ 78 (-95.75%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (-65.5%)
Sci PypeA Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository.
Stars: ✭ 90 (-95.1%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-95.69%)
H2o TutorialsTutorials and training material for the H2O Machine Learning Platform
Stars: ✭ 1,305 (-28.88%)
Data Science CompetitionsGoal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).
Stars: ✭ 572 (-68.83%)
Github-Stars-PredictorIt's a github repo star predictor that tries to predict the stars of any github repository having greater than 100 stars.
Stars: ✭ 34 (-98.15%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-96.84%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-97%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-96.51%)
25daysinmachinelearningI will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (-97.11%)
MlboxMLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (-34.66%)
Allstate capstoneAllstate Kaggle Competition ML Capstone Project
Stars: ✭ 72 (-96.08%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-93.9%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-93.35%)
DtreevizA python library for decision tree visualization and model interpretation.
Stars: ✭ 1,857 (+1.2%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-94.11%)
Auto ml[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (-15.04%)
Sigmoidal aiTutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal
Stars: ✭ 103 (-94.39%)
Automl alexState-of-the art Automated Machine Learning python library for Tabular Data
Stars: ✭ 132 (-92.81%)
Machine-Learning-ModelsIn This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Stars: ✭ 30 (-98.37%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (-45.61%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-27.08%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-93.19%)