Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1127.52%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+7541.28%)
Data Science Your WayWays of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+386.24%)
Quantitative NotebooksEducational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (+226.61%)
JupytemplateTemplates for jupyter notebooks
Stars: ✭ 85 (-22.02%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (+3.67%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+804.59%)
Cookbook 2nd CodeCode of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (+396.33%)
CoursesQuiz & Assignment of Coursera
Stars: ✭ 454 (+316.51%)
Nteract📘 The interactive computing suite for you! ✨
Stars: ✭ 5,713 (+5141.28%)
Cookbook 2ndIPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+545.87%)
DatasciencevmTools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (+40.37%)
Ml Workspace🛠 All-in-one web-based IDE specialized for machine learning and data science.
Stars: ✭ 2,337 (+2044.04%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-1.83%)
CortxCORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.
Stars: ✭ 426 (+290.83%)
Jupyter pivottablejsDrag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
Stars: ✭ 428 (+292.66%)
Sklearn ClassificationData Science Notebook on a Classification Task, using sklearn and Tensorflow.
Stars: ✭ 518 (+375.23%)
Intro To PythonAn intro to Python & programming for wanna-be data scientists
Stars: ✭ 536 (+391.74%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-27.52%)
Sci PypeA Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository.
Stars: ✭ 90 (-17.43%)
Jupyterlab LspCoding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
Stars: ✭ 796 (+630.28%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+4766.97%)
LuxPython API for Intelligent Visual Data Discovery
Stars: ✭ 787 (+622.02%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (+188.99%)
Nbconfluxnbconflux converts Jupyter Notebooks to Atlassian Confluence pages
Stars: ✭ 82 (-24.77%)
ArticlesA repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (+221.1%)
Fastai2Temporary home for fastai v2 while it's being developed
Stars: ✭ 630 (+477.98%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+5088.99%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+683.49%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+583.49%)
LambdaschooldatascienceCompleted assignments and coding challenges from the Lambda School Data Science program.
Stars: ✭ 22 (-79.82%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+692.66%)
Kaggle CompetitionsThere are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-21.1%)
ResourcesPyMC3 educational resources
Stars: ✭ 930 (+753.21%)
Pyspark Setup DemoDemo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-77.98%)
PrmlPRML algorithms implemented in Python
Stars: ✭ 10,206 (+9263.3%)
Crime AnalysisAssociation Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-81.65%)
MachinelearningcourseA collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-67.89%)
Python TrainingPython training for business analysts and traders
Stars: ✭ 972 (+791.74%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+775.23%)
Datacamp🍧 A repository that contains courses I have taken on DataCamp
Stars: ✭ 69 (-36.7%)
Ppd599USC urban data science course series with Python and Jupyter
Stars: ✭ 1,062 (+874.31%)
Countly Sdk CordovaCountly Product Analytics SDK for Cordova, Icenium and Phonegap
Stars: ✭ 69 (-36.7%)
Allstate capstoneAllstate Kaggle Competition ML Capstone Project
Stars: ✭ 72 (-33.94%)
TensorwatchDebugging, monitoring and visualization for Python Machine Learning and Data Science
Stars: ✭ 3,191 (+2827.52%)
Pydataroadopen source for wechat-official-account (ID: PyDataLab)
Stars: ✭ 302 (+177.06%)
SkdataPython tools for data analysis
Stars: ✭ 16 (-85.32%)