Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1990.63%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1440.63%)
MydatascienceportfolioApplying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (+254.69%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-50%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+1390.63%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (+68.75%)
HandysparkHandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+146.88%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-1.56%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+889.06%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (+75%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1459.38%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+212.5%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+545.31%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+134.38%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+8737.5%)
MdsModern Data Science
Stars: ✭ 19 (-70.31%)
Har Keras CoremlHuman Activity Recognition (HAR) with Keras and CoreML
Stars: ✭ 23 (-64.06%)
ResourcesPyMC3 educational resources
Stars: ✭ 930 (+1353.13%)
LambdaschooldatascienceCompleted assignments and coding challenges from the Lambda School Data Science program.
Stars: ✭ 22 (-65.62%)
SkdataPython tools for data analysis
Stars: ✭ 16 (-75%)
Pyspark Setup DemoDemo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-62.5%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-81.25%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+1250%)
Awesome Google ColabGoogle Colaboratory Notebooks and Repositories (by @firmai)
Stars: ✭ 863 (+1248.44%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-78.12%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+12914.06%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-78.12%)
Mlnet WorkshopML.NET Workshop to predict car sales prices
Stars: ✭ 29 (-54.69%)
Intro PythonPython pour Statistique et Science des Données -- Syntaxe, Trafic de Données, Graphes, Programmation, Apprentissage
Stars: ✭ 21 (-67.19%)
Python for mlbrief introduction to Python for machine learning
Stars: ✭ 29 (-54.69%)
CourseraQuiz & Assignment of Coursera
Stars: ✭ 774 (+1109.38%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-59.37%)
Crime AnalysisAssociation Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-68.75%)
Docker Iocaml DatascienceDockerfile of Jupyter (IPython notebook) and IOCaml (OCaml kernel) with libraries for data science and machine learning
Stars: ✭ 30 (-53.12%)
Python TrainingPython training for business analysts and traders
Stars: ✭ 972 (+1418.75%)
Ds Take HomeMy solution to the book A Collection of Data Science Take-Home Challenges
Stars: ✭ 1,004 (+1468.75%)
Computervision RecipesBest Practices, code samples, and documentation for Computer Vision.
Stars: ✭ 8,214 (+12734.38%)
Machine Learning From ScratchSuccinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-34.37%)
Numerical Linear AlgebraFree online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
Stars: ✭ 8,263 (+12810.94%)
PresentationsTalks & Workshops by the CODAIT team
Stars: ✭ 50 (-21.87%)
Ppd599USC urban data science course series with Python and Jupyter
Stars: ✭ 1,062 (+1559.38%)
25daysinmachinelearningI will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (-17.19%)
Mlj.jlA Julia machine learning framework
Stars: ✭ 982 (+1434.38%)
Mckinsey Smartcities Traffic PredictionAdventure into using multi attention recurrent neural networks for time-series (city traffic) for the 2017-11-18 McKinsey IronMan (24h non-stop) prediction challenge
Stars: ✭ 49 (-23.44%)
MetrotwitterWhat Twitter reveals about the differences between cities and the monoculture of the Bay Area
Stars: ✭ 52 (-18.75%)