PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+531.65%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+524.05%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+161.39%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-29.11%)
MydatascienceportfolioApplying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (+43.67%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-59.49%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+3479.75%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+746.84%)
FastbookThe fastai book, published as Jupyter Notebooks
Stars: ✭ 13,998 (+8759.49%)
Pandas VideosJupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+986.08%)
Datasist A Python library for easy data analysis, visualization, exploration and modeling
Stars: ✭ 123 (-22.15%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (-28.48%)
Seaborn TutorialThis repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-27.85%)
ZigzagPython library for identifying the peaks and valleys of a time series.
Stars: ✭ 156 (-1.27%)
Dat8General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+859.49%)
Seq2seq tutorialCode For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"
Stars: ✭ 132 (-16.46%)
2016 Ml ContestMachine learning contest - October 2016 TLE
Stars: ✭ 135 (-14.56%)
Dive Into Machine LearningDive into Machine Learning with Python Jupyter notebook and scikit-learn! First posted in 2016, maintained as of 2021. Pull requests welcome.
Stars: ✭ 10,810 (+6741.77%)
Data Science WgSF Brigade's Data Science Working Group.
Stars: ✭ 135 (-14.56%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-12.03%)
Scipy con 2019Tutorial Sessions for SciPy Con 2019
Stars: ✭ 142 (-10.13%)
Kaggle HousepricesKaggle Kernel for House Prices competition https://www.kaggle.com/massquantity/all-you-need-is-pca-lb-0-11421-top-4
Stars: ✭ 113 (-28.48%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-3.8%)
AlgocodeWelcome everyone!🌟 Here you can solve problems, build scrappers and much more💻
Stars: ✭ 113 (-28.48%)
ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+918.99%)
AutomungeArtificial Learning, Intelligent Machines
Stars: ✭ 119 (-24.68%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-22.78%)
KriskStatistical Interactive Visualization with pandas+Jupyter integration on top of Echarts.
Stars: ✭ 111 (-29.75%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-18.99%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-14.56%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-20.89%)
Machine Learning And Data ScienceThis is a repository which contains all my work related Machine Learning, AI and Data Science. This includes my graduate projects, machine learning competition codes, algorithm implementations and reading material.
Stars: ✭ 137 (-13.29%)
NlpaugData augmentation for NLP
Stars: ✭ 2,761 (+1647.47%)
TextbookPrinciples and Techniques of Data Science, the textbook for Data 100 at UC Berkeley
Stars: ✭ 145 (-8.23%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-6.96%)
Nyc TransportA Unified Database of NYC transport (subway, taxi/Uber, and citibike) data.
Stars: ✭ 148 (-6.33%)
Machine Learning🌎 machine learning tutorials (mainly in Python3)
Stars: ✭ 1,924 (+1117.72%)
RaspberryturkThe Raspberry Turk is a robot that can play chess—it's entirely open source, based on Raspberry Pi, and inspired by the 18th century chess playing machine, the Mechanical Turk.
Stars: ✭ 140 (-11.39%)
Fantasy Basketball Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.
Stars: ✭ 146 (-7.59%)
Project kojakTraining a Neural Network to Detect Gestures and Control Smart Home Devices with OpenCV in Python
Stars: ✭ 147 (-6.96%)
TestovoeHome assignments for data science positions
Stars: ✭ 149 (-5.7%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-5.06%)
Benchm MlA minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Stars: ✭ 1,835 (+1061.39%)
Py QuantmodPowerful financial charting library based on R's Quantmod | http://py-quantmod.readthedocs.io/en/latest/
Stars: ✭ 155 (-1.9%)
Ml Workspace🛠 All-in-one web-based IDE specialized for machine learning and data science.
Stars: ✭ 2,337 (+1379.11%)
Machine Learning With PythonPractice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+1290.51%)