All Categories → Data Processing → data-science

Top 1642 data-science open source projects

Data Science Interview Resources
A repository listing out the potential sources which will help you in preparing for a Data Science/Machine Learning interview. New resources added frequently.
Python Crfsuite
A python binding for crfsuite
Cracking The Data Science Interview
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Roughviz
Reusable JavaScript library for creating sketchy/hand-drawn styled charts in the browser.
Deep learning and the game of go
Code and other material for the book "Deep Learning and the Game of Go"
Kaggle Cli
(Deprecated, use https://github.com/Kaggle/kaggle-api instead) An unofficial Kaggle command line tool.
Ipython Dashboard
A stand alone, light-weight web server for building, sharing graphs created in ipython. Build for data science, data analysis guys. Aiming at building an interactive visualization, collaborated dashboard, and real-time streaming graph.
Test Tube
Python library to easily log experiments and parallelize hyperparameter search for neural networks
Deep Recommender System
深度学习在推荐系统中的应用及论文小结。
Tsfresh
Automatic extraction of relevant features from time series:
Data Science Blogs
A curated list of data science blogs
Dataprep
DataPrep — The easiest way to prepare data in Python
Zero To Mastery Ml
All course materials for the Zero to Mastery Machine Learning and Data Science course.
Speech Emotion Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Fastai2
Temporary home for fastai v2 while it's being developed
Boltons
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
Lazydata
Lazydata: Scalable data dependencies for Python projects
Matrixprofile Ts
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile
Engsoccerdata
English and European soccer results 1871-2020
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Dist Keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Sigma coding youtube
This is a collection of all the code that can be found on my YouTube channel Sigma Coding.
Book sample
another book on data science
Moviegeek
A django website used in the book Practical Recommender Systems to illustrate how recommender algorithms can be implemented.
Datasheets
Read data from, write data to, and modify the formatting of Google Sheets
Pdpipe
Easy pipelines for pandas DataFrames.
Awesome Ai Usecases
A list of awesome and proven Artificial Intelligence use cases and applications
Imbalanced Learn
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Vehicle counting tensorflow
🚘 "MORE THAN VEHICLE COUNTING!" This project provides prediction for speed, color and size of the vehicles with TensorFlow Object Counting API.
Data Science Competitions
Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).
Baikal
A graph-based functional API for building complex scikit-learn pipelines.
Pygam
[HELP REQUESTED] Generalized Additive Models in Python
Datasets For Recommender Systems
This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
Data Analysis And Machine Learning Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
Data Science Portfolio
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
Nipype
Workflows and interfaces for neuroimaging packages
Probabilistic Programming And Bayesian Methods For Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Cookbook 2nd Code
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Intro To Python
An intro to Python & programming for wanna-be data scientists
Feature Selection
Features selector based on the self selected-algorithm, loss function and validation method
601-660 of 1642 data-science projects