All Categories → Data Processing → data-science

Top 1642 data-science open source projects

Benchm Ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Labs
Labs for the Foundations of Applied Mathematics curriculum.
Testovoe
Home assignments for data science positions
Project kojak
Training a Neural Network to Detect Gestures and Control Smart Home Devices with OpenCV in Python
Ml Hub
🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.
Nyc Transport
A Unified Database of NYC transport (subway, taxi/Uber, and citibike) data.
Datacompy
Pandas and Spark DataFrame comparison for humans
Pycwt
A Python module for continuous wavelet spectral analysis. It includes a collection of routines for wavelet transform and statistical analysis via FFT algorithm. In addition, the module also includes cross-wavelet transforms, wavelet coherence tests and sample scripts.
Fantasy Basketball
Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.
Selfie2anime
Anime2Selfie Backend Services - Lambda, Queue, API Gateway and traffic processing
Docker tutorial
Code and helper scripts for article on Medium "How Docker Can Help You Become A More Effective Data Scientist"
Textbook
Principles and Techniques of Data Science, the textbook for Data 100 at UC Berkeley
Py Rse
Research Software Engineering with Python course material
Tscv
Time Series Cross-Validation -- an extension for scikit-learn
Bodywork Core
Deploy machine learning projects developed in Python, to Kubernetes. Accelerated MLOps 🚀
Efficient Apriori
An efficient Python implementation of the Apriori algorithm.
Scalable Data Science
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Doddle Model
🍰 doddle-model: machine learning in Scala.
Raspberryturk
The Raspberry Turk is a robot that can play chess—it's entirely open source, based on Raspberry Pi, and inspired by the 18th century chess playing machine, the Mechanical Turk.
Coffee Quality Database
Building the Coffee Quality Institute Database
Book
This book serves as an introduction to a whole new way of thinking systematically about geographic data, using geographical analysis and computation to unlock new insights hidden within data.
Matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
Ripser.py
A Lean Persistent Homology Library for Python
Toma
Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory
Datasciencecoursera
Data Science Repo and blog for John Hopkins Coursera Courses. Please let me know if you have any questions.
Python For Data Science
A blog for data analytics using data science technologies
Traffic
A toolbox for processing and analysing air traffic data
Scilab
Free and Open Source software for numerical computation providing a powerful computing environment for engineering and scientific applications.
Machine Learning And Data Science
This is a repository which contains all my work related Machine Learning, AI and Data Science. This includes my graduate projects, machine learning competition codes, algorithm implementations and reading material.
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Pandasschema
A validation library for Pandas data frames using user-friendly schemas
Qlik Py Tools
Data Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).
Beyond Jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Blockchain2graph
Blockchain2graph extracts blockchain data (bitcoin) and insert them into a graph database (neo4j).
Accelerators
Data science and AI solution accelerator suite that provides templates for prototyping, reporting, and presenting data science analytics of specific domains
181-240 of 1642 data-science projects