All Categories → Data Processing → datascience

Top 142 datascience open source projects

Introduction Datascience Python Book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
Mimesis
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
data science chile
Lista de cursos de Data Science en Chile 📈📊🇨🇱
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
data-science-best-practices
The goal of this repository is to enable data scientists and ML engineers to develop data science use cases and making it ready for production use. This means focusing on the versioning, scalability, monitoring and engineering of the solution.
HackyHourHandbook
A handbook for those who want to start coordinating Hacky Hour events in their University/Institute
nl4dv
A python toolkit to create Visualizations (Vis) using natural language (NL) or add an NL interface to existing Vis.
genero-nomes
Classifica nomes por gênero de acordo com API do IBGE
awesome-open-mlops
The Fuzzy Labs guide to the universe of open source MLOps
d20datascience
Data science investigations into the mechanics of the world's greatest role playing game
RcppDynProg
Dynamic Programming implemented in Rcpp. Includes example partition and out of sample fitting applications.
AgePredictor
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
nyc-2019-scikit-sprint
NYC WiMLDS scikit-learn open source sprint (Aug 24, 2019)
ScalaTIKZ
ScalaTIKZ is an open-source library for PGF/TIKZ vector graphics.
R-data-wrangling
Materials for my my R data workshop. https://cengel.github.io/R-data-wrangling/
data-science-popular-algorithms
Data Science algorithms and topics that you must know. (Newly Designed) Recommender Systems, Decision Trees, K-Means, LDA, RFM-Segmentation, XGBoost in Python, R, and Scala.
Python-For-DataScience-Machine-Learning-Bootcamp-Udemy
Repository for the course on Udemy - Python for Data Science and Machine Learning Bootcamp , Jose Portilla
ML-CaPsule
ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.
66Days NaturalLanguageProcessing
I am sharing my Journey of 66DaysofData in Natural Language Processing.
primrose
Primrose modeling framework for simple production models
Naive-Bayes-Evening-Workshop
Companion code for Introduction to Python for Data Science: Coding the Naive Bayes Algorithm evening workshop
xgboost-smote-detect-fraud
Can we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!
k3ai
A lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
61-120 of 142 datascience projects