All Projects → ottogroup → dstoolbox

ottogroup / dstoolbox

Licence: Apache-2.0 license
Tools that make working with scikit-learn and pandas easier.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to dstoolbox

Algorithmic-Trading
Algorithmic trading using machine learning.
Stars: ✭ 102 (+137.21%)
Mutual labels:  scikit-learn, pandas
Mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
Stars: ✭ 2,308 (+5267.44%)
Mutual labels:  scikit-learn, pandas
Python Cheat Sheet
Python Cheat Sheet NumPy, Matplotlib
Stars: ✭ 1,739 (+3944.19%)
Mutual labels:  scikit-learn, pandas
Dat8
General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+3425.58%)
Mutual labels:  scikit-learn, pandas
Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+446.51%)
Mutual labels:  scikit-learn, pandas
Pbpython
Code, Notebooks and Examples from Practical Business Python
Stars: ✭ 1,724 (+3909.3%)
Mutual labels:  scikit-learn, pandas
Cheatsheets.pdf
📚 Various cheatsheets in PDF
Stars: ✭ 159 (+269.77%)
Mutual labels:  scikit-learn, pandas
Docker Alpine Python Machinelearning
Small Docker image with Python Machine Learning tools (~180MB) https://hub.docker.com/r/frolvlad/alpine-python-machinelearning/
Stars: ✭ 76 (+76.74%)
Mutual labels:  scikit-learn, pandas
Jetson Containers
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
Stars: ✭ 223 (+418.6%)
Mutual labels:  scikit-learn, pandas
Kagglestruggle
Kaggle Struggle
Stars: ✭ 228 (+430.23%)
Mutual labels:  scikit-learn, pandas
Studybook
Study E-Book(ComputerVision DeepLearning MachineLearning Math NLP Python ReinforcementLearning)
Stars: ✭ 1,457 (+3288.37%)
Mutual labels:  scikit-learn, pandas
Datacamp Python Data Science Track
All the slides, accompanying code and exercises all stored in this repo. 🎈
Stars: ✭ 250 (+481.4%)
Mutual labels:  scikit-learn, pandas
Pymc Example Project
Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.
Stars: ✭ 90 (+109.3%)
Mutual labels:  scikit-learn, pandas
Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+4244.19%)
Mutual labels:  scikit-learn, pandas
Credit Risk Modelling
Credit Risk analysis by using Python and ML
Stars: ✭ 91 (+111.63%)
Mutual labels:  scikit-learn, pandas
Machine Learning With Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+5009.3%)
Mutual labels:  scikit-learn, pandas
Dask
Parallel computing with task scheduling
Stars: ✭ 9,309 (+21548.84%)
Mutual labels:  scikit-learn, pandas
Disease Prediction From Symptoms
Disease Prediction based on Symptoms.
Stars: ✭ 70 (+62.79%)
Mutual labels:  scikit-learn, pandas
Data Science Projects With Python
A Case Study Approach to Successful Data Science Projects Using Python, Pandas, and Scikit-Learn
Stars: ✭ 198 (+360.47%)
Mutual labels:  scikit-learn, pandas
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+7230.23%)
Mutual labels:  scikit-learn, pandas

Otto Group BI Data Science Toolbox

NOTE: This project is on life support. That means there are probably not any new features being added, but there will be regular updates to support upcoming versions of sklearn and pandas.

This repository contains tools that make working with scikit-learn and pandas easier.

Build Status

What is this?

dstoolbox is not one big tool but rather an amalgamation of small re-usable tools. They are intended to work well with scikit-learn and pandas make the integration of those libraries easier.

The best way to get started is to have a look at the notebooks folder, especially at the showcase notebook.

The tools included here are used by us at Otto Group BI for our production services, as well as by individual members for machine learning related things, such as participating in Kaggle competitions.

Installation instructions

Using pip:

pip install dstoolbox

There is a conda recipe for those who want to build their own conda package.

Contributing

Pull requests are welcome. Here are some directions:

Tests

To run the tests, you need to install the dev requirements using pip:

pip install -r requirements-dev.txt

or conda:

conda install --file requirements-dev.txt

Next you should check that all unit tests and all static code checks pass:

py.test
pylint dstoolbox

Guidelines

  • Python 3 only.
  • Code should be re-usable and succinct.
  • Where applicable, it should be compatible with scikit-learn, pandas, and Palladium.
  • It should be documented and unit-tested using pytest (100% code coverage desired).
  • It should conform to the coding standards prescribed by pylint (where it makes sense).
  • There should be usage examples that cover the most common use cases (the best place would be an IPython/Jupyter notebook).
  • Don't add dependencies unless absolutely necessary.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].