All Projects → aws → sagemaker-xgboost-container

aws / sagemaker-xgboost-container

Licence: Apache-2.0 license
This is the Docker container based on open source framework XGBoost (https://xgboost.readthedocs.io/en/latest/) to allow customers use their own XGBoost scripts in SageMaker.

Programming Languages

python
139335 projects - #7 most used programming language
java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to sagemaker-xgboost-container

studio-lab-examples
Example notebooks for working with SageMaker Studio Lab. Sign up for an account at the link below!
Stars: ✭ 319 (+243.01%)
Mutual labels:  training, inference, sagemaker
Amazon Sagemaker Examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
Stars: ✭ 6,346 (+6723.66%)
Mutual labels:  training, inference, sagemaker
aws-lambda-docker-serverless-inference
Serve scikit-learn, XGBoost, TensorFlow, and PyTorch models with AWS Lambda container images support.
Stars: ✭ 56 (-39.78%)
Mutual labels:  inference, xgboost, sagemaker
HyperGBM
A full pipeline AutoML tool for tabular data
Stars: ✭ 172 (+84.95%)
Mutual labels:  xgboost, gbm, distributed-training
sagemaker-sparkml-serving-container
This code is used to build & run a Docker container for performing predictions against a Spark ML Pipeline.
Stars: ✭ 44 (-52.69%)
Mutual labels:  inference, sagemaker
Xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Stars: ✭ 22,017 (+23574.19%)
Mutual labels:  xgboost, gbm
aws-customer-churn-pipeline
An End to End Customer Churn Prediction solution using AWS services.
Stars: ✭ 30 (-67.74%)
Mutual labels:  xgboost, sagemaker
optimum
🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools
Stars: ✭ 567 (+509.68%)
Mutual labels:  training, inference
chainer-fcis
[This project has moved to ChainerCV] Chainer Implementation of Fully Convolutional Instance-aware Semantic Segmentation
Stars: ✭ 45 (-51.61%)
Mutual labels:  training, inference
Open Solution Value Prediction
Open solution to the Santander Value Prediction Challenge 🐠
Stars: ✭ 34 (-63.44%)
Mutual labels:  training, xgboost
Bmw Labeltool Lite
This repository provides you with a easy to use labeling tool for State-of-the-art Deep Learning training purposes.
Stars: ✭ 145 (+55.91%)
Mutual labels:  training, inference
Dawn Bench Entries
DAWNBench: An End-to-End Deep Learning Benchmark and Competition
Stars: ✭ 254 (+173.12%)
Mutual labels:  training, inference
RobustTrees
[ICML 2019, 20 min long talk] Robust Decision Trees Against Adversarial Examples
Stars: ✭ 62 (-33.33%)
Mutual labels:  xgboost, gbm
fast retraining
Show how to perform fast retraining with LightGBM in different business cases
Stars: ✭ 56 (-39.78%)
Mutual labels:  xgboost, gbm
stackgbm
🌳 Stacked Gradient Boosting Machines
Stars: ✭ 24 (-74.19%)
Mutual labels:  xgboost, gbm
Nimble
Stars: ✭ 121 (+30.11%)
Mutual labels:  training, inference
decision-trees-for-ml
Building Decision Trees From Scratch In Python
Stars: ✭ 61 (-34.41%)
Mutual labels:  xgboost, gbm
go-ml-benchmarks
⏱ Benchmarks of machine learning inference for Go
Stars: ✭ 27 (-70.97%)
Mutual labels:  inference, xgboost
hypothesis
A Python toolkit for (simulation-based) inference and the mechanization of science.
Stars: ✭ 47 (-49.46%)
Mutual labels:  inference
cytrone
CyTrONE: Integrated Cybersecurity Training Framework
Stars: ✭ 72 (-22.58%)
Mutual labels:  training

SageMaker XGBoost Container

SageMaker XGBoost Container is an open source library for making the XGBoost framework run on Amazon SageMaker.

This repository also contains Dockerfiles which install this library and dependencies for building SageMaker XGBoost Framework images.

The SageMaker team uses this repository to build its official XGBoost Framework image. To use this image on SageMaker, see Python SDK. For end users, this repository is typically of interest if you need implementation details for the official image, or if you want to use it to build your own customized XGBoost Framework image.

Table of Contents

  1. Getting Started
  2. Building your Image
  3. Running the tests

Getting Started

Prerequisites

Make sure you have installed all of the following prerequisites on your development machine:

Note: CMake is required for XGBoost. If using macOS, install CMake (pip install cmake)

Recommended

Building your image

Amazon SageMaker utilizes Docker containers to run all training jobs & inference endpoints.

The Docker images are built from the Dockerfiles specified in Docker/.

The Docker files are grouped based on XGboost version and separated based on Python version and processor type.

The Docker images, used to run training & inference jobs, are built from both corresponding "base" and "final" Dockerfiles.

Base Images

The "base" Dockerfile encompass the installation of the framework and all of the dependencies needed.

Tagging scheme is based on <SageMaker-XGBoost-version>-cpu-py3 (e.g. 1.5-1-cpu-py3), where
<SageMaker-XGBoost-version> is comprised of <XGBoost-version>-<SageMaker-version>.

All "final" Dockerfiles build images using base images that use the tagging scheme above.

If you want to build your base docker image, then use:

# All build instructions assume you're building from the root directory of the sagemaker-xgboost-container.

# CPU
docker build -t xgboost-container-base:<SageMaker-XGBoost-version>-cpu-py3 -f docker/<SageMaker-XGBoost-version>/base/Dockerfile.cpu .
# Example

# CPU
docker build -t xgboost-container-base:1.5-1-cpu-py3 -f docker/1.5-1/base/Dockerfile.cpu .

Final Images

The "final" Dockerfiles encompass the installation of the SageMaker specific support code.

All "final" Dockerfiles use base images for building.

These "base" images are specified with the naming convention of xgboost-container-base:<SageMaker-XGBoost-version>-cpu-py3.

Before building "final" images:

Build your "base" image. Make sure it is named and tagged in accordance with your "final" Dockerfile.

# Create the SageMaker XGBoost Container Python package.
cd sagemaker-xgboost-container
python setup.py bdist_wheel --universal

If you want to build "final" Docker images, then use:

# All build instructions assume you're building from the root directory of the sagemaker-xgboost-container.

# CPU
docker build -t <image_name>:<tag> -f docker/<xgboost-version>/final/Dockerfile.cpu .
# Example

# CPU
docker build -t preprod-xgboost-container:1.5-1-cpu-py3 -f docker/1.5-1/final/Dockerfile.cpu .

Running the tests

Running the tests requires installation of the SageMaker XGBoost Framework container code and its test dependencies.

git clone https://github.com/aws/sagemaker-xgboost-container.git
cd sagemaker-xgboost-container
# The below command will work if you're using bash as the shell.
pip install -e .[test]

Conda is also required and can be installed by following the instructions at https://conda.io/projects/conda/en/latest/user-guide/install/index.html. For convenience, the Linux installation commands are provided as an example.

curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -bfp /miniconda3
rm Miniconda3-latest-Linux-x86_64.sh
export PATH=/miniconda3/bin:${PATH}
conda update -y conda

Tests are defined in test/ and include unit, local integration, and SageMaker integration tests.

Unit Tests

If you want to run unit tests, then use:

# All test instructions should be run from the top level directory

pytest test/unit

# or you can use tox to run unit tests as well as flake8 and code coverage

tox
tox -e py3-xgboost1.0,flake8
tox -e py3-xgboost0.90,py3-xgboostlatest
tox -e py3-xgboost0.72

Local Integration Tests

Running local integration tests require Docker and AWS credentials, as the local integration tests make calls to a couple AWS services. The local integration tests and SageMaker integration tests require configurations specified within their respective conftest.py.

Before running local integration tests:

  1. Build your Docker image.
  2. Pass in the correct pytest arguments to run tests against your Docker image.

If you want to run local integration tests, then use:

# Required arguments for integration tests are found in test/conftest.py

pytest test/integration/local --docker-base-name <your_docker_image> \
                  --tag <your_docker_image_tag> \
                  --py-version <2_or_3> \
                  --framework-version <xgboost-version>
# Example
pytest test/integration/local --docker-base-name preprod-xgboost-container \
                  --tag 1.5-1-cpu-py3 \
                  --py-version 3 \
                  --framework-version 1.5-1

SageMaker Integration Tests

SageMaker integration tests require your Docker image to be within an Amazon ECR repository.

The Docker base name is your ECR repository namespace.

The instance type is your specified Amazon SageMaker Instance Type that the SageMaker integration test will run on.

Before running SageMaker integration tests:

  1. Build your Docker image.
  2. Push the image to your ECR repository.
  3. Pass in the correct pytest arguments to run tests on SageMaker against the image within your ECR repository.

If you want to run a SageMaker integration end to end test on Amazon SageMaker, then use:

# Required arguments for integration tests are found in test/conftest.py

pytest test/integration/sagemaker --aws-id <your_aws_id> \
                       --docker-base-name <your_docker_image> \
                       --instance-type <amazon_sagemaker_instance_type> \
                       --tag <your_docker_image_tag>
# Example
pytest test/integration/sagemaker --aws-id 12345678910 \
                       --docker-base-name preprod-xgboost-container \
                       --instance-type ml.m4.xlarge \
                       --tag 1.0

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

License

SageMaker XGboost Framework Container is licensed under the Apache 2.0 License. It is copyright 2019 Amazon .com, Inc. or its affiliates. All Rights Reserved. The license is available at: http://aws.amazon.com/apache2.0/

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].