All Projects → PaddlePaddle → Serving

PaddlePaddle / Serving

Licence: apache-2.0
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Serving

Bodywork Core
Deploy machine learning projects developed in Python, to Kubernetes. Accelerated MLOps 🚀
Stars: ✭ 145 (-64.02%)
Mutual labels:  pipeline, serving
Pex Context
Modern WebGL state wrapper for PEX: allocate GPU resources (textures, buffers), setup state pipelines and passes, and combine them into commands.
Stars: ✭ 117 (-70.97%)
Mutual labels:  pipeline, gpu
Imageai
A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities
Stars: ✭ 6,734 (+1570.97%)
Mutual labels:  gpu, prediction
lineage
Generate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (-96.03%)
Mutual labels:  pipeline, dag
STOCK-RETURN-PREDICTION-USING-KNN-SVM-GUASSIAN-PROCESS-ADABOOST-TREE-REGRESSION-AND-QDA
Forecast stock prices using machine learning approach. A time series analysis. Employ the Use of Predictive Modeling in Machine Learning to Forecast Stock Return. Approach Used by Hedge Funds to Select Tradeable Stocks
Stars: ✭ 94 (-76.67%)
Mutual labels:  pipeline, prediction
Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (-61.79%)
Mutual labels:  pipeline, prediction
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+197.52%)
Mutual labels:  pipeline, prediction
bodywork-ml-pipeline-project
Deployment template for a continuous training pipeline.
Stars: ✭ 22 (-94.54%)
Mutual labels:  pipeline, serving
sagemaker-sparkml-serving-container
This code is used to build & run a Docker container for performing predictions against a Spark ML Pipeline.
Stars: ✭ 44 (-89.08%)
Mutual labels:  pipeline, serving
bifrost
A stream processing framework for high-throughput applications.
Stars: ✭ 48 (-88.09%)
Mutual labels:  pipeline, gpu
Ilgpu
ILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (-7.2%)
Mutual labels:  gpu
Neuraxle
A Sklearn-like Framework for Hyperparameter Tuning and AutoML in Deep Learning projects. Finally have the right abstractions and design patterns to properly do AutoML. Let your pipeline steps have hyperparameter spaces. Enable checkpoints to cut duplicate calculations. Go from research to production environment easily.
Stars: ✭ 377 (-6.45%)
Mutual labels:  pipeline
Cloud Gpus
This repository contains information about Cloud GPU offerings for Machine Learning practitioners.
Stars: ✭ 395 (-1.99%)
Mutual labels:  gpu
Oceananigans.jl
🌊 Fast and friendly fluid dynamics on CPUs and GPUs
Stars: ✭ 400 (-0.74%)
Mutual labels:  gpu
Cuda.jl
CUDA programming in Julia.
Stars: ✭ 370 (-8.19%)
Mutual labels:  gpu
Go Spacemesh
Go Implementation of the Spacemesh protocol full node. 💾⏰💪
Stars: ✭ 389 (-3.47%)
Mutual labels:  dag
Piper
piper - a distributed workflow engine
Stars: ✭ 374 (-7.2%)
Mutual labels:  pipeline
Stats
macOS system monitor in your menu bar
Stars: ✭ 7,134 (+1670.22%)
Mutual labels:  gpu
Kashti
Kashti is a dashboard for your Brigade pipelines.
Stars: ✭ 370 (-8.19%)
Mutual labels:  pipeline
Gpuvideo Android
This library apply video filter on generate an Mp4 and on ExoPlayer video and Video Recording with Camera2.
Stars: ✭ 403 (+0%)
Mutual labels:  gpu

(简体中文|English)




Build Status Release Issues License Slack

Motivation

We consider deploying deep learning inference service online to be a user-facing application in the future. The goal of this project: When you have trained a deep neural net with Paddle, you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:

Some Key Features of Paddle Serving

  • Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
  • Industrial serving features supported, such as models management, online loading, online A/B testing etc.
  • Highly concurrent and efficient communication between clients and servers supported.
  • Multiple programming languages supported on client side, such as C++, python and Java.

AIStudio Turorial

Here we provide tutorial on AIStudio(Chinese Version) AIStudio教程-Paddle Serving服务化部署框架

The tutorial provides

  • Paddle Serving Environment Setup
    • Running in docker images
    • pip install Paddle Serving
  • Quick Experience of Paddle Serving
  • Advanced Tutorial of Model Deployment
    • Save/Convert Models for Paddle Serving
    • Setup Online Inference Service
  • Paddle Serving Examples
    • Paddle Serving for Detections
    • Paddle Serving for OCR

Installation

We highly recommend you to run Paddle Serving in Docker, please visit Run in Docker. See the document for more docker images.

Attention:: Currently, the default GPU environment of paddlepaddle 2.0 is Cuda 10.2, so the sample code of GPU Docker is based on Cuda 10.2. We also provides docker images and whl packages for other GPU environments. If users use other environments, they need to carefully check and select the appropriate version.

# Run CPU Docker
docker pull registry.baidubce.com/paddlepaddle/serving:0.5.0-devel
docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.5.0-devel bash
docker exec -it test bash
git clone https://github.com/PaddlePaddle/Serving
# Run GPU Docker
nvidia-docker pull registry.baidubce.com/paddlepaddle/serving:0.5.0-cuda10.2-cudnn8-devel
nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.5.0-cuda10.2-cudnn8-devel bash
nvidia-docker exec -it test bash
git clone https://github.com/PaddlePaddle/Serving
pip install paddle-serving-client==0.5.0
pip install paddle-serving-server==0.5.0 # CPU
pip install paddle-serving-app==0.3.0
pip install paddle-serving-server-gpu==0.5.0.post102 #GPU with CUDA10.2 + TensorRT7
# DO NOT RUN ALL COMMANDS! check your GPU env and select the right one
pip install paddle-serving-server-gpu==0.5.0.post9 # GPU with CUDA9.0
pip install paddle-serving-server-gpu==0.5.0.post10 # GPU with CUDA10.0
pip install paddle-serving-server-gpu==0.5.0.post101 # GPU with CUDA10.1 + TensorRT6
pip install paddle-serving-server-gpu==0.5.0.post11 # GPU with CUDA10.1 + TensorRT7

You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add -i https://pypi.tuna.tsinghua.edu.cn/simple to pip command) to speed up the download.

If you need install modules compiled with develop branch, please download packages from latest packages list and install with pip install command. If you want to compile by yourself, please refer to How to compile Paddle Serving?

Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.

Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.5/3.6/3.7/3.8.

Recommended to install paddle >= 2.0.0

# CPU users, please run
pip install paddlepaddle==2.0.0

# GPU Cuda10.2 please run
pip install paddlepaddle-gpu==2.0.0 

Note: If your Cuda version is not 10.2, please do not execute the above commands directly, you need to refer to Paddle official documentation-multi-version whl package list

Select the url link of the corresponding GPU environment and install it. For example, for Python2.7 users of Cuda 9.0, please select cp27-cp27mu and The url corresponding to cuda9.0_cudnn7-mkl, copy it and run

pip install https://paddle-wheel.bj.bcebos.com/2.0.0-gpu-cuda9-cudnn7-mkl/paddlepaddle_gpu-2.0.0.post90-cp27-cp27mu-linux_x86_64.whl

the default paddlepaddle-gpu==2.0.0 is Cuda 10.2 with no TensorRT. If you want to install PaddlePaddle with TensorRT. please also check the documentation-multi-version whl package list and find key word cuda10.2-cudnn8.0-trt7.1.3. More info please check Paddle Serving uses TensorRT

If it is other environment and Python version, please find the corresponding link in the table and install it with pip.

For Windows Users, please read the document Paddle Serving for Windows Users

Quick Start Example

This quick start example is mainly for those users who already have a model to deploy, and we also provide a model that can be used for deployment. in case if you want to know how to complete the process from offline training to online service, please refer to the AiStudio tutorial above.

Boston House Price Prediction model

get into the Serving git directory, and change dir to fit_a_line

cd Serving/python/examples/fit_a_line
sh get_data.sh

Paddle Serving provides HTTP and RPC based service for users to access

RPC service

A user can also start a RPC service with paddle_serving_server.serve. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify --name here.

python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
Argument Type Default Description
thread int 4 Concurrency of current service
port int 9292 Exposed port of current service to users
model str "" Path of paddle model directory to be served
mem_optim_off - - Disable memory / graphic memory optimization
ir_optim - - Enable analysis and optimization of calculation graph
use_mkl (Only for cpu version) - - Run inference with MKL
use_trt (Only for trt version) - - Run inference with TensorRT
use_lite (Only for ARM) - - Run PaddleLite inference
use_xpu (Only for ARM+XPU) - - Run PaddleLite XPU inference
# A user can visit rpc service through paddle_serving_client API
from paddle_serving_client import Client
import numpy as np
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
        -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
print(fetch_map)

Here, client.predict function has two arguments. feed is a python dict with model input variable alias name and values. fetch assigns the prediction variables to be returned from servers. In the example, the name of "x" and "price" are assigned when the servable model is saved during training.

WEB service

Users can also put the data format processing logic on the server side, so that they can directly use curl to access the service, refer to the following case whose path is python/examples/fit_a_line

python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci

for client side,

curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction

the response is

{"result":{"price":[[18.901151657104492]]}}

Document

New to Paddle Serving

Developers

About Efficiency

Design

FAQ

Community

Slack

To connect with other users and contributors, welcome to join our Slack channel

Contribution

If you want to contribute code to Paddle Serving, please reference Contribution Guidelines

  • Special Thanks to @BeyondYourself in complementing the gRPC tutorial, updating the FAQ doc and modifying the mdkir command
  • Special Thanks to @mcl-stone in updating faster_rcnn benchmark
  • Special Thanks to @cg82616424 in updating the unet benchmark and modifying resize comment error

Feedback

For any feedback or to report a bug, please propose a GitHub Issue.

License

Apache 2.0 License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].