This guide introduces how to run model serving job on OpenPAI. Serving system for machine learning models is designed for production environments, which makes it easy to deploy new algorithms and experiments to users. The following contents show some basic model serving examples, other customized serving code can be run similarly.
To run TensorFlow model serving, you need to prepare a job configuration file and submit it through webportal.
OpenPAI packaged the docker env required by the job for user to use. User could refer to DOCKER.md to customize this example docker env. If user have built a customized image and pushed it to Docker Hub, replace our pre-built image openpai/pai.example.tensorflow-serving
with your own.
Here're some configuration file examples:
{
"jobName": "tensorflow-serving",
"image": "openpai/pai.example.tensorflow-serving",
"taskRoles": [
{
"name": "serving",
"taskNumber": 1,
"cpuNumber": 4,
"memoryMB": 8192,
"gpuNumber": 1,
"portList": [
{
"label": "model_server",
"beginAt": 0,
"portNumber": 1
}
],
"command": "bazel-bin/tensorflow_serving/example/mnist_saved_model /tmp/mnist_model && while :; do tensorflow_model_server --port=$PAI_CONTAINER_HOST_model_server_PORT_LIST --model_name=mnist --model_base_path=/tmp/mnist_model; done"
}
],
"retryCount": -2
}
For more details on how to write a job configuration file, please refer to job tutorial.