Resnet model inference on Virtual GPU

This is an example to deploy resnet model and invoke the client to get prediction result.

We already build resnet50 model in conatiner image seedjeffwan/tensorflow-serving-gpu:resnet. You can build your own image using Dockefile.

$ kubectl apply -f resnet.yaml

Since we plan to use ClusterIP for the model service, we will create a client in the cluster to communication.

$ kubectl apply -f client.yaml

Enter python client pod we created.

$ kubectl exec -it python-client bash

$ apt update && apt install -y vim && pip install requests

Prepare model client, copy the scripts from resnet_client.py

$ vim client.py

Invoke model prediction, the first call will take some time to warm up, reset of the calls will be stable.

$ python client.py
Prediction class: 286, avg latency: 26.615 ms

Provide feedback