-
Couldn't load subscription status.
- Fork 7
PIM stack for triton inference server to run AI/ML applications #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
manju956
commented
Sep 9, 2025
- Downloads onnx model and config.pbtxt file from hosted http server
- Brings up triton inference server to serve ONNX model for fraud detection example
caec625 to
57262c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dharaneeshvrd Added e2e flow steps to run triton server with example AI apps. Also updated the PR title to be generic |
examples/tritonserver/Containerfile
Outdated
| @@ -0,0 +1,8 @@ | |||
| FROM na.artifactory.swg-devops.com/sys-pcloud-docker-local/devops/pim/base | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use quay image instead of jfrog registry like in other examples.
examples/tritonserver/README.md
Outdated
| ## Step 1: Building the images | ||
| Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md) | ||
| To reuse the built container image, push the built image to container registry. | ||
| ```shell | ||
| podman push <registry>/build_env | ||
| ``` | ||
|
|
||
| ## Step2: Train the model | ||
| Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md) | ||
| After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ## Step 1: Building the images | |
| Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md) | |
| To reuse the built container image, push the built image to container registry. | |
| ```shell | |
| podman push <registry>/build_env | |
| ``` | |
| ## Step2: Train the model | |
| Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md) | |
| After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud** | |
| ### Step 1: Preparing the model and config file | |
| As mentioned earlier triton inference server can be used to serve any machine learning models with their respective configuration files stored in model repository. You can build your model and config for your use case. To show case the e2e flow of triton inference server deployment from PIM, we would be utilising the existing ai-demos provided in https://github.com/PDeXchange/ai-demos. Please follow below steps to build the model and config file. | |
| ### Using https://github.com/PDeXchange/ai-demos | |
| #### Step I: Building the image | |
| To easily train the model with the provided python application, we have built the Containerfile with the necessary packages and tools to run the python application which can train the model for you. Build the container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md) | |
| To reuse the built container image, push the built image to container registry. | |
| `podman push <registry>/build_env` | |
| #### Step II: Train the model | |
| Model with ONNX runtime can be trained by running the container image built in Step I. Follow the [training steps](app/README.md) | |
| After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud** |
I was expecting something like this, please rephrase and improvise it as required
examples/tritonserver/README.md
Outdated
| "configSource": "http://<Host/IP>/fraud_detection/config.pbtxt" | ||
| } | ||
| ``` | ||
| Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above. | |
| Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine where you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above. |
examples/tritonserver/README.md
Outdated
| ``` | ||
|
|
||
| ### Validate AI application functionality | ||
| To verify AI example application served from Triton server, Apply below speicifed configurations in [config.ini](../../config.ini). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain this in generic and mention if you have used the ai-demos/fraud-detection below payload can be used to validate the same.
examples/tritonserver/README.md
Outdated
| } | ||
| ``` | ||
|
|
||
| ### Build PIM triton server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This need to be provided before setting up partition section.
| EnvironmentFile=/etc/pim/env.conf | ||
|
|
||
| [Container] | ||
| Image=na.artifactory.swg-devops.com/sys-linux-power-team-ftp3distro-docker-images-docker-local/tritonserver:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as prev comment regd using jfrog registry
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
5182613 to
3cf2ff3
Compare
Signed-off-by: Manjunath-A-C <[email protected]>
3cf2ff3 to
fbaa642
Compare
| @@ -0,0 +1,25 @@ | |||
| # Triton server | |||
|
|
|||
| [Triton server](https://github.com/triton-inference-server/server) can be used to inference AI workloads using machine learning models. Some of pre-built example AI workloads like fraud detection, Iris classification etc are covered in the [ai-demos-repo](https://github.com/PDeXchange/ai-demos). Users can utilise them to try out the triton inference server. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mention somewhere that currently only fraud detection example is supported via provided script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with recent changes in containerization of ai-demos apps, triton server supports both examples, so this line not needed.
| # Triton | ||
|
|
||
| Triton inference server can be used to serve machine learning or deep learning models like classification, regression etc on CPU/GPU platforms. | ||
| Triton inference server is built on top of base image [here](../../base-image/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line might not be required if you do below comment
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>
Signed-off-by: Manjunath-A-C <[email protected]>