PIM stack for triton inference server to run AI/ML applications #25

manju956 · 2025-09-09T05:00:58Z

Downloads onnx model and config.pbtxt file from hosted http server
Brings up triton inference server to serve ONNX model for fraud detection example

examples/tritonserver/README.md

dharaneeshvrd

@manju956
can you please update the examples/tritonserver/README.md with steps to run the e2e flow like this?

Also rename PR title to remove fraud detection?

examples/tritonserver/app/README.md

examples/tritonserver/app/build_and_train.sh

examples/tritonserver/app/README.md

examples/tritonserver/README.md

manju956 · 2025-09-24T09:48:24Z

@manju956 can you please update the examples/tritonserver/README.md with steps to run the e2e flow like this?

Also rename PR title to remove fraud detection?

@dharaneeshvrd Added e2e flow steps to run triton server with example AI apps. Also updated the PR title to be generic

dharaneeshvrd · 2025-09-30T04:05:46Z

examples/tritonserver/Containerfile

@@ -0,0 +1,8 @@
+FROM na.artifactory.swg-devops.com/sys-pcloud-docker-local/devops/pim/base


Let's use quay image instead of jfrog registry like in other examples.

dharaneeshvrd · 2025-09-30T04:58:32Z

examples/tritonserver/README.md

+## Step 1: Building the images
+Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
+To reuse the built container image, push the built image to container registry.
+```shell
+podman push <registry>/build_env
+```
+
+## Step2: Train the model
+Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md)
+After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**


Suggested change

## Step 1: Building the images

Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)

To reuse the built container image, push the built image to container registry.

```shell

podman push <registry>/build_env

```

## Step2: Train the model

Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md)

After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**

### Step 1: Preparing the model and config file

As mentioned earlier triton inference server can be used to serve any machine learning models with their respective configuration files stored in model repository. You can build your model and config for your use case. To show case the e2e flow of triton inference server deployment from PIM, we would be utilising the existing ai-demos provided in https://github.com/PDeXchange/ai-demos. Please follow below steps to build the model and config file.

### Using https://github.com/PDeXchange/ai-demos

#### Step I: Building the image

To easily train the model with the provided python application, we have built the Containerfile with the necessary packages and tools to run the python application which can train the model for you. Build the container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)

To reuse the built container image, push the built image to container registry.

`podman push <registry>/build_env`

#### Step II: Train the model

Model with ONNX runtime can be trained by running the container image built in Step I. Follow the [training steps](app/README.md)

After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**

I was expecting something like this, please rephrase and improvise it as required

dharaneeshvrd · 2025-09-30T05:06:46Z

examples/tritonserver/README.md

+    "configSource": "http://<Host/IP>/fraud_detection/config.pbtxt"
+  }
+```
+Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.


Suggested change

Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.

Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine where you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.

dharaneeshvrd · 2025-09-30T05:07:58Z

examples/tritonserver/README.md

+```
+
+### Validate AI application functionality
+To verify AI example application served from Triton server, Apply below speicifed configurations in [config.ini](../../config.ini).  


Explain this in generic and mention if you have used the ai-demos/fraud-detection below payload can be used to validate the same.

dharaneeshvrd · 2025-09-30T05:08:29Z

examples/tritonserver/README.md

+}
+```
+
+### Build PIM triton server


This need to be provided before setting up partition section.

examples/tritonserver/app/README.md

dharaneeshvrd · 2025-09-30T05:14:18Z

examples/tritonserver/tritonserver.container

+EnvironmentFile=/etc/pim/env.conf
+
+[Container]
+Image=na.artifactory.swg-devops.com/sys-linux-power-team-ftp3distro-docker-images-docker-local/tritonserver:latest


Same as prev comment regd using jfrog registry

examples/tritonserver/tritonserver_config.sh

Signed-off-by: Manjunath-A-C <[email protected]>

examples/tritonserver/Containerfile

examples/tritonserver/README.md

dharaneeshvrd · 2025-10-08T10:21:28Z

examples/tritonserver/app/README.md

@@ -0,0 +1,25 @@
+# Triton server
+
+[Triton server](https://github.com/triton-inference-server/server) can be used to inference AI workloads using machine learning models. Some of pre-built example AI workloads like fraud detection, Iris classification etc are covered in the [ai-demos-repo](https://github.com/PDeXchange/ai-demos). Users can utilise them to try out the triton inference server.


Please mention somewhere that currently only fraud detection example is supported via provided script.

with recent changes in containerization of ai-demos apps, triton server supports both examples, so this line not needed.

examples/tritonserver/README.md

dharaneeshvrd · 2025-10-08T10:31:35Z

examples/tritonserver/README.md

+# Triton
+
+Triton inference server can be used to serve machine learning or deep learning models like classification, regression etc on CPU/GPU platforms.
+Triton inference server is built on top of base image [here](../../base-image/)


I think this line might not be required if you do below comment

Signed-off-by: Manjunath-A-C <[email protected]>

manju956 force-pushed the triton-inference branch from caec625 to 57262c2 Compare September 9, 2025 06:37

manju956 changed the title ~~[WIP] PIM stack for fraud detection usecase using triton inference server~~ PIM stack for fraud detection usecase using triton inference server Sep 12, 2025

manju956 requested review from adarshagrawal38 and dharaneeshvrd September 12, 2025 06:39

manju956 self-assigned this Sep 12, 2025

manju956 added the enhancement New feature or request label Sep 12, 2025

adarshagrawal38 reviewed Sep 12, 2025

View reviewed changes

examples/tritonserver/README.md Outdated Show resolved Hide resolved

dharaneeshvrd reviewed Sep 22, 2025

View reviewed changes

manju956 changed the title ~~PIM stack for fraud detection usecase using triton inference server~~ PIM stack for triton inference server to run AI/ML applications Sep 24, 2025

dharaneeshvrd requested changes Sep 30, 2025

View reviewed changes

manju956 added 10 commits September 30, 2025 14:51

PIM stack for fraud detection usecase using triton inference server

13c62da

Signed-off-by: Manjunath-A-C <[email protected]>

Generalize triton server to run different AI usecases

8ab62ef

Signed-off-by: Manjunath-A-C <[email protected]>

refactor model repository path to store different models

9c62ad5

Signed-off-by: Manjunath-A-C <[email protected]>

Build script to build AI application container images from ai-demos repo

36d8ae8

Signed-off-by: Manjunath-A-C <[email protected]>

Fix documentation issues

fa4f572

Signed-off-by: Manjunath-A-C <[email protected]>

Fix build and run script issues

90b37f5

Signed-off-by: Manjunath-A-C <[email protected]>

Override entrypoint in the fraud detection container image

98964a7

Signed-off-by: Manjunath-A-C <[email protected]>

Provide e2e steps to run AI example apps using triton server

fd6b15a

Signed-off-by: Manjunath-A-C <[email protected]>

Rectify model directory path on http host server

0d42803

Signed-off-by: Manjunath-A-C <[email protected]>

changes to train model

2d1858d

Signed-off-by: Manjunath-A-C <[email protected]>

manju956 force-pushed the triton-inference branch from 5182613 to 3cf2ff3 Compare October 6, 2025 07:07

Refactor documentation

fbaa642

Signed-off-by: Manjunath-A-C <[email protected]>

manju956 force-pushed the triton-inference branch from 3cf2ff3 to fbaa642 Compare October 6, 2025 07:13

dharaneeshvrd reviewed Oct 8, 2025

View reviewed changes

manju956 added 3 commits October 8, 2025 22:14

Update pim base image

7cc4022

Signed-off-by: Manjunath-A-C <[email protected]>

build script changes for refactored container image

43f6ea1

Signed-off-by: Manjunath-A-C <[email protected]>

documentation changes after containerization refactoring

2ff889a

Signed-off-by: Manjunath-A-C <[email protected]>

		@@ -0,0 +1,8 @@
		FROM na.artifactory.swg-devops.com/sys-pcloud-docker-local/devops/pim/base

	Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.
	Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine where you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.

		@@ -0,0 +1,25 @@
		# Triton server

		[Triton server](https://github.com/triton-inference-server/server) can be used to inference AI workloads using machine learning models. Some of pre-built example AI workloads like fraud detection, Iris classification etc are covered in the [ai-demos-repo](https://github.com/PDeXchange/ai-demos). Users can utilise them to try out the triton inference server.

Uh oh!

PIM stack for triton inference server to run AI/ML applications #25

Are you sure you want to change the base?

PIM stack for triton inference server to run AI/ML applications #25

Uh oh!

Conversation

manju956 commented Sep 9, 2025

Uh oh!

Uh oh!

dharaneeshvrd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

manju956 commented Sep 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants