Skip to content

Conversation

@manju956
Copy link
Collaborator

@manju956 manju956 commented Sep 9, 2025

  • Downloads onnx model and config.pbtxt file from hosted http server
  • Brings up triton inference server to serve ONNX model for fraud detection example

@manju956 manju956 changed the title [WIP] PIM stack for fraud detection usecase using triton inference server PIM stack for fraud detection usecase using triton inference server Sep 12, 2025
@manju956 manju956 self-assigned this Sep 12, 2025
@manju956 manju956 added the enhancement New feature or request label Sep 12, 2025
Copy link
Member

@dharaneeshvrd dharaneeshvrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manju956
can you please update the examples/tritonserver/README.md with steps to run the e2e flow like this?

Also rename PR title to remove fraud detection?

@manju956 manju956 changed the title PIM stack for fraud detection usecase using triton inference server PIM stack for triton inference server to run AI/ML applications Sep 24, 2025
@manju956
Copy link
Collaborator Author

@manju956 can you please update the examples/tritonserver/README.md with steps to run the e2e flow like this?

Also rename PR title to remove fraud detection?

@dharaneeshvrd Added e2e flow steps to run triton server with example AI apps. Also updated the PR title to be generic

@@ -0,0 +1,8 @@
FROM na.artifactory.swg-devops.com/sys-pcloud-docker-local/devops/pim/base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use quay image instead of jfrog registry like in other examples.

Comment on lines 8 to 17
## Step 1: Building the images
Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
To reuse the built container image, push the built image to container registry.
```shell
podman push <registry>/build_env
```

## Step2: Train the model
Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md)
After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Step 1: Building the images
Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
To reuse the built container image, push the built image to container registry.
```shell
podman push <registry>/build_env
```
## Step2: Train the model
Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md)
After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**
### Step 1: Preparing the model and config file
As mentioned earlier triton inference server can be used to serve any machine learning models with their respective configuration files stored in model repository. You can build your model and config for your use case. To show case the e2e flow of triton inference server deployment from PIM, we would be utilising the existing ai-demos provided in https://github.com/PDeXchange/ai-demos. Please follow below steps to build the model and config file.
### Using https://github.com/PDeXchange/ai-demos
#### Step I: Building the image
To easily train the model with the provided python application, we have built the Containerfile with the necessary packages and tools to run the python application which can train the model for you. Build the container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
To reuse the built container image, push the built image to container registry.
`podman push <registry>/build_env`
#### Step II: Train the model
Model with ONNX runtime can be trained by running the container image built in Step I. Follow the [training steps](app/README.md)
After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**

I was expecting something like this, please rephrase and improvise it as required

"configSource": "http://<Host/IP>/fraud_detection/config.pbtxt"
}
```
Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.
Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine where you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.

```

### Validate AI application functionality
To verify AI example application served from Triton server, Apply below speicifed configurations in [config.ini](../../config.ini).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain this in generic and mention if you have used the ai-demos/fraud-detection below payload can be used to validate the same.

}
```

### Build PIM triton server
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This need to be provided before setting up partition section.

EnvironmentFile=/etc/pim/env.conf

[Container]
Image=na.artifactory.swg-devops.com/sys-linux-power-team-ftp3distro-docker-images-docker-local/tritonserver:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as prev comment regd using jfrog registry

Signed-off-by: Manjunath-A-C <[email protected]>
@@ -0,0 +1,25 @@
# Triton server

[Triton server](https://github.com/triton-inference-server/server) can be used to inference AI workloads using machine learning models. Some of pre-built example AI workloads like fraud detection, Iris classification etc are covered in the [ai-demos-repo](https://github.com/PDeXchange/ai-demos). Users can utilise them to try out the triton inference server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention somewhere that currently only fraud detection example is supported via provided script.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with recent changes in containerization of ai-demos apps, triton server supports both examples, so this line not needed.

# Triton

Triton inference server can be used to serve machine learning or deep learning models like classification, regression etc on CPU/GPU platforms.
Triton inference server is built on top of base image [here](../../base-image/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line might not be required if you do below comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants