IBM · manju956 · Sep 9, 2025 · Sep 9, 2025 · Sep 9, 2025 · Sep 12, 2025
diff --git a/examples/tritonserver/Containerfile b/examples/tritonserver/Containerfile
@@ -0,0 +1,8 @@
+FROM quay.io/powercloud/pim:base
+
+COPY tritonserver_config.sh /usr/bin/
+COPY tritonserver_config.service /etc/systemd/system
+RUN systemctl unmask tritonserver_config.service
+RUN systemctl enable tritonserver_config.service
+
+COPY tritonserver.container /usr/share/containers/systemd
diff --git a/examples/tritonserver/README.md b/examples/tritonserver/README.md
@@ -0,0 +1,138 @@
+# Triton
+
+Triton inference server can be used to serve machine learning or deep learning models like classification, regression etc on CPU/GPU platforms.
+Triton inference server is built on top of base image [here](../../base-image/)
+
+## Build PIM triton server
+**Step 1: Build Base image**
+Follow the steps provided [here](../../base-image/README.md) to build the base image.
+
+**Step 2: Build triton server PIM image**
+- Bootc PIM based triton server image brings up the AI partition that can serve trained machine learning models for the AI applications.
+Ensure to replace the `FROM` image in [Containerfile](Containerfile) with the base image you have built before building this image.
+
+```shell
+podman build -t <your_registry>/pim:triton-server
+
+podman push <your_registry>/pim:triton-server
+```
+
+## Steps to setup e2e inference flow
+
+### Step 1: Preparing the model and config file
+As mentioned earlier triton inference server can be used to serve any machine learning models with their respective model and configuration files stored in model repository. You can build your model and config file for your use case. 
+To show case the e2e flow of triton inference server deployment from PIM, we will be utilising the existing application [fraud-detection](https://github.com/PDeXchange/ai-demos/tree/main/02_Fraud_Detection). Please follow below steps to build the model and config file.
+
+#### Step I: Building the image
+To easily train the model with the provided python application, we have provided a Containerfile with the necessary packages, environment and tools to run the python application which can train the model for you. The source files for training the python application will be volume mounted during training to reuse the container across AI example applications.
+
+Build the container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
+
+To consume the already built and hosted container image use `quay.io/powercloud/build_env`
+
+#### Step II: Train the model
+Model with ONNX runtime can be trained by running the container image built in Step I. Follow the [training steps](app/README.md)
+After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**
+
+### Step 2: Store model artifacts in a model repository
+Store both model file(model.onnx) and config file(config.pbtxt) in a simple HTTP server
+
+#### Steps to start http server and copy the model artifacts
+```shell
+# Install httpd
+yum install httpd -y
+systemctl enable httpd
+systemctl start httpd
+# Copy AI app specific artifacts like model file and model config file
+mkdir -p /var/www/html/fraud_detection/
+cp <current_dir>/model_repository/fraud_detection/config.pbtxt /var/www/html/fraud_detection/
+cp <current_dir>/model_repository/fraud_detection/1/model.onnx /var/www/html/fraud_detection/
+```
+
+### Step 3: Setting up PIM partition
+Follow this [deployer section](../../README.md#deployer-steps) to setup PIM cli, configuring your AI partition and launching it.
+
+Regarding configuration of AI application served from triton server, user need to provide generated model artifacts like model file and config file to the PIM partition as shown below in `ai.config-json` section.
+```ini
+  config-json = """
+  {
+    "modelSource": "http://<Host/IP>/fraud_detection/model.onnx",
+    "configSource": "http://<Host/IP>/fraud_detection/config.pbtxt",
+    "aiApp": "fraud_detection"
+  }
+```
+modelSource and configSource are the URI path to the model artifacts stored on the model repository covered in Step 2. Specify name of the AI application for which model and config files need to be pulled from model repository.
+
+### Step 4: Validate AI application functionality
+To verify AI example application served from Triton server, feed the ai.validation section with application specific REST schema like URL, headers and payload. If you have built and trained model for fraud detection usecase, apply below speicifed configurations in [config.ini](../../config.ini).  
+
+
+```ini
+  [[validation]]
+    # yes, no - set yes to make the request to validate the AI app deployed as part of PIM partition
+    request = "yes"
+    url = "http://<PIM_LPAR_IP>:8000/v1/chat/completions"
+    method = "POST" # GET, POST
+    # provide headers to use in json format inside triple quotes
+    headers = """
+    {
+      "Content-Type": "application/json"
+    }
+    """
+    # provide payload to use in json format inside triple quotes.
+    # Below JSON payload is used when fraud-detection example is served from triton server
+    payload = """
+    {
+		"inputs": [
+        {
+            "name": "float_input",
+            "shape": [
+                1,
+                7
+            ],
+            "datatype": "FP32",
+            "data": [
+                [
+                    20,
+                    0.5,
+                    2,
+                    1.0,
+                    1.0,
+                    1.0,
+                    1.0
+                ]
+            ]
+        }
+        ],
+        "outputs": [
+            {
+                "name": "label"
+            },
+            {
+                "name": "probabilities"
+            }
+        ]
+	}
+    """
+```
+
+Once PIM partition is deployed with triton server serving the model of configured AI application(fraud-detection in the example above), you will get to observe the output as below
+```json
+{
+  "model_name":"fraud",
+  "model_version":"1",
+  "outputs":[
+  {
+    "name":"label",
+    "datatype":"INT64",
+    "shape":[1,1],
+    "data":[1]
+  },
+  {
+    "name":"probabilities",
+    "datatype":"FP32",
+    "shape":[1,2],
+    "data":[4.172325134277344e-7,0.9999995827674866]
+  }]
+}
+```
diff --git a/examples/tritonserver/app/README.md b/examples/tritonserver/app/README.md
@@ -0,0 +1,25 @@
+# Triton server
+
+[Triton server](https://github.com/triton-inference-server/server) can be used to inference AI workloads using machine learning models. Some of pre-built example AI workloads like fraud detection, Iris classification etc are covered in the [ai-demos-repo](https://github.com/PDeXchange/ai-demos). Users can utilise them to try out the triton inference server.
+Users can deploy AI workloads of their choice of model and configuration by supplying the trained model file(model.onnx) and configuration file (config.pbtxt) to http server to be used by Triton server when its run on a PIM partition.
+
+## Fraud detection/Iris usecase with ONNX runtime
+### Pre-requisites
+Below mentioned pre-requisites are needed to build container image for fraud detection example
+- podman
+- container registry to push the built fraud detection container image
+- protobuf
+
+### Build application container image
+The [script](build_and_train.sh) builds the base container image for the AI example applications given in [ai-demos](https://github.com/PDeXchange/ai-demos). 
+```shell
+bash build_and_train.sh build
+```
+
+### Training model with ONNX runtime
+Run the `build_env` base container image built above to train the model and generate model configuration for the AI usecase. Provide both AI application name and the container image built above as arguments to the script. Below command demonstrates the training of fraud detection usecase.
+```shell
+bash build_and_train.sh train fraud_detection localhost/build_env
+```
+
+After the successful execution, **model.onnx** file will be available on the path `ai-demos/fraud_detection/model_repository/fraud_detection/1/model.onnx`. It also persisits configuration for the model **config.pbtxt** on to the path `ai-demos/fraud_detection/model_repository/fraud_detection/config.pbtxt`
diff --git a/examples/tritonserver/app/build_and_train.sh b/examples/tritonserver/app/build_and_train.sh
@@ -0,0 +1,102 @@
+#!/bin/bash
+
+AI_DEMOS_REPO="https://github.com/PDeXchange/ai-demos"
+REPO_NAME="ai-demos"
+REGISTRY="localhost"
+CONTAINER_IMAGE="$REGISTRY/build_env"
+
+show_help() {
+cat << EOF
+  Usage: $(basename "$0") [build, train] [options]
+
+  This is a bash script to build the AI application container image and train its machine learning/deep learning model.
+
+  Available commands:
+    build      Build the container image for AI applications.
+    train      Train a model by passing the AI application container image as an argument.
+    help       Display the help message.
+
+EOF
+}
+
+build_image() {
+  if [ ! -d "$REPO_NAME" ]; then
+    echo "Cloning source code from $AI_DEMOS_REPO"
+    git clone $AI_DEMOS_REPO
+  fi
+
+  cd $REPO_NAME
+
+  echo "Building container image: $CONTAINER_IMAGE"
+  podman build . -t $CONTAINER_IMAGE
+}
+
+train_model() {
+  shift
+
+  if [ "$#" -ne 2 ]; then
+    echo "Error: 'train' command requires exactly two arguments: application_name and container_image" >&2
+    echo "Usage: $0 train <app_name> <container_image>" >&2
+    exit 1
+  fi
+
+  local APP="$1"
+  local CONTAINER_IMAGE="$2"
+
+  echo "Executing 'train' command..."
+  echo "  APPLICATION: $APP"
+  echo "  CONTAINER IMAGE: $CONTAINER_IMAGE"
+
+  if [ ! -d "$REPO_NAME" ]; then
+    echo "Cloning source code from $AI_DEMOS_REPO"
+    git clone $AI_DEMOS_REPO
+  fi
+
+  cd $REPO_NAME
+
+  echo "find the app directory"
+  app_dir=$(find . -maxdepth 1 -type d -iname "*$APP*" | head -n 1)
+  echo "app dir: $app_dir"
+  #if [ -d "$app_dir" ]; then
+  #    cd "$app_dir" || return
+  #fi
+
+  mkdir -p $(pwd)/${app_dir}/model_repository
+
+  echo "Train the model using $CONTAINER_IMAGE container"
+  # Run the app image to generate the model file
+  podman run --rm --name $APP -v $(pwd)/$app_dir:/app:Z -v $(pwd)/Makefile:/app/Makefile:Z \
+	  --entrypoint="/bin/sh" $CONTAINER_IMAGE -c "cd /app && make train APP=$APP"
+  echo "Model has been trained successfuly and available at: $(pwd)/model_repository/$APP/1"
+
+  # Cleanup redunduntant volume hosted directories
+  rm -rf $(pwd)/$APP/$APP
+
+  echo "Generate model config file for app: $APP"
+  make generate-config APP=$APP || { echo "Failed to generate model config.pbtxt file for $APP" >&2; exit 1; }
+  echo "Model config file config.pbtxt has been generated for app: $APP"
+}
+
+# If no subcommands or args passed, display help
+if [ $# -eq 0 ]; then
+  show_help
+  exit 1
+fi
+
+SUBCOMMAND="$1"
+case "$SUBCOMMAND" in
+  build)
+    build_image "$@"
+    ;;
+  train)
+    train_model "$@"
+    ;;
+  help)
+    show_help
+    ;;
+  *)
+    echo "Error: Unknown command '$SUBCOMMAND'" >&2
+    show_help
+    exit 1
+    ;;
+esac
diff --git a/examples/tritonserver/tritonserver.container b/examples/tritonserver/tritonserver.container
@@ -0,0 +1,22 @@
+[Unit]
+Description=Run tritonserver with ONNX runtime to serve deep learning/machine learning models
+Requires=tritonserver_config.service
+After=tritonserver_config.service
+
+[Service]
+Restart=on-failure
+RestartSec=60
+EnvironmentFile=/etc/pim/env.conf
+
+[Container]
+Image=quay.io/powercloud/tritonserver:latest
+ContainerName=tritonserver
+EnvironmentFile=/etc/pim/tritonserver.conf
+Network=host
+PublishPort=8000-8002:8000-8002
+Volume=/var/models/model_repository:/models:Z
+Exec=/bin/sh -c 'tritonserver --model-repository=/models --'
+SecurityLabelType=unconfined_t
+
+[Install]
+WantedBy=multi-user.target default.target
diff --git a/examples/tritonserver/tritonserver_config.service b/examples/tritonserver/tritonserver_config.service
@@ -0,0 +1,15 @@
+[Unit]
+Description=Mount and setup triton server config
+Requires=network-online.target cloud-config.target
+After=network-online.target cloud-config.target
+
+[Service]
+Type=oneshot
+ExecStart=/usr/bin/env /bin/bash /usr/bin/tritonserver_config.sh
+RemainAfterExit=yes
+TimeoutSec=0 
+
+StandardOutput=journal+console
+
+[Install]
+WantedBy=multi-user.target default.target
diff --git a/examples/tritonserver/tritonserver_config.sh b/examples/tritonserver/tritonserver_config.sh
@@ -0,0 +1,23 @@
+#!/bin/bash
+
+set -x
+
+[ -f /etc/pim/tritonserver.conf ] || touch /etc/pim/tritonserver.conf
+
+AI_APP=$(jq -r '.aiApp' /etc/pim/pim_config.json)
+echo "Application: ${AI_APP}"
+
+mkdir -p /var/models/model_repository/${AI_APP}/1
+
+ONNX_MODEL_SOURCE=$(jq -r '.modelSource' /etc/pim/pim_config.json)
+if [[ -n "$ONNX_MODEL_SOURCE" ]]; then
+curl "$ONNX_MODEL_SOURCE" --output /var/models/model_repository/${AI_APP}/1/model.onnx
+fi
+
+CONFIG_FILE=$(jq -r '.configSource' /etc/pim/pim_config.json)
+if [[ -n "$CONFIG_FILE" ]]; then
+    curl "$CONFIG_FILE" --output /var/models/model_repository/${AI_APP}/config.pbtxt
+fi
+
+var_to_add=MODEL_REPOSITORY=/var/models/model_repository
+sed -i "/^MODEL_REPOSITORY=.*/d"  /etc/pim/tritonserver.conf && echo "$var_to_add" >> /etc/pim/tritonserver.conf