Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions examples/tritonserver/Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM quay.io/powercloud/pim:base

COPY tritonserver_config.sh /usr/bin/
COPY tritonserver_config.service /etc/systemd/system
RUN systemctl unmask tritonserver_config.service
RUN systemctl enable tritonserver_config.service

COPY tritonserver.container /usr/share/containers/systemd
138 changes: 138 additions & 0 deletions examples/tritonserver/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Triton

Triton inference server can be used to serve machine learning or deep learning models like classification, regression etc on CPU/GPU platforms.
Triton inference server is built on top of base image [here](../../base-image/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line might not be required if you do below comment


## Build PIM triton server
**Step 1: Build Base image**
Follow the steps provided [here](../../base-image/README.md) to build the base image.

**Step 2: Build triton server PIM image**
- Bootc PIM based triton server image brings up the AI partition that can serve trained machine learning models for the AI applications.
Ensure to replace the `FROM` image in [Containerfile](Containerfile) with the base image you have built before building this image.

```shell
podman build -t <your_registry>/pim:triton-server

podman push <your_registry>/pim:triton-server
```

## Steps to setup e2e inference flow

### Step 1: Preparing the model and config file
As mentioned earlier triton inference server can be used to serve any machine learning models with their respective model and configuration files stored in model repository. You can build your model and config file for your use case.
To show case the e2e flow of triton inference server deployment from PIM, we will be utilising the existing application [fraud-detection](https://github.com/PDeXchange/ai-demos/tree/main/02_Fraud_Detection). Please follow below steps to build the model and config file.

#### Step I: Building the image
To easily train the model with the provided python application, we have provided a Containerfile with the necessary packages, environment and tools to run the python application which can train the model for you. The source files for training the python application will be volume mounted during training to reuse the container across AI example applications.

Build the container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)

To consume the already built and hosted container image use `quay.io/powercloud/build_env`

#### Step II: Train the model
Model with ONNX runtime can be trained by running the container image built in Step I. Follow the [training steps](app/README.md)
After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**

### Step 2: Store model artifacts in a model repository
Store both model file(model.onnx) and config file(config.pbtxt) in a simple HTTP server

#### Steps to start http server and copy the model artifacts
```shell
# Install httpd
yum install httpd -y
systemctl enable httpd
systemctl start httpd
# Copy AI app specific artifacts like model file and model config file
mkdir -p /var/www/html/fraud_detection/
cp <current_dir>/model_repository/fraud_detection/config.pbtxt /var/www/html/fraud_detection/
cp <current_dir>/model_repository/fraud_detection/1/model.onnx /var/www/html/fraud_detection/
```

### Step 3: Setting up PIM partition
Follow this [deployer section](../../README.md#deployer-steps) to setup PIM cli, configuring your AI partition and launching it.

Regarding configuration of AI application served from triton server, user need to provide generated model artifacts like model file and config file to the PIM partition as shown below in `ai.config-json` section.
```ini
config-json = """
{
"modelSource": "http://<Host/IP>/fraud_detection/model.onnx",
"configSource": "http://<Host/IP>/fraud_detection/config.pbtxt",
"aiApp": "fraud_detection"
}
```
modelSource and configSource are the URI path to the model artifacts stored on the model repository covered in Step 2. Specify name of the AI application for which model and config files need to be pulled from model repository.

### Step 4: Validate AI application functionality
To verify AI example application served from Triton server, feed the ai.validation section with application specific REST schema like URL, headers and payload. If you have built and trained model for fraud detection usecase, apply below speicifed configurations in [config.ini](../../config.ini).


```ini
[[validation]]
# yes, no - set yes to make the request to validate the AI app deployed as part of PIM partition
request = "yes"
url = "http://<PIM_LPAR_IP>:8000/v1/chat/completions"
method = "POST" # GET, POST
# provide headers to use in json format inside triple quotes
headers = """
{
"Content-Type": "application/json"
}
"""
# provide payload to use in json format inside triple quotes.
# Below JSON payload is used when fraud-detection example is served from triton server
payload = """
{
"inputs": [
{
"name": "float_input",
"shape": [
1,
7
],
"datatype": "FP32",
"data": [
[
20,
0.5,
2,
1.0,
1.0,
1.0,
1.0
]
]
}
],
"outputs": [
{
"name": "label"
},
{
"name": "probabilities"
}
]
}
"""
```

Once PIM partition is deployed with triton server serving the model of configured AI application(fraud-detection in the example above), you will get to observe the output as below
```json
{
"model_name":"fraud",
"model_version":"1",
"outputs":[
{
"name":"label",
"datatype":"INT64",
"shape":[1,1],
"data":[1]
},
{
"name":"probabilities",
"datatype":"FP32",
"shape":[1,2],
"data":[4.172325134277344e-7,0.9999995827674866]
}]
}
```
25 changes: 25 additions & 0 deletions examples/tritonserver/app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Triton server

[Triton server](https://github.com/triton-inference-server/server) can be used to inference AI workloads using machine learning models. Some of pre-built example AI workloads like fraud detection, Iris classification etc are covered in the [ai-demos-repo](https://github.com/PDeXchange/ai-demos). Users can utilise them to try out the triton inference server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention somewhere that currently only fraud detection example is supported via provided script.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with recent changes in containerization of ai-demos apps, triton server supports both examples, so this line not needed.

Users can deploy AI workloads of their choice of model and configuration by supplying the trained model file(model.onnx) and configuration file (config.pbtxt) to http server to be used by Triton server when its run on a PIM partition.

## Fraud detection/Iris usecase with ONNX runtime
### Pre-requisites
Below mentioned pre-requisites are needed to build container image for fraud detection example
- podman
- container registry to push the built fraud detection container image
- protobuf

### Build application container image
The [script](build_and_train.sh) builds the base container image for the AI example applications given in [ai-demos](https://github.com/PDeXchange/ai-demos).
```shell
bash build_and_train.sh build
```

### Training model with ONNX runtime
Run the `build_env` base container image built above to train the model and generate model configuration for the AI usecase. Provide both AI application name and the container image built above as arguments to the script. Below command demonstrates the training of fraud detection usecase.
```shell
bash build_and_train.sh train fraud_detection localhost/build_env
```

After the successful execution, **model.onnx** file will be available on the path `ai-demos/fraud_detection/model_repository/fraud_detection/1/model.onnx`. It also persisits configuration for the model **config.pbtxt** on to the path `ai-demos/fraud_detection/model_repository/fraud_detection/config.pbtxt`
102 changes: 102 additions & 0 deletions examples/tritonserver/app/build_and_train.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
#!/bin/bash

AI_DEMOS_REPO="https://github.com/PDeXchange/ai-demos"
REPO_NAME="ai-demos"
REGISTRY="localhost"
CONTAINER_IMAGE="$REGISTRY/build_env"

show_help() {
cat << EOF
Usage: $(basename "$0") [build, train] [options]

This is a bash script to build the AI application container image and train its machine learning/deep learning model.

Available commands:
build Build the container image for AI applications.
train Train a model by passing the AI application container image as an argument.
help Display the help message.

EOF
}

build_image() {
if [ ! -d "$REPO_NAME" ]; then
echo "Cloning source code from $AI_DEMOS_REPO"
git clone $AI_DEMOS_REPO
fi

cd $REPO_NAME

echo "Building container image: $CONTAINER_IMAGE"
podman build . -t $CONTAINER_IMAGE
}

train_model() {
shift

if [ "$#" -ne 2 ]; then
echo "Error: 'train' command requires exactly two arguments: application_name and container_image" >&2
echo "Usage: $0 train <app_name> <container_image>" >&2
exit 1
fi

local APP="$1"
local CONTAINER_IMAGE="$2"

echo "Executing 'train' command..."
echo " APPLICATION: $APP"
echo " CONTAINER IMAGE: $CONTAINER_IMAGE"

if [ ! -d "$REPO_NAME" ]; then
echo "Cloning source code from $AI_DEMOS_REPO"
git clone $AI_DEMOS_REPO
fi

cd $REPO_NAME

echo "find the app directory"
app_dir=$(find . -maxdepth 1 -type d -iname "*$APP*" | head -n 1)
echo "app dir: $app_dir"
#if [ -d "$app_dir" ]; then
# cd "$app_dir" || return
#fi

mkdir -p $(pwd)/${app_dir}/model_repository

echo "Train the model using $CONTAINER_IMAGE container"
# Run the app image to generate the model file
podman run --rm --name $APP -v $(pwd)/$app_dir:/app:Z -v $(pwd)/Makefile:/app/Makefile:Z \
--entrypoint="/bin/sh" $CONTAINER_IMAGE -c "cd /app && make train APP=$APP"
echo "Model has been trained successfuly and available at: $(pwd)/model_repository/$APP/1"

# Cleanup redunduntant volume hosted directories
rm -rf $(pwd)/$APP/$APP

echo "Generate model config file for app: $APP"
make generate-config APP=$APP || { echo "Failed to generate model config.pbtxt file for $APP" >&2; exit 1; }
echo "Model config file config.pbtxt has been generated for app: $APP"
}

# If no subcommands or args passed, display help
if [ $# -eq 0 ]; then
show_help
exit 1
fi

SUBCOMMAND="$1"
case "$SUBCOMMAND" in
build)
build_image "$@"
;;
train)
train_model "$@"
;;
help)
show_help
;;
*)
echo "Error: Unknown command '$SUBCOMMAND'" >&2
show_help
exit 1
;;
esac
22 changes: 22 additions & 0 deletions examples/tritonserver/tritonserver.container
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[Unit]
Description=Run tritonserver with ONNX runtime to serve deep learning/machine learning models
Requires=tritonserver_config.service
After=tritonserver_config.service

[Service]
Restart=on-failure
RestartSec=60
EnvironmentFile=/etc/pim/env.conf

[Container]
Image=quay.io/powercloud/tritonserver:latest
ContainerName=tritonserver
EnvironmentFile=/etc/pim/tritonserver.conf
Network=host
PublishPort=8000-8002:8000-8002
Volume=/var/models/model_repository:/models:Z
Exec=/bin/sh -c 'tritonserver --model-repository=/models --'
SecurityLabelType=unconfined_t

[Install]
WantedBy=multi-user.target default.target
15 changes: 15 additions & 0 deletions examples/tritonserver/tritonserver_config.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[Unit]
Description=Mount and setup triton server config
Requires=network-online.target cloud-config.target
After=network-online.target cloud-config.target

[Service]
Type=oneshot
ExecStart=/usr/bin/env /bin/bash /usr/bin/tritonserver_config.sh
RemainAfterExit=yes
TimeoutSec=0

StandardOutput=journal+console

[Install]
WantedBy=multi-user.target default.target
23 changes: 23 additions & 0 deletions examples/tritonserver/tritonserver_config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash

set -x

[ -f /etc/pim/tritonserver.conf ] || touch /etc/pim/tritonserver.conf

AI_APP=$(jq -r '.aiApp' /etc/pim/pim_config.json)
echo "Application: ${AI_APP}"

mkdir -p /var/models/model_repository/${AI_APP}/1

ONNX_MODEL_SOURCE=$(jq -r '.modelSource' /etc/pim/pim_config.json)
if [[ -n "$ONNX_MODEL_SOURCE" ]]; then
curl "$ONNX_MODEL_SOURCE" --output /var/models/model_repository/${AI_APP}/1/model.onnx
fi

CONFIG_FILE=$(jq -r '.configSource' /etc/pim/pim_config.json)
if [[ -n "$CONFIG_FILE" ]]; then
curl "$CONFIG_FILE" --output /var/models/model_repository/${AI_APP}/config.pbtxt
fi

var_to_add=MODEL_REPOSITORY=/var/models/model_repository
sed -i "/^MODEL_REPOSITORY=.*/d" /etc/pim/tritonserver.conf && echo "$var_to_add" >> /etc/pim/tritonserver.conf