You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/tritonserver/README.md
+44-35Lines changed: 44 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,31 +3,39 @@
3
3
Triton inference server can be used to serve machine learning or deep learning models like classification, regression etc on CPU/GPU platforms.
4
4
Triton inference server is built on top of base image [here](../../base-image/)
5
5
6
-
## Steps to setup e2e inference flow
6
+
## Build PIM triton server
7
+
**Step 1: Build Base image**
8
+
Follow the steps provided [here](../../base-image/README.md) to build the base image.
9
+
10
+
**Step 2: Build triton server PIM image**
11
+
Ensure to replace the `FROM` image in [Containerfile](Containerfile) with the base image you have built before building this image.
7
12
8
-
## Step 1: Building the images
9
-
Build container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
10
-
To reuse the built container image, push the built image to container registry.
11
13
```shell
12
-
podman push <registry>/build_env
14
+
podman build -t <your_registry>/pim-triton-server
15
+
16
+
podman push <your_registry>/pim-triton-server
13
17
```
14
18
15
-
## Step2: Train the model
16
-
Model with ONNX runtime can be trained by running the container image built in Step 1. Follow the [training steps](app/README.md)
17
-
After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**
19
+
## Steps to setup e2e inference flow
18
20
19
-
### Setting up PIM partition
20
-
Follow this [deployer section](../../README.md#deployer-steps) to setup PIM cli, configuring your AI partition and launching it.
21
+
### Step 1: Preparing the model and config file
22
+
As mentioned earlier triton inference server can be used to serve any machine learning models with their respective model and configuration files stored in model repository. You can build your model and config file for your use case.
23
+
To show case the e2e flow of triton inference server deployment from PIM, we will be utilising the existing application [fraud-detection](https://github.com/PDeXchange/ai-demos/tree/main/02_Fraud_Detection). Please follow below steps to build the model and config file.
21
24
22
-
Regarding configuration of AI application served from triton server, user need to provide generated model artifacts like model file and config file to the PIM partition as shown below in `ai.config-json` section.
Both of the model files will be available in `<current_dir>/model_repository/fraud` dir on the machine wher you have trained the model. Store these files in a simple HTTP server and pass the URI path to the PIM partition like above.
25
+
#### Step I: Building the image
26
+
To easily train the model with the provided python application, we have provided a Containerfile with the necessary packages, environment and tools to run the python application which can train the model for you. the source files for the python application will be volume mounted during training.
27
+
28
+
Build the container image for AI example application covered in [ai-demos](https://github.com/PDeXchange/ai-demos) using [build-steps](app/README.md)
29
+
30
+
To reuse the built container image, push the built image to container registry.
31
+
`podman push <registry>/build_env`
32
+
33
+
#### Step II: Train the model
34
+
Model with ONNX runtime can be trained by running the container image built in Step I. Follow the [training steps](app/README.md)
35
+
After the successful training completion, model(mode.onnx) and config(config.pbtxt) files will be available in path **<current_dir>/app/model_repository/fraud**
36
+
37
+
### Step 2: Store model artifacts in a model repository
38
+
Store both model file(model.onnx) and config file(config.pbtxt) in a simple HTTP server
31
39
32
40
#### Steps to start http server and copy the model artifacts
To verify AI example application served from Triton server, Apply below speicifed configurations in [config.ini](../../config.ini).
46
-
Sample JSON payload is provided for fraud detection usecase. Feed the appropriate JSON payload specific to AI example app to be served from triton.
52
+
### Step 3: Setting up PIM partition
53
+
Follow this [deployer section](../../README.md#deployer-steps) to setup PIM cli, configuring your AI partition and launching it.
54
+
55
+
Regarding configuration of AI application served from triton server, user need to provide generated model artifacts like model file and config file to the PIM partition as shown below in `ai.config-json` section.
modelSource and configSource are the URI path to the model artifacts stored on the model repository covered in Step 2. Specify name of the AI application for which model and config files need to be pulled from model repository.
65
+
66
+
### Step 4: Validate AI application functionality
67
+
To verify AI example application served from Triton server, feed the ai.validation section with application specific REST schema like URL, headers and payload. If you have built and trained model for fraud detection usecase, apply below speicifed configurations in [config.ini](../../config.ini).
68
+
47
69
48
70
```ini
49
71
[[validation]]
@@ -114,16 +136,3 @@ Once PIM partition is deployed with triton server serving the model of configure
114
136
}]
115
137
}
116
138
```
117
-
118
-
### Build PIM triton server
119
-
**Step 1: Build Base image**
120
-
Follow the steps provided [here](../../base-image/README.md) to build the base image.
121
-
122
-
**Step 2: Build triton server PIM image**
123
-
Ensure to replace the `FROM` image in [Containerfile](Containerfile) with the base image you have built before building this image.
Copy file name to clipboardExpand all lines: examples/tritonserver/app/README.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,12 @@
4
4
Users can deploy AI workloads of their choice of model and configuration by supplying the trained model file(model.onnx) and configuration file (config.pbtxt) to http server to be used by Triton server when its run on a PIM partition.
5
5
6
6
## Fraud detection usecase with ONNX runtime
7
+
### Pre-requisites
8
+
Below mentioned pre-requisites are needed to build container image for fraud detection example
9
+
- podman
10
+
- container registry to push the built fraud detection container image
11
+
- protobuf
12
+
7
13
### Build fraud detection container image
8
14
The [script](build_and_train.sh) builds the base container image for the AI example applications given in [ai-demos](https://github.com/PDeXchange/ai-demos). AI application name for which container image to be built is given as an argument to the script.
0 commit comments