-
Notifications
You must be signed in to change notification settings - Fork 28
Build: Trigger CI for new vllm_backend Triton releases #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4f59ff5
5f40521
b19dbb7
65f979c
a12519f
59b549f
8ca761d
718c244
6c1ad63
5a251ae
73655f8
450f73b
915346e
00d10f9
0ff3e8b
1ea7b00
580bbf0
2085c70
6d79abc
3418d12
380fb94
9a1725b
8e182af
93a10a5
50cc924
c800ef4
9dca430
fa1846a
1fd0e56
8209da1
7f6157c
a5e9fc4
0301c04
b6f351e
ba5dba3
bfe7131
d6606d3
223a10b
23865a6
b1c58b3
72ee876
18c6ab6
d4b30c0
a2b3058
dd7ccf9
cfa8c48
8bdc8e0
07912f2
9268c2f
98f4a3f
e0c4ad4
013e389
20f3d39
2d5098a
84e14dd
fb20236
29e73ba
a2e7db3
bfb9466
f348d49
85c53aa
25a71f7
deb13ce
d888f12
7a08c86
e76e209
1b3dbc0
1a93230
1efadc7
3d0110a
f29d9d0
dec1329
52aaded
62fc87c
6605983
98fbc29
f484188
91dc27a
9bf9c12
805ae1e
7ccea9c
d347b1d
fff8f14
156724a
4604761
232787f
656c8f5
47b9a91
047b885
fb7977f
2050086
9492364
ce76a76
2fafdda
5792663
f3258ef
f2c7a89
e45854c
5929f7a
2512b8a
7c97d9f
552db92
65fe72c
3ada940
760f884
61ba078
0100f94
5768b34
ea2120c
adec735
834e76e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
name: Welcome message | ||
on: | ||
pull_request_target: | ||
types: [opened] | ||
|
||
jobs: | ||
pr_reminder: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Add first comment | ||
uses: actions/github-script@v6 | ||
with: | ||
script: | | ||
github.rest.issues.createComment({ | ||
owner: context.repo.owner, | ||
repo: context.repo.repo, | ||
issue_number: context.issue.number, | ||
body: '👋 Hi! \nThank you for contributing to the project.\n Just a reminder: PRs will trigger full CI run by default. We will add verified labels on the PR once build and tests steps are successful.\n🚀' | ||
}) | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,4 +36,3 @@ jobs: | |
- uses: actions/checkout@v3 | ||
- uses: actions/setup-python@v3 | ||
- uses: pre-commit/[email protected] | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
name: Validate Triton Pull request by running our change on the latest version of vLLM | ||
on: | ||
pull_request: | ||
jobs: | ||
mirror_repo: | ||
environment: GITLAB | ||
runs-on: self-hosted | ||
steps: | ||
- name: Sync Mirror Repository | ||
run: | | ||
#!/bin/bash | ||
curl --request POST --header "PRIVATE-TOKEN:${{ secrets.TOKEN }}" "${{ secrets.MIRROR_URL }}" | ||
trigger-ci: | ||
environment: GITLAB | ||
needs: mirror_repo | ||
runs-on: self-hosted | ||
steps: | ||
- name: Trigger Pipeline | ||
run: | | ||
#!/bin/bash | ||
# Get latest VLLM RELEASED VERSION from https://github.com/triton-inference-server/vllm_backend/releases | ||
TAG=$(curl https://api.github.com/repos/triton-inference-server/vllm_backend/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
export TRITON_CONTAINER_VERSION=${TAG#v} # example: 24.10 | ||
if [ -z "$TRITON_CONTAINER_VERSION" ] | ||
then | ||
echo "\$TRITON_CONTAINER_VERSION is NULL, setting it to 24.10" | ||
TRITON_CONTAINER_VERSION=24.10 | ||
else | ||
echo "\$TRITON_CONTAINER_VERSION is NOT NULL" | ||
fi | ||
echo "TRITON_CONTAINER_VERSION = ${TRITON_CONTAINER_VERSION}" | ||
|
||
# Get latest VLLM RELEASED VERSION from https://github.com/vllm-project/vllm/releases | ||
TAG=$(curl https://api.github.com/repos/vllm-project/vllm/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
export VLLM_VERSION=${TAG#v} # example: 0.5.5 | ||
if [ -z "$VLLM_VERSION" ] | ||
then | ||
echo "\$VLLM_VERSION is NULL, setting it to 0.5.5" | ||
VLLM_VERSION=0.5.5 | ||
else | ||
echo "\$VLLM_VERSION is NOT NULL" | ||
fi | ||
echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
|
||
curl --fail --request POST --form token=${{ secrets.PIPELINE_TOKEN }} -F ref=${GITHUB_HEAD_REF} -F variables[BUILD_OPTION]="BUILD_SOURCE" -F variables[TRITON_CONTAINER_VERSION]="${TRITON_CONTAINER_VERSION}" -F variables[VLLM_VERSION]="${VLLM_VERSION}" -F variables[TEST_OPTION]="ALL_TESTS" "${{ secrets.PIPELINE_URL }}" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
name: Validate latest vLLM release from https://github.com/vllm-project/vllm/releases against latest Triton release https://github.com/triton-inference-server/vllm_backend/releases | ||
on: | ||
schedule: | ||
- cron: "30 09 */3 * *" | ||
jobs: | ||
mirror_repo: | ||
environment: GITLAB | ||
runs-on: self-hosted | ||
steps: | ||
- name: Sync Mirror Repository | ||
run: | | ||
#!/bin/bash | ||
curl --request POST --header "PRIVATE-TOKEN:${{ secrets.TOKEN }}" "${{ secrets.MIRROR_URL }}" | ||
trigger-ci: | ||
environment: GITLAB | ||
needs: mirror_repo | ||
runs-on: self-hosted | ||
steps: | ||
- name: Trigger Pipeline | ||
run: | | ||
#!/bin/bash | ||
# Get latest VLLM RELEASED VERSION from https://github.com/triton-inference-server/vllm_backend/releases | ||
TAG=$(curl https://api.github.com/repos/triton-inference-server/vllm_backend/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
export TRITON_CONTAINER_VERSION=${TAG#v} # example: 24.08 | ||
# Get latest VLLM RELEASED VERSION from https://github.com/vllm-project/vllm/releases | ||
TAG=$(curl https://api.github.com/repos/vllm-project/vllm/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
export VLLM_VERSION=${TAG#v} # example: 0.5.5 | ||
echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
if [ -z "$TRITON_CONTAINER_VERSION" || -z "$VLLM_VERSION"] | ||
then | ||
echo "Can't find latest Triton or vllm version.. Skipping CI run" | ||
else | ||
echo "TRITON_CONTAINER_VERSION = ${TRITON_CONTAINER_VERSION}" | ||
echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
curl --fail --request POST --form token=${{ secrets.PIPELINE_TOKEN }} -F ref=${GITHUB_HEAD_REF} -F variables[BUILD_OPTION]="PULL_DOCKER" -F variables[TRITON_CONTAINER_VERSION]="${TRITON_CONTAINER_VERSION}" -F variables[TEST_OPTION]="ALL_HARDWARE" -F variables[VLLM_VERSION]="${VLLM_VERSION}" -F variables[TEST_OPTION]="ALL_TESTS" "${{ secrets.PIPELINE_URL }}" | ||
fi |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,9 @@ | |
--> | ||
|
||
[](https://opensource.org/licenses/BSD-3-Clause) | ||
 | ||
 | ||
 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could you please clarify how these static badges work? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally this should be automated. But, it is a manual process at the moment. Once the cron task is finished and the pipeline is green, I'd have to issue a PR to update the badges here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to hold on adding badges, until we have an automated workflow in place? |
||
|
||
# vLLM Backend | ||
|
||
|
@@ -82,7 +85,18 @@ latest YY.MM (year.month) of [Triton release](https://github.com/triton-inferenc | |
|
||
``` | ||
# YY.MM is the version of Triton. | ||
export TRITON_CONTAINER_VERSION=<YY.MM> | ||
# Get latest VLLM RELEASED VERSION from https://github.com/triton-inference-server/vllm_backend/releases | ||
TAG=$(curl https://api.github.com/repos/triton-inference-server/vllm_backend/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
export TRITON_CONTAINER_VERSION=${TAG#v} # example: 24.06 | ||
echo "TRITON_CONTAINER_VERSION = ${TRITON_CONTAINER_VERSION}" | ||
|
||
# Get latest VLLM RELEASED VERSION from https://github.com/vllm-project/vllm/releases | ||
TAG=$(curl https://api.github.com/repos/vllm-project/vllm/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
export VLLM_VERSION=${TAG#v} # example: 0.5.3.post1 | ||
echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
|
||
git clone -b r${TRITON_CONTAINER_VERSION} https://github.com/triton-inference-server/server.git | ||
nvda-mesharma marked this conversation as resolved.
Show resolved
Hide resolved
nvda-mesharma marked this conversation as resolved.
Show resolved
Hide resolved
|
||
cd server | ||
./build.py -v --enable-logging | ||
--enable-stats | ||
--enable-tracing | ||
|
@@ -101,6 +115,11 @@ export TRITON_CONTAINER_VERSION=<YY.MM> | |
--backend=python:r${TRITON_CONTAINER_VERSION} | ||
--backend=vllm:r${TRITON_CONTAINER_VERSION} | ||
--backend=ensemble | ||
--vllm-version=${VLLM_VERSION} | ||
# Build Triton Server | ||
cd build | ||
bash -x ./docker_build | ||
|
||
``` | ||
|
||
### Option 3. Add the vLLM Backend to the Default Triton Container | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
#!/bin/bash | ||
# Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
# | ||
# Redistribution and use in source and binary forms, with or without | ||
# modification, are permitted provided that the following conditions | ||
# are met: | ||
# * Redistributions of source code must retain the above copyright | ||
# notice, this list of conditions and the following disclaimer. | ||
# * Redistributions in binary form must reproduce the above copyright | ||
# notice, this list of conditions and the following disclaimer in the | ||
# documentation and/or other materials provided with the distribution. | ||
# * Neither the name of NVIDIA CORPORATION nor the names of its | ||
# contributors may be used to endorse or promote products derived | ||
# from this software without specific prior written permission. | ||
# | ||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY | ||
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR | ||
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | ||
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY | ||
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
while getopts t: flag | ||
do | ||
case "${flag}" in | ||
u) PROD_CONTAINER=${OPTARG};; | ||
esac | ||
done | ||
|
||
echo "Pulling container image ${PROD_CONTAINER}" | ||
docker pull ${PROD_CONTAINER} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
#!/bin/bash | ||
# Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
# | ||
# Redistribution and use in source and binary forms, with or without | ||
# modification, are permitted provided that the following conditions | ||
# are met: | ||
# * Redistributions of source code must retain the above copyright | ||
# notice, this list of conditions and the following disclaimer. | ||
# * Redistributions in binary form must reproduce the above copyright | ||
# notice, this list of conditions and the following disclaimer in the | ||
# documentation and/or other materials provided with the distribution. | ||
# * Neither the name of NVIDIA CORPORATION nor the names of its | ||
# contributors may be used to endorse or promote products derived | ||
# from this software without specific prior written permission. | ||
# | ||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY | ||
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR | ||
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | ||
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY | ||
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
while getopts t:v: flag | ||
do | ||
case "${flag}" in | ||
u) TRITON_CONTAINER_VERSION=${OPTARG};; | ||
a) VLLM_VERSION=${OPTARG};; | ||
esac | ||
done | ||
|
||
echo "Triton version is ${TRITON_CONTAINER_VERSION} and vllm version is ${VLLM_VERSION}" | ||
# This change will start working for r24.12 release | ||
#git clone -b r${TRITON_CONTAINER_VERSION} https://github.com/triton-inference-server/server.git | ||
nvda-mesharma marked this conversation as resolved.
Show resolved
Hide resolved
|
||
git clone https://github.com/triton-inference-server/server.git | ||
set -x && python3 server/build.py -v \ | ||
--enable-logging \ | ||
--enable-stats \ | ||
--enable-tracing \ | ||
--enable-metrics \ | ||
--enable-gpu-metrics \ | ||
--enable-cpu-metrics \ | ||
--enable-gpu \ | ||
--no-container-interactive \ | ||
--container-prebuild-command="docker login -u gitlab-ci-token -p ${CI_JOB_TOKEN} ${CI_REGISTRY}" \ | ||
oandreeva-nv marked this conversation as resolved.
Show resolved
Hide resolved
|
||
--filesystem=gcs \ | ||
--filesystem=s3 \ | ||
--filesystem=azure_storage \ | ||
--endpoint=http \ | ||
--endpoint=grpc \ | ||
--endpoint=sagemaker \ | ||
--endpoint=vertex-ai \ | ||
--upstream-container-version=${TRITON_CONTAINER_VERSION} \ | ||
--backend=python:r${TRITON_CONTAINER_VERSION} \ | ||
--backend=vllm:r${TRITON_CONTAINER_VERSION} \ | ||
--vllm-version=${VLLM_VERSION} 2>&1 | ||
# Build Triton Server | ||
cd server/build | ||
bash -x ./docker_build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please clarify why the version 0.5.5 was picked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a passing pipeline with this version. Hence, I picked this as the default until a new version is tested and verified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have the latest version we support, or wait until we have tests, that indicate that we can migrate to the latest. Since we don't support 0.5.5 and vLLM's latest version now is 0.6.1.post2, I can see a confusion, that will arise in users with this badge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done