Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin versions in AMD Docker #4

Merged
merged 4 commits into from
Oct 8, 2024
Merged

Conversation

JArnoldAMD
Copy link
Contributor

The Docker build used in AMD's Llama-2-70b inference results worked properly at the time of submission. But versions of certain dependencies were not pinned, and changes in these upstream dependencies in the past few weeks have caused the Docker build to fail.

This PR updates the docker build to pin these dependency versions to the same versions/revisions that were used for the actual submission.

In addition, the run instructions and scripts assumed that shell scripts inside the container were executable. That was the case in our internal repo, but not in the official MLPerf results repo. The instructions and scripts have been updated to run the scripts via bash so that they don't have to be executable.

Finally, the run_scenarios.sh script referenced a package submission tool which was not distributed as part of the actual submission and therefore generated an error. This has no impact on the results themselves. The run_scenarios.sh script has been updated to remove the call to this tool. AMD looks forward to providing such a tool in the future to simplify the process of preparing MLPerf submissions, but it is not ready for release in its current form.

The Dockerfile used for building vLLM on ROCm points to the Triton
main branch by default.  This results in a build that is not repeatable,
and recent Triton updates have introduced incompatibilities which cause
the build to fail.

Update the build_llama2.sh script to build vLLM with a specific commit
of Triton; the revision used here is the same revision that was used
for the MLPerf 4.1 submission.

Also make the second stage of the build (adding the MLPerf-specific
code on top of the generic vLLM image) conditional on having a
successful vLLM build.  Without this change, vLLM build failures
would result in a confusing error message.
The Dockerfile for AMD's Llama-2 results didn't use specific
versions when installing pip packages, resulting in failures when
newer versions were released.  This update pins those versions to
those that were used for the submission runs.
The run_scenarios.sh script used in AMD's Llama-2 submission includes
a call to a submission packaging tool, but this tool was not part of
the submission package.  While AMD looks forward to including this
tool in a future submission to make it easier for others to submit
MLPerf results with AMD GPUs, the packaging tool is not yet ready
for broader use.  We are removing the call from the run_scenarios.sh
script to eliminate an error message (which doesn't affect the actual
runs).
AMD's internal repo had execution permissions enabled on scripts
for launching the workload, but the execuctable permission was
lost in the submission package.  Switch to using bash to execute
these scripts so that they will work properly without being
executable.
@JArnoldAMD JArnoldAMD requested a review from a team as a code owner October 3, 2024 17:17
Copy link

github-actions bot commented Oct 3, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@JArnoldAMD
Copy link
Contributor Author

recheck

@mrmhodak mrmhodak merged commit 30d19b7 into mlcommons:main Oct 8, 2024
1 check passed
@github-actions github-actions bot locked and limited conversation to collaborators Oct 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants