-
Notifications
You must be signed in to change notification settings - Fork 97
Open
Description
I'm seeing intermittent failures from vllm tests on lsf cluster when run with
uv run --all-extras --all-groups pytest --isolate-heavy -v
For example:
==== 723 passed, 142 skipped, 2 xfailed, 90 warnings in 1572.83s (0:26:12) =====
when all worked well, and
FAILED test/backends/test_openai_vllm.py::test_instruct - openai.NotFoundErro...
FAILED test/backends/test_openai_vllm.py::test_multiturn - openai.NotFoundErr...
FAILED test/backends/test_openai_vllm.py::test_chat - openai.NotFoundError: E...
FAILED test/backends/test_openai_vllm.py::test_chat_stream - openai.NotFoundE...
FAILED test/backends/test_openai_vllm.py::test_format - openai.NotFoundError:...
FAILED test/backends/test_openai_vllm.py::test_generate_from_raw - openai.Not...
FAILED test/backends/test_openai_vllm.py::test_generate_from_raw_with_format
= 7 failed, 716 passed, 142 skipped, 2 xfailed, 90 warnings in 1409.38s (0:23:29) =
at other times.
Success seems about 50-75% failure from running multiple times
On further investigation the underlying error for all these cases is:
E openai.NotFoundError: Error code: 404 - {'error': {'message': 'The model `ibm-granite/granite-4.0-micro` does not exist.', 'type': 'NotFoundError', 'param': None, 'code': 404}}
Question to persue -- How is the vllm server initialized when tests are run with uv on a GPU enabled cluster - clearly sometimes we get access to a vllm environment with the right model, othertimes we don't
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels