v0.6.1
What's Changed
- Added Slurm dependency example
- Added unit tests for vec-inf client and missing unit tests for vec-inf API
- Fixed multi-node launch GPU placement group issue:
--exclusiveoption is needed for slurm script and compilation config needs to stay at 0 - Set environment variables in the generated slurm script instead of in the helper to ensure reusability
- Replaced
python3.10 -m vllm.entrypoints.openai.api_serverwithvllm serveto support custom chat template usage - Added additional launch options:
--excludefor excluding certain nodes,--node-listfor targeting a specific list of nodes, and--bindfor binding additional directories - Added remaining vLLM engine arg short-long name mappings for robustness
- Added notes in documentation to capture some gotchas and added vLLM version info
Tests Added
tests/vec-inf/client/test_api.py:
shutdown_model()wait_until_ready()
tests/vec-inf/client/test_helper.py:
ModelRegistryPerformanceMetricsCollectorModelStatusMonitorModelLauncher