v0.6.1

XkunW released this 27 May 01:05

· 228 commits to main since this release

7f382ba

What's Changed

Added Slurm dependency example
Added unit tests for vec-inf client and missing unit tests for vec-inf API
Fixed multi-node launch GPU placement group issue: --exclusive option is needed for slurm script and compilation config needs to stay at 0
Set environment variables in the generated slurm script instead of in the helper to ensure reusability
Replaced python3.10 -m vllm.entrypoints.openai.api_server with vllm serve to support custom chat template usage
Added additional launch options: --exclude for excluding certain nodes, --node-list for targeting a specific list of nodes, and --bind for binding additional directories
Added remaining vLLM engine arg short-long name mappings for robustness
Added notes in documentation to capture some gotchas and added vLLM version info

Tests Added

tests/vec-inf/client/test_api.py:

shutdown_model()
wait_until_ready()

tests/vec-inf/client/test_helper.py:

ModelRegistry
PerformanceMetricsCollector
ModelStatusMonitor
ModelLauncher

Contributors

@XkunW @kohankhaki

Contributors

XkunW and kohankhaki

Assets 2