Skip to content

[FEATURE] KServe gRPC frontend expose full ModelConfig specification #3438

@GuanLuo

Description

@GuanLuo

The current ModelConfig endpoint in KServe gRPC frontend populate the ModelConfig via only TensorModelConfig and other fields are set to the default values. However, some Triton Inference Server deployments uses ModelConfig and allowing a Dynamo worker to provide full ModelConfig specification will help the migration.

For this feature request, I may add a new "extra" field in TensorModelConfig so that Dynamo still treat "tensor based model" as generic as possible, and yet some specialized information can be passed around and be interpreted by parts that understand them.
So for ModelConfig, which I categorize it to be specialized Triton Inference Server metadata, you will pass as {..., "extra" : {"triton_model_config": {...} # JSON representation of the ModelConfig}} . Then the gRPC frontend will deserialize triton_model_config if present, otherwise only populate field from base fields in TensorModelConfig

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions