The specification defines an open standard Artifacial Intelligence model. It is defined through the artifact extension based on the OCI image specification, and extends model features through artifactType
and annotations
. Model storage and distribution can be optimized based on artifact extension.
The goal of this specification is to package models in an OCI artifact to take advantage of OCI distribution and ensure efficient model deployment.
The model specification needs to consider two factors:
- The model needs to be stored in the OCI registry and display the parameters of the model. So that the model should use the artifact extension to packaging content other than OCI image specification.
- The model needs to be mounted by the container runtime as read only volumes based on the OCI Artifacts in Kubernetes 1.31+. Container runtimes can only pull OCI artifact that follows the OCI image specification.
Therefore, the model specification must be defined through the artifact extension based on the OCI image specification. It can be better compatible with the kubernetes ecosystem.
The model specification is defined through the artifact extension based on the OCI image specification, and extend model features through artifactType
and annotations
. Model storage and distribution can be optimized based on artifact extension.
The model specification running workflow is divided into two stages: BUILD & PUSH
and PULL & SERVE
.
Use tools(ORAS, Ollama, etc.) to build required resources in the model repository into artifact based on the model specification. Note that the model layer MUST NOT be compressed, because the files of model weight has been compressed. If the model layer is compressed, the container runtime will cost long time to decompress the model layer. Therefore, it's recommended to use the application/vnd.oci.image.layer.v1.tar
format for the model layer to avoid compression
Next push the artifact to the OCI registry(Harbor, Docker Hub, etc.), and use the functionalities of the OCI registry to manage the model artifact.
The container runtime(containerd, CRI-O, etc) pulls the model artifact from the OCI registry, and mounts the model artifact as a read-only volume. Therefore, distributed model can use the P2P technology(Dragonfly, Kraken, etc) to reduce the pressure on the registry and preheat the model artifact into each node. If the model artifact is already present on the node, the container runtime can reuse the model artifact to mount different containers in the same node.
The model specification is based on the OCI image specification and focuses on defining the artifact extension according to the Artifacts Guidance.
-
artifactType
stringThis REQUIRED property MUST contain the media type
application/vnd.cnai.model.manifest.v1+json
. -
layers
array of objects-
artifactType
stringImplementations MUST support at least the following media types:
application/vnd.cnai.model.layer.v1.tar
application/vnd.cnai.model.layer.v1.tar+gzip
If
mediaType
isapplication/vnd.oci.image.layer.v1.tar
, theartifactType
MUST beapplication/vnd.cnai.model.layer.v1.tar
. IfmediaType
isapplication/vnd.oci.image.layer.v1.tar+gzip
, theartifactType
MUST beapplication/vnd.cnai.model.layer.v1.tar+gzip
. ThemediaType
andartifactType
MUST be consistent, for detailed definitions of Filesystem Layers, please refer to the Image Layer Filesystem Changeset. -
annotations
string-string mapThis OPTIONAL property contains arbitrary metadata for the layer. For model specification, SHOULD set the pre-defined annotation keys, refer to the Layer Annotation Keys.
-
-
annotations
string-string mapThis OPTIONAL property contains arbitrary metadata for the image manifest. For model specification, SHOULD set the pre-defined annotation keys, refer to the Manifest Annotation Keys.