The Open Inference Protocol(OIP) specification defines a standard protocol for performing machine learning model inference across serving runtimes for different ML frameworks. The protocol facilitates the implementation of a standardized and high performance data plane, promoting interoperability among model serving runtimes. The specification enables the creation of cohesive inference experience, empowering the development of versatile client or benchmarking tools that can work with all supported serving runtimes.
- The inference REST specification
- The inference gRPC specification
- KServe v2 inference protocol
- NVIDIA Triton inference server protocol
- Seldon MLServer
- Seldon Core v2 inference protocol
- OpenVino RESTful API and gRPC API
- AMD Inference Server
- TorchServe Inference API
Changes to the specification are versioned according to Semantic Versioning 2.0 and described in CHANGELOG.md. Layout changes are not versioned. Specific implementations of the specification should specify which version they implement.
We have a public monthly community meeting on Wed 10AM US/Pacific. Please map that to your local time.
You can also find these meetings on the community calendar, along with other major community events.
You can find the meeting minutes from the monthly work group sessions in this Google Doc.
You can access the meeting recordings on the community calendar by clicking on the respective date's event details.
For questions or issues, you can use:
#kserve-oip-collaboration
channel in CNCF Slack, please follow the steps below:
For bug reports and features requests, please use Open Inference Protocol issues.
By contributing to Open Inference Protocol Specification repository, you agree that your contributions will be licensed under its Apache 2.0 License.