Skip to content

ivikhrev/model_server

This branch is 1598 commits behind openvinotoolkit/model_server:main.

Folders and files

NameName
Last commit message
Last commit date
Nov 18, 2021
Dec 20, 2021
Jun 22, 2021
Nov 17, 2021
Nov 17, 2021
Jul 20, 2021
Dec 20, 2021
Nov 5, 2021
Nov 25, 2021
Sep 13, 2021
Aug 20, 2021
Mar 19, 2021
Nov 17, 2020
Jul 14, 2020
Nov 18, 2021
Aug 26, 2021
Aug 26, 2021
Nov 17, 2021
Nov 5, 2021
Apr 1, 2020
Sep 29, 2020
Nov 18, 2021
Oct 28, 2021
Aug 20, 2021
Nov 15, 2021

Repository files navigation

OpenVINO™ Model Server

OVMS picture

OpenVINO™ Model Server (OVMS) is a scalable, high-performance solution for serving machine learning models optimized for Intel® architectures. The server provides an inference service via gRPC or REST API - making it easy to deploy new algorithms and AI experiments using the same architecture as TensorFlow Serving for any models trained in a framework that is supported by OpenVINO.

The server implements gRPC and REST API framework with data serialization and deserialization using TensorFlow Serving API, and OpenVINO™ as the inference execution provider. Model repositories may reside on a locally accessible file system (e.g. NFS), Google Cloud Storage (GCS), Amazon S3, Minio or Azure Blob Storage.

OVMS is now implemented in C++ and provides much higher scalability compared to its predecessor in Python version. You can take advantage of all the power of Xeon CPU capabilities or AI accelerators and expose it over the network interface. Read release notes to find out what's new in C++ version.

Review the Architecture concept document for more details.

A few key features:

Note: OVMS has been tested on RedHat*, CentOS* and Ubuntu*. Latest publicly released docker images are based on Ubuntu and UBI. They are stored in:

Run OpenVINO Model Server

A demonstration how to use OpenVINO Model Server can be found in a quick start guide.

More detailed guides to using Model Server in various scenarios can be found here:

API documentation

GRPC

Learn more about GRPC API

Refer to the GRPC example client code to learn how to use and submit the requests using the gRPC interface.

REST

Learn more about REST API

Refer to the REST API example client code to learn how to use REST API

OVMS Python Client Library

For simplified interaction with the model server API, the Python client library has been introduced. It's a set of Python functions and objects that wrap things like:

  • setting connection with the server
  • creating TensorProto from data
  • creating requests for model status, model metadata and prediction
  • sending requests to appropriate endpoints

Testing

Learn more about tests in the developer guide

Known Limitations

  • Currently, Predict, GetModelMetadata and GetModelStatus calls are implemented using Tensorflow Serving API.
  • Classify, Regress and MultiInference are not included.
  • Output_filter is not effective in the Predict call. All outputs defined in the model are returned to the clients.

OpenVINO Model Server Contribution Policy

  • All contributed code must be compatible with the Apache 2 license.

  • All changes have to have pass style, unit and functional tests.

  • All new features need to be covered by tests.

Follow a contributor guide and a developer guide.

References

Contact

Submit Github issue to ask question, request a feature or report a bug.


* Other names and brands may be claimed as the property of others.

About

A scalable inference server for models optimized with OpenVINO™

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 70.7%
  • Python 23.9%
  • Makefile 1.3%
  • Shell 1.1%
  • Go 1.0%
  • Starlark 0.8%
  • Other 1.2%