ONNXRuntime Benchmarking

This repository contains an AI model benchmarking on EtGlowExecutionProvider provider hardware. Executions on CPU EP are also provided for comparison.

Performance Results

Model	Size	Batch	ET Provider (Inf/s)	ET Provider (Inf/s/W)
Esperanto/resnet50-onnx	fp32	1	269.806	9.958698
Esperanto/resnet50-onnx	fp32	2	345.299	12.543673
Esperanto/resnet50-onnx	fp32	4	466.124	16.233861
Esperanto/resnet50-onnx	fp32	8	427.453	15.111834
Esperanto/vgg-16-bn-fp32-onnx	fp32	1	106.370	3.929013
Esperanto/vgg-16-bn-fp32-onnx	fp32	2	148.595	5.259819
Esperanto/vgg-16-bn-fp32-onnx	fp32	4	168.185	5.978496
Esperanto/vgg-16-bn-fp32-onnx	fp32	8	167.038	6.096589
Esperanto/vgg-16-bn-fp32-onnx	fp32	16	142.018	5.347647
Esperanto/vgg19-7-onnx	fp32	1	93.033	3.379863
Esperanto/vgg19-7-onnx	fp32	2	125.453	4.437935
Esperanto/vgg19-7-onnx	fp32	4	137.954	4.873554
Esperanto/vgg19-7-onnx	fp32	8	113.090	4.374226
Esperanto/mobilenet-v2-fp32-onnx	fp32	1	722.274	33.004601
Esperanto/mobilenet-v2-fp32-onnx	fp32	2	1155.423	49.460166
Esperanto/mobilenet-v2-fp32-onnx	fp32	4	1628.399	64.735260
Esperanto/mobilenet-v2-fp32-onnx	fp32	8	1478.598	56.441255
Esperanto/mobilenet-v3-small-onnx	fp32	1	319.873	17.816432
Esperanto/mobilenet-v3-small-onnx	fp32	2	598.362	32.487561
Esperanto/mobilenet-v3-small-onnx	fp32	4	1040.481	53.614453
Esperanto/mobilenet-v3-small-onnx	fp32	8	1748.935	82.216533
Esperanto/mobilenet-v3-small-onnx	fp32	16	2405.113	109.051403
Esperanto/mobilenet-v3-small-onnx	fp32	32	2120.889	89.259395
onnx-community/mobilenet_v2_1.4_224	fp32	1	139.007	6.622033
onnx-community/mobilenet_v2_1.4_224	fp32	2	190.644	8.958979
onnx-community/mobilenet_v2_1.4_224	fp32	4	218.698	9.774437
onnx-community/mobilenet_v2_1.4_224	fp32	8	232.392	10.256368
onnx-community/mobilenet_v2_1.4_224	fp32	16	210.324	9.370816

Model (results for MMLUDataset)	Size	Batch	Sequence Len	Window	Implementation	ET Time To First Token (TTFT)	ET Prompt parsing TPS (tok/s)	ET Prompt parsing TPS/W (tok/s/W)
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	1	Default	20.806	3.36	0.15
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	1	IOBindings	14.282	4.9	0.2
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	1	Default	40.342	1.74	0.08
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	1	IOBindings	25.332	2.76	0.12
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	32	Default	0.901	77.7	3.19
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	32	IOBindings	0.683	102.46	4.17
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	32	Default	1.671	41.9	1.98
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	32	IOBindings	1.171	59.77	2.59
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	96	Default	0.352	199.04	7.13
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	96	IOBindings	0.287	334.07	12.88
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	96	Default	0.622	112.46	5.0
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	96	IOBindings	0.526	182.39	8.36
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	128	Default	0.352	199.01	6.27
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	2048	128	IOBindings	0.318	403.04	15.57
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	128	Default	0.666	105.13	4.38
Esperanto/llama3.1-8b-Instruct-kvc-fp16-onnx	fp16	1	4096	128	IOBindings	0.578	221.59	9.55

Prepare development environment

The onnxruntime-benchmarking repository requires a system that has the Esperanto SDK pre-installed. The SDK is a set of utilities and tools that allow the user to transparently use the Esperanto SW stack, including the onnxruntime esperanto fork.

Getting sources

First step is to get the onnxruntime-benchmarking sources.

git clone [email protected]:esperantotech/software/onnxruntime-benchmarking.git
cd onnxruntime-benchmarking/

Start dockerized environment

We have to load a docker from somewhere!!! Assuming you have a sw-platform installation in your $HOME:

./sw-platform/dock.py --image=convoke/ubuntu-22.04-et-sw-develop-stack prompt

Installing dependencies

In order to successfully execute this benchmark we first need to install all its dependencies. This can be performed with these two commands:

python3 -m pip install -r requirements.txt --extra-index-url https://sc-artifactory1.esperanto.ai/artifactory/api/pypi/pypi-virtual/simple
python3 -m pip install opencv-python-headless~=4.10.0.84

Setting environment

This benchmark utilizes HuggingFace to download models and datasets automatically.

Make sure to go to your HuggingFace account settings -> Access Tokens and create a token with the following permisisons:

(Personal permissions) Read access to contents of all repos under your personal namespace
(Personal permissions) Read access to contents of all public gated repos you can access
(Org permissions) Read access to contents of all repos in selected organizations

Then set HF_TOKEN with the newly created token

export HF_TOKEN=<your_hugging_face_token>

OBS: you will need to have access to Esperanto EtSoC-1 accelerators to be able to successfully execute

Dataset Access:

Visit HuggingFace Datasets - ImageNet-1k and follow the instructions to grant access to the ImageNet-1k dataset.
Make sure you can view and access the dataset details without any permission errors.

How to run

To run this benchmark is as easy as running python3 bench.py <params>.

Several examples:

Listing models and the configurations available:

$ python3 bench.py  -lm
2025-01-30 09:41:58.541933 | INFO     | Available models
Models       configs
-----------  -----------------------------------------
mobilenetv2  ['small', 'large']
resnet       ['xx', '50']
bert         ['base', 'large', 'albert', 'distilbert']
llama3       ['', 'kvc']
```

2. Running Resnet50

$ python3 bench.py -m resnet -c 50


3. Running Resnet50 limiting number of launches to 10

$ python3 bench.py -m resnet -c 50 -l 10



4. Running Resnet50 limiting the benchmark execution time to 7 seconds

$ python3 bench.py -m resnet -c 50 -tc 7

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
models		models
scripts		scripts
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.pylintrc		.pylintrc
LICENSE.md		LICENSE.md
README.md		README.md
bench.py		bench.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sonar-project.properties		sonar-project.properties
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ONNXRuntime Benchmarking

Performance Results

Prepare development environment

Getting sources

Start dockerized environment

Installing dependencies

Setting environment

Dataset Access:

How to run

About

Releases

Packages

Languages

License

esperantotech/onnxruntime-benchmarking

Folders and files

Latest commit

History

Repository files navigation

ONNXRuntime Benchmarking

Performance Results

Prepare development environment

Getting sources

Start dockerized environment

Installing dependencies

Setting environment

Dataset Access:

How to run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages