Skip to content

Commit

Permalink
Results from GH action on NVIDIA_RTX4090x1
Browse files Browse the repository at this point in the history
  • Loading branch information
arjunsuresh committed Feb 4, 2025
1 parent f33a521 commit 34218da
Show file tree
Hide file tree
Showing 52 changed files with 20,564 additions and 20,564 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,4 @@ Model Precision: int8
### Accuracy Results

### Performance Results
`Samples per query`: `11520571.0`
`Samples per query`: `11595368.0`
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[2025-02-03 16:47:08,854 main.py:229 INFO] Detected system ID: KnownSystem.Nvidia_0365aa3a001c
[2025-02-03 16:47:08,936 harness.py:249 INFO] The harness will load 2 plugins: ['build/plugins/NMSOptPlugin/libnmsoptplugin.so', 'build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so']
[2025-02-03 16:47:08,937 generate_conf_files.py:107 INFO] Generated measurements/ entries for Nvidia_0365aa3a001c_TRT/retinanet/MultiStream
[2025-02-03 16:47:08,937 __init__.py:46 INFO] Running command: ./build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so,build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so" --logfile_outdir="/mlc-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/multistream/accuracy" --logfile_prefix="mlperf_log_" --performance_sample_count=64 --test_mode="AccuracyOnly" --gpu_copy_streams=1 --gpu_inference_streams=1 --use_deque_limit=true --gpu_batch_size=2 --map_path="data_maps/open-images-v6-mlperf/val_map.txt" --mlperf_conf_path="/home/mlcuser/MLC/repos/local/cache/get-git-repo_02ea1bfc/inference/mlperf.conf" --tensor_path="build/preprocessed_data/open-images-v6-mlperf/validation/Retinanet/int8_linear" --use_graphs=true --user_conf_path="/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/830d3f2fe5614831a503db704a077549.conf" --gpu_engines="./build/engines/Nvidia_0365aa3a001c/retinanet/MultiStream/retinanet-MultiStream-gpu-b2-int8.lwis_k_99_MaxP.plan" --max_dlas=0 --scenario MultiStream --model retinanet --response_postprocess openimageeffnms
[2025-02-03 16:47:08,937 __init__.py:53 INFO] Overriding Environment
[2025-02-03 23:03:57,764 main.py:229 INFO] Detected system ID: KnownSystem.Nvidia_35fc6d42651a
[2025-02-03 23:03:57,842 harness.py:249 INFO] The harness will load 2 plugins: ['build/plugins/NMSOptPlugin/libnmsoptplugin.so', 'build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so']
[2025-02-03 23:03:57,843 generate_conf_files.py:107 INFO] Generated measurements/ entries for Nvidia_35fc6d42651a_TRT/retinanet/MultiStream
[2025-02-03 23:03:57,843 __init__.py:46 INFO] Running command: ./build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so,build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so" --logfile_outdir="/mlc-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/multistream/accuracy" --logfile_prefix="mlperf_log_" --performance_sample_count=64 --test_mode="AccuracyOnly" --gpu_copy_streams=1 --gpu_inference_streams=1 --use_deque_limit=true --gpu_batch_size=2 --map_path="data_maps/open-images-v6-mlperf/val_map.txt" --mlperf_conf_path="/home/mlcuser/MLC/repos/local/cache/get-git-repo_02ea1bfc/inference/mlperf.conf" --tensor_path="build/preprocessed_data/open-images-v6-mlperf/validation/Retinanet/int8_linear" --use_graphs=true --user_conf_path="/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/df1126a310fa40a8b43effd5f7c0ff8e.conf" --gpu_engines="./build/engines/Nvidia_35fc6d42651a/retinanet/MultiStream/retinanet-MultiStream-gpu-b2-int8.lwis_k_99_MaxP.plan" --max_dlas=0 --scenario MultiStream --model retinanet --response_postprocess openimageeffnms
[2025-02-03 23:03:57,843 __init__.py:53 INFO] Overriding Environment
benchmark : Benchmark.Retinanet
buffer_manager_thread_count : 0
data_dir : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_fe95ede4/data
Expand All @@ -12,7 +12,7 @@ gpu_copy_streams : 1
gpu_inference_streams : 1
input_dtype : int8
input_format : linear
log_dir : /home/mlcuser/MLC/repos/local/cache/get-git-repo_e7fa5107/repo/closed/NVIDIA/build/logs/2025.02.03-16.47.07
log_dir : /home/mlcuser/MLC/repos/local/cache/get-git-repo_e7fa5107/repo/closed/NVIDIA/build/logs/2025.02.03-23.03.56
map_path : data_maps/open-images-v6-mlperf/val_map.txt
mlperf_conf_path : /home/mlcuser/MLC/repos/local/cache/get-git-repo_02ea1bfc/inference/mlperf.conf
multi_stream_expected_latency_ns : 0
Expand All @@ -21,14 +21,14 @@ multi_stream_target_latency_percentile : 99
precision : int8
preprocessed_data_dir : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_fe95ede4/preprocessed_data
scenario : Scenario.MultiStream
system : SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='AMD Ryzen 9 7950X 16-Core Processor', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=16, threads_per_core=2): 1}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=131.080068, byte_suffix=<ByteSuffix.GB: (1000, 3)>, _num_bytes=131080068000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA GeForce RTX 4090', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=23.98828125, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=25757220864), max_power_limit=450.0, pci_id='0x268410DE', compute_sm=89): 1})), numa_conf=None, system_id='Nvidia_0365aa3a001c')
system : SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='AMD Ryzen 9 7950X 16-Core Processor', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=16, threads_per_core=2): 1}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=131.080068, byte_suffix=<ByteSuffix.GB: (1000, 3)>, _num_bytes=131080068000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA GeForce RTX 4090', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=23.98828125, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=25757220864), max_power_limit=450.0, pci_id='0x268410DE', compute_sm=89): 1})), numa_conf=None, system_id='Nvidia_35fc6d42651a')
tensor_path : build/preprocessed_data/open-images-v6-mlperf/validation/Retinanet/int8_linear
test_mode : AccuracyOnly
use_deque_limit : True
use_graphs : True
user_conf_path : /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/830d3f2fe5614831a503db704a077549.conf
system_id : Nvidia_0365aa3a001c
config_name : Nvidia_0365aa3a001c_retinanet_MultiStream
user_conf_path : /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/df1126a310fa40a8b43effd5f7c0ff8e.conf
system_id : Nvidia_35fc6d42651a
config_name : Nvidia_35fc6d42651a_retinanet_MultiStream
workload_setting : WorkloadSetting(HarnessType.LWIS, AccuracyTarget.k_99, PowerSetting.MaxP)
optimization_level : plugin-enabled
num_profiles : 1
Expand All @@ -40,15 +40,15 @@ power_limit : None
cpu_freq : None
&&&& RUNNING Default_Harness # ./build/bin/harness_default
[I] mlperf.conf path: /home/mlcuser/MLC/repos/local/cache/get-git-repo_02ea1bfc/inference/mlperf.conf
[I] user.conf path: /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/830d3f2fe5614831a503db704a077549.conf
[I] user.conf path: /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/df1126a310fa40a8b43effd5f7c0ff8e.conf
Creating QSL.
Finished Creating QSL.
Setting up SUT.
[I] [TRT] Loaded engine size: 73 MiB
[I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +6, GPU +10, now: CPU 124, GPU 888 (MiB)
[I] [TRT] [MemUsageChange] Init cuDNN: CPU +2, GPU +10, now: CPU 126, GPU 898 (MiB)
[I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +68, now: CPU 0, GPU 68 (MiB)
[I] Device:0.GPU: [0] ./build/engines/Nvidia_0365aa3a001c/retinanet/MultiStream/retinanet-MultiStream-gpu-b2-int8.lwis_k_99_MaxP.plan has been successfully loaded.
[I] Device:0.GPU: [0] ./build/engines/Nvidia_35fc6d42651a/retinanet/MultiStream/retinanet-MultiStream-gpu-b2-int8.lwis_k_99_MaxP.plan has been successfully loaded.
[E] [TRT] 3: [runtime.cpp::~Runtime::401] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::401, condition: mEngineCounter.use_count() == 1 Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.)
[I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 53, GPU 900 (MiB)
[I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 53, GPU 908 (MiB)
Expand All @@ -59,7 +59,7 @@ Setting up SUT.
[I] Creating batcher thread: 0 EnableBatcherThreadPerDevice: false
Finished setting up SUT.
Starting warmup. Running for a minimum of 5 seconds.
Finished warmup. Ran for 5.14346s.
Finished warmup. Ran for 5.14326s.
Starting running actual test.

No warnings encountered during test.
Expand All @@ -72,8 +72,8 @@ Device Device:0.GPU processed:
PerSampleCudaMemcpy Calls: 0
BatchedCudaMemcpy Calls: 12392
&&&& PASSED Default_Harness # ./build/bin/harness_default
[2025-02-03 16:48:10,907 run_harness.py:166 INFO] Result: Accuracy run detected.
[2025-02-03 16:48:10,907 __init__.py:46 INFO] Running command: python3 /home/mlcuser/MLC/repos/local/cache/get-git-repo_e7fa5107/repo/closed/NVIDIA/build/inference/vision/classification_and_detection/tools/accuracy-openimages.py --mlperf-accuracy-file /mlc-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/multistream/accuracy/mlperf_log_accuracy.json --openimages-dir /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_fe95ede4/preprocessed_data/open-images-v6-mlperf --output-file build/retinanet-results.json
[2025-02-03 23:05:00,001 run_harness.py:166 INFO] Result: Accuracy run detected.
[2025-02-03 23:05:00,001 __init__.py:46 INFO] Running command: python3 /home/mlcuser/MLC/repos/local/cache/get-git-repo_e7fa5107/repo/closed/NVIDIA/build/inference/vision/classification_and_detection/tools/accuracy-openimages.py --mlperf-accuracy-file /mlc-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/multistream/accuracy/mlperf_log_accuracy.json --openimages-dir /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_fe95ede4/preprocessed_data/open-images-v6-mlperf --output-file build/retinanet-results.json
loading annotations into memory...
Done (t=0.42s)
creating index...
Expand All @@ -84,22 +84,22 @@ creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=126.30s).
DONE (t=125.96s).
Accumulating evaluation results...
DONE (t=30.83s).
DONE (t=31.08s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.373
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.522
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.403
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.404
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.124
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.412
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.413
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.419
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.598
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.599
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.628
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.082
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.343
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.677
mAP=37.317%
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.081
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.344
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.678
mAP=37.348%

======================== Result summaries: ========================

Loading

0 comments on commit 34218da

Please sign in to comment.