Skip to content

Commit

Permalink
Results from GH action on NVIDIA_RTX4090x1
Browse files Browse the repository at this point in the history
  • Loading branch information
arjunsuresh committed Feb 4, 2025
1 parent f5277ec commit c96af0b
Show file tree
Hide file tree
Showing 48 changed files with 1,297 additions and 1,297 deletions.
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@

hash=fa3fcf53e1b87bb0575cfd8ea062a84f017a3195d3330bcf6262e8080604942c
hash=37d0e870423295f6fccefb574a04178fef5208681c5a6a2f521722a0fc569b87
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{"exact_match": 26.17786187322611, "f1": 28.440232048529055}
{"exact_match": 25.827814569536425, "f1": 28.206707691181904}
Reading examples...
No cached features at 'eval_features.pickle'... converting from examples...
Creating tokenizer...
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{"exact_match": 26.17786187322611, "f1": 28.440232048529055}
{"exact_match": 25.818353831598866, "f1": 28.204815543594393}
Reading examples...
Loading cached features from 'eval_features.pickle'...
Loading LoadGen logs...
Expand Down

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ MLPerf Results Summary
SUT name : BERT SERVER
Scenario : Offline
Mode : PerformanceOnly
Samples per second: 1671.89
Samples per second: 4130.23
Result is : VALID
Min duration satisfied : Yes
Min queries satisfied : Yes
Expand All @@ -13,21 +13,21 @@ Result is : VALID
================================================
Additional Stats
================================================
Min latency (ns) : 832988139
Max latency (ns) : 666502179782
Mean latency (ns) : 403210362357
50.00 percentile latency (ns) : 429488304459
90.00 percentile latency (ns) : 635515355299
95.00 percentile latency (ns) : 653949765117
97.00 percentile latency (ns) : 660036187945
99.00 percentile latency (ns) : 664759798628
99.90 percentile latency (ns) : 666402009515
Min latency (ns) : 673447439
Max latency (ns) : 666377332374
Mean latency (ns) : 403960047115
50.00 percentile latency (ns) : 430132484001
90.00 percentile latency (ns) : 635493053900
95.00 percentile latency (ns) : 653717048485
97.00 percentile latency (ns) : 659767492035
99.00 percentile latency (ns) : 664565075772
99.90 percentile latency (ns) : 666241854528

================================================
Test Parameters Used
================================================
samples_per_query : 1114315
target_qps : 1688.36
samples_per_query : 2752291
target_qps : 4170.14
target_latency (ns): 0
max_async_queries : 1
min_duration (ms): 600000
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Reading performance mode results...
num_acc_log_entries = 10833
num_acc_log_duplicate_keys = 0
num_acc_log_data_mismatch = 0
num_perf_log_entries = 4109
num_perf_log_qsl_idx_match = 4109
num_perf_log_data_mismatch = 48
num_perf_log_entries = 4100
num_perf_log_qsl_idx_match = 4100
num_perf_log_data_mismatch = 24
num_missing_qsl_idxs = 0
TEST FAIL

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Verifying performance.
reference score = 1671.64
test score = 1671.89
reference score = 4128.85
test score = 4130.23
TEST PASS
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@

hash=9c846f3ba27836464ab01e63081f2974c2cfbe89cac69c161b3f6fe3111e2a86
hash=125af90a292e3e12d730c1e69c016e488dfaf8df79262ee2b5d2be1aa50da913

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -4,38 +4,38 @@ MLPerf Results Summary
SUT name : BERT SERVER
Scenario : SingleStream
Mode : PerformanceOnly
90th percentile latency (ns) : 2172588
90th percentile latency (ns) : 1010026
Result is : VALID
Min duration satisfied : Yes
Min queries satisfied : Yes
Early stopping satisfied: Yes
Early Stopping Result:
* Processed at least 64 queries (390753).
* Would discard 38638 highest latency queries.
* Early stopping 90th percentile estimate: 2173721
* Early stopping 99th percentile estimate: 2626707
* Processed at least 64 queries (645137).
* Would discard 63952 highest latency queries.
* Early stopping 90th percentile estimate: 1010329
* Early stopping 99th percentile estimate: 1182908

================================================
Additional Stats
================================================
QPS w/ loadgen overhead : 651.25
QPS w/o loadgen overhead : 656.86
QPS w/ loadgen overhead : 1075.23
QPS w/o loadgen overhead : 1079.93

Min latency (ns) : 1165352
Max latency (ns) : 2822770
Mean latency (ns) : 1522390
50.00 percentile latency (ns) : 1437728
90.00 percentile latency (ns) : 2172588
95.00 percentile latency (ns) : 2304814
97.00 percentile latency (ns) : 2607469
99.00 percentile latency (ns) : 2626523
99.90 percentile latency (ns) : 2640331
Min latency (ns) : 853322
Max latency (ns) : 1352664
Mean latency (ns) : 925990
50.00 percentile latency (ns) : 903600
90.00 percentile latency (ns) : 1010026
95.00 percentile latency (ns) : 1084037
97.00 percentile latency (ns) : 1168865
99.00 percentile latency (ns) : 1182838
99.90 percentile latency (ns) : 1198603

================================================
Test Parameters Used
================================================
samples_per_query : 1
target_qps : 1643.65
target_qps : 2699.17
target_latency (ns): 0
max_async_queries : 1
min_duration (ms): 600000
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ Reading performance mode results...
num_acc_log_entries = 10833
num_acc_log_duplicate_keys = 0
num_acc_log_data_mismatch = 0
num_perf_log_entries = 1619
num_perf_log_qsl_idx_match = 1619
num_perf_log_entries = 1664
num_perf_log_qsl_idx_match = 1664
num_perf_log_data_mismatch = 0
num_missing_qsl_idxs = 0
TEST PASS
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Verifying performance.
reference score = 2170590
test score = 2173721
reference score = 1010858
test score = 1010329
TEST PASS
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
| Model | Scenario | Accuracy | Throughput | Latency (in ms) | Power Efficiency (in samples/J) | TEST01 |
|-----------|--------------|------------|--------------|-------------------|-----------------------------------|----------|
| bert-99.9 | offline | 90.8832 | 1671.64 | - | | passed |
| bert-99.9 | singlestream | 90.8811 | 460.617 | 2.171 | | passed |
| Model | Scenario | Accuracy | Throughput | Latency (in ms) | Power Efficiency (in samples/J) | TEST01 |
|---------|--------------|------------|--------------|-------------------|-----------------------------------|----------|
| bert-99 | offline | 90.1528 | 4128.85 | - | | passed |
| bert-99 | singlestream | 90.2668 | 989.12 | 1.011 | | passed |
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ mlc rm cache -f

Platform: RTX4090x1-nvidia-gpu-TensorRT-default_config

Model Precision: fp16
Model Precision: int8

### Accuracy Results
`F1`: `90.88324`, Required accuracy for closed division `>= 90.78313`
`F1`: `90.15279`, Required accuracy for closed division `>= 89.96526`

### Performance Results
`Samples per second`: `1671.64`
`Samples per second`: `4128.85`
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
"starting_weights_filename": "https://armi.in/files/bert_large_v1_1_fake_quant.onnx",
"retraining": "no",
"input_data_types": "int32",
"weight_data_types": "fp16",
"weight_data_types": "int8",
"weight_transformations": "quantization, affine fusion"
}
Loading

0 comments on commit c96af0b

Please sign in to comment.