failure of TensorRT 10.8.0.43 when running Unimatch Fp32 to Fp16 conversion on GPU Jetson Orin 8GB and NVIDIA RTX 4500 #4355
Labels
internal-bug-tracked
Tracked internally, will be fixed in a future release.
Investigating
Issue needs further investigation
Module:Accuracy
Output mismatch between TensorRT and other frameworks
triaged
Issue has been triaged by maintainers
Description
I have tried to convert Unimatch FP32 gmflow-scale1 Model to a Float16 engine model. However the Float 16 Model is not really usable, this concolusion comes from visually inspecting the result as such as the polygraphy tool. For converting the model I have used the following commands :
Both share the same warning output :
However I have been comparing the fp32 and fp16 model with polygraphy :
Applying the last command reveals the following :
That the output is completely different. According to the error message concerning the layernorm I have tried to actively not convert it with the tensorrt python api. The goal was to let the entire attention run on fp32. However it yielded the same result.
I have received the same behaviour converting this network also on the jetson orin 8GB (TensorRt 8.6 and 10.0). For extended tests I switched to another device.
Environment
I am using the docker nvcr.io/nvidia/tensorrt:25.01-py3 environment.
TensorRT Version:
10.8.0.43-1
NVIDIA GPU:
NVIDIA RTX 4500
NVIDIA Driver Version:
535.183.01
CUDA Version:
cuda12.8
CUDNN Version:
Operating System:
Ubuntu 24.04.1 LTS
Python Version (if applicable):
3.12.3
Tensorflow Version (if applicable):
No
PyTorch Version (if applicable):
No
Baremetal or Container (if so, version):
nvcr.io/nvidia/tensorrt:25.01-py3
Relevant Files
Model link:
Unimatch gmflow-scale1
Steps To Reproduce
--save-inputs inputs.json --save-outputs outputs_fp32.json
*polygraphy run --trt trexec_fp16_model.engine
--load-inputs inputs.json --load-outputs outputs_fp32.json
--atol 0.001 --rtol 0
Commands or scripts:
Have you tried the latest release?:
Yes, the docker is quite new.
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
):Yes, but not on FP16 I suppose.
The text was updated successfully, but these errors were encountered: