Upgrade perf_run script to support TRT 10 and fix some issues #3650

zewenli98 · 2025-07-02T22:48:53Z

Description

removed embedding layer from constant folding lowering pass
upgraded TRT API to execute_async_v3
added optimization_level arg and set highest optimization 5 as default
fixed some known issues

Fixes #3634

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py	2025-07-17 21:15:44.135432+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py	2025-07-17 21:16:11.238491+00:00
@@ -3596,6 +3596,6 @@
        SourceIR.ATEN,
        name,
        input=args[0],
        weight=args[1],
        bias=args_bounds_check(args, 2, None),
-    )
\ No newline at end of file
+    )
--- /home/runner/work/TensorRT/TensorRT/tools/perf/perf_run.py	2025-07-17 21:15:44.172433+00:00
+++ /home/runner/work/TensorRT/TensorRT/tools/perf/perf_run.py	2025-07-17 21:16:16.387016+00:00
@@ -776,11 +776,13 @@
        raise ValueError(
            "No valid models specified. Please provide a torchscript model file or model name (defined in hub.py) or model_hf name in huggingface models "
        )

    backends = parse_backends(params["backends"])
-    if any(backend in ["dynamo", "torch_compile", "tensorrt"] for backend in backends) and (model_torch is None):
+    if any(
+        backend in ["dynamo", "torch_compile", "tensorrt"] for backend in backends
+    ) and (model_torch is None):
        raise ValueError(
            "No Pytorch model (nn.Module) is provided for torchdynamo compilation. Please provide a pytorch model using --model_torch argument"
        )

    batch_size = params["batch_size"]

peri044

remove lower_linear
consider try/catch for onnx export
Update perf_run readme with the latest arguments
Run llm example with the constant_folding to test accuracy

peri044

modify the start timer to include creation of input bindings as well in run_tensorrt

HolyWu · 2025-07-18T14:41:04Z

py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py

+
+
+@dynamo_tensorrt_converter(torch.ops.aten.linear.default, supports_dynamic_shapes=True)
+@dynamo_tensorrt_converter(torch.ops.aten.linear, supports_dynamic_shapes=True)


I think that registering a converter for OpOverloadPacket has no effect.

I found for some models in fp16, for example, bert, registering a linear op can reduce latency. It seems no effect for fp32 though.

HolyWu · 2025-07-18T14:42:21Z

py/torch_tensorrt/dynamo/conversion/impl/linear.py

+from torch_tensorrt.dynamo.conversion import impl
+from torch_tensorrt.dynamo.conversion._ConversionContext import ConversionContext
+from torch_tensorrt.dynamo.conversion.converter_utils import SourceIR, get_trt_tensor
+from torch_tensorrt.fx.types import TRTTensor


torch_tensorrt.dynamo.types

zewenli98 requested review from narendasan, peri044 and cehongwang July 2, 2025 22:48

zewenli98 self-assigned this Jul 2, 2025

facebook-github-bot added the cla signed label Jul 2, 2025

github-actions bot added component: lowering Issues re: The lowering / preprocessing passes component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Jul 2, 2025

github-actions bot requested a review from gs-olive July 2, 2025 22:49

zewenli98 removed the request for review from gs-olive July 2, 2025 22:49

zewenli98 added 2 commits July 7, 2025 13:30

fix perf bug

8c69f8d

minor update

962fb48

zewenli98 force-pushed the fix_perf_bug branch from 1f88d6c to 962fb48 Compare July 7, 2025 20:58

github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters labels Jul 17, 2025

github-actions bot requested changes Jul 17, 2025

View reviewed changes

revert linear converter

785c25a

zewenli98 force-pushed the fix_perf_bug branch from 640e96d to 785c25a Compare July 17, 2025 21:19

peri044 reviewed Jul 17, 2025

View reviewed changes

fix comments

22721c2

HolyWu reviewed Jul 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Upgrade perf_run script to support TRT 10 and fix some issues #3650

Upgrade perf_run script to support TRT 10 and fix some issues #3650

Uh oh!

zewenli98 commented Jul 2, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

peri044 left a comment

Uh oh!

peri044 left a comment

Uh oh!

HolyWu Jul 18, 2025

Uh oh!

zewenli98 Jul 18, 2025

Uh oh!

HolyWu Jul 18, 2025

Uh oh!

Uh oh!



		@dynamo_tensorrt_converter(torch.ops.aten.linear.default, supports_dynamic_shapes=True)
		@dynamo_tensorrt_converter(torch.ops.aten.linear, supports_dynamic_shapes=True)

Upgrade perf_run script to support TRT 10 and fix some issues #3650

Are you sure you want to change the base?

Upgrade perf_run script to support TRT 10 and fix some issues #3650

Uh oh!

Conversation

zewenli98 commented Jul 2, 2025

Description

Type of change

Checklist:

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

HolyWu Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

zewenli98 Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

HolyWu Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!