-
Notifications
You must be signed in to change notification settings - Fork 364
Upgrade perf_run script to support TRT 10 and fix some issues #3650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to Python style guidelines:
--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py 2025-07-17 21:15:44.135432+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py 2025-07-17 21:16:11.238491+00:00
@@ -3596,6 +3596,6 @@
SourceIR.ATEN,
name,
input=args[0],
weight=args[1],
bias=args_bounds_check(args, 2, None),
- )
\ No newline at end of file
+ )
--- /home/runner/work/TensorRT/TensorRT/tools/perf/perf_run.py 2025-07-17 21:15:44.172433+00:00
+++ /home/runner/work/TensorRT/TensorRT/tools/perf/perf_run.py 2025-07-17 21:16:16.387016+00:00
@@ -776,11 +776,13 @@
raise ValueError(
"No valid models specified. Please provide a torchscript model file or model name (defined in hub.py) or model_hf name in huggingface models "
)
backends = parse_backends(params["backends"])
- if any(backend in ["dynamo", "torch_compile", "tensorrt"] for backend in backends) and (model_torch is None):
+ if any(
+ backend in ["dynamo", "torch_compile", "tensorrt"] for backend in backends
+ ) and (model_torch is None):
raise ValueError(
"No Pytorch model (nn.Module) is provided for torchdynamo compilation. Please provide a pytorch model using --model_torch argument"
)
batch_size = params["batch_size"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- remove lower_linear
- consider try/catch for onnx export
- Update perf_run readme with the latest arguments
- Run llm example with the constant_folding to test accuracy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify the start timer to include creation of input bindings as well in run_tensorrt
|
||
|
||
@dynamo_tensorrt_converter(torch.ops.aten.linear.default, supports_dynamic_shapes=True) | ||
@dynamo_tensorrt_converter(torch.ops.aten.linear, supports_dynamic_shapes=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that registering a converter for OpOverloadPacket has no effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found for some models in fp16, for example, bert, registering a linear op can reduce latency. It seems no effect for fp32 though.
from torch_tensorrt.dynamo.conversion import impl | ||
from torch_tensorrt.dynamo.conversion._ConversionContext import ConversionContext | ||
from torch_tensorrt.dynamo.conversion.converter_utils import SourceIR, get_trt_tensor | ||
from torch_tensorrt.fx.types import TRTTensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch_tensorrt.dynamo.types
Description
optimization_level
arg and set highest optimization 5 as defaultFixes #3634
Type of change
Checklist: