rustc_codegen_llvm: Require `opt-level >= 1` for index-based write_operand_repeatedly() loop #147462

Enselic · 2025-10-08T04:46:05Z

To make debugger stepping intuitive with -Copt-level=0. See the adjusted basic-stepping.rs test.

This is kind of a revert of bd0aae9 (cg_llvm: use index-based loop in write_operand_repeatedly), except we don't revert it, we just make it conditional on opt-level. That commit regressed basic-stepping.rs, but it was not noticed since that test did not exist back then (it was added later in #144876). I have retroactively bisected to find that out.

It seems messy to sprinkle if-cases inside of
write_operand_repeatedly() so make the whole function conditional.

The test that bd0aae9 added in
tests/codegen/issues/issue-111603.rs already use -Copt-level=3, so we don't need to adjust the compiler flags for it to keep passing.

This PR takes us one step closer to fixing #33013.

CC #147426 which is related (there will be trivial conflicts for me to resolve in basic-stepping.rs once one of them lands)

rustbot · 2025-10-08T04:46:10Z

r? @jdonszelmann

rustbot has assigned @jdonszelmann.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

saethlin · 2025-10-11T19:48:48Z

r? saethlin

saethlin · 2025-10-12T04:14:46Z

Do you know why this codegen change made the stepping work differently? It's not clear to me at all what this diff has to do with the stepping order, but clearly it does. So I can't tell if this is actually papering over a bug in LLVM, or deeper in our debuginfo handling.

Enselic · 2025-10-24T04:16:46Z

@saethlin Good call. I have investigated more deeply now. Stepping stopped working because the only emitted LLVM instructions with debug location metadata were removed. No LLVM instructions with debug location metadata remained.

However, simply adding debug info to all remaining instructions does not work well, because then stepping will loop for as many times as we repeat.

Let's look at the details with this program:

fn main() {
    let whatever = ["whatever"; 8];
}

When stepping worked, it looked like this:

rustc +nightly-2023-04-23 -g --emit=llvm-ir $src/whatever.rs

; whatever::main
; Function Attrs: nonlazybind uwtable
define internal void @_ZN8whatever4main17h46e92f5877e9c5b3E() unnamed_addr #1 !dbg !175 {
start:
  %whatever = alloca [8 x { ptr, i64 }], align 8
  call void @llvm.dbg.declare(metadata ptr %whatever, metadata !179, metadata !DIExpression()), !dbg !189
  %0 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 0, !dbg !190
  %1 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 8, !dbg !190
  br label %repeat_loop_header, !dbg !190

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %2 = phi ptr [ %0, %start ], [ %6, %repeat_loop_body ]
  %3 = icmp ne ptr %2, %1
  br i1 %3, label %repeat_loop_body, label %repeat_loop_next

repeat_loop_body:                                 ; preds = %repeat_loop_header
  %4 = getelementptr inbounds { ptr, i64 }, ptr %2, i32 0, i32 0
  store ptr @alloc_3db654700ddfbbbd22c59221279a79d2, ptr %4, align 8
  %5 = getelementptr inbounds { ptr, i64 }, ptr %2, i32 0, i32 1
  store i64 8, ptr %5, align 8
  %6 = getelementptr inbounds { ptr, i64 }, ptr %2, i64 1
  br label %repeat_loop_header

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void, !dbg !191
}

!190 = !DILocation(line: 2, column: 20, scope: !175)

stepping worked because of (note that these instructions don't repeat):

  %0 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 0, !dbg !190
  %1 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 8, !dbg !190

We can make stepping work if we add !dbg to the looped over instructions which I do in #148058. Here is what the IR looks like in that PR:

./build/x86_64-unknown-linux-gnu/stage1/bin/rustc -g --emit=llvm-ir $src/whatever.rs

; whatever::main
; Function Attrs: nonlazybind uwtable
define hidden void @_ZN8whatever4main17h0930ac910df4df44E() unnamed_addr #0 !dbg !159 {
start:
  %whatever = alloca [128 x i8], align 8
    #dbg_declare(ptr %whatever, !163, !DIExpression(), !173)
  br label %repeat_loop_header, !dbg !174

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %0 = phi i64 [ 0, %start ], [ %4, %repeat_loop_body ], !dbg !174
  %1 = icmp ult i64 %0, 8, !dbg !174
  br i1 %1, label %repeat_loop_body, label %repeat_loop_next, !dbg !174

repeat_loop_body:                                 ; preds = %repeat_loop_header
  %2 = getelementptr inbounds nuw { ptr, i64 }, ptr %whatever, i64 %0, !dbg !174
  store ptr @alloc_3db654700ddfbbbd22c59221279a79d2, ptr %2, align 8, !dbg !174
  %3 = getelementptr inbounds i8, ptr %2, i64 8, !dbg !174
  store i64 8, ptr %3, align 8, !dbg !174
  %4 = add nuw i64 %0, 1, !dbg !174
  br label %repeat_loop_header, !dbg !174

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void, !dbg !175
}

!174 = !DILocation(line: 2, column: 20, scope: !159)

Now stepping works, but since all instructions with !174 are repeated, we will now step 8 times (because of ; 8];). br label %repeat_loop_header, !dbg !174 probably don't count since no machine code is emitted for that instruction (I assume. I haven't double-checked. But why would it?.)

I don't have any good idea on how to solve this. The only thing I can think of is to force emission of a no-op instruction before we enter the loop and then attach !dbg !174 to it. It might work in non-optimized builds. But I don't know if that's possible or wanted.

Maybe you have some other idea?

saethlin · 2025-10-24T22:04:10Z

Stepping stopped working because the only emitted LLVM instructions with debug location metadata were removed. No LLVM instructions with debug location metadata remained.

Is that true? I'm looking at this IR:

; demo::main
; Function Attrs: nonlazybind uwtable
define hidden void @_ZN4demo4main17h1e5d7b00d81e2d71E() unnamed_addr #0 !dbg !151 {
start:
  %h = alloca [128 x i8], align 8
    #dbg_declare(ptr %h, !155, !DIExpression(), !165)
  br label %repeat_loop_header, !dbg !166

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %0 = phi i64 [ 0, %start ], [ %4, %repeat_loop_body ]
  %1 = icmp ult i64 %0, 8
  br i1 %1, label %repeat_loop_body, label %repeat_loop_next

repeat_loop_body:                                 ; preds = %repeat_loop_header
  %2 = getelementptr inbounds nuw { ptr, i64 }, ptr %h, i64 %0
  store ptr @alloc_3db654700ddfbbbd22c59221279a79d2, ptr %2, align 8
  %3 = getelementptr inbounds i8, ptr %2, i64 8
  store i64 8, ptr %3, align 8
  %4 = add nuw i64 %0, 1
  br label %repeat_loop_header

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void, !dbg !167
}

In the start block, I see dbg 165 and 166 mentioned, and those are

!165 = !DILocation(line: 2, column: 9, scope: !156) 
!166 = !DILocation(line: 2, column: 13, scope: !151)

Which are the the h and the [ in let h = ["whatever; 8].

It almost seems like the getelementptr inbounds instructions that we used to emit in the start block matter to LLVM in some way that the other instructions don't, right?

Enselic · 2025-11-07T21:40:02Z

@saethlin I investigated this further by stepping around in LLVM code among other things.

My understanding is that #dbg_declare does not affect line stepping behaviour in a debugger. Instead, #dbg_declare is used to tell a debugger how to figure out the value of a variable at various points in the function. So we can disregard #dbg_declare when it comes to stepping.

For stepping, what matters is the !dbg metadata that are attached to LLVM IR instructions. So why doesn't

br label %repeat_loop_header, !dbg !166

save us? After all, it attaches the right line info to the instruction, so what's wrong? What's happening is that there is no machine code emitted that corresponds to that LLVM IR instruction. That instruction just says "go to the block that comes next". But no explicit machine instruction is needed for that. Execution from start can simply fall through to repeat_loop_header.

This can be seen in the final machine code:

0000000000038b30 <_ZN13whatever_main4main17h2b49df056371578eE>:
   38b30:	48 83 ec 10          	sub    $0x10,%rsp
   38b34:	31 c0                	xor    %eax,%eax
   38b36:	48 89 44 24 88       	mov    %rax,-0x78(%rsp)
   38b3b:	48 8b 44 24 88       	mov    -0x78(%rsp),%rax
   38b40:	48 89 44 24 80       	mov    %rax,-0x80(%rsp)
   38b45:	48 83 f8 08          	cmp    $0x8,%rax
   38b49:	73 31                	jae    38b7c <_ZN13whatever_main4main17h2b49df056371578eE+0x4c>
   38b4b:	48 8b 44 24 80       	mov    -0x80(%rsp),%rax
   38b50:	48 8d 4c 24 90       	lea    -0x70(%rsp),%rcx
   38b55:	48 89 c2             	mov    %rax,%rdx
   38b58:	48 c1 e2 04          	shl    $0x4,%rdx
   38b5c:	48 01 d1             	add    %rdx,%rcx
   38b5f:	48 8d 15 fa 9c fd ff 	lea    -0x26306(%rip),%rdx        # 12860 <GCC_except_table0+0x38>
   38b66:	48 89 11             	mov    %rdx,(%rcx)
   38b69:	48 c7 41 08 08 00 00 	movq   $0x8,0x8(%rcx)
   38b70:	00 
   38b71:	48 83 c0 01          	add    $0x1,%rax
   38b75:	48 89 44 24 88       	mov    %rax,-0x78(%rsp)
   38b7a:	eb bf                	jmp    38b3b <_ZN13whatever_main4main17h2b49df056371578eE+0xb>
   38b7c:	48 83 c4 10          	add    $0x10,%rsp

We only have two jumps, namely (conditionally) from repeat_loop_header to repeat_loop_next (at 38b49) and from repeat_loop_body to repeat_loop_header (at 38b7a).

Since there is no machine code for br label %repeat_loop_header, !dbg !166, the debugger can't stop on that LLVM IR instruction.

It almost seems like the getelementptr inbounds instructions that we used to emit in the start block matter to LLVM in some way that the other instructions don't, right?

As explained above, I don't think we are observing any strange side effects of removing getelementptr. I think what we observe follows naturally from the new LLVM IR.

compiler/rustc_codegen_llvm/src/builder.rs

saethlin · 2025-11-12T23:49:36Z

Ah-ha! That's what I was missing: debuginfo on a br label doesn't end up on anything. I swear I've heard of this before, and it just completely slipped my mind.

Thank you for bearing with me on this. I'm not excited about the approach, but I'm content that we're pretty deep into fencing with LLVM debuginfo handling that this is a good compromise implementation.

Just add a comment in the code about the motivation (so that it's obvious to a user that this branch is about debuginfo quality) then r=me.

To make debugger stepping intuitive with `-Copt-level=0`. See the adjusted `basic-stepping.rs` test. This is kind of a revert of bd0aae9, except we don't revert it, we just make it conditional on `opt-level`. That commit regressed `basic-stepping.rs`, but it was not noticed since that test did not exist back then. I have retroactively bisected to find that out. It seems messy to sprinkle if-cases inside of the `write_operand_repeatedly()` so make the whole function conditional. The test that bd0aae9 added in `tests/codegen/issues/issue-111603.rs` already use `-Copt-level=3`, so we don't need to adjust the compiler flags for it to keep passing.

rustbot · 2025-11-13T05:35:23Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Enselic · 2025-11-13T05:41:00Z

My pleasure. I learned a lot in the process.

saethlin · 2025-11-13T23:18:49Z

@bors r+ rollup=iffy

bors · 2025-11-13T23:18:52Z

📌 Commit 5ad2f43 has been approved by saethlin

It is now in the queue for this repository.

rustc_codegen_llvm: Require `opt-level >= 1` for index-based write_operand_repeatedly() loop To make debugger stepping intuitive with `-Copt-level=0`. See the adjusted `basic-stepping.rs` test. This is kind of a revert of **bd0aae92dc76d9 (cg_llvm: use index-based loop in write_operand_repeatedly)**, except we don't revert it, we just make it conditional on `opt-level`. That commit regressed `basic-stepping.rs`, but it was not noticed since that test did not exist back then (it was added later in #144876). I have retroactively bisected to find that out. It seems messy to sprinkle if-cases inside of `write_operand_repeatedly()` so make the whole function conditional. The test that bd0aae9 added in `tests/codegen/issues/issue-111603.rs` already use `-Copt-level=3`, so we don't need to adjust the compiler flags for it to keep passing. This PR takes us one step closer to fixing #33013. CC #147426 which is related (there will be trivial conflicts for me to resolve in basic-stepping.rs once one of them lands)

bors · 2025-11-14T01:33:58Z

⌛ Testing commit 5ad2f43 with merge 8fa656e...

bors · 2025-11-14T01:41:21Z

💔 Test failed - checks-actions

rust-log-analyzer · 2025-11-14T01:52:02Z

A job failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

saethlin · 2025-11-14T01:56:18Z

denied: permission_denied: write_package
##[error]Process completed with exit code 1.

@bors retry

bors · 2025-11-14T02:12:19Z

⌛ Testing commit 5ad2f43 with merge 6d41834...

bors · 2025-11-14T05:28:45Z

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing 6d41834 to main...

github-actions · 2025-11-14T05:32:07Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 7a72c54 (parent) -> 6d41834 (this PR)

Test differences

Show 4 test diffs

4 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 6d41834e251a65a994bf0acf415ddafd4adbf71f --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

dist-x86_64-apple: 8360.2s -> 5891.4s (-29.5%)
aarch64-apple: 7988.0s -> 10183.8s (+27.5%)
x86_64-rust-for-linux: 2498.6s -> 3050.3s (+22.1%)
armhf-gnu: 4548.6s -> 5318.7s (+16.9%)
i686-gnu-nopt-1: 6913.1s -> 8022.8s (+16.1%)
aarch64-gnu-debug: 3913.3s -> 4538.1s (+16.0%)
x86_64-gnu-tools: 3226.3s -> 3685.3s (+14.2%)
x86_64-gnu-llvm-20: 2257.7s -> 2575.2s (+14.1%)
x86_64-gnu-aux: 6403.5s -> 7282.7s (+13.7%)
x86_64-gnu-gcc: 3034.9s -> 3426.5s (+12.9%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-11-14T06:51:22Z

Finished benchmarking commit (6d41834): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 475.201s -> 476.969s (0.37%)
Artifact size: 388.70 MiB -> 388.67 MiB (-0.01%)

rustbot assigned jdonszelmann Oct 8, 2025

Enselic changed the title ~~rustc_codegen_llvm: Require opt-level >= 1 for index-based loop~~ rustc_codegen_llvm: Require opt-level >= 1 for index-based write_operand_repeatedly() loop Oct 8, 2025

Enselic mentioned this pull request Oct 8, 2025

tests: basic-[debugger-]stepping.rs: Disable SingleUseConsts MIR pass temporarily #147426

Open

rustbot assigned saethlin and unassigned jdonszelmann Oct 11, 2025

saethlin added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 12, 2025

Enselic mentioned this pull request Oct 24, 2025

rustc_codegen_llvm: Add debug locations to write_operand_repeatedly() #148058

Closed

saethlin reviewed Nov 12, 2025

View reviewed changes

compiler/rustc_codegen_llvm/src/builder.rs Show resolved Hide resolved

Enselic force-pushed the fix-basic-stepping-array branch from 5fb7903 to 5ad2f43 Compare November 13, 2025 05:35

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 13, 2025

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 14, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 14, 2025

bors merged commit 6d41834 into rust-lang:main Nov 14, 2025
12 checks passed

rustbot added this to the 1.93.0 milestone Nov 14, 2025

bors mentioned this pull request Nov 14, 2025

compiletest: Use //@ prefixes also for debuginfo test directives #147799

Open

1 task

rustc_codegen_llvm: Require opt-level >= 1 for index-based write_operand_repeatedly() loop #147462

rustc_codegen_llvm: Require opt-level >= 1 for index-based write_operand_repeatedly() loop #147462

Uh oh!

Conversation

Enselic commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Oct 8, 2025

Uh oh!

saethlin commented Oct 11, 2025

Uh oh!

saethlin commented Oct 12, 2025

Uh oh!

Enselic commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saethlin commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Enselic commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

saethlin commented Nov 12, 2025

Uh oh!

rustbot commented Nov 13, 2025

Uh oh!

Enselic commented Nov 13, 2025

Uh oh!

saethlin commented Nov 13, 2025

Uh oh!

bors commented Nov 13, 2025

Uh oh!

bors commented Nov 14, 2025

Uh oh!

bors commented Nov 14, 2025

Uh oh!

rust-log-analyzer commented Nov 14, 2025

Uh oh!

saethlin commented Nov 14, 2025

Uh oh!

bors commented Nov 14, 2025

Uh oh!

bors commented Nov 14, 2025

Uh oh!

Uh oh!

github-actions bot commented Nov 14, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Nov 14, 2025

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

rustc_codegen_llvm: Require `opt-level >= 1` for index-based write_operand_repeatedly() loop #147462

rustc_codegen_llvm: Require `opt-level >= 1` for index-based write_operand_repeatedly() loop #147462

Enselic commented Oct 8, 2025 •

edited

Loading

Enselic commented Oct 24, 2025 •

edited

Loading

saethlin commented Oct 24, 2025 •

edited

Loading

Enselic commented Nov 7, 2025 •

edited

Loading