Skip to content

Conversation

@Enselic
Copy link
Member

@Enselic Enselic commented Oct 8, 2025

To make debugger stepping intuitive with -Copt-level=0. See the adjusted basic-stepping.rs test.

This is kind of a revert of bd0aae9 (cg_llvm: use index-based loop in write_operand_repeatedly), except we don't revert it, we just make it conditional on opt-level. That commit regressed basic-stepping.rs, but it was not noticed since that test did not exist back then (it was added later in #144876). I have retroactively bisected to find that out.

It seems messy to sprinkle if-cases inside of
write_operand_repeatedly() so make the whole function conditional.

The test that bd0aae9 added in
tests/codegen/issues/issue-111603.rs already use -Copt-level=3, so we don't need to adjust the compiler flags for it to keep passing.

This PR takes us one step closer to fixing #33013.

CC #147426 which is related (there will be trivial conflicts for me to resolve in basic-stepping.rs once one of them lands)

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 8, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 8, 2025

r? @jdonszelmann

rustbot has assigned @jdonszelmann.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@Enselic Enselic changed the title rustc_codegen_llvm: Require opt-level >= 1 for index-based loop rustc_codegen_llvm: Require opt-level >= 1 for index-based write_operand_repeatedly() loop Oct 8, 2025
@saethlin
Copy link
Member

r? saethlin

@rustbot rustbot assigned saethlin and unassigned jdonszelmann Oct 11, 2025
@saethlin
Copy link
Member

Do you know why this codegen change made the stepping work differently? It's not clear to me at all what this diff has to do with the stepping order, but clearly it does. So I can't tell if this is actually papering over a bug in LLVM, or deeper in our debuginfo handling.

@saethlin saethlin added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 12, 2025
@Enselic
Copy link
Member Author

Enselic commented Oct 24, 2025

@saethlin Good call. I have investigated more deeply now. Stepping stopped working because the only emitted LLVM instructions with debug location metadata were removed. No LLVM instructions with debug location metadata remained.

However, simply adding debug info to all remaining instructions does not work well, because then stepping will loop for as many times as we repeat.

Let's look at the details with this program:

fn main() {
    let whatever = ["whatever"; 8];
}

When stepping worked, it looked like this:

rustc +nightly-2023-04-23 -g --emit=llvm-ir $src/whatever.rs
; whatever::main
; Function Attrs: nonlazybind uwtable
define internal void @_ZN8whatever4main17h46e92f5877e9c5b3E() unnamed_addr #1 !dbg !175 {
start:
  %whatever = alloca [8 x { ptr, i64 }], align 8
  call void @llvm.dbg.declare(metadata ptr %whatever, metadata !179, metadata !DIExpression()), !dbg !189
  %0 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 0, !dbg !190
  %1 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 8, !dbg !190
  br label %repeat_loop_header, !dbg !190

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %2 = phi ptr [ %0, %start ], [ %6, %repeat_loop_body ]
  %3 = icmp ne ptr %2, %1
  br i1 %3, label %repeat_loop_body, label %repeat_loop_next

repeat_loop_body:                                 ; preds = %repeat_loop_header
  %4 = getelementptr inbounds { ptr, i64 }, ptr %2, i32 0, i32 0
  store ptr @alloc_3db654700ddfbbbd22c59221279a79d2, ptr %4, align 8
  %5 = getelementptr inbounds { ptr, i64 }, ptr %2, i32 0, i32 1
  store i64 8, ptr %5, align 8
  %6 = getelementptr inbounds { ptr, i64 }, ptr %2, i64 1
  br label %repeat_loop_header

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void, !dbg !191
}

!190 = !DILocation(line: 2, column: 20, scope: !175)

stepping worked because of (note that these instructions don't repeat):

  %0 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 0, !dbg !190
  %1 = getelementptr inbounds [8 x { ptr, i64 }], ptr %whatever, i64 0, i64 8, !dbg !190

We can make stepping work if we add !dbg to the looped over instructions which I do in #148058. Here is what the IR looks like in that PR:

./build/x86_64-unknown-linux-gnu/stage1/bin/rustc -g --emit=llvm-ir $src/whatever.rs
; whatever::main
; Function Attrs: nonlazybind uwtable
define hidden void @_ZN8whatever4main17h0930ac910df4df44E() unnamed_addr #0 !dbg !159 {
start:
  %whatever = alloca [128 x i8], align 8
    #dbg_declare(ptr %whatever, !163, !DIExpression(), !173)
  br label %repeat_loop_header, !dbg !174

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %0 = phi i64 [ 0, %start ], [ %4, %repeat_loop_body ], !dbg !174
  %1 = icmp ult i64 %0, 8, !dbg !174
  br i1 %1, label %repeat_loop_body, label %repeat_loop_next, !dbg !174

repeat_loop_body:                                 ; preds = %repeat_loop_header
  %2 = getelementptr inbounds nuw { ptr, i64 }, ptr %whatever, i64 %0, !dbg !174
  store ptr @alloc_3db654700ddfbbbd22c59221279a79d2, ptr %2, align 8, !dbg !174
  %3 = getelementptr inbounds i8, ptr %2, i64 8, !dbg !174
  store i64 8, ptr %3, align 8, !dbg !174
  %4 = add nuw i64 %0, 1, !dbg !174
  br label %repeat_loop_header, !dbg !174

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void, !dbg !175
}

!174 = !DILocation(line: 2, column: 20, scope: !159)

Now stepping works, but since all instructions with !174 are repeated, we will now step 8 times (because of ; 8];). br label %repeat_loop_header, !dbg !174 probably don't count since no machine code is emitted for that instruction (I assume. I haven't double-checked. But why would it?.)

I don't have any good idea on how to solve this. The only thing I can think of is to force emission of a no-op instruction before we enter the loop and then attach !dbg !174 to it. It might work in non-optimized builds. But I don't know if that's possible or wanted.

Maybe you have some other idea?

@saethlin
Copy link
Member

saethlin commented Oct 24, 2025

Stepping stopped working because the only emitted LLVM instructions with debug location metadata were removed. No LLVM instructions with debug location metadata remained.

Is that true? I'm looking at this IR:

; demo::main
; Function Attrs: nonlazybind uwtable
define hidden void @_ZN4demo4main17h1e5d7b00d81e2d71E() unnamed_addr #0 !dbg !151 {
start:
  %h = alloca [128 x i8], align 8
    #dbg_declare(ptr %h, !155, !DIExpression(), !165)
  br label %repeat_loop_header, !dbg !166

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %0 = phi i64 [ 0, %start ], [ %4, %repeat_loop_body ]
  %1 = icmp ult i64 %0, 8
  br i1 %1, label %repeat_loop_body, label %repeat_loop_next

repeat_loop_body:                                 ; preds = %repeat_loop_header
  %2 = getelementptr inbounds nuw { ptr, i64 }, ptr %h, i64 %0
  store ptr @alloc_3db654700ddfbbbd22c59221279a79d2, ptr %2, align 8
  %3 = getelementptr inbounds i8, ptr %2, i64 8
  store i64 8, ptr %3, align 8
  %4 = add nuw i64 %0, 1
  br label %repeat_loop_header

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void, !dbg !167
}

In the start block, I see dbg 165 and 166 mentioned, and those are

!165 = !DILocation(line: 2, column: 9, scope: !156) 
!166 = !DILocation(line: 2, column: 13, scope: !151)

Which are the the h and the [ in let h = ["whatever; 8].

It almost seems like the getelementptr inbounds instructions that we used to emit in the start block matter to LLVM in some way that the other instructions don't, right?

@Enselic
Copy link
Member Author

Enselic commented Nov 7, 2025

@saethlin I investigated this further by stepping around in LLVM code among other things.

My understanding is that #dbg_declare does not affect line stepping behaviour in a debugger. Instead, #dbg_declare is used to tell a debugger how to figure out the value of a variable at various points in the function. So we can disregard #dbg_declare when it comes to stepping.

For stepping, what matters is the !dbg metadata that are attached to LLVM IR instructions. So why doesn't

br label %repeat_loop_header, !dbg !166

save us? After all, it attaches the right line info to the instruction, so what's wrong? What's happening is that there is no machine code emitted that corresponds to that LLVM IR instruction. That instruction just says "go to the block that comes next". But no explicit machine instruction is needed for that. Execution from start can simply fall through to repeat_loop_header.

This can be seen in the final machine code:

0000000000038b30 <_ZN13whatever_main4main17h2b49df056371578eE>:
   38b30:	48 83 ec 10          	sub    $0x10,%rsp
   38b34:	31 c0                	xor    %eax,%eax
   38b36:	48 89 44 24 88       	mov    %rax,-0x78(%rsp)
   38b3b:	48 8b 44 24 88       	mov    -0x78(%rsp),%rax
   38b40:	48 89 44 24 80       	mov    %rax,-0x80(%rsp)
   38b45:	48 83 f8 08          	cmp    $0x8,%rax
   38b49:	73 31                	jae    38b7c <_ZN13whatever_main4main17h2b49df056371578eE+0x4c>
   38b4b:	48 8b 44 24 80       	mov    -0x80(%rsp),%rax
   38b50:	48 8d 4c 24 90       	lea    -0x70(%rsp),%rcx
   38b55:	48 89 c2             	mov    %rax,%rdx
   38b58:	48 c1 e2 04          	shl    $0x4,%rdx
   38b5c:	48 01 d1             	add    %rdx,%rcx
   38b5f:	48 8d 15 fa 9c fd ff 	lea    -0x26306(%rip),%rdx        # 12860 <GCC_except_table0+0x38>
   38b66:	48 89 11             	mov    %rdx,(%rcx)
   38b69:	48 c7 41 08 08 00 00 	movq   $0x8,0x8(%rcx)
   38b70:	00 
   38b71:	48 83 c0 01          	add    $0x1,%rax
   38b75:	48 89 44 24 88       	mov    %rax,-0x78(%rsp)
   38b7a:	eb bf                	jmp    38b3b <_ZN13whatever_main4main17h2b49df056371578eE+0xb>
   38b7c:	48 83 c4 10          	add    $0x10,%rsp

We only have two jumps, namely (conditionally) from repeat_loop_header to repeat_loop_next (at 38b49) and from repeat_loop_body to repeat_loop_header (at 38b7a).

Since there is no machine code for br label %repeat_loop_header, !dbg !166, the debugger can't stop on that LLVM IR instruction.

It almost seems like the getelementptr inbounds instructions that we used to emit in the start block matter to LLVM in some way that the other instructions don't, right?

As explained above, I don't think we are observing any strange side effects of removing getelementptr. I think what we observe follows naturally from the new LLVM IR.

@saethlin
Copy link
Member

Ah-ha! That's what I was missing: debuginfo on a br label doesn't end up on anything. I swear I've heard of this before, and it just completely slipped my mind.

Thank you for bearing with me on this. I'm not excited about the approach, but I'm content that we're pretty deep into fencing with LLVM debuginfo handling that this is a good compromise implementation.

Just add a comment in the code about the motivation (so that it's obvious to a user that this branch is about debuginfo quality) then r=me.

To make debugger stepping intuitive with `-Copt-level=0`. See the
adjusted `basic-stepping.rs` test.

This is kind of a revert of bd0aae9, except we don't revert it,
we just make it conditional on `opt-level`. That commit regressed
`basic-stepping.rs`, but it was not noticed since that test did not
exist back then. I have retroactively bisected to find that out.

It seems messy to sprinkle if-cases inside of the
`write_operand_repeatedly()` so make the whole function conditional.

The test that bd0aae9 added in
`tests/codegen/issues/issue-111603.rs` already use `-Copt-level=3`, so
we don't need to adjust the compiler flags for it to keep passing.
@Enselic Enselic force-pushed the fix-basic-stepping-array branch from 5fb7903 to 5ad2f43 Compare November 13, 2025 05:35
@rustbot
Copy link
Collaborator

rustbot commented Nov 13, 2025

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@Enselic
Copy link
Member Author

Enselic commented Nov 13, 2025

My pleasure. I learned a lot in the process.

@saethlin
Copy link
Member

@bors r+ rollup=iffy

@bors
Copy link
Collaborator

bors commented Nov 13, 2025

📌 Commit 5ad2f43 has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 13, 2025
bors added a commit that referenced this pull request Nov 14, 2025
rustc_codegen_llvm: Require `opt-level >= 1` for index-based write_operand_repeatedly() loop

To make debugger stepping intuitive with `-Copt-level=0`. See the adjusted `basic-stepping.rs` test.

This is kind of a revert of **bd0aae92dc76d9 (cg_llvm: use index-based loop in write_operand_repeatedly)**, except we don't revert it, we just make it conditional on `opt-level`. That commit regressed `basic-stepping.rs`, but it was not noticed since that test did not exist back then (it was added later in #144876). I have retroactively bisected to find that out.

It seems messy to sprinkle if-cases inside of
`write_operand_repeatedly()` so make the whole function conditional.

The test that bd0aae9 added in
`tests/codegen/issues/issue-111603.rs` already use `-Copt-level=3`, so we don't need to adjust the compiler flags for it to keep passing.

This PR takes us one step closer to fixing #33013.

CC #147426 which is related (there will be trivial conflicts for me to resolve in basic-stepping.rs once one of them lands)
@bors
Copy link
Collaborator

bors commented Nov 14, 2025

⌛ Testing commit 5ad2f43 with merge 8fa656e...

@bors
Copy link
Collaborator

bors commented Nov 14, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 14, 2025
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@saethlin
Copy link
Member

denied: permission_denied: write_package
##[error]Process completed with exit code 1.

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2025
@bors
Copy link
Collaborator

bors commented Nov 14, 2025

⌛ Testing commit 5ad2f43 with merge 6d41834...

@bors
Copy link
Collaborator

bors commented Nov 14, 2025

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing 6d41834 to main...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 14, 2025
@bors bors merged commit 6d41834 into rust-lang:main Nov 14, 2025
12 checks passed
@rustbot rustbot added this to the 1.93.0 milestone Nov 14, 2025
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 7a72c54 (parent) -> 6d41834 (this PR)

Test differences

Show 4 test diffs

4 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 6d41834e251a65a994bf0acf415ddafd4adbf71f --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-x86_64-apple: 8360.2s -> 5891.4s (-29.5%)
  2. aarch64-apple: 7988.0s -> 10183.8s (+27.5%)
  3. x86_64-rust-for-linux: 2498.6s -> 3050.3s (+22.1%)
  4. armhf-gnu: 4548.6s -> 5318.7s (+16.9%)
  5. i686-gnu-nopt-1: 6913.1s -> 8022.8s (+16.1%)
  6. aarch64-gnu-debug: 3913.3s -> 4538.1s (+16.0%)
  7. x86_64-gnu-tools: 3226.3s -> 3685.3s (+14.2%)
  8. x86_64-gnu-llvm-20: 2257.7s -> 2575.2s (+14.1%)
  9. x86_64-gnu-aux: 6403.5s -> 7282.7s (+13.7%)
  10. x86_64-gnu-gcc: 3034.9s -> 3426.5s (+12.9%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6d41834): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 475.201s -> 476.969s (0.37%)
Artifact size: 388.70 MiB -> 388.67 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants