Skip to content

Conversation

@Fidget-Spinner
Copy link
Contributor

No description provided.

@ltratt
Copy link
Contributor

ltratt commented Dec 28, 2025

This is a quick partial review. Broadly speaking, I want to get this in, but I would like it to slightly better match existing conventions in the codebase (even trivial things like naming and comment width). I also think there are some trivial comments that are too long, and some surprising things that aren't documented (e.g. functions with surprising pre/post-conditions). Nothing major, but they'll just make the code easier to live with in the long term.

@Fidget-Spinner
Copy link
Contributor Author

This is a quick partial review. Broadly speaking, I want to get this in, but I would like it to slightly better match existing conventions in the codebase (even trivial things like naming and comment width). I also think there are some trivial comments that are too long, and some surprising things that aren't documented (e.g. functions with surprising pre/post-conditions). Nothing major, but they'll just make the code easier to live with in the long term.

Yep I'm still getting a feel of the conventions. Please bear with me on this one, thanks!

/// Known-bits analysis.
pub(super) struct KnownBits {
known_bits: IndexVec<InstIdx, Option<KnownBitValue>>,
pending_commit: Option<KnownBitValue>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute needs documenting (I know what it does because I know its existence is because of an incomplete API... which is shameful on my part!).


fn inst_committed(&mut self, _opt: &CommitInstOpt, _iidx: InstIdx, _inst: &Inst) {
self.known_bits.push(self.pending_commit.clone());
assert_eq!(_iidx.index(), self.known_bits.len() - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iidx doesn't need the leading _ if it's used in assert. Also, wouldn't this assert be clearer before the push when the - 1 wouldn't be needed?

})
}

fn set_knownbits(&mut self, bits: KnownBitValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be set_pending perhaps?

// If no new information (set zeroes) was gained, that means this
// op is useless.
if rhs_b.all_known()
&& (rhs_b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised cargo fmt hasn't removed the brackets. If it doesn't, we should remove them, as "over bracketing" isn't really needed in Rust like it is in C.

}

/// Full credits to the PyPy blog post for some of the functions here:
/// https://pypy.org/posts/2024/08/toy-knownbits.html#the-knownbits-abstract-domain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer / less repetitive to clearly say at the top-level docstring (which we almost do currently) This is heavily influenced by the [PyPy blog post](https://...) (using Markdown syntax for URLs) once and not repeat it three times (as is currently the case).

/// https://pypy.org/posts/2024/08/toy-knownbits.html#the-knownbits-abstract-domain
///
/// In short:
/// one unknown knownbit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to either use Markdown formatting for the table (or, at a push, a code block).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also probably make the first row the ? state since that's the first. Then it becomes clear that one can go from row 1 to either rows 2 or 3, but no other transitions are legal (so we can condense the slightly long lattice comment perhaps?).

/// https://pypy.org/posts/2024/08/toy-knownbits.html#the-knownbits-abstract-domain
impl KnownBitValue {
/// Constructs a KnownBitValue from a constant.
fn from_constant(num: &ArbBitInt) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make this from_const(num: ArbBitInt) we force ownership onto the caller which might allow them to sometimes sidestep a clone (i.e. this will, sometimes, be a more efficient API; and it's never less efficient). [Note suggested name from_const which is the conventional j2 shortening of constant.]


#[test]
fn opt_and() {
// any = any & 01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit these comments confused me because I thought the arrow was pointing to 11, not to the whole line! I tend to think these particular comments are best removed, since the test itself more clearly shows what's going on. That said, I quite like high-level comments as a grouping exercise: that might not be relevant here, yet, though.

);

// any & any
test_known_bits(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops yeah, this test is useless

}

#[test]
fn opt_and_or() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced by this test because scaling it's going to be really hard as we analyse more IR instructions! I'd probably remove it.


#[test]
fn with_intermediate() {
// Test that other instructions stay around
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what these tests are testing.

}
}

/// Return a new [ArbBitInt] that performs bitwise `NEG` on `self`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arbbitint instructions all need corresponding tests. [Mostly those are proptest tests.]

@ltratt
Copy link
Contributor

ltratt commented Dec 28, 2025

I've now done a thorough review: don't be scared! I take code in the optimiser very seriously, because it's the easiest code to get wrong. Because there are lots of little things, feel free in this (unusual) case to bunch changes up into large commits (you can even do one big commit if you want): I will review the whole PR again afterwards.

@Fidget-Spinner
Copy link
Contributor Author

@ltratt I tried to address all your reviews in a few commits. I tried to make sure I didn't miss out anything. Please take a look again when you have the time, thanks!

@ltratt
Copy link
Contributor

ltratt commented Dec 28, 2025

Dumb question: with this PR do all the tests pass and benchmarks run?

@ltratt
Copy link
Contributor

ltratt commented Dec 28, 2025

I just realised that we probably need an additional test(s) where we prove that the knownbits pass can turn operations into constants. So I assume that we should be able to optimise this sort of thing:

%0: i8 = arg [reg]
%1: i8 = 3
%2: i8 = or %0, %1
%3: i8 = 1
%4: i8 = and %2, %3

into %4: i8 = 1 (assuming I've got my set bits vaguely right)?

One other (very minor) comment: comments need formatting to 100 chars. [I'm somewhat neutral on comment width, but the current code is formatted to 100 chars so we should follow that unless we make a consistent decision to reformat all comments.]

@Fidget-Spinner
Copy link
Contributor Author

100-width comments added in 332cee0
Constant tests added in 9938104

All tests pass on b16 and all of AWFY passes last I tried, I will try the full benchmark suite.

@Fidget-Spinner
Copy link
Contributor Author

FWIW, all the benchmarks pass, only one showed any difference (which I'm also suspicious of). Considering this only implements and and or, I'm not surprised.

confidence level: 99%

 Benchmark                 Datum0 (ms)  Datum1 (ms)  Ratio  Summary           
 knucleotide/yklua/        13013 ± 115  13387 ± 214   1.03  2.87% slower      
 queens/yklua/1000          5028 ± 298   4757 ± 231   0.95  indistinguishable 
 nbody/yklua/250000         4602 ± 291   4462 ± 239   0.97  indistinguishable 
 list/yklua/1500            7038 ± 234   6861 ± 115   0.97  indistinguishable 
 bounce/yklua/1500          8870 ± 370   8715 ± 508   0.98  indistinguishable 
 binarytrees/yklua/15      11798 ± 298  11682 ± 193   0.99  indistinguishable 
 fannkuchredux/yklua/10    12991 ± 166  12910 ± 181   0.99  indistinguishable 
 storage/yklua/1000        23638 ± 101  23515 ± 175   0.99  indistinguishable 
 deltablue/yklua/12000      8794 ± 217   8775 ±  34   1.00  indistinguishable 
 spectralnorm/yklua/1000   12375 ±  53  12354 ±  21   1.00  indistinguishable 
 Heightmap/yklua/2000       7969 ± 134   7959 ± 167   1.00  indistinguishable 
 havlak/yklua/1500         75744 ± 513  75654 ± 963   1.00  indistinguishable 
 mandelbrot/yklua/500       2291 ±   1   2292 ±   3   1.00  indistinguishable 
 bigloop/yklua/1000000000  32210 ±  24  32244 ±  36   1.00  indistinguishable 
 json/yklua/100            11842 ± 131  11855 ± 357   1.00  indistinguishable 
 permute/yklua/1000         8781 ± 398   8793 ± 936   1.00  indistinguishable 
 sieve/yklua/3000           4848 ±  60   4856 ± 105   1.00  indistinguishable 
 LuLPeg/yklua/             13659 ± 286  13685 ± 378   1.00  indistinguishable 
 HashIds/yklua/6000         9034 ± 102   9052 ±  52   1.00  indistinguishable 
 cd/yklua/250              32834 ± 365  33101 ± 418   1.01  indistinguishable 
 towers/yklua/600          10012 ± 250  10139 ± 326   1.01  indistinguishable 
 richards/yklua/100        41466 ± 869  42080 ±2248   1.01  indistinguishable

/// around. `illegal` occurs when both `0` and `1` are set and known, which is impossible in a
/// valid program. `illegal` indicates a likely bug in the optimizer/IR.
#[derive(Clone)]
pub struct KnownBitValue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, just realised: why are these pub? They shouldn't be known outside this file AFAICS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops yeah this is an artifact leftover form the olden days. Will remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


fn bitand(&self, other: &KnownBitValue) -> KnownBitValue {
let set_ones = self.ones.bitand(&other.ones);
let set_zeroes = self.zeroes().bitor(&other.zeroes());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is ones a field access and zeroes a method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the bit representation: ones is not computed, while zeroes is.

@ltratt
Copy link
Contributor

ltratt commented Dec 29, 2025

Two minor comments: we're almost there!

@ltratt
Copy link
Contributor

ltratt commented Dec 29, 2025

Please squash.

@Fidget-Spinner
Copy link
Contributor Author

Done! I will run all the tests and benchmarks one more time to be sure nothing crashes.

@Fidget-Spinner
Copy link
Contributor Author

All tests pass. All benchmarks on yk-benchmarks pass.

@ltratt ltratt added this pull request to the merge queue Dec 29, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 29, 2025
@Fidget-Spinner
Copy link
Contributor Author

Oops, the clippy on the buildbot fails but not locally, guess the buildbot has a newer clippy? In any case I force pushed a formatting fix. Please retry, thanks!

@ltratt ltratt added this pull request to the merge queue Dec 29, 2025
@ltratt
Copy link
Contributor

ltratt commented Dec 29, 2025

buildbot always uses the latest nightly Clippy. I personally find it worth rustuping my local rustc roughly monthly, as rustc and the other tools do introduce "won't merge" changes every so often.

Merged via the queue into ykjit:master with commit ac0a3ee Dec 29, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants