Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge -Zhir-stats into -Zinput-stats #133023

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

samestep
Copy link
Contributor

Currently -Z hir-stats prints the size and count of various kinds of nodes, and the total size of all the nodes it counted, but not the total count of nodes. So, before this PR:

$ git clone https://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ cargo +nightly rustc -- -Z hir-stats
ast-stats-1 PRE EXPANSION AST STATS
ast-stats-1 Name                Accumulated Size         Count     Item Size
ast-stats-1 ----------------------------------------------------------------
ast-stats-1 ...
ast-stats-1 ----------------------------------------------------------------
ast-stats-1 Total                 93_576
ast-stats-1
ast-stats-2 POST EXPANSION AST STATS
ast-stats-2 Name                Accumulated Size         Count     Item Size
ast-stats-2 ----------------------------------------------------------------
ast-stats-2 ...
ast-stats-2 ----------------------------------------------------------------
ast-stats-2 Total              2_430_648
ast-stats-2
hir-stats HIR STATS
hir-stats Name                Accumulated Size         Count     Item Size
hir-stats ----------------------------------------------------------------
hir-stats ...
hir-stats ----------------------------------------------------------------
hir-stats Total              3_678_512
hir-stats

For consistency, this PR adds a total for the count as well:

$ cargo +stage1 rustc -- -Z hir-stats
ast-stats-1 PRE EXPANSION AST STATS
ast-stats-1 Name                Accumulated Size         Count     Item Size
ast-stats-1 ----------------------------------------------------------------
ast-stats-1 ...
ast-stats-1 ----------------------------------------------------------------
ast-stats-1 Total                 93_576                 1_877
ast-stats-1
ast-stats-2 POST EXPANSION AST STATS
ast-stats-2 Name                Accumulated Size         Count     Item Size
ast-stats-2 ----------------------------------------------------------------
ast-stats-2 ...
ast-stats-2 ----------------------------------------------------------------
ast-stats-2 Total              2_430_648                48_625
ast-stats-2
hir-stats HIR STATS
hir-stats Name                Accumulated Size         Count     Item Size
hir-stats ----------------------------------------------------------------
hir-stats ...
hir-stats ----------------------------------------------------------------
hir-stats Total              3_678_512                73_418
hir-stats

I wasn't sure if I was supposed to update tests/ui/stats/hir-stats.stderr to reflect this. I ran it locally, thinking it would fail, but it didn't:

$ ./x test tests/ui/stats
...

running 2 tests
i.

test result: ok. 1 passed; 0 failed; 1 ignored; 0 measured; 17949 filtered out

Also: is there a reason -Z hir-stats and -Z input-stats both exist? The former seems like it should completely supercede the latter. But strangely, the two give very different numbers for node counts:

$ cargo +nightly rustc -- -Z input-stats
...
Lines of code:             483
Pre-expansion node count:  2386
Post-expansion node count: 63844

That's a 30% difference in this case. Is it intentional that these numbers are so different? I see comments for both saying that they are merely approximations and should not be expected to be correct:

// Simply gives a rough count of the number of nodes in an AST.

// The visitors in this module collect sizes and counts of the most important
// pieces of AST and HIR. The resulting numbers are good approximations but not
// completely accurate (some things might be counted twice, others missed).

@rustbot
Copy link
Collaborator

rustbot commented Nov 14, 2024

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 14, 2024
@bjorn3
Copy link
Member

bjorn3 commented Nov 14, 2024

Also: is there a reason -Z hir-stats and -Z input-stats both exist?

AST and HIR are two separate IR's with a different structure. As such they should get separate statistics.

@jieyouxu
Copy link
Member

jieyouxu commented Nov 14, 2024

Not too familiar with the use case for this, maybe r? @nnethercote?

@rustbot rustbot assigned nnethercote and unassigned jieyouxu Nov 14, 2024
@samestep
Copy link
Contributor Author

AST and HIR are two separate IR's with a different structure. As such they should get separate statistics.

@bjorn3 I understand that, but the reason I'm asking is because -Z hir-stats doesn't only print HIR stats: it also prints (what seems like it should be) a superset of the AST stats printed by -Z input-stats. So from your message, it sounds like you wouldn't be in favor of merging them into one flag, but perhaps you'd be in favor of moving the more detailed AST statistics from -Z hir-stats to -Z input-stats?

@bjorn3
Copy link
Member

bjorn3 commented Nov 14, 2024

-Z hir-stats doesn't only print HIR stats:

Right, missed that.

So from your message, it sounds like you wouldn't be in favor of merging them into one flag, but perhaps you'd be in favor of moving the more detailed AST statistics from -Z hir-stats to -Z input-stats?

I never used either flag, so I guess I'm not the right person to decide whether that should be done or not.

@nnethercote
Copy link
Contributor

The code changes here look fine.

You should update the test. It's current marked as //@ only-x86_64. Are you running on an ARM64 machine such as an M-series Mac? I think the test will succeed on any 64-bit processor, try changing it to //@ only-64bit.

As for the bigger questions about the two flags:

  • -Zhir-stats is misnamed, because it prints both AST stats and HIR stats.
  • -Zinput-stats prints one source code stat (the line count) and two AST stats (pre-expansion and post-expansion node counts). It is very old, and was added in Add -Zinput-stats #29764.
  • -Zhir-stats gives a smaller number for node counts because it doesn't count AST nodes that are embedded within other AST nodes, because it's approximating memory usage and we don't want to double count anything. -Zinput-stats counts every AST node, so it's a more abstract measurement of size that doesn't relate as closely to memory usage.
  • I'm pretty sure both flags are rarely used, so changing them is unlikely to cause any significant disruption.
  • I suggest merging -Zhir-stats into -Zinput-stats and removing the -Zinput-stats node count. This means the entire file compiler/rustc_ast_passes/src/node_count.rs can be removed. I would put the source code line count in a section at the start like this:
    src-stats SOURCE CODE STATS
    src-stats Lines of code:  100
    src-stats
    
    tests/ui/stats/hir-stats.* should also be renamed as tests/ui/stats/input-stats.*. (Also grep for hir-stats and hir_stats; don't miss the entry in triagebot.toml.)

@samestep: Would you like to do this? You could do it in this PR, in a second commit.

Finally, I'm curious why you were looking at these flags. Have you been using them yourself?

@rustbot
Copy link
Collaborator

rustbot commented Nov 14, 2024

Changes to the size of AST and/or HIR nodes.

cc @nnethercote

@rustbot rustbot added the A-meta Area: Issues & PRs about the rust-lang/rust repository itself label Nov 14, 2024
@rustbot
Copy link
Collaborator

rustbot commented Nov 14, 2024

triagebot.toml has been modified, there may have been changes to the review queue.

cc @davidtwco, @wesleywiser

@samestep samestep changed the title Print total node count in -Z hir-stats Merge -Zhir-stats into -Zinput-stats Nov 14, 2024
@samestep
Copy link
Contributor Author

samestep commented Nov 14, 2024

@nnethercote

The code changes here look fine.

Thanks for the quick review!

You should update the test. It's current marked as //@ only-x86_64. Are you running on an ARM64 machine such as an M-series Mac? I think the test will succeed on any 64-bit processor, try changing it to //@ only-64bit.

Done; yeah, I am on an M-series Mac.

  • I suggest merging -Zhir-stats into -Zinput-stats and removing the -Zinput-stats node count. This means the entire file compiler/rustc_ast_passes/src/node_count.rs can be removed. I would put the source code line count in a section at the start like this:
    src-stats SOURCE CODE STATS
    src-stats Lines of code:  100
    src-stats
    

I've merged them, but I missed the part about src-stats when I first read your message. Should I still do this? Personally I feel inclined not to, since it's not particularly useful to have a stat that just tells you how many lines there are in the crate root; that's extremely easy to see with other non-Rust-specific tools.

tests/ui/stats/hir-stats.* should also be renamed as tests/ui/stats/input-stats.*. (Also grep for hir-stats and hir_stats; don't miss the entry in triagebot.toml.)

Done.

Finally, I'm curious why you were looking at these flags. Have you been using them yourself?

It's a bit of a long story, and somewhat off-topic for this thread, so I'll write you an email.

@nnethercote
Copy link
Contributor

Looks good. My only suggestion is to squash the first two commits together, because they are logically part of the same change. Once that's done it's good to go, thanks.

@bors delegate=samestep

@bors
Copy link
Contributor

bors commented Nov 14, 2024

✌️ @samestep, you can now approve this pull request!

If @nnethercote told you to "r=me" after making some further change, please make that change, then do @bors r=@nnethercote

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Area: Issues & PRs about the rust-lang/rust repository itself S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants