-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add id dispenser for numerical yul node ids #15838
base: develop
Are you sure you want to change the base?
Conversation
libyul/optimiser/NodeIdDispenser.cpp
Outdated
// this can be replaced by the actually used ids in the provided block once the AST uses ids instead of YulString | ||
std::set<NodeId> usedIds = ranges::views::iota(static_cast<size_t>(0), m_mapping.size() + m_offset) | ranges::to<std::set<NodeId>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version where AST contains numerical ids:
solidity/libyul/optimiser/NodeIdDispenser.cpp
Lines 67 to 71 in 9372af9
std::set<NodeId> usedIds = NameCollector(_root).names(); | |
// add ghosts to usedIds as they're not referenced in the regular ast | |
for (size_t i = 0; i < m_mapping.size(); ++i) | |
if (m_mapping[i] == ASTNodeRegistry::ghostId()) | |
usedIds.insert(i + m_offset); |
10dde18
to
0076c71
Compare
5ade700
to
4602bf7
Compare
0076c71
to
26ea8ed
Compare
0fdc32e
to
5bc5d98
Compare
26ea8ed
to
96f8138
Compare
837871f
to
43a00c6
Compare
96f8138
to
1ae159b
Compare
a6ae0c7
to
b0f5c6b
Compare
1ae159b
to
37112b5
Compare
b0f5c6b
to
2d04f51
Compare
37112b5
to
ada52c0
Compare
44cace1
to
f330e45
Compare
b514841
to
9f54785
Compare
f330e45
to
4d96648
Compare
9f54785
to
f50d498
Compare
libyul/optimiser/LabelIDDispenser.h
Outdated
/// Reserved labels, equipped with the transparent less comparison operator to be able to handle string_view. | ||
std::set<std::string, std::less<>> m_reservedLabels; | ||
/// Offset by which LabelIDs must be shifted to be used in the dispenser's `m_idToLabelMapping` | ||
size_t m_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to suggest a clearer name for this, but actually I think it would it be much more self-explanatory to just use ASTLabelRegistry::maxID()
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, absolutely. Thanks for the suggestion!
ASTLabelRegistry const& labels() const { return m_labels; } | ||
|
||
/// Spawns a new LabelID which depends on a parent LabelID that will be used for its string representation. | ||
LabelID newID(LabelID _parent = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring should say that the parent label must not be an unused one.
Same for resolveParentLabelID()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for generateNewLabels()
we should mention that the resulting registry will never contain any unused IDs. They will always have new labels generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for
generateNewLabels()
we should mention that the resulting registry will never contain any unused IDs. They will always have new labels generated.
It will have unused IDs. Right now I am not compressing the id range but it's append-only. As ids grow, so does the number of unused ones in the lower ranges. I thought about compressing them to the used ones, which would mean potentially remapping existing IDs in the AST and make generateNewLabels
return a fresh AST
. We could still do that if we figure it's a performance bottleneck or we waste too much memory. But personally I don't think so.
bool LabelIDDispenser::ghost(LabelID const _id) const | ||
{ | ||
yulAssert(_id < m_idToLabelMapping.size() + m_offset, "ID exceeds bounds."); | ||
if (_id >= m_offset) | ||
return m_idToLabelMapping[_id - m_offset] == ASTLabelRegistry::ghostLabelIndex(); | ||
|
||
return m_labels.ghost(_id); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function will return false
if _id
's parent is a ghost. Is that intentional? If not, then we should assert against that possibility in newID()
. Note that currently newID()
only rejects IDs which are ghosts themselves (due to range assert in resolveParentLabelID()
).
newID()
's docstring should also document how it handles ghosts in either form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in my mind one should only use newGhost
to generate new ghosts. Deriving labels from ghosts has the potential to break things in an intricate fashion. Definitely not intended. I'm adding an assert + documentation of this.
} | ||
|
||
|
||
ASTLabelRegistry LabelIDDispenser::generateNewLabels(Block const&, Dialect const& _dialect) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't use the _block
yet, IMO it would be best not add this parameter for now. I'd rather see the places that need to pass it in modified in the same PR that will implement the functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fair:) I thought it would be easier this way but if you prefer it without the parameter then I'll remove it for now.
|
||
#include <libyul/optimiser/LabelIDDispenser.h> | ||
|
||
#include <libyul/optimiser/NameCollector.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class does not really seem to use anything from NameCollector.h
. Or did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't miss anything, it's in the same vein as the block argument: to be used once fully integrated. I'll remove the include for now.
auto const parentLabelID = resolveParentLabelID(id); | ||
|
||
auto const originalLabelIndex = m_labels.idToLabelIndex(parentLabelID); | ||
auto const& originalLabel = originalLabels[originalLabelIndex]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a string view, not a string. Is there any point in taking a reference to it?
auto const& originalLabel = originalLabels[originalLabelIndex]; | |
std::string_view const originalLabel = originalLabels[originalLabelIndex]; |
BTW, this yet another good example of how auto
obfuscates things that would have been obvious otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a string :) case in point, though!
auto const& parentLabel = originalLabels[parentLabelIndex]; | ||
|
||
std::string generatedLabel = parentLabel; | ||
size_t suffix = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small optimization we could do here would be to remember the counter value we stopped at when we last visited the same parent.
Not sure if it's worth it in that form, because it would require us to keep track of counters for all labels from the original registry, but a simplified version would be to just use a single counter for all labels. That could be achieved simply by moving the suffix
variable initialization out of the loop. The downside being that numbers used in names would have more digits than they could otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! The memory impact isn't too big, just a bunch of size_t
s.
libyul/optimiser/LabelIDDispenser.h
Outdated
/// A set of reserved labels may be provided, which is excluded when generating new labels. If a reserved label | ||
/// already appears in the label registry and is used as-is in the AST, it will be reused despite it being | ||
/// provided here. | ||
explicit LabelIDDispenser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One important thing to mention is that the original labels will be preserved even if they're not valid Yul identifiers.
And for generateNewLabels()
I'd add that labels are guaranteed to be valid and not reserved IFF the original registry satisfies this condition. And it's guaranteed that at least no additional invalid/reserved names will be introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, noted these things.
if (usedIDs.empty()) | ||
return {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition will never be satisfied, because ASTLabelRegistry
always contains at least one label (the empty one) and therefore usedIDs
will always contain 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another consequence of including 0
in usedIDs
is that you'll try to generate a new label for it (you also mark it as reused), which will overwrite idToLabelMap[0]
. I think that'll actually trip an assertion in ASTLabelRegistry
constructor, because it requires that the label at 0
is ""
.
On the other hand the ""
you put in labels
will go unreferenced (unless there are unused IDs in the original registry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition will never be satisfied, because
ASTLabelRegistry
always contains at least one label (the empty one) and thereforeusedIDs
will always contain0
.
Right. Although this is a bit temporary and to be replaced with whatever the NameCollector
yields on an input block in which it may happen to have an empty usedIDs
set. I have removed the condition for now so it can be re-introduced when properly integrating the block input.
m_reservedLabels(_reservedLabels.begin(), _reservedLabels.end()), | ||
m_offset(_labels.maxID() + 1) | ||
{ | ||
m_reservedLabels += std::set{""}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even necessary to reserve ""
? As far as I can tell, m_reservedLabels
just prevents a label from being reused in generateNewLabels()
, but you set reusedLabels[0] = true
there, so it won't hit the reserved name check anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I don't think it is necessary and having fewer reserved labels is good for performance. :)
0d84ba2
to
d2d9d5b
Compare
d2d9d5b
to
d7aa648
Compare
Depends on #15823