-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Add id dispenser for numerical yul node ids #15838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
libyul/optimiser/NodeIdDispenser.cpp
Outdated
| // this can be replaced by the actually used ids in the provided block once the AST uses ids instead of YulString | ||
| std::set<NodeId> usedIds = ranges::views::iota(static_cast<size_t>(0), m_mapping.size() + m_offset) | ranges::to<std::set<NodeId>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version where AST contains numerical ids:
10dde18 to
0076c71
Compare
5ade700 to
4602bf7
Compare
0076c71 to
26ea8ed
Compare
0fdc32e to
5bc5d98
Compare
26ea8ed to
96f8138
Compare
837871f to
43a00c6
Compare
96f8138 to
1ae159b
Compare
a6ae0c7 to
b0f5c6b
Compare
1ae159b to
37112b5
Compare
b0f5c6b to
2d04f51
Compare
37112b5 to
ada52c0
Compare
44cace1 to
f330e45
Compare
b514841 to
9f54785
Compare
9f54785 to
f50d498
Compare
libyul/optimiser/LabelIDDispenser.h
Outdated
| /// Reserved labels, equipped with the transparent less comparison operator to be able to handle string_view. | ||
| std::set<std::string, std::less<>> m_reservedLabels; | ||
| /// Offset by which LabelIDs must be shifted to be used in the dispenser's `m_idToLabelMapping` | ||
| size_t m_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to suggest a clearer name for this, but actually I think it would it be much more self-explanatory to just use ASTLabelRegistry::maxID() directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, absolutely. Thanks for the suggestion!
| ASTLabelRegistry const& labels() const { return m_labels; } | ||
|
|
||
| /// Spawns a new LabelID which depends on a parent LabelID that will be used for its string representation. | ||
| LabelID newID(LabelID _parent = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring should say that the parent label must not be an unused one.
Same for resolveParentLabelID().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for generateNewLabels() we should mention that the resulting registry will never contain any unused IDs. They will always have new labels generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for
generateNewLabels()we should mention that the resulting registry will never contain any unused IDs. They will always have new labels generated.
It will have unused IDs. Right now I am not compressing the id range but it's append-only. As ids grow, so does the number of unused ones in the lower ranges. I thought about compressing them to the used ones, which would mean potentially remapping existing IDs in the AST and make generateNewLabels return a fresh AST. We could still do that if we figure it's a performance bottleneck or we waste too much memory. But personally I don't think so.
| bool LabelIDDispenser::ghost(LabelID const _id) const | ||
| { | ||
| yulAssert(_id < m_idToLabelMapping.size() + m_offset, "ID exceeds bounds."); | ||
| if (_id >= m_offset) | ||
| return m_idToLabelMapping[_id - m_offset] == ASTLabelRegistry::ghostLabelIndex(); | ||
|
|
||
| return m_labels.ghost(_id); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function will return false if _id's parent is a ghost. Is that intentional? If not, then we should assert against that possibility in newID(). Note that currently newID() only rejects IDs which are ghosts themselves (due to range assert in resolveParentLabelID()).
newID()'s docstring should also document how it handles ghosts in either form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in my mind one should only use newGhost to generate new ghosts. Deriving labels from ghosts has the potential to break things in an intricate fashion. Definitely not intended. I'm adding an assert + documentation of this.
| } | ||
|
|
||
|
|
||
| ASTLabelRegistry LabelIDDispenser::generateNewLabels(Block const&, Dialect const& _dialect) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't use the _block yet, IMO it would be best not add this parameter for now. I'd rather see the places that need to pass it in modified in the same PR that will implement the functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fair:) I thought it would be easier this way but if you prefer it without the parameter then I'll remove it for now.
|
|
||
| #include <libyul/optimiser/LabelIDDispenser.h> | ||
|
|
||
| #include <libyul/optimiser/NameCollector.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class does not really seem to use anything from NameCollector.h. Or did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't miss anything, it's in the same vein as the block argument: to be used once fully integrated. I'll remove the include for now.
| auto const parentLabelID = resolveParentLabelID(id); | ||
|
|
||
| auto const originalLabelIndex = m_labels.idToLabelIndex(parentLabelID); | ||
| auto const& originalLabel = originalLabels[originalLabelIndex]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a string view, not a string. Is there any point in taking a reference to it?
| auto const& originalLabel = originalLabels[originalLabelIndex]; | |
| std::string_view const originalLabel = originalLabels[originalLabelIndex]; |
BTW, this yet another good example of how auto obfuscates things that would have been obvious otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a string :) case in point, though!
| auto const& parentLabel = originalLabels[parentLabelIndex]; | ||
|
|
||
| std::string generatedLabel = parentLabel; | ||
| size_t suffix = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small optimization we could do here would be to remember the counter value we stopped at when we last visited the same parent.
Not sure if it's worth it in that form, because it would require us to keep track of counters for all labels from the original registry, but a simplified version would be to just use a single counter for all labels. That could be achieved simply by moving the suffix variable initialization out of the loop. The downside being that numbers used in names would have more digits than they could otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! The memory impact isn't too big, just a bunch of size_ts.
| /// A set of reserved labels may be provided, which is excluded when generating new labels. If a reserved label | ||
| /// already appears in the label registry and is used as-is in the AST, it will be reused despite it being | ||
| /// provided here. | ||
| explicit LabelIDDispenser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One important thing to mention is that the original labels will be preserved even if they're not valid Yul identifiers.
And for generateNewLabels() I'd add that labels are guaranteed to be valid and not reserved IFF the original registry satisfies this condition. And it's guaranteed that at least no additional invalid/reserved names will be introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, noted these things.
| if (usedIDs.empty()) | ||
| return {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition will never be satisfied, because ASTLabelRegistry always contains at least one label (the empty one) and therefore usedIDs will always contain 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another consequence of including 0 in usedIDs is that you'll try to generate a new label for it (you also mark it as reused), which will overwrite idToLabelMap[0]. I think that'll actually trip an assertion in ASTLabelRegistry constructor, because it requires that the label at 0 is "".
On the other hand the "" you put in labels will go unreferenced (unless there are unused IDs in the original registry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition will never be satisfied, because
ASTLabelRegistryalways contains at least one label (the empty one) and thereforeusedIDswill always contain0.
Right. Although this is a bit temporary and to be replaced with whatever the NameCollector yields on an input block in which it may happen to have an empty usedIDs set. I have removed the condition for now so it can be re-introduced when properly integrating the block input.
| m_reservedLabels(_reservedLabels.begin(), _reservedLabels.end()), | ||
| m_offset(_labels.maxID() + 1) | ||
| { | ||
| m_reservedLabels += std::set{""}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even necessary to reserve ""? As far as I can tell, m_reservedLabels just prevents a label from being reused in generateNewLabels(), but you set reusedLabels[0] = true there, so it won't hit the reserved name check anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I don't think it is necessary and having fewer reserved labels is good for performance. :)
0d84ba2 to
d2d9d5b
Compare
d2d9d5b to
d7aa648
Compare
ba47427 to
b45bcb3
Compare
b45bcb3 to
bc7c678
Compare
bc7c678 to
061185e
Compare
|
This pull request is stale because it has been open for 14 days with no activity. |
|
This pull request is stale because it has been open for 14 days with no activity. |
061185e to
e96a237
Compare
Depends on #15823