-
Notifications
You must be signed in to change notification settings - Fork 47
AST Rust API #1477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
AST Rust API #1477
Conversation
|
crates/solidity/outputs/cargo/crate/src/backend/semantic/mod.rs
Outdated
Show resolved
Hide resolved
| define_variant!(YulVariable, Identifier); | ||
| } | ||
|
|
||
| include!("ast_nodes.generated.rs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was confused by the structure/imports until I found this include!() here.
I suggest using the same pattern in other IRs, by renaming ast_nodes to nodes, and the new nodes here to node_extensions. In that case, ast/mod.rs can simply pub use nodes::* and pub use node_extensions::* to achieve the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suggest splitting the new types/extensions into type-specific modules under ast/node_extensions/*, as I expect them to only grow in size, and it would make reviewing them much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I thought it was not possible to extend the implementations (defining new impl blocks) for types in a different module, but I was mistaken. I'll change the structure of this code.
| } | ||
|
|
||
| impl IdentifierStruct { | ||
| fn create(ir_node: &Rc<TerminalNode>, semantic: &Rc<SemanticAnalysis>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be pub(crate)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is an oversight.
| .unwrap() | ||
| } | ||
|
|
||
| define_variant!(Constant); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the extra SemanticAnalysis field for every def in the compilation, this struct/define_variant!() pattern makes it harder to match on the type, resorting to the added is_/as_ extensions.
I wonder if we should make it a proper enum instead? Each *Definition type contains a SemanticAnalysis object already.
enum Definition {
Contract(ContractDefinition),
Enum(EnumDefinition),
// etc...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suggest adding a sync marker between the variants of this Definition and binder::Definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try changing this to an enum. Seeing the code in perspective it makes perfect sense.
| } | ||
| } | ||
|
|
||
| fn internal_definition(&self) -> &binder::Definition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this PR adds definitions, but not references.
Is this planned for a future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References would need to be added to the the Definition (or each *Definition AST node). It's not added yet because we don't have the inverse mapping computed yet on the SemanticAnalysis object, but that's easy to do. I think that can go into this PR. I'll add that.
crates/solidity/outputs/cargo/crate/src/backend/ir/ast/visitor.rs.jinja2
Show resolved
Hide resolved
| let binder::Definition::Contract(contract) = definition else { | ||
| return None; | ||
| }; | ||
| if definition.identifier().unparse() == name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking this PR:
I notice in many places in the backend we are using unparse() (which clones the string value) just to get a reference to check/compare. I wonder if we should use/add a fn TerminalNode.value() -> &str instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to add on this, we should intern all of these strings so these comparisons are just over ids (numbers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of interning, which will also further improve performance of the binder, as there's a big chunk of time spent in hashing identifiers for lookup. This should be done in the parser though.
| @@ -0,0 +1,6549 @@ | |||
| // This file is generated automatically by infrastructure scripts. Please don't edit by hand. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we measured the impact of adding this additional Rc<SemanticAnalysis> to every single node in the tree?
I like the new API, and that people don't have to see/interact with the SemanticAnalysis object at all, but I want to make sure that the increase in memory/allocation time is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an issue for me right now is that we're cloning everything by default, so if a user wants to traverse the AST they will create a huge amount of (probably short lived) Rcs to it. The API looks almost like a typed cursor, so I wonder if maybe the methods should consume self rather than borrow it:
pub fn members(self) -> SourceUnitMembersStruct {
SourceUnitMembersStruct::create(
&self.ir_node.members,
self.semantic,
))
}This way, if a user actually needs to keep the reference they can cheaply clone it (we should #[derive(Clone)])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, whenever we start looking into allocators we'll probably bring in lifetimes in, so a lot of this Rc<SemanticAnalysis> should become &'arena SemanticAnalysis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would adding a Nevermind this, I read the opening comment wrong.Rc<SemanticAnalysis> to each IR node be even possible? I think there is a circular dependency there, and the borrow checker will not like that.
Re: the use of Rc, maybe we could be using value structs directly for Rust code, but once we cross into Wasm territory we need to keep track of the memory somehow, and our wrappers will use Rcs anyway. The AST is a façade/cursor API so maybe it's not that impactful?
| let unit = build_compilation_unit()?; | ||
| let semantic_analysis = unit.semantic_analysis(); | ||
| assert_eq!(2, semantic_analysis.files().len()); | ||
| assert_eq!(3, semantic_analysis.files().len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we move this test to sample.rs as well, eliminating this module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
OmarTawfik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few questions/suggestions. Thanks!
teofr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments and questions
crates/solidity/outputs/cargo/crate/src/backend/ir/ast/nodes.generated.rs
Show resolved
Hide resolved
| @@ -0,0 +1,6549 @@ | |||
| // This file is generated automatically by infrastructure scripts. Please don't edit by hand. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an issue for me right now is that we're cloning everything by default, so if a user wants to traverse the AST they will create a huge amount of (probably short lived) Rcs to it. The API looks almost like a typed cursor, so I wonder if maybe the methods should consume self rather than borrow it:
pub fn members(self) -> SourceUnitMembersStruct {
SourceUnitMembersStruct::create(
&self.ir_node.members,
self.semantic,
))
}This way, if a user actually needs to keep the reference they can cheaply clone it (we should #[derive(Clone)])
| @@ -0,0 +1,6549 @@ | |||
| // This file is generated automatically by infrastructure scripts. Please don't edit by hand. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, whenever we start looking into allocators we'll probably bring in lifetimes in, so a lot of this Rc<SemanticAnalysis> should become &'arena SemanticAnalysis
crates/solidity/outputs/cargo/crate/src/backend/ir/ast/nodes.rs.jinja2
Outdated
Show resolved
Hide resolved
| let binder::Definition::Contract(contract) = definition else { | ||
| return None; | ||
| }; | ||
| if definition.identifier().unparse() == name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to add on this, we should intern all of these strings so these comparisons are just over ids (numbers).
Also constructs the this (reverse) mapping after resolving all references.
teofr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a small comment, but LGTM
| definitions.insert(*node_id, vec![*reference_id]); | ||
| } | ||
| } | ||
| Resolution::Ambiguous(_) | Resolution::BuiltIn(_) | Resolution::Unresolved => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small note, we may be interested in computing references for BuiltIns as well.
This PR defines an AST API as a façade over the last IR language currently available (
ir2_flat_contracts) that allows opaque access to the IR nodes and the semantic information computed in the frontend passes.SemanticAnalysisobject that is obtainable from theCompilationUnitit's possible to retrieve the root of the AST of each file in the unitget_file_ast_root(), which returnsast::SourceUnit.SourceUnitdown through the intermediate AST nodes, down to the terminal nodes.SemanticAnalysis.Identifiernodes, which allow attempting to resolve to aDefinitionobject.Visitortrait can be implemented and used viaaccept_source_unit(and related functions) to traverse the wrapped AST.SourceUnitdefinescontracts()to retrieve the list of contracts defined in the file)