-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge constructor augmentation changes into the augmentations proposal #4164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a bunch of comments. ;-)
* Evaluate the initializer expression of that definition to a value *v* | ||
(taking any augmentations of it into account). | ||
|
||
* Initialize the corresponding instance variable of *o* to the value *v*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we would initialize all the inherited instance variables with an initializing expression as well at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to define "class definition", such that it includes the inherited variable definitions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the missing part is exactly an explicit definition of 'class definition'.
Considering the class definition to be all declarations (introductory + augmenting) of the given class, in introductory textual order, we will initialize the instance variables of the current class that have an initializing expression. That's fine.
But the next step should be to execute the initializer list definition, which would be all the initializer list elements, gathered from the constructor declarations in introductory-to-last-augment order (appending, except for the superinitializer, if any, which will remain last).
The actual ordering is the semi-opposite: We're evaluating the initializer list elements from the most-augmented end, in textual order. So if we have initializer lists i1, i2, i3 (introductory), i4, i5 (an augmenting declaration), i6, i7 (another augmentation) then we will evaluate them in the order i6, i7, i4, i5, i1, i2, i3.
Of course, the ordering usually doesn't matter. Still, we do specify the order because we want to be able to give precise answers when it does matter, and we do want to provide a predictable program behavior to developers (such that they don't get weird behaviors after a recompile, or even at run time if the ordering could change from run to run). So the ordering does matter after all.
And I doubt this ordering will be helpful to anyone. I'd prefer if the ordering follows the pattern which is laid out by a single constructor declaration: i1, i2, i3, i4, i5, i6, i7.
* Field initializers of class, from all augmentations, in “base declaration | ||
source order”. | ||
|
||
* Augmenting declaration parameter lists and initializer lists, for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally do not evaluate parameter lists, it's probably an actual argument list. But I'd expect the evaluation of actual arguments to be the very first step, before such things as allocation of the fresh object (OK, we probably can't know, but certainly before) initialization of non-late/abstract/external instance variable declarations with an initializing expression.
Then we'd evaluate the initializer list (asserts and instance variable initializers, in textual order, for augmentations in last-to-first order).
There is exactly one superinitializer in the augmentation chain (or it's an error); it may have been implicitly induced because there is no super...
syntax anywhere in the augmentation chain. The next step is to invoke the constructor denoted by that superinitializer by evaluating its actual arguments, binding them to parameters, and performing the same steps recursively as we just did on d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.
Or in more detail... When you execute a (non-redirecting generative) contructor of a class to initialize the object, given an argument list, it does the following:
- For every instance variable of the class, in source order of their base declarations,
execute any initializer expressions for the variable.- Start with the last variable declaration for that variable that has an initializing expression.
- If that declaration is in an augmenting declaration, and it contains
augmented
,
then each occurrence ofaugmented
evaluates the initializer expression of the
next prior declaration of that variable that has an initializing expression. - Then initialize the variable to the resulting value.
- Then start with the last (in source order) declaration of that constructor, and execute that
to initialize the new object with the given argument list. - Executing a constructor declaration in this way means:
- Bind the actual argument list to the declared formal parameters of that constructor
declaration. This handles initializing formals and super parameters (which assign the
value to some position in a super constructor invocation argument list), and binds
names to values in the parameter and initializer list scopes. - Execute the initializer list of that constructor declaration in the intializer list scope.
- If the constructor declares a super-constructor invocation, remember that superclass
constructor, and the accumulated super-constructor argument list. - If the declaration is an augmenting constructor declaration, then
execute (recursively) the augmented declaration to initialize the same object
with the same argument list. - Otherwise execute the remembered superclass constructor with the constructed
argument list to initialize the same object. (If there is no remembered superclass
constructor, execute the unnamed constructor of the superclass with the collected
argument list). - When either of these complete, execute the body (if any) of this declaration in the
parameter scope of. - Then the execution of this constructor declaration completes.
- Bind the actual argument list to the declared formal parameters of that constructor
- When the last constructor declaration execution has completed, the execution of the
class constructor completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.
OK, so we might call this step something like "binding formals to actuals and performing parameter based side effects". (The language specification still says "binding actuals to formals" in the section title, and the order is not consistent, but the NNBD update says "binding formals to actuals" everywhere, as it should.)
The binding of formals to actuals in S is somewhat magical because it relies on being able to bind positional parameters before we know how many of those we have in the superinitializer. Similarly, we don't know the set of names of named parameters. This means that a dynamic type check which would otherwise be performed during this binding of formals to actuals must be done later, when we encounter the syntactic construct which is to become the value of s.
I would prefer a specification where the superinitializer is found at first, using a declarative specification: "For an augmentation chain d1 .. dk (introductory-to-last-augment) declaring a non-redirecting generative constructor, the superinitializer is the unique occurrence of a superinitializer in one of the declarations d1 .. dk. If there is none then the superinitializer is super()
. If there are more than one then a compile-time error occurs." With that, we can start the execution of the constructor augmentation chain knowing exactly which superinitializer it has, and we don't need late
variables like s, or apparent run-time errors that won't happen anyway.
* Base declaration parameter list, initializer list | ||
|
||
* Base declaration super-constructor invocation, recurses to this entire list | ||
on superclass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be the invocation of the unique superinitializer, which should happen when the introductory declaration of d has executed its initializer list.
Next, the introductory declaration of d runs its body (if any), and it returns to the first augmenting declaration of d, which will run its body, etc.
Co-authored-by: Erik Ernst <[email protected]>
(I'm OOO for a couple of days, I'll return to this on Monday.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a heroic effort to specify the semantics of generative non-redirecting constructor invocations based on the syntax as specified by the programmer.
I think it almost works.
At the same time, it illustrates that this approach is quite costly in terms of detailed elements that are specified from scratch. It would be very helpful to introduce a number of abstractions (search for 'declarative specification' to see one example).
One of the consequences of this is that we get the surprising evaluation order for initializer list elements (search for 'i7' in the comments to see more). I certainly think we'd deliver a better quality of service if we preserve the evaluation order for a constructor which is declared in one piece.
Another thing I'm worried about is the fact that initializing formals and super-parameters will block further augmentation. I would certainly expect that an initializing formal could be introduced (like A(this.x);
) and later augmentations could just confirm its status (like @metadata augment A(this.x);
), but this does not seem to be possible.
In other words, every parameter of the form this.x
or super.x
must occur in the laste augmentation. Is it really scalable (and compatible with macros) to have this requirement?
* Add initializers to the initializer list. If the augmenting constructor has | ||
an initializer list then: | ||
* Otherwise the parameter is a normal parameter. Bind the name of the | ||
parameter to *v* in the parameter scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think it is allowed to have A(super.x);
and then augment A(int x);
. I do see things that imply we can have A(int x);
and augment A(super.x);
(as you mention), but (as I mentioned in comment on line 951 and 962) it seems to be impossible to have further augmentations (which is probably not intended, or at least seems quite inconvenient).
In any case, we should make sure we're careful about the scopes: The wording in line 987-988 implies that the parameter name is bound to v in the body scope of the constructor, but this should not happen in the case where the definition of this parameter is an initializing formal (they are only in scope in the initializer list, not in the body).
* Invoke *d’* on *C\<T1,…,Tn>* with arguments *L*, super-arguments *S*, | ||
and super constructor invocation *s* to initialize *o*. | ||
|
||
* *This recurses on the augmented definition*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good!
@@ -869,49 +880,187 @@ It is a compile-time error if: | |||
|
|||
These are probably the most complex constructor, but also the most common. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are probably the most complex constructor, but also the most common. | |
These are probably the most complex constructors, but also the most common. |
initialize the value *o*. | ||
|
||
We then define the invocation of a *constructor definition* as follows, allowing | ||
an augmenting constructor’s definition to delegate to the definition it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could have an augmenting constructor declaration, which is a constructor declaration whose first token is augment
.
It would then delegate to the declaration that it augments.
Otherwise, an augmenting constructor declaration d could delegate to the definition of the augmented constructor declaration d1 (which would then be something like the chain from d1 and up to the introductory declaration of a constructor with the same name).
But probably:
an augmenting constructor’s definition to delegate to the definition it | |
an augmenting constructor’s declaration to delegate to the declaration it |
initialize an object *o* (with a runtime type that is known to extend *C* and | ||
implement *C*\<*T1*,..,*Tn*\>): | ||
|
||
* Let *d* be the constructor definition named *g* of *C*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constructor definition would be an entity that includes all the declarations of constructors named g in all the declarations in the same library (w/parts) named C, ordered textually last-to-first.
It would be helpful to have a definition of this terminology, but I couldn't find it.
Similarly for 'class definition' a few lines earlier.
I've previously considered 'class definition' to denote the semantic entity which is the result of processing the chain of augmentations into a single entity with a semantics that allows us to talk about "an instance member of the class" and other concepts that we've used for many years (and similarly for "X definition" for other X). However, the treatment here seems to imply that we must at least also consider a 'class definition' to be a syntactic entity, consisting of a lot of bits and pieces.
|
||
* It is an error if *s* is already initialized. | ||
|
||
* Initialize *s* to the *superInitializer* of *d*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is completely inscrutable as long as the nature of s hasn't been defined anywhere. Based on the assumption that s is a late variable whose type is syntax (or some semantic type modeling things like super.name(42, n: true)
), it starts to make sense.
However, it is needlessly turning a statically known property ("what's the superinitializer of this constructor definition?") into a run-time adventure.
|
||
* Initialize *s* to the *superInitializer* of *d*. | ||
|
||
* If *d* has an *augmentedDefinition* *d’* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that the word 'definition' is otherwise used to denote the semantics (or syntax) of a fragmented declaration as a single entity, I'd certainly expect that d' would be a declaration.
* If *d* has an *augmentedDefinition* *d’* | |
* If *d* has an *augmentedDeclaration* *d’* |
* Invoke *d’* on *C\<T1,…,Tn>* with arguments *L*, super-arguments *S*, | ||
and super constructor invocation *s* to initialize *o*. | ||
|
||
* *This recurses on the augmented definition*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I still don't think we can have an augmented definition, it would be an augmented declaration.
* Invoke the constructor named *g1* on *U* with arguments *S* to | ||
initialize *o*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous "invocation of this semantic function" (line 929) passed the super parameter list and the uninitialized superinitializer as well. I guess this is the proper invocation signature, and line 929 should be adjusted to make the super parameter list and the uninitialized superinitializer "local variables" of the procedure.
* Field initializers of class, from all augmentations, in “base declaration | ||
source order”. | ||
|
||
* Augmenting declaration parameter lists and initializer lists, for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.
OK, so we might call this step something like "binding formals to actuals and performing parameter based side effects". (The language specification still says "binding actuals to formals" in the section title, and the order is not consistent, but the NNBD update says "binding formals to actuals" everywhere, as it should.)
The binding of formals to actuals in S is somewhat magical because it relies on being able to bind positional parameters before we know how many of those we have in the superinitializer. Similarly, we don't know the set of names of named parameters. This means that a dynamic type check which would otherwise be performed during this binding of formals to actuals must be done later, when we encounter the syntactic construct which is to become the value of s.
I would prefer a specification where the superinitializer is found at first, using a declarative specification: "For an augmentation chain d1 .. dk (introductory-to-last-augment) declaring a non-redirecting generative constructor, the superinitializer is the unique occurrence of a superinitializer in one of the declarations d1 .. dk. If there is none then the superinitializer is super()
. If there are more than one then a compile-time error occurs." With that, we can start the execution of the constructor augmentation chain knowing exactly which superinitializer it has, and we don't need late
variables like s, or apparent run-time errors that won't happen anyway.
This is a paired down version of #4164. It contains only the changes to the proposal we had previously discussed, making augmented have no special meaning in generative constructor bodies. It does not change rules around initializing formals or super parameters (these must be consistent across the introductory and any augmenting declarations still). I wanted to separate out this change from the much more complicated explanation of how these constructors should be evaluated.
I am closing this PR, a simplified version has been merged which doesn't describe the full process of running a constructor, and just contains the key changes surrounding the use of |
Merges #4063 into the augmentations feature spec, with modifications from the comments.
augmented
in these constructor bodies.I left off the requirement that the super constructor invocation must appear in the same declaration as any super parameters, and instead allow it to come in any subsequent declaration. It was simply easier to specify, and seemed fairly harmless, but let me know if you think this should change.
cc @lrhn should I bring in the "details of the definitions" section as well or is this sufficient?