Skip to content

Merge constructor augmentation changes into the augmentations proposal #4164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 200 additions & 47 deletions working/augmentation-libraries/feature-specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -239,14 +239,14 @@ An augmentation that replaces the body of a function may also want to
preserve and run the code of the augmented declaration (hence the name
"augmentation"). It may want to run its own code before the augmented
code, after it, or both. To support that, we allow a new expression syntax
inside the "bodies" of augmenting declarations (function bodies,
constructor bodies, and variable initializers). Inside an expression in an
augmenting member declaration, the identifier `augmented` can be used to
refer to the augmented function, getter, or setter body, or variable
initializer. This is a contextual reserved word within `augment`
declarations, and has no special meaning outside of that context. See the
next section for a full specification of what `augmented` means, and how it
must be used, in the various contexts.
inside the "bodies" of certain augmenting declarations (some function bodies and
variable initializers). Inside an expression in an augmenting member
declaration, the identifier `augmented` can be used to refer to the augmented
function, getter, or setter body, or variable initializer. This is a contextual
reserved word within `augment` declarations, and has no special meaning outside
of that context. See the [augmented expression](#augmented-expression) section
for a full specification of what `augmented` means, and how it must be used, in
the various contexts.

*Note that within an augmenting member declaration, a reference to a member
by the same name refers to the final version of the member (and not the one
Expand Down Expand Up @@ -343,9 +343,20 @@ augmented, but it generally follows the same rules as any normal identifier:
variable's initializer if the member being augmented is not a variable
declaration with an initializing expression.

* **Augmenting functions**: When augmenting a function, `augmented`
refers to the augmented function. Tear offs are not allowed, so this
function must immediately be invoked.
* **Augmenting functions**: Inside an augmenting function body (including
factory constructors but not generative constructors) `augmented` refers to
the augmented function. Tear-offs are not allowed, and this function must
immediately be invoked.

* **Augmenting non-redirecting generative constructors**: Unlike other
functions, `augmented` has no special meaning in non-redirecting generative
constructors. It is still a reserved word inside the body of these
constructors, since they are within the scope of an augmenting declaration.

There is instead an implicit order in which these augmented constructors are
invoked, and they all receive the same arguments. See
[this section](#non-redirecting-generative-constructors) for more
information.

* **Augmenting operators**: When augmenting an operator, `augmented`
refers to the augmented operator method, which must be immediately
Expand Down Expand Up @@ -869,49 +880,187 @@ It is a compile-time error if:

These are probably the most complex constructor, but also the most common.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These are probably the most complex constructor, but also the most common.
These are probably the most complex constructors, but also the most common.


A non-redirecting generative constructor marked `augment` may:
At a high level, a non-redirecting generative constructor marked `augment` may:

* Augment the constructor with an _additional_ constructor body (all bodies
eventually invoked as described below).

* Add initializers (and/or asserts) to the initializer list, as well as a
`super` call at the end of the initializer list.

* Opt a previously normal parameter into being either an initializing formal
parameter or super parameter.

The full process for evaluating a non-redirecting generative constructor is as
follows.

When we invoke a non-redirecting generative constructor named *g* (for example
`C.id` or `C`) on a class definition *C* with type parameters `X1`,..,`Xn`,
instantiated with type arguments *T1*,…,*Tn*, with an argument list *L* to
initialize an object *o* (with a runtime type that is known to extend *C* and
implement *C*\<*T1*,..,*Tn*\>):

* Let *d* be the constructor definition named *g* of *C*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor definition would be an entity that includes all the declarations of constructors named g in all the declarations in the same library (w/parts) named C, ordered textually last-to-first.

It would be helpful to have a definition of this terminology, but I couldn't find it.

Similarly for 'class definition' a few lines earlier.

I've previously considered 'class definition' to denote the semantic entity which is the result of processing the chain of augmentations into a single entity with a semantics that allows us to talk about "an instance member of the class" and other concepts that we've used for many years (and similarly for "X definition" for other X). However, the treatment here seems to imply that we must at least also consider a 'class definition' to be a syntactic entity, consisting of a lot of bits and pieces.


* For each *instance variable definition* of *C* in
*introductory declaration order* (the order of the *introductory*
declaration in the total traversal ordering of a library), where the
variable definition *has an initializer expression*, and is not declared
`late`:

* Evaluate the initializer expression of that definition to a value *v*
(taking any augmentations of it into account).

* Initialize the corresponding instance variable of *o* to the value *v*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we would initialize all the inherited instance variables with an initializing expression as well at this time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to define "class definition", such that it includes the inherited variable definitions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the missing part is exactly an explicit definition of 'class definition'.

Considering the class definition to be all declarations (introductory + augmenting) of the given class, in introductory textual order, we will initialize the instance variables of the current class that have an initializing expression. That's fine.

But the next step should be to execute the initializer list definition, which would be all the initializer list elements, gathered from the constructor declarations in introductory-to-last-augment order (appending, except for the superinitializer, if any, which will remain last).

The actual ordering is the semi-opposite: We're evaluating the initializer list elements from the most-augmented end, in textual order. So if we have initializer lists i1, i2, i3 (introductory), i4, i5 (an augmenting declaration), i6, i7 (another augmentation) then we will evaluate them in the order i6, i7, i4, i5, i1, i2, i3.

Of course, the ordering usually doesn't matter. Still, we do specify the order because we want to be able to give precise answers when it does matter, and we do want to provide a predictable program behavior to developers (such that they don't get weird behaviors after a recompile, or even at run time if the ordering could change from run to run). So the ordering does matter after all.

And I doubt this ordering will be helpful to anyone. I'd prefer if the ordering follows the pattern which is laid out by a single constructor declaration: i1, i2, i3, i4, i5, i6, i7.


* Let *S* be an uninitialized argument list with the number of positional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An "uninitialized argument list" is again not a known concept. We could say that it is a kind of semantic record, and we will store actual arguments in its positional and named slots as we proceed.

arguments and names of named arguments of *superParameters* of *d* (which is
the shape of the argument list of the super-constructor invocation including
all explicit arguments and super parameters).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we have not yet encountered the superinitializer and hence we can't know which shape S has.


* Invoke the constructor definition *d* of *C*\<*T1*,…,*Tn*> with argument
list *L*, super parameters *S*, and uninitialized super initializer *s* to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be defined explicitly somewhere that s is essentially a late variable whose type is syntax (or some semantic type which is capable of holding a superinitializer, e.g., super.name(42, x: true)).

We don't usually have this kind of entity (or that kind of type) in specification language and it needs to be properly introduced. As it stands, everything about s is seriously confusing to read.

initialize the value *o*.

We then define the invocation of a *constructor definition* as follows, allowing
an augmenting constructor’s definition to delegate to the definition it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could have an augmenting constructor declaration, which is a constructor declaration whose first token is augment.

It would then delegate to the declaration that it augments.

Otherwise, an augmenting constructor declaration d could delegate to the definition of the augmented constructor declaration d1 (which would then be something like the chain from d1 and up to the introductory declaration of a constructor with the same name).

But probably:

Suggested change
an augmenting constructor’s definition to delegate to the definition it
an augmenting constructor’s declaration to delegate to the declaration it

augments.

An invocation of a non-redirecting generative constructor definition *d* on an
instantiated class definition *C*\<*T1*,…,*Tn*\> with argument list *L*, super
parameters *S*, and uninitialized super initializer *s* to initialize an object
*o* proceeds as follows (at least for any class other than `Object`, which just
completes immediately):

* Bind actual arguments to formal parameters. For each parameter in the
*parameterList* of *d* (in sequence order):

* If the argument list *L* has a value for the corresponding positional
position or named argument name, let *v* be that value.

* Otherwise, the parameter is guaranteed to be optional.

* If the parameter has a default value expression, let *v* be the
value of that expression.

* Otherwise let *v* be the `null` value.

* If the parameter is an initializing formal with name *n*,

* It is an error if *d* has an *augmentedDefinition* *d’* in which the
same parameter is anything other than a normal parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can have A(int i); followed by augment A(this.i); but not A(this.i); followed by augment A(this.i);? Is that a useful constraint? It seems to block all further augmentations as soon as any single parameter has been transformed into an initializing formal (or a super parameter, taking lines 959++ into account).

Also, do these constraints get expressed in terms of compile-time errors anywhere? I'd expect all these operational issues ("oops, that's not a normal parameter, lets throw!") to be expressed as compile-time requirements somewhere, and then we could argue at this point that the bad situation cannot arise because it would have incurred a compile-time error.


* Initialize the variable of *o* corresponding to the instance
variable named *n* of *C* to the value *v*. It is an error if this
variable is already initialized.

* Bind the name *n* to *v* in the initializer-list scope.

* Otherwise, if the parameter is a super-parameter,

* It is an error if *d* has a *augmentedDefinition* *d’* in which the
same parameter is anything other than a normal parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in line 951.


* It is an error if *d* has a *augmentedDefinition* *d’*, which has
any super parameters. Only one declaration may contain super
parameters.

* Add or replace the body of the augmented constructor with a new body.
* It is an error if *s* is initialized. Super parameters must appear
before any super constructor invocations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again an operational "oops" approach, it should be a compile-time error if there's anything wrong with the placement of initializer list elements, including superinitializers, in the chain of declarations that amount to the definition of a constructor.


* If the augmenting constructor has an explicit block body, then that body
replaces any existing constructor body.
* If the parameter is named, set the value of the argument with name
*n* of *S* to *v*.

* In the augmenting constructor's body, an `augmented()` call executes the
augmented constructor's body in the same parameter scope that the
augmenting body is executing in. The expression has type `void` and
evaluates to `null`. **(TODO: This is slightly under-specified. We can
use the current bindings of the parameters of the augmenting constructor
as the initial binding of parameter variables in the augmented body, or
we can execute the body in the current *scope*, using the same variables
as the current body. The latter is not what we do with functions
elsewhere, and allows the `augmented()` expression to modify local
variables, but the former introduces different variables than the ones
that existed when evaluating the initializer list. If the initializer
list captures variables in closures, that body may not work.)**
* Otherwise the parameter is positional:

* Initializer lists _are not_ re-run, they have already executed and
shouldn't be executed twice. The same goes for initializing formals and
super parameters.
* Let *i* be one plus the number of prior positional super
parameters in the parameter list of *d*.

* If a parameter variable is overwritten prior to calling `augmented()`,
the augmented body will see the updated value, because the parameter
scope is identical.
* Otherwise if *d* has a *superInitializer*, add the number of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Otherwise' other than what?

positional arguments in the argument list of that invocation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add them to what? To i? Is i mutable?


* Local variables in scope where `augmented()` is evaluated are not in
scope for the execution of the augmented constructor's body.
* Set the positional argument value with position *i* in *S* to
*v*.

* Add initializers to the initializer list. If the augmenting constructor has
an initializer list then:
* Bind the name of the parameter to *v* in the initializer-list scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and in the parameter scope. A super-parameter should be accessible in the body.


* It's a compile-time error if the augmented constructor has
super-initializer, and the augmenting constructor's initializer list
also contains a super-initializer.
* Otherwise the parameter is a normal parameter. Bind the name of the
parameter to *v* in the parameter scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a rule somewhere that it is an error if the parameter is normal, but it has an augmented parameter declaration which is not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we are explicitly allowing this now (it wasn't allowed before).

It allows a macro or augmentation to add an extends clause and simultaneously add the super call, while taking advantage of the super parameter syntax to construct that call by augmenting some parameters to be super parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think it is allowed to have A(super.x); and then augment A(int x);. I do see things that imply we can have A(int x); and augment A(super.x); (as you mention), but (as I mentioned in comment on line 951 and 962) it seems to be impossible to have further augmentations (which is probably not intended, or at least seems quite inconvenient).

In any case, we should make sure we're careful about the scopes: The wording in line 987-988 implies that the parameter name is bound to v in the body scope of the constructor, but this should not happen in the case where the definition of this parameter is an initializing formal (they are only in scope in the initializer list, not in the body).


* Otherwise the result of applying the augmenting constructor has an
initializer list containing first the assertions and field initializers
of the augmented constructor, if any, then the assertions and field
initializers of the augmenting constructor, and finally any
super-initializer of either the augmeted or augmenting constructor.
* For each entry in the initializer list of *d*, in sequence order:

* If assert entry, evaluate the first operand in the initializer list
scope. If evaluate to `false`, evaluate the second operand to a message
value *m*, if there is a second operand, otherwise let *m* be the
`null` value. Throw an `AssertionError` with *m* as message.

* If variable intializer entry, evaluate expression in initializer list
scope to value *v*. Initialize the variable of *o* corresponding to
variable with the initializer entry name in *C* to the value *v*.

* If *d* has a *superInitializer*.

* It is an error if *s* is already initialized.

* Initialize *s* to the *superInitializer* of *d*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is completely inscrutable as long as the nature of s hasn't been defined anywhere. Based on the assumption that s is a late variable whose type is syntax (or some semantic type modeling things like super.name(42, n: true)), it starts to make sense.

However, it is needlessly turning a statically known property ("what's the superinitializer of this constructor definition?") into a run-time adventure.


* If *d* has an *augmentedDefinition* *d’*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that the word 'definition' is otherwise used to denote the semantics (or syntax) of a fragmented declaration as a single entity, I'd certainly expect that d' would be a declaration.

Suggested change
* If *d* has an *augmentedDefinition* *d’*
* If *d* has an *augmentedDeclaration* *d’*


* Invoke *d’* on *C\<T1,…,Tn>* with arguments *L*, super-arguments *S*,
and super constructor invocation *s* to initialize *o*.

* *This recurses on the augmented definition*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so the correctness of the super-constructor invocation relies on an augmentation chain property: Exactly one declaration (introductory or augmenting) contributes a superinitializer. This is known statically, so we can rely on it being true at run time (so we shouldn't need to specify an error in line 1000, we could just have commentary saying that it is a known property).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, a static error isn't called out anywhere right now I don't believe.

It certainly could be a static error though. Maybe I just mention on this line that this error should be reported at compile time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I still don't think we can have an augmented definition, it would be an augmented declaration.


* Otherwise:

* Let *U* be be the instantiated *superclass definition* of
*C*\<*T1*,…,*Tn*\>.

* If *s* is initialized:

* Evaluate each argument of the argument list of the super-constructor
invocation in the initializer list scope, in source order, and set
the entry of *S* with the same position or name to the resulting
value.

* Let *g1* be the name of the superclass constructor of *U* targeted
by the *superInitializer* (class-name of *U* plus `.id` if
referenced as `super.id`).

* Otherwise let *g1* be the name of the unnamed constructor of *U*.

* Invoke the constructor named *g1* on *U* with arguments *S* to
initialize *o*.
Comment on lines +1032 to +1033
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous "invocation of this semantic function" (line 929) passed the super parameter list and the uninitialized superinitializer as well. I guess this is the proper invocation signature, and line 929 should be adjusted to make the super parameter list and the uninitialized superinitializer "local variables" of the procedure.


* *This recurses on the superclass constructor.*

* When this has completed, execute the *body* of *d*, if any, in a scope which
has the parameter scope of the invocation of *d* as parent scope.

* Then invocation of *d* completes normally.

The consequence of this definition is an execution order for initialization of
an instance of a class of:

* Field initializers of class, from all augmentations, in “introductory
declaration source order”.

* Augmenting declaration parameter lists and initializer lists, for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally do not evaluate parameter lists, it's probably an actual argument list. But I'd expect the evaluation of actual arguments to be the very first step, before such things as allocation of the fresh object (OK, we probably can't know, but certainly before) initialization of non-late/abstract/external instance variable declarations with an initializing expression.

Then we'd evaluate the initializer list (asserts and instance variable initializers, in textual order, for augmentations in last-to-first order).

There is exactly one superinitializer in the augmentation chain (or it's an error); it may have been implicitly induced because there is no super... syntax anywhere in the augmentation chain. The next step is to invoke the constructor denoted by that superinitializer by evaluating its actual arguments, binding them to parameters, and performing the same steps recursively as we just did on d.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.

Or in more detail... When you execute a (non-redirecting generative) contructor of a class to initialize the object, given an argument list, it does the following:

  • For every instance variable of the class, in source order of their base declarations,
    execute any initializer expressions for the variable.
    • Start with the last variable declaration for that variable that has an initializing expression.
    • If that declaration is in an augmenting declaration, and it contains augmented,
      then each occurrence of augmented evaluates the initializer expression of the
      next prior declaration of that variable that has an initializing expression.
    • Then initialize the variable to the resulting value.
  • Then start with the last (in source order) declaration of that constructor, and execute that
    to initialize the new object with the given argument list.
  • Executing a constructor declaration in this way means:
    • Bind the actual argument list to the declared formal parameters of that constructor
      declaration. This handles initializing formals and super parameters (which assign the
      value to some position in a super constructor invocation argument list), and binds
      names to values in the parameter and initializer list scopes.
    • Execute the initializer list of that constructor declaration in the intializer list scope.
    • If the constructor declares a super-constructor invocation, remember that superclass
      constructor, and the accumulated super-constructor argument list.
    • If the declaration is an augmenting constructor declaration, then
      execute (recursively) the augmented declaration to initialize the same object
      with the same argument list.
    • Otherwise execute the remembered superclass constructor with the constructed
      argument list to initialize the same object. (If there is no remembered superclass
      constructor, execute the unnamed constructor of the superclass with the collected
      argument list).
    • When either of these complete, execute the body (if any) of this declaration in the
      parameter scope of.
    • Then the execution of this constructor declaration completes.
  • When the last constructor declaration execution has completed, the execution of the
    class constructor completes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.

OK, so we might call this step something like "binding formals to actuals and performing parameter based side effects". (The language specification still says "binding actuals to formals" in the section title, and the order is not consistent, but the NNBD update says "binding formals to actuals" everywhere, as it should.)

The binding of formals to actuals in S is somewhat magical because it relies on being able to bind positional parameters before we know how many of those we have in the superinitializer. Similarly, we don't know the set of names of named parameters. This means that a dynamic type check which would otherwise be performed during this binding of formals to actuals must be done later, when we encounter the syntactic construct which is to become the value of s.

I would prefer a specification where the superinitializer is found at first, using a declarative specification: "For an augmentation chain d1 .. dk (introductory-to-last-augment) declaring a non-redirecting generative constructor, the superinitializer is the unique occurrence of a superinitializer in one of the declarations d1 .. dk. If there is none then the superinitializer is super(). If there are more than one then a compile-time error occurs." With that, we can start the execution of the constructor augmentation chain knowing exactly which superinitializer it has, and we don't need late variables like s, or apparent run-time errors that won't happen anyway.

augmentations in last-to-first order.

* Introductory declaration parameter list, initializer list

* Introductory declaration super initializer, recurses to this entire list on
superclass.

* Body of the introductory declaration.

* Bodies of any augmenting declarations in first-to-last order.

It ensures that we only recurse in *one* place, keeping a stack discipline. The
parameter scope of the invocation can be stack-allocated, and be on top of the
stack when the scope is used. (If we execute initializer list entries *after*
the ones of an augmented definition, we fail to maintain that stack order.)

#### Non-redirecting factory constructors

Expand All @@ -932,8 +1081,12 @@ potentially non-redirecting property of the constructor.

It is a compile-time error if:

* The augmented constructor has an initializer list or a body, or it has a
redirection.
* The augmented constructor has any initializers.
* The augmented constructor has a body.
* The augmented constructor has a redirection.

This redirecting generative constructor now behaves exactly like any other
redirecting generative constructor when it is invoked.

#### Redirecting factory constructors

Expand Down