Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge constructor augmentation changes into the augmentations proposal #4164

Closed
wants to merge 5 commits into from

Conversation

jakemac53
Copy link
Contributor

@jakemac53 jakemac53 commented Nov 18, 2024

Merges #4063 into the augmentations feature spec, with modifications from the comments.

  • Changes how non-redirecting generative constructor augmentations work, roughly:
    • Remove special meaning of augmented in these constructor bodies.
    • Remove requirement that initializing formal and super parameters are a part of the "identity" of a constructor call, so future augmentations can change them.
      • Instead, only a single declaration opts a parameter into being either an initializing formal or super parameter.
      • All super parameters must appear in the same declaration, but it need not be the base one.
      • Any declaration introducing super parameters must come before any super constructor invocations.
    • Constructor bodies are invoked in order, starting at the base declaration and then going through the augmentations in first to last order.
  • Describes in detail the non-redirecting generative constructor invocation semantics.

I left off the requirement that the super constructor invocation must appear in the same declaration as any super parameters, and instead allow it to come in any subsequent declaration. It was simply easier to specify, and seemed fairly harmless, but let me know if you think this should change.

cc @lrhn should I bring in the "details of the definitions" section as well or is this sufficient?

@jakemac53 jakemac53 requested review from lrhn and eernstg November 18, 2024 19:28
Copy link
Member

@eernstg eernstg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bunch of comments. ;-)

working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
* Evaluate the initializer expression of that definition to a value *v*
(taking any augmentations of it into account).

* Initialize the corresponding instance variable of *o* to the value *v*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we would initialize all the inherited instance variables with an initializing expression as well at this time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to define "class definition", such that it includes the inherited variable definitions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the missing part is exactly an explicit definition of 'class definition'.

Considering the class definition to be all declarations (introductory + augmenting) of the given class, in introductory textual order, we will initialize the instance variables of the current class that have an initializing expression. That's fine.

But the next step should be to execute the initializer list definition, which would be all the initializer list elements, gathered from the constructor declarations in introductory-to-last-augment order (appending, except for the superinitializer, if any, which will remain last).

The actual ordering is the semi-opposite: We're evaluating the initializer list elements from the most-augmented end, in textual order. So if we have initializer lists i1, i2, i3 (introductory), i4, i5 (an augmenting declaration), i6, i7 (another augmentation) then we will evaluate them in the order i6, i7, i4, i5, i1, i2, i3.

Of course, the ordering usually doesn't matter. Still, we do specify the order because we want to be able to give precise answers when it does matter, and we do want to provide a predictable program behavior to developers (such that they don't get weird behaviors after a recompile, or even at run time if the ordering could change from run to run). So the ordering does matter after all.

And I doubt this ordering will be helpful to anyone. I'd prefer if the ordering follows the pattern which is laid out by a single constructor declaration: i1, i2, i3, i4, i5, i6, i7.

working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
* Field initializers of class, from all augmentations, in “base declaration
source order”.

* Augmenting declaration parameter lists and initializer lists, for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally do not evaluate parameter lists, it's probably an actual argument list. But I'd expect the evaluation of actual arguments to be the very first step, before such things as allocation of the fresh object (OK, we probably can't know, but certainly before) initialization of non-late/abstract/external instance variable declarations with an initializing expression.

Then we'd evaluate the initializer list (asserts and instance variable initializers, in textual order, for augmentations in last-to-first order).

There is exactly one superinitializer in the augmentation chain (or it's an error); it may have been implicitly induced because there is no super... syntax anywhere in the augmentation chain. The next step is to invoke the constructor denoted by that superinitializer by evaluating its actual arguments, binding them to parameters, and performing the same steps recursively as we just did on d.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.

Or in more detail... When you execute a (non-redirecting generative) contructor of a class to initialize the object, given an argument list, it does the following:

  • For every instance variable of the class, in source order of their base declarations,
    execute any initializer expressions for the variable.
    • Start with the last variable declaration for that variable that has an initializing expression.
    • If that declaration is in an augmenting declaration, and it contains augmented,
      then each occurrence of augmented evaluates the initializer expression of the
      next prior declaration of that variable that has an initializing expression.
    • Then initialize the variable to the resulting value.
  • Then start with the last (in source order) declaration of that constructor, and execute that
    to initialize the new object with the given argument list.
  • Executing a constructor declaration in this way means:
    • Bind the actual argument list to the declared formal parameters of that constructor
      declaration. This handles initializing formals and super parameters (which assign the
      value to some position in a super constructor invocation argument list), and binds
      names to values in the parameter and initializer list scopes.
    • Execute the initializer list of that constructor declaration in the intializer list scope.
    • If the constructor declares a super-constructor invocation, remember that superclass
      constructor, and the accumulated super-constructor argument list.
    • If the declaration is an augmenting constructor declaration, then
      execute (recursively) the augmented declaration to initialize the same object
      with the same argument list.
    • Otherwise execute the remembered superclass constructor with the constructed
      argument list to initialize the same object. (If there is no remembered superclass
      constructor, execute the unnamed constructor of the superclass with the collected
      argument list).
    • When either of these complete, execute the body (if any) of this declaration in the
      parameter scope of.
    • Then the execution of this constructor declaration completes.
  • When the last constructor declaration execution has completed, the execution of the
    class constructor completes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.

OK, so we might call this step something like "binding formals to actuals and performing parameter based side effects". (The language specification still says "binding actuals to formals" in the section title, and the order is not consistent, but the NNBD update says "binding formals to actuals" everywhere, as it should.)

The binding of formals to actuals in S is somewhat magical because it relies on being able to bind positional parameters before we know how many of those we have in the superinitializer. Similarly, we don't know the set of names of named parameters. This means that a dynamic type check which would otherwise be performed during this binding of formals to actuals must be done later, when we encounter the syntactic construct which is to become the value of s.

I would prefer a specification where the superinitializer is found at first, using a declarative specification: "For an augmentation chain d1 .. dk (introductory-to-last-augment) declaring a non-redirecting generative constructor, the superinitializer is the unique occurrence of a superinitializer in one of the declarations d1 .. dk. If there is none then the superinitializer is super(). If there are more than one then a compile-time error occurs." With that, we can start the execution of the constructor augmentation chain knowing exactly which superinitializer it has, and we don't need late variables like s, or apparent run-time errors that won't happen anyway.

* Base declaration parameter list, initializer list

* Base declaration super-constructor invocation, recurses to this entire list
on superclass.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be the invocation of the unique superinitializer, which should happen when the introductory declaration of d has executed its initializer list.

Next, the introductory declaration of d runs its body (if any), and it returns to the first augmenting declaration of d, which will run its body, etc.

working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
working/augmentation-libraries/feature-specification.md Outdated Show resolved Hide resolved
@eernstg
Copy link
Member

eernstg commented Nov 21, 2024

(I'm OOO for a couple of days, I'll return to this on Monday.)

Copy link
Member

@eernstg eernstg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a heroic effort to specify the semantics of generative non-redirecting constructor invocations based on the syntax as specified by the programmer.

I think it almost works.

At the same time, it illustrates that this approach is quite costly in terms of detailed elements that are specified from scratch. It would be very helpful to introduce a number of abstractions (search for 'declarative specification' to see one example).

One of the consequences of this is that we get the surprising evaluation order for initializer list elements (search for 'i7' in the comments to see more). I certainly think we'd deliver a better quality of service if we preserve the evaluation order for a constructor which is declared in one piece.

Another thing I'm worried about is the fact that initializing formals and super-parameters will block further augmentation. I would certainly expect that an initializing formal could be introduced (like A(this.x);) and later augmentations could just confirm its status (like @metadata augment A(this.x);), but this does not seem to be possible.

In other words, every parameter of the form this.x or super.x must occur in the laste augmentation. Is it really scalable (and compatible with macros) to have this requirement?

* Add initializers to the initializer list. If the augmenting constructor has
an initializer list then:
* Otherwise the parameter is a normal parameter. Bind the name of the
parameter to *v* in the parameter scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think it is allowed to have A(super.x); and then augment A(int x);. I do see things that imply we can have A(int x); and augment A(super.x); (as you mention), but (as I mentioned in comment on line 951 and 962) it seems to be impossible to have further augmentations (which is probably not intended, or at least seems quite inconvenient).

In any case, we should make sure we're careful about the scopes: The wording in line 987-988 implies that the parameter name is bound to v in the body scope of the constructor, but this should not happen in the case where the definition of this parameter is an initializing formal (they are only in scope in the initializer list, not in the body).

* Invoke *d’* on *C\<T1,…,Tn>* with arguments *L*, super-arguments *S*,
and super constructor invocation *s* to initialize *o*.

* *This recurses on the augmented definition*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good!

@@ -869,49 +880,187 @@ It is a compile-time error if:

These are probably the most complex constructor, but also the most common.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These are probably the most complex constructor, but also the most common.
These are probably the most complex constructors, but also the most common.

initialize the value *o*.

We then define the invocation of a *constructor definition* as follows, allowing
an augmenting constructor’s definition to delegate to the definition it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could have an augmenting constructor declaration, which is a constructor declaration whose first token is augment.

It would then delegate to the declaration that it augments.

Otherwise, an augmenting constructor declaration d could delegate to the definition of the augmented constructor declaration d1 (which would then be something like the chain from d1 and up to the introductory declaration of a constructor with the same name).

But probably:

Suggested change
an augmenting constructor’s definition to delegate to the definition it
an augmenting constructor’s declaration to delegate to the declaration it

initialize an object *o* (with a runtime type that is known to extend *C* and
implement *C*\<*T1*,..,*Tn*\>):

* Let *d* be the constructor definition named *g* of *C*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor definition would be an entity that includes all the declarations of constructors named g in all the declarations in the same library (w/parts) named C, ordered textually last-to-first.

It would be helpful to have a definition of this terminology, but I couldn't find it.

Similarly for 'class definition' a few lines earlier.

I've previously considered 'class definition' to denote the semantic entity which is the result of processing the chain of augmentations into a single entity with a semantics that allows us to talk about "an instance member of the class" and other concepts that we've used for many years (and similarly for "X definition" for other X). However, the treatment here seems to imply that we must at least also consider a 'class definition' to be a syntactic entity, consisting of a lot of bits and pieces.


* It is an error if *s* is already initialized.

* Initialize *s* to the *superInitializer* of *d*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is completely inscrutable as long as the nature of s hasn't been defined anywhere. Based on the assumption that s is a late variable whose type is syntax (or some semantic type modeling things like super.name(42, n: true)), it starts to make sense.

However, it is needlessly turning a statically known property ("what's the superinitializer of this constructor definition?") into a run-time adventure.


* Initialize *s* to the *superInitializer* of *d*.

* If *d* has an *augmentedDefinition* *d’*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that the word 'definition' is otherwise used to denote the semantics (or syntax) of a fragmented declaration as a single entity, I'd certainly expect that d' would be a declaration.

Suggested change
* If *d* has an *augmentedDefinition* *d’*
* If *d* has an *augmentedDeclaration* *d’*

* Invoke *d’* on *C\<T1,…,Tn>* with arguments *L*, super-arguments *S*,
and super constructor invocation *s* to initialize *o*.

* *This recurses on the augmented definition*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I still don't think we can have an augmented definition, it would be an augmented declaration.

Comment on lines +1032 to +1033
* Invoke the constructor named *g1* on *U* with arguments *S* to
initialize *o*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous "invocation of this semantic function" (line 929) passed the super parameter list and the uninitialized superinitializer as well. I guess this is the proper invocation signature, and line 929 should be adjusted to make the super parameter list and the uninitialized superinitializer "local variables" of the procedure.

* Field initializers of class, from all augmentations, in “base declaration
source order”.

* Augmenting declaration parameter lists and initializer lists, for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "evaluate parameter list" here is actually performing "binding actuals to formals", which is the step that handles initializing formals and super parameters.

OK, so we might call this step something like "binding formals to actuals and performing parameter based side effects". (The language specification still says "binding actuals to formals" in the section title, and the order is not consistent, but the NNBD update says "binding formals to actuals" everywhere, as it should.)

The binding of formals to actuals in S is somewhat magical because it relies on being able to bind positional parameters before we know how many of those we have in the superinitializer. Similarly, we don't know the set of names of named parameters. This means that a dynamic type check which would otherwise be performed during this binding of formals to actuals must be done later, when we encounter the syntactic construct which is to become the value of s.

I would prefer a specification where the superinitializer is found at first, using a declarative specification: "For an augmentation chain d1 .. dk (introductory-to-last-augment) declaring a non-redirecting generative constructor, the superinitializer is the unique occurrence of a superinitializer in one of the declarations d1 .. dk. If there is none then the superinitializer is super(). If there are more than one then a compile-time error occurs." With that, we can start the execution of the constructor augmentation chain knowing exactly which superinitializer it has, and we don't need late variables like s, or apparent run-time errors that won't happen anyway.

jakemac53 added a commit that referenced this pull request Dec 6, 2024
This is a paired down version of #4164.

It contains only the changes to the proposal we had previously discussed, making augmented have no special meaning in generative constructor bodies.

It does not change rules around initializing formals or super parameters (these must be consistent across the introductory and any augmenting declarations still).

I wanted to separate out this change from the much more complicated explanation of how these constructors should be evaluated.
@jakemac53
Copy link
Contributor Author

I am closing this PR, a simplified version has been merged which doesn't describe the full process of running a constructor, and just contains the key changes surrounding the use of augmented in constructor bodies (not allowed any more for generative, non-redirecting constructors).

@jakemac53 jakemac53 closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants