Skip to content

Add initial draft of static immutability #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 6, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
230 changes: 230 additions & 0 deletions working/0125-static-immutability/feature-specification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
# Shared immutable objects

[email protected]

Status: Draft

This describes a possible solution for:
- [Communication between isolates](https://github.com/dart-lang/language/issues/124)
- [Building immutable collections](https://github.com/dart-lang/language/issues/117)
- [Unwanted mutation of lists in Flutter](https://github.com/dart-lang/sdk/issues/27755)

## Summary

This describes a way to declare classes that produce deeply immutable object
graphs that are shared across isolates.

## Syntax

We add a section to class headers for expressing class and generic constraints,
along with an "immutable" constraint.

```dart
class Value<T> extends Scalar<T> implements Constant is immutable
where T is immutable {
}
```

Mixin declarations may also be marked `immutable`.

Generic method headers may also express generic constraints.

```
foo<S, T, where T is immutable>(Value<T> v) {

}
```

### Alternative syntax 1

Instead of adding constraints, a simpler approach is to add a marker interface
`Immutable`. The property expressed by the constraint `T is immutable` then
becomes expressed by `implements Immutable` in the case of a class, or `T
extends Immutable` in the case of a type variable `T`.

### Alternative syntax 2

Instead of adding general constraints, we could expose a dedicated syntax. For
example, this proposal from @yjbanov.

```dart
data Value<data T> extends Scalar<T> {
}

foo<S, data T>(Value<T> v) {
}

```


## Static checking
A class marked with `immutable` is subject to the following additional static
checks.

- Every field in an immutable class (including any superclass fields) must be
final.
- Every field in an immutable class (including any superclass fields) must have
a static type which is immutable.
- Every other class which implements the interface of an immutable class
(including via extension or mixing in) must also be immutable.

The types `int`, `double`, `bool`, `String`, `Type`, and `Symbol` are considered
immutable.

## Generated methods

We may wish to consider automatically generating hashCode and equality methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still dangerous because auto-generated code can only rely on fields, not logical properties (unless we start annotating the logical properties, getters, which should be used for the hash code/equality, and that's just ugly).

We cannot define an automatic equality based on private fields, so any class with a private field and public getter would not work with automatic equality. The class needs to be sealed to ensure that other objects of the same type has the same private members.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still dangerous because auto-generated code can only rely on fields, not logical properties

I was a bit concerned about this. @munificent seemed to feel it was a non-issue. Something to think about.

We cannot define an automatic equality based on private fields

I don't understand why not? Can you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can, but since nobody knows the order or names of the private members, it is nigh impossible to make another implementation of the same interface that has the same equality/hash-code. So, rather than "we can't", it's probably "we shouldn't". It exposes the (arbitrary) internal implementation in a public method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this. Suppose I have class. It has private fields. I want to put it in a hash table. What is it that you suggest I do, if not hash the private fields? Make the fields public so that I can hash them? And if it's reasonable for me to do it, why is it not reasonable for a compiler to do it for me?

Copy link
Member

@lrhn lrhn May 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it works. That class interface has private fields, so it is effectively unimplementable, so my concern that nobody else can implement the same equality is probably not a real problem.

My concern about hash code based on private fields may (I don't remember any more) have come from subclassing. If you subclass an immutable class with automatic hash codes and private fields in the superclass from another library, then you can't access the private field. You can call super.hashCode, and you might even be able to call super.==, so it could work.

immutable class C {
  final int _i;
  C(this._i);
  // generated:
  int get hashCode => hash1(_i.hashCode);
  bool operator==(Object other) => other is C && _i == other._i;
}
... other library ...
immutable class D {
  final int x;
  D(this.x, int i) : super(i);
  // generate what?
}

Then the hashCode/equality of D would have to be something like:

  int get hashCode => hash2(super.hashCode, x.hashCode);
  bool operator==(Object other) => other is D && x == other.x && super == other;

That could work.
It's still not symmetric, C(1) == D(2, 1) and D(2, 1) != C(1). That's a separate, and general, problem with a equality and subclassing a concrete class. You should never subclass a concrete class, but I guess that rule is hard to sell :)

(Or maybe my concern was that the author might not want private fields to be part of the equality, because they are implementation details that may vary without affecting logical equality. I guess they can write their own equalities then).

for immutable classes (possibly with caching of hashCode).

We may wish to consider automatically generating functional update methods (or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again assumes that these classes are simple data objects which only represent their public fields. So far everything has suggested that these are full Dart classes, not data-types.

providing some other form of functional update).

## Allocation of immutable objects

Immutable objects are allocated as usual in an isolate local
nursery. (Alternatively, it might be preferable to maintain a separate isolate
local shared object nursery for allocating only shared objects). However, when
they are tenured, they are tenured to a global heap which is shared by all
isolates in the process, and which is inhabited solely by immutable shared
objects.

The shared object heap cannot have pointers into the isolate local heaps, and so
garbage collection of an isolate local heap does not require coordination with
other isolates.

The isolate local heap can have pointers into the shared global heap, and so
either these must be tracked via write barriers and treated as roots when
collecting the shared global heap, or else collection of the shared global heap
might require cross-isolate coordination.

Tenuring objects into the shared global heap requires locking or pausing
isolates. Bulk reservation of allocation regions could potentially be used to
mitigate this.

Issue: It is possible that a large object may need to be tenured before it has
been fully initialized. This would allow writes into the shared heap. This
should not be problematic semantically since the object cannot be visible in
other isolates prior to initialization, but it may complicate the GC model.
This does not seem deeply problematic - a number of solutions seem plausible.

## Sharing of immutable objects

The SendPort class is extended with a new method `void share<T, where T is
immutable>(T message)` which given a reference to an immutable object graph,
shares that reference with all receivers of the SentPort. Note that the object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "shares" mean?
Is it sent on the port, or is it just prepared for sending.

What class does the object have on the "other side"?
Can it access static state? Other types? Can an immutable object have a method like:

foo() => new _Foo(_staticCounter++);

If the receiving isolate has not imported the library declaring the class of the object, can you send it? If you do, what is its run-time type? Which static state can it access?

If the receiving isolate has imported the same library, what is the run-time type of the received object? We have two different instances of the library (they have different static state), so which static state will the object access?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is sent on the port.

I would expect that it can access static state, and every isolate has its own instance of the static state.

Broadly, to your questions, I would expect isolates to share the same code, hence have the same classes loaded, etc. This is what the (very minimal, vague) docs for Isolate.spawn seem to describe? If there are other ways to create isolates that don't share this property, I would suggest that they not be able to access the shared heap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Isolate.spawnUri to run arbitrary code that doesn't need to have anything in common with the spawning isolate. You can usually only send JSON-like structures between such isolates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is not copied since it and all sub-components of it are in the shared heap.

An object which is shared before it has been tenured will likely need to be
tenured when it is shared.

It should be the case that every object is fully initialized before it can be
shared. The intent of the static checks specified above are to guarantee this.

It should be the case that no object that has been shared can be mutated. The
intent of the static checks specified above are to guarantee this.

## Immutable collections

The following additional immutable classes are added to the core libraries:
`ImmutableList` which implements `List`, `ImmutableMap` which implements `Map`,
and `ImmutableSet` which implements `Set`.

### Collection initialization
Instances of these collections may be allocated and assigned to local variables
in a modifiable state. Mutation operations may be performed on such an instance
up until the first point at which the instance escapes (that is, is captured by
a closure, is assigned to another variable or setter, or is passed as a
parameter). It is a static error if a mutation operation is performed on an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have "mutate until frozen" collections.
Do we allow that for other, user written, immutable types too?

(We can perhaps get away with this by saying that you can only mutate the object using cascades on the object creation expression, and that mutating methods on an immutable object may not leak this in any way. Still sounds complicated).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raised this question below - if there's need for it we can do it. It's basically a limited use of typestate. How limited depends on how much power we need or don't need.

instance of one of these classes:
- at any point not intra-procedurally dominated by the allocation point of the
instance
- at any point where the instance escapes along any path from the allocation
point to the mutation operation.

Instances that are allocated to initialize fields or top level variables are
always initialized in an umodifiable state.

Question: Is this functionality needed? With spread collections, many patterns
will be expressible directly as a literal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely not needed, and it's a very fragile functionality, so I'd prefer to avoid it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. @munificent agrees with you. After years of struggling with vector initialization in functional languages.... I'm a little more skeptical.


Question: Is this sufficient? The analysis as specified is brittle: you cannot
factor out initialization code into a different scope from the allocation. We
could add type level support for tracking uninitialized instances, but this
raises the footprint of this feature substantially.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need computation when creating the object, then you can use a factory constructor.
For collections, I do believe comprehensions ("control flow") will be sufficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do believe comprehensions ("control flow") will be sufficient.

That's my hope too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be considered in the context of built_collection and protocol buffers.

They use the builder pattern; built_collection currently copies only once when changing to a builder and back again, I believe this would force it to copy twice.

Protocol buffers currently copy zero times--the 'builder' is 'frozen' and becomes the actual instance--I believe this would force it to copy once.

To support these without extra copying you'd need a way to explicitly call 'freeze' on a mutable collection to make it immutable.


Qustion: Should this functionality be extended to user classes?

### Runtime immutability
As with the result of the current `List.unmodifiable` constructor, mutation
operations on an instance of an immutable collection shall throw (except in the
limited cases described in the initialization section above). Note that the
static checks described above prevent mutation operations from being accessed on
an instance of immutable type. However, the immutable collections implement
their mutable interfaces, and hence the mutation operations may be reached by
subsuming into the mutable type.

### Literals

A collection literal which appears in a context where the static type required
by the context is an immutable collection type shall be allocated as an
immutable collection.

```
ImmutableList<int> l = [ 3 ];
```
Question: Do we need additional syntax for the case where a static type context
is not required?

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please! immutability is really useful for collections outside of supporting issolates, etc

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but if we get the semantics wrong we may find ourselves trying to retrofit immutables onto isolates. So I really prefer that we consider isolates immediately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Just saying that it's nice on its own, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference between shallowly unmodifiable and deeply immutable objects, and we probably want both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does shallow immutability give you that we don't already have with final fields?

var l = ^[3];
```

### Alternative collection approach

Instead of making `ImmutableList` a subtype of `List`, we could make it either
an unrelated type, or a supertype of `List`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you say below, it can't be a supertype if subtypes of immutable types must be immutable. If that isn't true, then the type system won't be the thing enforcing immutability.
(It's also a good reason why you should never name a type negatively, because a restriction is usually something you can remove in a sub-class, and new MutableList() is ImmutableList being true is just weird).

We can add a supertype of List called Sequence, and then let ImmutableList implement Sequence. We probably won't because it introduces a new super-type of List that code authors should then change all their read-only List accepting functions to using.

So ImmutableList shoud either be a subtype of List or be unrelated to List.
The former allows it to be used with code that currently accepts a list.
The latter allows it to have a different API, say with functional updates. I don't think introducing a completely new API is worth it.


#### `ImmutableList` is a supertype
If `ImmutableList` is a supertype of `List`, then immutability is no longer type
based. If we wish to enforce deep immutability, then there would need to be
runtime checks during initialization, which may be expensive (particularly in
the case of collections). Alternatively, we could simply not enforce deep
immutability statically, and instead dynamically traverse an object grap before
sharing it to check for immutability. This is expensive, but perhaps marginally
less so than copying.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dartino approach was to have one bit on each object telling whether it's immutable.
(That could also be based on which GC page it is allocated in, or something similar).

When you create a new object using a const constructor, or an .unmodifiable constructor of a system collection, then the bit gets set if all the fields/elements/entries are immutable.
You never need to do a deep check, but you do one more computation for each component object when creating a composite object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about writing this up as an alternative proposal, may still do (or feel free). I'm not very enthusiastic about it though. It makes every allocation in the program a little slower (or potentially a lot slower in JS), even if no immutability is used.


Another downside of this approach is that existing APIs that take `Lists` but
only read them cannot be re-used with an `ImmutableList`. A wrapper can help
with this.

A benefit of this is that changing APIs (especially Flutter APIs) to take
`ImmutableList` as an argument would be non-breaking.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear how this would work. A type annotation of ImmutableList should accept the subtype _GrowableList.
If it's not type based, then it should probably not show up in the type system at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't see a coherent story here, but added it based on discussion for future reference.


#### `ImmutableList` is an unrelated type

If `ImmutableList` is unrelated to `List`, then we have the same issue with
re-using existing APIs. However, we retain all of the benefits of type based
immutability.

## Immutable functions

There is no way to describe the type of an immutable function. If important, we
could add a type for immutable closures. A function is immutable if every free
variable of the function is immutable.

## Immutable top type

There is no top type for immutable types. It might be useful to have a type
`Immutable`, to express the type of fields of immutable objects which are
intended to hold instances of multiple types which do not otherwise share a
common super-interface.

## Javascript

There are no issues with supporting immutable objects on the web, but the
ability to support communication between isolates is limited. Currently,
isolates are not supported at all in Javascript. If we revisit that, we are
unlikely to be able to support this in full on the web. It is possible that we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that all Immutable types will work fine in JS – just the isolate APIs won't be available.

I'm sure there will be places where we can generate better JS if we know things are immutable, though – so I still consider this a "feature" – even for web-only code.

CC @rakudrama @jmesserly for comments...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's the communication that wouldn't work out of the box.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cloning objects should be a sufficient polyfill. You don't get the performance benefits from memory sharing, but you still get the benefit of a second isolate (i.e. web worker).

may be able to define a subset of immutable objects which can be implemented as
a layer over shared typed data buffers.