Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Expose Version on List<T> and similar Enumerables #86795

Closed
MHDante opened this issue May 26, 2023 · 4 comments
Closed

[API Proposal]: Expose Version on List<T> and similar Enumerables #86795

MHDante opened this issue May 26, 2023 · 4 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Collections

Comments

@MHDante
Copy link

MHDante commented May 26, 2023

Background and motivation

The internal implementation of List, HashSet and others all contain a _version field.

This is then used to send out CollectionModifiedException during Enumerations.

It would be awesome if these versions would be available as a public property.

Reasons why this is useful:

  • I wanna know if the list I passed into a method was modified.
  • I wanna poll the list to see if it's changed since I last touched it.
  • I wanna implement my own funky iterator that throws its own CollectionModifiedException

As far as costs go, this one feels pretty free, given that it's already implemented on most enumerables within System.Collections.Generic. That said, I understand there's a cost of maintenance for any API surface.

Potential candidates for this change would include:

List<T>
HashSet<T>
Queue<T>
Dictionary<T>

perhaps also the non-generic ones.

API Proposal

namespace System.Collections.Generic;

public class List<T> : IEnumerable<T>, [...]
{
    ....

    ///<summary>
    /// A number indicating the current version of the data structure. Increments when the collection is modified.
    /// Not guaranteed to be sequential. 
    ///</summary>
    public int Version => _version;

    ...

}

API Usage

I wanna know if the list I passed into a method was modified

var myStuff = new List<Valuables>(10,000);
var oldVersion = myStuff.Version;
var value = otherObj.Appraise(myStuff);

if(oldVersion != myStuff.Version){
    // complicated iteration over long list
}

I wanna poll the list to see if it's changed since I last touched it.

public class Party {

    public List<Person> Invitees { get; set; }
    private int _version;
    
    public void OnIdCheck(){
        if(_version == Invitees.Version) return;
        // complicated iteration over long list
    }
}

I wanna implement my own funky iterator that throws its own CollectionModifiedException.

public struct PrimeNumberListEnumerator<T> : IEnumerator<T> {
 
    ...
    public PrimeNumberListEnumerator() {
        _list = list;
        _index = 0;
        _version = list.Version; // !!!
        _current = default;
    }
    
    public bool MoveNext() {
        if (_version != _list._version) {
            throw new CollectionModifiedException();
        }
              
        int primeIndex = CachedPrimeSieve.GetPrime(_index);
        if ((uint)primeIndex < (uint)_list.Count)
        {
            _current = _list[primeIndex];
            _index++;
            return true;
        }

        _index = _list._size + 1;
        _current = default;
        return false;
    }
      ....
}

Alternative Designs

There's currently an ObservableCollection that has the ability to be used for observing collection changes, but its API is push-based. This Version based approach allows for a polling or pull-based api without requiring a buncha OnNotificationChanged listeners. Furthermore, this allows the caller to determine the time of reaction to the change.

A lot of use-cases could be circumvented with careful use of IReadOnlyList and separating input and output lists. The downsides of this is Allocating and Copying. All my homies hate Allocating and Copying. Also, there are 3rd party APIs that are shipped that use List as an input and are not possible to change.

Risks

  1. Some runtimes may use other arcane methods for detecting modification of such structures (memhash?), and are not guaranteed to have _version alredy implemented.

  2. This number could overflow. Currently it's implemented as an Int everywhere, but not possible to change due binary serialization.

@MHDante MHDante added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label May 26, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 26, 2023
@ghost
Copy link

ghost commented May 26, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

The internal implementation of List, HashSet and others all contain a _version field.

This is then used to send out CollectionModifiedExceptions during Enumerations.

It would be awesome if these versions would be available as a public property.

Reasons why this is useful:

  • I wanna know if the list I passed into a method was modified.
  • I wanna poll the list to see if it's changed since I last touched it.
  • I wanna implement my own funky iterator that throws its own CollectionModifiedException.

As far as costs go, this one feels pretty free, given that it's already implemented on most enumerables within System.Collections.Generic. That said, I understand there's a cost of maintenance for any API surface.

Potential candidates for this change would include:

List
HashSet
Queue
Dictionary

API Proposal

namespace System.Collections.Generic;

public class List<T> : IEnumerable<T>, [...]
{
    ....

    ///<summary>
    /// A number indicating the current version of the data structure. Increments when the collection is modified.
    /// Not guaranteed to be sequential. 
    ///</summary>
    public int Version => _version;

    ...

}

API Usage

I wanna know if the list I passed into a method was modified

var myStuff = new List<Valuables>(10,000);
var oldVersion = myStuff.Version;
var value = otherObj.Appraise(myStuff);

if(oldVersion != myStuff.Version){
    // complicated iteration over long list
}

I wanna poll the list to see if it's changed since I last touched it.

public class Party {

    public List<Person> Invitees { get; set; }
    private int _version;
    
    public void OnIdCheck(){
        if(_version == Invitees.Version) return;
        // complicated iteration over long list
    }
}

I wanna implement my own funky iterator that throws its own CollectionModifiedException.

public struct PrimeNumberListEnumerator<T> : IEnumerator<T> {
 
    ...
    public PrimeNumberListEnumerator() {
                _list = list;
                _index = 0;
                _version = list.Version; // !!!
                _current = default;
    }
    
      public bool MoveNext()      {
          
          if (_version != _list._version) {
              ThrowHelper.ThrowInvalidOperationException_InvalidOperation_EnumFailedVersion();
          }
                
          int primeIndex = CachedPrimeSieve.GetPrime(_index);
          if ((uint)primeIndex < (uint)_list.Count)
          {
              _current = _list[primeIndex];
              _index++;
              return true;
          }

          _index = _list._size + 1;
          _current = default;
          return false;
      }
      ....
}

Alternative Designs

There's currently an ObservableCollection that has the ability to be used for observing collection changes, but its API is push-based. This allows for a polling or pull-based api without requiring a buncha OnNotificationChanged listeners. Furthermore, this allows the caller to

A lot of use-cases could be circumvented with careful use of IReadOnlyList and separating input and output lists. The downsides of this is Allocating and Copying. All my homies hate Allocating and Copying. Also, there are 3rd party APIs that are unchangeable that use List as an input and are not possible to change.

Risks

  1. Some runtimes may use other arcane methods for detecting modification of such structures (memhash?), and are not guaranteed to have _version alredy implemented.

  2. This number could overflow. Currently it's implemented as an Int everywhere, but not possible to change due binary serialization.

Author: MHDante
Assignees: -
Labels:

api-suggestion, area-System.Collections

Milestone: -

@Clockwork-Muse
Copy link
Contributor

See #81523 , where the proposal is to remove the version checks.

Part of the problem with this is that the modification exception isn't guaranteed to be thrown, because the version isn't always incremented. If we exposed this, we would have to make that behavior guaranteed.

@nth-commit
Copy link

nth-commit commented May 27, 2023

I think using either IReadOnlyList (which is zero cost) or ImmutableList (which is lower cost than copying lists or doing sequence equals) will solve your problems.

If a third-party library is mutating your lists unexpectedly, then it's a bad library.

@eiriktsarpalis
Copy link
Member

Given #81523 I don't think we would ever consider doing this.

@eiriktsarpalis eiriktsarpalis closed this as not planned Won't fix, can't repro, duplicate, stale May 29, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label May 29, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Collections
Projects
None yet
Development

No branches or pull requests

4 participants