Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Predefined InlineArrayX<T> types up to some X #111973

Open
333fred opened this issue Jan 29, 2025 · 25 comments
Open

[API Proposal]: Predefined InlineArrayX<T> types up to some X #111973

333fred opened this issue Jan 29, 2025 · 25 comments
Assignees
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime.CompilerServices untriaged New issue has not been triaged by the area owner

Comments

@333fred
Copy link
Member

333fred commented Jan 29, 2025

Background and motivation

Today, Roslyn will emit anonymous inline array types when users call methods that take params ReadOnlySpan in params form, or when using collection expressions in some fashion. There can be a significant number of these types; for example, @jaredpar found that in Microsoft.AspNetCore.App, there are ~110 such types. In Microsoft.NetCore.app, there are ~136 such types. Most of these types are 5 elements or smaller, thus I'm only proposing up to InlineArray5. However, we may want to consider going further; dotnet/roslyn#74538 asks for Roslyn to emit calls to string.Concat(ReadOnlySpan<string>), which may push the numbers here up more. I'll leave it to the BCL designers to decide what number is right, and Roslyn will take advantage of whatever is available.

API Proposal

namespace System.Runtime.CompilerServices;

[InlineArray(1)]
public struct InlineArray1<T>
{
    private T t;
}
[InlineArray(2)]
public struct InlineArray2<T>
{
    private T t;
}
[InlineArray(3)]
public struct InlineArray3<T>
{
    private T t;
}
[InlineArray(4)]
public struct InlineArray4<T>
{
    private T t;
}
[InlineArray(5)]
public struct InlineArray5<T>
{
    private T t;
}

API Usage

var finalString = string.Concat(str1, str2, str3, str4, str5); // The compiler uses InlineArray5<string> behind the scenes

Alternative Designs

No response

Risks

No response

@333fred 333fred added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jan 29, 2025
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 29, 2025
@333fred
Copy link
Member Author

333fred commented Jan 29, 2025

/cc @stephentoub

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 29, 2025
@333fred 333fred added area-System.Runtime.CompilerServices untriaged New issue has not been triaged by the area owner and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 29, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices
See info in area-owners.md if you want to be subscribed.

@jaredpar
Copy link
Member

Here is the histogram breakdown of usage:

Microsoft.NetCore.App

0:  (0)
1:  (0)
2: ***************************************************************** (65)
3: ******************************************** (44)
4: ************************* (25)
5: ** (2)
6:  (0)
7: * (1)
8: * (1)
9:  (0)
10:  (0)

Microsoft.AspNetCore.App

0:  (0)
1:  (0)
2: ********************************* (33)
3: ******************************** (32)
4: ******************************************* (43)
5: **** (4)
6: *** (3)
7: ** (2)
8:  (0)
9:  (0)
10: * (1)
11:  (0)
12: * (1)
13:  (0)

@stephentoub
Copy link
Member

In the data Jared shared with me, the 95% cases were covered by 2, 3, and 4, and there were no occurrences with 1. As this is mainly a size optimization, I'd be inclined to start small and only add the types that have the biggest bang for the buck. We can always add more later as needed.

@stephentoub
Copy link
Member

@333fred, how would the compiler decide what types are available to it to use? Is it based on a naming scheme? Anything that's generic and an InlineArray of the right size?

@hez2010
Copy link
Contributor

hez2010 commented Jan 29, 2025

Oh no... How about doing this for them all: #89730

@jaredpar
Copy link
Member

... and there were no occurrences with 1.

In the case of a single element collection expression targeting Span<T> the compiler uses the Span<T>(ref T) constructor. That means the compiler can put a T local on the stack and use that for storage. May be missing a case but pretty sure this allows us to avoid a one element inline array whenever targeting .NET Core 7+.

@hamarb123
Copy link
Contributor

hamarb123 commented Jan 29, 2025

I'd suspect that up to 16, and then a few powers of 2 after that should pretty much cover the vast majority of projects for all time, so that would be good. Not only for compiler emitted ones, but also for ones an advanced developer might write themselves (I certainly would switch to using these over ones I have defined myself where possible).

Would be good if we could add where T : allows ref struct as well, even though that part would not work from C# directly today, as these would ideally also be used in lowering [refStruct1, refStruct2] -> ReadOnlySpan<MyRefStructType> if and when that is supported.

@333fred
Copy link
Member Author

333fred commented Jan 29, 2025

how would the compiler decide what types are available to it to use

Likely we'd look for all types in a specific namespace with the attribute and a specific naming pattern, but we can be pretty flexible. I'm not particularly attached to the proposed names, there just needs to be a pattern.

@alrz
Copy link
Member

alrz commented Jan 30, 2025

Oh no... How about doing this for them all: #89730

I think initially T[N] type was considered to lower to a compiler-generated InlineArray. Not sure where it landed.

@hez2010
Copy link
Contributor

hez2010 commented Jan 30, 2025

I think initially T[N] type was considered to lower to a compiler-generated InlineArray. Not sure where it landed.

Compiler synthesized type is not suitable for exposing through public ABI. This would make data exchange extremely difficult.

@alrz
Copy link
Member

alrz commented Jan 30, 2025

I think initially T[N] type was considered to lower to a compiler-generated InlineArray. Not sure where it landed.

Compiler synthesized type is not suitable for exposing through public ABI. This would make data exchange extremely difficult.

How is that any different than using an explicit InlineArray type declaration? Provided the type name is speakable (though I don't think that's needed if there's syntax for it), then the compiler could reuse it if it already exist, effectively leading to the same outcome of this proposal.

@hamarb123
Copy link
Contributor

hamarb123 commented Jan 30, 2025

How is that any different than using an explicit InlineArray type declaration? Provided the type name is speakable, I think the compiler could reuse it if it already exist, effectively leading to the same outcome of this proposal.

It needs to not only be speakable, but stable permanently to be on public API definitions - implicitly generated types cannot achieve this, because each library may have their own definitions, which will not get unified; or you'd have to define a type for every single size that you may want (the const generics option does implicitly by moving the quantity to a generic parameter). Regardless, this proposal was made for reducing the number of synthesised definitions in dlls used for collection expressions (including especially params), not for allowing T[N].

@hez2010
Copy link
Contributor

hez2010 commented Jan 31, 2025

Alternatively we can introduce some basic hex digit types under CompilerServices, for example,

interface IHex
{
    abstract static int Value { get; }
}

struct Hex0 : IHex
{
    public static int Value => 0;
}

struct Hex1 : IHex
{
    public static int Value => 1;
}

struct Hex2 : IHex
{
    public static int Value => 2;
}

// ... same for Hex3 to HexF

interface IValue<T>
{
    abstract static T Value { get; }
}

struct Int32<H7, H6, H5, H4, H3, H2, H1, H0> : IValue<int>
    where H7 : IHex
    where H6 : IHex
    where H5 : IHex
    where H4 : IHex
    where H3 : IHex
    where H2 : IHex
    where H1 : IHex
    where H0 : IHex
{
    public static int Value
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        get => H7.Value << 28 | H6.Value << 24 | H5.Value << 20 | H4.Value << 16 | H3.Value << 12 | H2.Value << 8 | H1.Value << 4 | H0.Value;
    }
}

Then we can add a new generic type InlineArray<T, TSize> where TSize : IValue<int>:

struct InlineArray<T, TSize> where TSize : IValue<int>
{
    private T elem;
}

and we repeat the field for TSize.Value times.

i.e. instead of adding predefined InlineArrayX<T>, we add predefined number types and use them in InlineArray<T, TSize>.

In this way we can use arbitrary sized InlineArray with a unified type and interfaces. It embeds the number in the type so that it can be evaluated at JIT compile time as well:

var arr = new InlineArray<int, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, HexF>>(); // InlineArray of int with size 15

This won't introduce any breaking changes to our ABI, and can still be efficient as well. And at runtime we can easily get its value by calling TSize.Value, which will be folded to constant value by the JIT so it's very efficient.

And this supports the full range of int32 while only adds 16+3=19 types to the BCL, and unused digit types can be easily trimmed away.

Take the example of Microsoft.NetCore.App:

0:  (0)
1:  (0)
2: ***************************************************************** (65)
3: ******************************************** (44)
4: ************************* (25)
5: ** (2)
6:  (0)
7: * (1)
8: * (1)
9:  (0)
10:  (0)

In this case we will instantiate Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex2>, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex3>, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex4>, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex5>, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex7>, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex8>, which ends up only Hex0 + Hex7 + 6 Int32 instantiations + two interfaces = 10 types. But with compiler synthesized types, we end up 65+44+25+2+1+1=138 types, which is far more bloated.

And! This approach also unblocks the scenarios for exchanging data across managed ABI because now we are using unified types.

In C# we can then allows users to use 15 directly instead of Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, HexF>, and lowers 15 to Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, HexF> under the hood.

If we ever want to extend it to support other number types as well, it can be easily extended. For example, floating point numbers:

struct Float32<H7, H6, H5, H4, H3, H2, H1, H0> : IValue<float>
    where H7 : IHex
    where H6 : IHex
    where H5 : IHex
    where H4 : IHex
    where H3 : IHex
    where H2 : IHex
    where H1 : IHex
    where H0 : IHex
{
    public static float Value
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        get => Unsafe.BitCast<int, float>(H7.Value << 28 | H6.Value << 24 | H5.Value << 20 | H4.Value << 16 | H3.Value << 12 | H2.Value << 8 | H1.Value << 4 | H0.Value);
    }
}

@tannergooding
Copy link
Member

Alternatively we can introduce some basic hex digit types under CompilerServices, for example,

This would be a lot of expense for something where typical real world usages need 2-4 members, maybe 2-7.

These types aren't really being added for broad general usage (hence the namespace choice), but rather for internal compiler usages as part of things like params Span<object>, allowing that to be allocation free, and allowing that to be trimmable without introducing unnecessary metadata bloat.

A general purpose set of InlineArray types would likely exist elsewhere (maybe System.Collections.Generic or System.Runtime.InteropServices for example). However, it is less meaningful for typical applications; particularly as it primarily applies to interop or very perf centric scenarios.


Proposals like this one are balancing real world considerations, not simply what is nice to have. They are weighing how it will get used in practice, the impact of doing the smaller feature for the immediate need, etc.

Things like integral generics or exposing some entirely new way to define inline arrays are very expensive and the benefits don't justify the costs, especially for something like what this proposal is trying to address.

@hez2010
Copy link
Contributor

hez2010 commented Jan 31, 2025

Things like integral generics or exposing some entirely new way to define inline arrays are very expensive and the benefits don't justify the costs, especially for something like what this proposal is trying to address.

We can keep them as internal types so that they won't be exposed to users directly, while only compilers can use them. This would address the issue of InlineArray for compiler use in a general way while not exposing the complexity of integral generics to users.

@tannergooding
Copy link
Member

We can keep them as internal types so that they won't be exposed to users directly, while only compilers can use them.

That still has significant cost and has its own other issues, especially in using and leaking internal types across assembly boundaries when the appropriate IVTs or similar aren't in place.

Just implementing integral generics, getting language support online, etc is a significant effort -- and this goes far beyond the prototype that you wrote, as it gets into design considerations how it extends the future of .NET/C#, etc which will take months of up front effort.

None of that is justified for the scenario that is driving the proposal here, its way too much work/effort for something that is trivially solvable today almost for free.

@hez2010
Copy link
Contributor

hez2010 commented Jan 31, 2025

None of that is justified for the scenario that is driving the proposal here, its way too much work/effort for something that is trivially solvable today almost for free.

I don't think the current proposal would solve any issue. It will only mitigate the issue temporarily. We will end up needing more and more InlineArrayX<T> types, and it would only make things worse as time goes by, especially we are now promoting collection expressions, where it uses an inline array to build the collection and turns it into a Span so that the need will be endless. And finally in the future we want to add integral types to the BCL, we will end up a lot of useless types remaining in the BCL.

My opinion is that we either do nothing here, or do something after we have those integral types.

@tannergooding
Copy link
Member

tannergooding commented Jan 31, 2025

And finally in the future we want to add integral types to the BCL, we will end up a lot of useless types remaining in the BCL.

They won't be useless and are unlikely to be replaced in the near future. The cost of waiting is significantly worse to the ecosystem

We will end up needing more and more InlineArrayX types, and it would only make things worse as time goes by.

This is unlikely. The Roslyn team has already done some real world analysis on what sizes are actually found in production. The general .NET team is able to do more analysis as well, but it will likely come to the same conclusion.

Things like what's used in interop are also not broad enough to be as big of a concern. It isn't something that is being implicitly done dozens of time for every compilation.

All of TerraFX.Interop.Windows, which is one of the larger and more comprehensive binding libraries out there, has 1023 InlineArray usages (and is using it for everything, including places that fixed int x[...] was previously applicable).

For the 1023 definitions we have (click to expand):
[InlineArray(32)]   : 107
[InlineArray(260)]  : 104
[InlineArray(16)]   : 102
[InlineArray(8)]    : 94
[InlineArray(2)]    : 92
[InlineArray(256)]  : 78
[InlineArray(4)]    : 59
[InlineArray(3)]    : 55
[InlineArray(128)]  : 42
[InlineArray(64)]   : 39
[InlineArray(6)]    : 30
[InlineArray(20)]   : 16
[InlineArray(80)]   : 15
[InlineArray(512)]  : 14

[InlineArray(5)]    : 12
[InlineArray(12)]   : 12
[InlineArray(261)]  : 12

[InlineArray(36)]   : 9

[InlineArray(7)]    : 7
[InlineArray(30)]   : 7

[InlineArray(10)]   : 6
[InlineArray(2001)] : 6

[InlineArray(14)]   : 4
[InlineArray(15)]   : 4
[InlineArray(34)]   : 4
[InlineArray(40)]   : 4
[InlineArray(48)]   : 4
[InlineArray(132)]  : 4

[InlineArray(9)]    : 3
[InlineArray(18)]   : 3
[InlineArray(24)]   : 3
[InlineArray(39)]   : 3
[InlineArray(96)]   : 3
[InlineArray(240)]  : 3
[InlineArray(2084)] : 3

[InlineArray(28)]   : 2
[InlineArray(43)]   : 2
[InlineArray(50)]   : 2
[InlineArray(60)]   : 2
[InlineArray(68)]   : 2
[InlineArray(112)]  : 2
[InlineArray(176)]  : 2
[InlineArray(263)]  : 2
[InlineArray(504)]  : 2
[InlineArray(780)]  : 2
[InlineArray(1025)] : 2
[InlineArray(1040)] : 2
[InlineArray(4056)] : 2
[InlineArray(4096)] : 2

[InlineArray(11)]   : 1
[InlineArray(17)]   : 1
[InlineArray(22)]   : 1
[InlineArray(25)]   : 1
[InlineArray(26)]   : 1
[InlineArray(31)]   : 1
[InlineArray(38)]   : 1
[InlineArray(41)]   : 1
[InlineArray(44)]   : 1
[InlineArray(70)]   : 1
[InlineArray(72)]   : 1
[InlineArray(129)]  : 1
[InlineArray(130)]  : 1
[InlineArray(152)]  : 1
[InlineArray(162)]  : 1
[InlineArray(196)]  : 1
[InlineArray(200)]  : 1
[InlineArray(255)]  : 1
[InlineArray(257)]  : 1
[InlineArray(296)]  : 1
[InlineArray(304)]  : 1
[InlineArray(312)]  : 1
[InlineArray(402)]  : 1
[InlineArray(516)]  : 1
[InlineArray(520)]  : 1
[InlineArray(768)]  : 1
[InlineArray(1140)] : 1
[InlineArray(1744)] : 1
[InlineArray(1808)] : 1
[InlineArray(1952)] : 1
[InlineArray(2048)] : 1
[InlineArray(4076)] : 1

So you can see even here it's primarily small powers of 2 and random outliers unique to a given scenario

I also maintain several other prominent interop binding libraries, and they are all similar in nature. There isn't a significant number of InlineArray in the first place and where they do exist the minor duplication isn't going to be a concern. Such usages are also prominent, typically public, and never changing. This is the complete opposite of the implicit usage the compiler will have where it is a hidden implementation detail that ends up being in most compilations and so where it does become a concern.

@hez2010
Copy link
Contributor

hez2010 commented Jan 31, 2025

All of TerraFX.Interop.Windows, which is one of the larger and more comprehensive binding libraries out there, has 1023 InlineArray usages (and is using it for everything, including places that fixed int x[...] was previously applicable).

Let's do a simple calculation, if we use a unified InlineArray<T, TSize> type, and use those hex digit types to form a number type, we only need these instantiations:

(click to expand)
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex0>> // [InlineArray(32)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex0,Hex4>> // [InlineArray(260)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex0>> // [InlineArray(16)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex8>> // [InlineArray(8)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2>> // [InlineArray(2)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex0,Hex0>> // [InlineArray(256)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex4>> // [InlineArray(4)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex3>> // [InlineArray(3)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex8,Hex0>> // [InlineArray(128)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex0>> // [InlineArray(64)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex6>> // [InlineArray(6)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex4>> // [InlineArray(20)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex5,Hex0>> // [InlineArray(80)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex0,Hex0>> // [InlineArray(512)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex5>> // [InlineArray(5)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexC>> // [InlineArray(12)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex0,Hex5>> // [InlineArray(261)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex4>> // [InlineArray(36)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex7>> // [InlineArray(7)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,HexE>> // [InlineArray(30)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexA>> // [InlineArray(10)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex7,HexD,Hex1>> // [InlineArray(2001)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexE>> // [InlineArray(14)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexF>> // [InlineArray(15)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex2>> // [InlineArray(34)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex8>> // [InlineArray(40)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex3,Hex0>> // [InlineArray(48)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex8,Hex4>> // [InlineArray(132)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex9>> // [InlineArray(9)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex2>> // [InlineArray(18)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex8>> // [InlineArray(24)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex7>> // [InlineArray(39)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex6,Hex0>> // [InlineArray(96)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexF,Hex0>> // [InlineArray(240)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex8,Hex2,Hex4>> // [InlineArray(2084)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,HexC>> // [InlineArray(28)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,HexB>> // [InlineArray(43)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex3,Hex2>> // [InlineArray(50)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex3,HexC>> // [InlineArray(60)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex4>> // [InlineArray(68)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex7,Hex0>> // [InlineArray(112)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexB,Hex0>> // [InlineArray(176)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex0,Hex7>> // [InlineArray(263)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,HexF,Hex8>> // [InlineArray(504)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex3,Hex0,HexC>> // [InlineArray(780)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex0,Hex1>> // [InlineArray(1025)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex1,Hex0>> // [InlineArray(1040)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,HexF,HexD,Hex8>> // [InlineArray(4056)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex1,Hex0,Hex0,Hex0>> // [InlineArray(4096)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexB>> // [InlineArray(11)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex1>> // [InlineArray(17)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex6>> // [InlineArray(22)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex9>> // [InlineArray(25)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,HexA>> // [InlineArray(26)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,HexF>> // [InlineArray(31)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex6>> // [InlineArray(38)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex9>> // [InlineArray(41)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,HexC>> // [InlineArray(44)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex6>> // [InlineArray(70)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex8>> // [InlineArray(72)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex8,Hex1>> // [InlineArray(129)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex8,Hex2>> // [InlineArray(130)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,Hex9,Hex8>> // [InlineArray(152)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexA,Hex2>> // [InlineArray(162)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexC,Hex4>> // [InlineArray(196)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexC,Hex8>> // [InlineArray(200)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex0,HexF,HexF>> // [InlineArray(255)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex0,Hex1>> // [InlineArray(257)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex2,Hex8>> // [InlineArray(296)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex3,Hex0>> // [InlineArray(304)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex3,Hex8>> // [InlineArray(312)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex1,Hex9,Hex2>> // [InlineArray(402)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex0,Hex4>> // [InlineArray(516)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex2,Hex0,Hex8>> // [InlineArray(520)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex3,Hex0,Hex0>> // [InlineArray(768)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex4,Hex7,Hex4>> // [InlineArray(1140)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex6,HexD,Hex0>> // [InlineArray(1744)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex7,Hex1,Hex0>> // [InlineArray(1808)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex7,HexA,Hex0>> // [InlineArray(1952)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,Hex8,Hex0,Hex0>> // [InlineArray(2048)]
InlineArray<T, Int32<Hex0,Hex0,Hex0,Hex0,Hex0,HexF,HexE,HexC>> // [InlineArray(4076)]

Number of distinct hex digit types: 16
Number of distinct InlineArray types: 1

Number of TypeDef: 16+1=17, for all scenarios.

Even we consider all 81 instantiations and their type definition of InlineArray for NativeAOT scenario, we still only need 81+17=98 types.

While currently we need 107+104+102+94+92+78+59+55+42+39+30+16+15+14+12+12+12+9+7+7+6+6+4+4+4+4+4+4+3+3+3+3+3+3+3+2+2+2+2+2+2+2+2+2+2+2+2+2+2+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1=1023 types.

If we add predefined InlineArrayX<T> to cover them all, we will need 81 distinct types. While 81 >> 17, and there's no much difference between 81 and 98.

So apparently this won't introduce any meaningful regression for NativeAOT, and will be a big win for JIT scenario and managed binary size.

Besides, we can easily get the number of elements of an InlineArray by doing a constrained call TSize.Value, which can be folded into constant by the JIT without any problem today. Instead of doing Unsafe.SizeOf<Container> / Unsafe.SizeOf<T>(), which involves unsafe APIs, and can even be wrong due to padding and alignment.

@tannergooding
Copy link
Member

Let's do a simple calculation, if we use a unified InlineArray<T, TSize> type, and use those hex digit types to form a number type, we only need these instantiations:

Something like TerraFX wouldn't use such a type, it hurts UX. The point of me sharing the numbers was to give more representation to how rare the need is beyond the common cases already called out by this proposal.

The entire approach of Int32<Hex....>, additionally, would require significantly more runtime work to enable and comes with its own complications and downsides. It also wouldn't work on existing in support runtimes and so it isn't a practical solution to the problem here. It's adding needless complexity and overhead to something which is trivially handled for the 90+% of cases by this proposal.

@hez2010
Copy link
Contributor

hez2010 commented Jan 31, 2025

The entire approach of Int32<Hex....>, additionally, would require significantly more runtime work to enable and comes with its own complications and downsides. It also wouldn't work on existing in support runtimes and so it isn't a practical solution to the problem here.

They are supported by today's runtime and we don't need any additional work to support such integral types being used. You can define those types and use them today without any issue.

The only work we need to do is to calculate the value of a given Int32<> type based on its instantiation when calculating the type layout of an inline array, which is even cheaper than parsing the value from custom attributes because we already have the TypeHandle with its TypeInst so we don't need any further metadata lookup, we can simply get the value by iterating typeInsts and sum them, whose value can even be cached in advance when we were instantiating the type. And such process is not recursive, each digit type in typeInsts is a TypeRef that can't hold other TypeSpec so that the whole process can be very cheap.

@tannergooding
Copy link
Member

They are supported by today's runtime and we don't need any additional work to support such integral types being used. You can define those types and use them today without any issue.

The consideration is the work required to integrate everything into the VM and the JIT such that the type definition and layout is correct, understood, handled as expected from the ABI level, etc. You're effectively defining another alternative to [InlineArray(...)], one that has worse UX, worse understandability, and more potential for issues. All the work around supporting [InlineArray] then needs to be repeated with additional work on top because its a net new mechanism. That then also has to be picked up by languages, tools, and beyond.

This discussion is going around in circles at this point, so lets drop it. I've iterated why the alternatives are not likely to happen.

@hez2010
Copy link
Contributor

hez2010 commented Jan 31, 2025

one that has worse UX, worse understandability

The UX issue can be resolved with the help of language, to interpret the Int32<> type to its value in the source code. So you can write InlineArray<T, 15> in C# and it gets lowered to InlineArray<T, Int32<H0, H0, H0, H0, H0, H0, H0, HF>>.
I don't think it has any difference than ValueTuple<> we have today, not to say ValueTuple is even recursive, while Int32<> is not.

This discussion is going around in circles at this point, so lets drop it.

Agree. It's kind of out-of-scope for this specific proposal. I may open a separate API proposal (and maybe language proposal as well) for this.

I've iterated why the alternatives are not likely to happen.

I don't think we should stop moving forward in this field :)

@aromaa
Copy link
Contributor

aromaa commented Feb 2, 2025

Is this ask motivated around the assembly level IL bloat or more around the runtime overhead and trimmer's ability to do its job?

If the latter, I could see more general approach that can be extended for all compiler generated types with the help of the type system. As these types are already "unspeakable", I could see that the compiler could mark the appropriate types as having "no identity" where the type system is free to merge and re-use already loaded types that also have "no identity".

While this approach is more expensive, it does extend the scope and as the compiler is ramping up on generating these assembly level types, the savings can add up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime.CompilerServices untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

9 participants