-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Predefined InlineArrayX<T>
types up to some X
#111973
Comments
/cc @stephentoub |
Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices |
Here is the histogram breakdown of usage: Microsoft.NetCore.App
Microsoft.AspNetCore.App
|
In the data Jared shared with me, the 95% cases were covered by 2, 3, and 4, and there were no occurrences with 1. As this is mainly a size optimization, I'd be inclined to start small and only add the types that have the biggest bang for the buck. We can always add more later as needed. |
@333fred, how would the compiler decide what types are available to it to use? Is it based on a naming scheme? Anything that's generic and an InlineArray of the right size? |
Oh no... How about doing this for them all: #89730 |
In the case of a single element collection expression targeting |
I'd suspect that up to Would be good if we could add |
Likely we'd look for all types in a specific namespace with the attribute and a specific naming pattern, but we can be pretty flexible. I'm not particularly attached to the proposed names, there just needs to be a pattern. |
I think initially |
Compiler synthesized type is not suitable for exposing through public ABI. This would make data exchange extremely difficult. |
How is that any different than using an explicit InlineArray type declaration? Provided the type name is speakable (though I don't think that's needed if there's syntax for it), then the compiler could reuse it if it already exist, effectively leading to the same outcome of this proposal. |
It needs to not only be speakable, but stable permanently to be on public API definitions - implicitly generated types cannot achieve this, because each library may have their own definitions, which will not get unified; or you'd have to define a type for every single size that you may want (the const generics option does implicitly by moving the quantity to a generic parameter). Regardless, this proposal was made for reducing the number of synthesised definitions in dlls used for collection expressions (including especially |
Alternatively we can introduce some basic hex digit types under interface IHex
{
abstract static int Value { get; }
}
struct Hex0 : IHex
{
public static int Value => 0;
}
struct Hex1 : IHex
{
public static int Value => 1;
}
struct Hex2 : IHex
{
public static int Value => 2;
}
// ... same for Hex3 to HexF
interface IValue<T>
{
abstract static T Value { get; }
}
struct Int32<H7, H6, H5, H4, H3, H2, H1, H0> : IValue<int>
where H7 : IHex
where H6 : IHex
where H5 : IHex
where H4 : IHex
where H3 : IHex
where H2 : IHex
where H1 : IHex
where H0 : IHex
{
public static int Value
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => H7.Value << 28 | H6.Value << 24 | H5.Value << 20 | H4.Value << 16 | H3.Value << 12 | H2.Value << 8 | H1.Value << 4 | H0.Value;
}
} Then we can add a new generic type struct InlineArray<T, TSize> where TSize : IValue<int>
{
private T elem;
} and we repeat the field for i.e. instead of adding predefined In this way we can use arbitrary sized var arr = new InlineArray<int, Int32<Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, Hex0, HexF>>(); // InlineArray of int with size 15 This won't introduce any breaking changes to our ABI, and can still be efficient as well. And at runtime we can easily get its value by calling And this supports the full range of int32 while only adds 16+3=19 types to the BCL, and unused digit types can be easily trimmed away. Take the example of
In this case we will instantiate And! This approach also unblocks the scenarios for exchanging data across managed ABI because now we are using unified types. In C# we can then allows users to use If we ever want to extend it to support other number types as well, it can be easily extended. For example, floating point numbers: struct Float32<H7, H6, H5, H4, H3, H2, H1, H0> : IValue<float>
where H7 : IHex
where H6 : IHex
where H5 : IHex
where H4 : IHex
where H3 : IHex
where H2 : IHex
where H1 : IHex
where H0 : IHex
{
public static float Value
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => Unsafe.BitCast<int, float>(H7.Value << 28 | H6.Value << 24 | H5.Value << 20 | H4.Value << 16 | H3.Value << 12 | H2.Value << 8 | H1.Value << 4 | H0.Value);
}
} |
This would be a lot of expense for something where typical real world usages need 2-4 members, maybe 2-7. These types aren't really being added for broad general usage (hence the namespace choice), but rather for internal compiler usages as part of things like A general purpose set of Proposals like this one are balancing real world considerations, not simply what is nice to have. They are weighing how it will get used in practice, the impact of doing the smaller feature for the immediate need, etc. Things like |
We can keep them as internal types so that they won't be exposed to users directly, while only compilers can use them. This would address the issue of |
That still has significant cost and has its own other issues, especially in using and leaking internal types across assembly boundaries when the appropriate IVTs or similar aren't in place. Just implementing integral generics, getting language support online, etc is a significant effort -- and this goes far beyond the prototype that you wrote, as it gets into design considerations how it extends the future of .NET/C#, etc which will take months of up front effort. None of that is justified for the scenario that is driving the proposal here, its way too much work/effort for something that is trivially solvable today almost for free. |
I don't think the current proposal would solve any issue. It will only mitigate the issue temporarily. We will end up needing more and more My opinion is that we either do nothing here, or do something after we have those integral types. |
They won't be useless and are unlikely to be replaced in the near future. The cost of waiting is significantly worse to the ecosystem
This is unlikely. The Roslyn team has already done some real world analysis on what sizes are actually found in production. The general .NET team is able to do more analysis as well, but it will likely come to the same conclusion. Things like what's used in interop are also not broad enough to be as big of a concern. It isn't something that is being implicitly done dozens of time for every compilation. All of For the 1023 definitions we have (click to expand):
So you can see even here it's primarily small powers of 2 and random outliers unique to a given scenario I also maintain several other prominent interop binding libraries, and they are all similar in nature. There isn't a significant number of |
Let's do a simple calculation, if we use a unified (click to expand)
Number of distinct hex digit types: 16 Number of Even we consider all 81 instantiations and their type definition of While currently we need 107+104+102+94+92+78+59+55+42+39+30+16+15+14+12+12+12+9+7+7+6+6+4+4+4+4+4+4+3+3+3+3+3+3+3+2+2+2+2+2+2+2+2+2+2+2+2+2+2+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1=1023 types. If we add predefined So apparently this won't introduce any meaningful regression for NativeAOT, and will be a big win for JIT scenario and managed binary size. Besides, we can easily get the number of elements of an |
Something like TerraFX wouldn't use such a type, it hurts UX. The point of me sharing the numbers was to give more representation to how rare the need is beyond the common cases already called out by this proposal. The entire approach of |
They are supported by today's runtime and we don't need any additional work to support such integral types being used. You can define those types and use them today without any issue. The only work we need to do is to calculate the value of a given |
The consideration is the work required to integrate everything into the VM and the JIT such that the type definition and layout is correct, understood, handled as expected from the ABI level, etc. You're effectively defining another alternative to This discussion is going around in circles at this point, so lets drop it. I've iterated why the alternatives are not likely to happen. |
The UX issue can be resolved with the help of language, to interpret the
Agree. It's kind of out-of-scope for this specific proposal. I may open a separate API proposal (and maybe language proposal as well) for this.
I don't think we should stop moving forward in this field :) |
Is this ask motivated around the assembly level IL bloat or more around the runtime overhead and trimmer's ability to do its job? If the latter, I could see more general approach that can be extended for all compiler generated types with the help of the type system. As these types are already "unspeakable", I could see that the compiler could mark the appropriate types as having "no identity" where the type system is free to merge and re-use already loaded types that also have "no identity". While this approach is more expensive, it does extend the scope and as the compiler is ramping up on generating these assembly level types, the savings can add up. |
Background and motivation
Today, Roslyn will emit anonymous inline array types when users call methods that take
params ReadOnlySpan
inparams
form, or when using collection expressions in some fashion. There can be a significant number of these types; for example, @jaredpar found that in Microsoft.AspNetCore.App, there are ~110 such types. In Microsoft.NetCore.app, there are ~136 such types. Most of these types are 5 elements or smaller, thus I'm only proposing up toInlineArray5
. However, we may want to consider going further; dotnet/roslyn#74538 asks for Roslyn to emit calls tostring.Concat(ReadOnlySpan<string>)
, which may push the numbers here up more. I'll leave it to the BCL designers to decide what number is right, and Roslyn will take advantage of whatever is available.API Proposal
API Usage
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: