Skip to content

More string optimizations #18546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

brianrourkeboll
Copy link
Contributor

@brianrourkeboll brianrourkeboll commented May 14, 2025

Description

(This might need an RFC? It also might be too much of a hack to accept... But it does work.)

This PR follows and improves upon #9549. It improves the string function's implementation for signed integer types (sbyte, int16, int32, int64) and enum types based on signed integer types by directly calling the appropriate ToString overload on the underlying type. The boxing and casting of the previous implementation (see #1714, #9153) are now avoided altogether when the type is known at compile-time. All existing culture-invariant and culture-dependent behavior is preserved.

This is done in a backwards- and forwards-compatibile way by working within the confines of the existing when 'T : … library-only static optimization construct, avoiding the need to extend that feature with new syntax or constraints (as suggested in #9594) or to introduce a new construct to the pickling format. That is, this and newer compilers will still be able to compile code using older FSharp.Core versions, while older F# compilers will be able to consume this and newer FSharp.Core versions without any compile-time or runtime breaking changes.

Example of IL before

.method public static string  'string int32'(int32 'value') cil managed
{

  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  box        [runtime]System.Int32
  IL_0006:  unbox.any  [runtime]System.IFormattable
  IL_000b:  ldnull
  IL_000c:  call       class [netstandard]System.Globalization.CultureInfo [netstandard]System.Globalization.CultureInfo::get_InvariantCulture()
  IL_0011:  tail.
  IL_0013:  callvirt   instance string [netstandard]System.IFormattable::ToString(string,
                                                                                  class [netstandard]System.IFormatProvider)
  IL_0018:  ret
} 
.method public static string  'string<Int32Enum>'(valuetype assembly/String/Int32Enum 'enum') cil managed
{
  
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  box        assembly/String/Int32Enum
  IL_0006:  unbox.any  [runtime]System.IFormattable
  IL_000b:  ldnull
  IL_000c:  call       class [netstandard]System.Globalization.CultureInfo [netstandard]System.Globalization.CultureInfo::get_InvariantCulture()
  IL_0011:  tail.
  IL_0013:  callvirt   instance string [netstandard]System.IFormattable::ToString(string,
                                                                                  class [netstandard]System.IFormatProvider)
  IL_0018:  ret
} 

Example of IL after

.method public static string  'string int32'(int32 'value') cil managed
{
  
  .maxstack  8
  IL_0000:  ldarga.s   'value'
  IL_0002:  ldnull
  IL_0003:  call       class [netstandard]System.Globalization.CultureInfo [netstandard]System.Globalization.CultureInfo::get_InvariantCulture()
  IL_0008:  call       instance string [netstandard]System.Int32::ToString(string,
                                                                           class [netstandard]System.IFormatProvider)
  IL_000d:  ret
}
.method public static string  'string<Int32Enum>'(valuetype assembly/String/Int32Enum 'enum') cil managed
{
  
  .maxstack  3
  .locals init (valuetype assembly/String/Int32Enum V_0)
  IL_0000:  ldarg.0
  IL_0001:  stloc.0
  IL_0002:  ldloca.s   V_0
  IL_0004:  constrained. assembly/String/Int32Enum
  IL_000a:  callvirt   instance string [netstandard]System.Object::ToString()
  IL_000f:  ret
} 

Changes

  1. A new marker type—SupportsWhenTEnum—is added to the Microsoft.FSharp.Core.CompilerServices namespace in FSharp.Core. This type is marked [<Sealed; AbstractClass>], has no members or constructors, and is hidden by default via [<CompilerMessage(…)>]. (Could/should we use IsError = true, and/or use EditorBrowsable or Experimental to discourage use from other languages as well?)
  2. Special compiler support is added for two new library-only (i.e., FSharp.Core-only) static optimization constraints:
    • The compiler now recognizes the when 'T : Enum library-only static optimization constraint and treats it very similarly to the already-possible when 'T : 'T & #Enum, only the subtype constraint is not propagated back to the outer 'T.
    • The compiler now recognizes the special when 'T : SupportsWhenTEnum library-only static optimization constraint. This enables compilers that understand it to process any following static optimization constraints in a different order from compilers that do not understand it while remaining fully compatible with them.
  3. The sequence of library-only static optimizations in the string operator is updated to use a new ordering via the when 'T : SupportsWhenTEnum and when 'T : Enum constraints when compiled with newer compilers. Older compilers will not recognize the new constraints; they will simply skip over them and use the old sequence of constraints exactly as before.

The compiler change could be put behind a language feature if necessary.

Tradeoffs

Pros

  • Enables significant speedups for string on very common types like int.
  • Fully backwards and forwards compatible with older and newer F# compilers and FSharp.Core versions.
  • Introduces essentially zero public-facing changes to the language.
  • Avoids the complex engineering and compatibility gymnastics that would likely be required to support altogether new kinds of static optimization constraints (like Support compiler-specific statically-resolved 'when' syntax with 'enum<_>' and possibly others #9594).
  • The changes to the compiler and core library that are made are very small, localized, and could conceivably be safely removed or changed in the future if necessary or desired.

Cons

  • Feels like a bit of a hack.
  • Introduces two subtle new special cases to the already-subtle typechecking of library-only static optimization constraints.
  • Not terribly scalable: due to the way that each new static optimization marker type requires one-upping anything that came before, significant application of this technique could quickly become ugly. It does seem rather unlikely for that to happen, though.

Alternatives

  1. Do nothing.
  2. Try something like Support compiler-specific statically-resolved 'when' syntax with 'enum<_>' and possibly others #9594.
  3. Change the typechecking of generic constraints in static optimization constraints such that they no longer propagate up to the outer 'T (and explicitly add the necessary constraints to the core library functions that currently depend on this propagation). This would allow existing syntactically-valid generic constraints to be used (like when 'T : 'T & #Enum) instead of adding a special case for when 'T : Enum.
  4. Do special lowering tailored to the compiled output of the existing code, along the lines of LowerComputedCollections.fs, etc.
  5. Add some more general mechanism to the optimizer to strip away boxing/casts that could be known to be unnecessary at compile-time.

If 2. or 3. had been done in F# 1.0 (or whenever static optimization constraints were added), that would have been ideal. However, doing either 2. or 3. now would involve a significant amount of engineering work, and both would have compatibility problems.

Checklist

  • Test cases added.
  • Performance benchmarks added in case of performance changes.
  • Release notes entry updated.

Benchmarks

  • The string representation of small positive integer values is cached. Calling string on such values for signed integer types (sbyte, int16, int32, int64) now results in zero allocations and a ~3× speedup.
  • There is a noticeable speedup and reduction in allocations for string on negative and non-cached positive signed integers.
  • There is also a noticeable speedup for string on enum values, including negative values that do not correspond to a member of the given enum type.
Source
open System
open System.Runtime.CompilerServices

[<MethodImpl(MethodImplOptions.NoInlining)>]
let ``string 3`` () = string 3

[<MethodImpl(MethodImplOptions.NoInlining)>]
let ``string -3`` () = string -3

[<MethodImpl(MethodImplOptions.NoInlining)>]
let ``string 1152921504606846975L`` () = string 1152921504606846975L

[<MethodImpl(MethodImplOptions.NoInlining)>]
let ``string -1152921504606846975L`` () = string -1152921504606846975L

[<MethodImpl(MethodImplOptions.NoInlining)>]
let ``string DayOfWeek.Wednesday`` () = string DayOfWeek.Wednesday

[<MethodImpl(MethodImplOptions.NoInlining)>]
let ``string (enum<DayOfWeek> -3)`` () = string (enum<DayOfWeek> -3)
| Categories                   | Mean      | Ratio | Gen0   | Allocated | Alloc Ratio |
|----------------------------- |----------:|------:|-------:|----------:|------------:|
| string 3                     |  6.172 ns |  1.00 | 0.0019 |      24 B |        1.00 |
| string 3                     |  1.925 ns |  0.31 |      - |         - |        0.00 |
|                              |           |       |        |           |             |
| string -3                    | 12.879 ns |  1.00 | 0.0044 |      56 B |        1.00 |
| string -3                    |  8.082 ns |  0.63 | 0.0025 |      32 B |        0.57 |
|                              |           |       |        |           |             |
| string 1152921504606846975L  | 18.343 ns |  1.00 | 0.0070 |      88 B |        1.00 |
| string 1152921504606846975L  | 13.976 ns |  0.76 | 0.0051 |      64 B |        0.73 |
|                              |           |       |        |           |             |
| string -1152921504606846975L | 21.641 ns |  1.00 | 0.0070 |      88 B |        1.00 |
| string -1152921504606846975L | 17.313 ns |  0.80 | 0.0051 |      64 B |        0.73 |
|                              |           |       |        |           |             |
| string DayOfWeek.Wednesday   | 10.036 ns |  1.00 | 0.0019 |      24 B |        1.00 |
| string DayOfWeek.Wednesday   |  7.940 ns |  0.79 | 0.0019 |      24 B |        1.00 |
|                              |           |       |        |           |             |
| string (enum<DayOfWeek> -3)  | 28.096 ns |  1.00 | 0.0044 |      56 B |        1.00 |
| string (enum<DayOfWeek> -3)  | 15.098 ns |  0.54 | 0.0044 |      56 B |        1.00 |

Copy link
Contributor

github-actions bot commented May 14, 2025

❗ Release notes required


✅ Found changes and release notes in following paths:

Change path Release notes path Description
src/FSharp.Core docs/release-notes/.FSharp.Core/10.0.100.md
src/Compiler docs/release-notes/.FSharp.Compiler.Service/10.0.100.md

@brianrourkeboll brianrourkeboll force-pushed the more-string-optimizations branch 4 times, most recently from 18062d5 to 5eded0b Compare May 15, 2025 22:00
@brianrourkeboll brianrourkeboll changed the base branch from main to T-Gro-patch-3 May 16, 2025 17:10
@brianrourkeboll brianrourkeboll force-pushed the more-string-optimizations branch from 5eded0b to 0c48a37 Compare May 16, 2025 17:11
@brianrourkeboll brianrourkeboll changed the base branch from T-Gro-patch-3 to main May 16, 2025 22:25
@brianrourkeboll brianrourkeboll changed the base branch from main to T-Gro-patch-3 May 16, 2025 22:32
@brianrourkeboll brianrourkeboll changed the base branch from T-Gro-patch-3 to main May 17, 2025 16:34
@brianrourkeboll brianrourkeboll force-pushed the more-string-optimizations branch 2 times, most recently from 277f5c4 to 2e83e22 Compare May 17, 2025 17:39
@brianrourkeboll brianrourkeboll marked this pull request as ready for review May 17, 2025 23:42
@brianrourkeboll brianrourkeboll requested a review from a team as a code owner May 17, 2025 23:42
Copy link
Member

@T-Gro T-Gro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this addition and we can uplift what you called type marker hack into a more explicit (new) mechanism - a feature toggle mechanism to communicate between compiler and statically optimized library code.

Testing for new compiler + new fsharp core is well covered with existing FSharp.Core tests.
I think it would be wise to add a smoke test for showing that new compiler + old fsharp.core works fine.

(we could also brainstorm on a good way to test the last combination, old compiler + new fsharp.core. A way could be to run fsharp.core test suite, with freshly built Fsharp.Core, using the last-known-good SDK, without compiler overrides)

This special-casing allows us to update FSharp.Core to avoid boxing
when caling the `string` function on enums and signed integral types
going forward while still allowing the updated version of FSharp.Core
to be fully compatible with older compilers.

Adding support for some form of constraint in library-only static
optimizations instead would have been problematic for multiple reasons.
Supporting something like `when 'T : enum<'U>` would have required
additional modifications to the compiler and would not have been
consumable by older compilers. It would also introduce a new type
variable. While something like `when 'T : 'T & #Enum` is already
syntactically valid, it would add that constraint to the entire `string`
function without further modification to the typechecker. It would also
not be consumable by older compilers.

I think adding a special case for enums is justifiable since (1) enums are a
special kind of type to begin with, and (2) static optimization
constraints are only allowed in FSharp.Core, so the change to the
language itself is quite small.
@brianrourkeboll brianrourkeboll force-pushed the more-string-optimizations branch from 7f2499f to d5dc4a5 Compare May 19, 2025 23:49
@brianrourkeboll brianrourkeboll force-pushed the more-string-optimizations branch from d5dc4a5 to 4d198f5 Compare May 20, 2025 01:35
Copy link
Member

@T-Gro T-Gro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good now.
It now makes the code clear about a new construct - CompilerServices.Supports.. - which we can use to communicate between Fsharp.Core and the compiler.

The proto process covered via CI also ensures that:

  • LKG compiler can built this code in fslib
  • LKG compiler can built project that references such built fslib (when LKG builds proto)

@github-project-automation github-project-automation bot moved this from New to In Progress in F# Compiler and Tooling May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

2 participants