Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -10599,7 +10599,12 @@ class Compiler
bool structSizeMightRepresentSIMDType(size_t structSize)
{
#ifdef FEATURE_SIMD
return (structSize >= getMinVectorByteLength()) && (structSize <= getMaxVectorByteLength());
#ifdef TARGET_ARM64
const uint32_t max = compExactlyDependsOn(InstructionSet_VectorT) ? MAX_SVE_REGSIZE_BYTES : FP_REGSIZE_BYTES;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to potentially cause light up for a lot of unintended structs, which can hurt startup perf

Won't SVE, in most scenarios, rather be "size unknown" and in isolated scenarios a JIT (but not AOT or pre-JIT) environment may be able to explicitly query the true size and optimize a few things (like frame layout)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed, we could query the actual size here in JIT mode, and use this as the upper bound. We can also filter sizes that are powers of 2 in bits, which could help with AOT as well.

As I've added the primitive for it, I could just do this in this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've added the primitive for it, I could just do this in this PR.

Sorry please ignore this comment, I've got confused with another patch I'm preparing. I will add the optimization to that patch instead, which adds a primitive to read the VL from Vector<T> metadata.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a circular dependency between querying the size of Vector<T> and calling structSizeMightRepresentSIMDType. Vector<T> needs to have been seen by the JIT to query the size, but the JIT will typically not pattern match for Vector<T> (in getBaseTypeAndSizeOfSIMDType) until structSizeMightRepresentSIMDType is true.

I can't find a way to try and look for a class handle by name, and I'm assuming this is by design? So to make this optimization happen, I think I'd need to cache a Vector<T> handle when it's found, and have some sort of ready state to check whether it's been seen yet. Then switch to the optimal maximum bound when it's available.

#else
const uint32_t max = getMaxVectorByteLength();
#endif // TARGET_ARM64
return (structSize >= getMinVectorByteLength()) && (structSize <= max);
#else
return false;
#endif // FEATURE_SIMD
Expand Down
3 changes: 3 additions & 0 deletions src/coreclr/jit/targetarm64.h
Original file line number Diff line number Diff line change
Expand Up @@ -370,4 +370,7 @@

#define REG_UNKBASE REG_R19
#define RBM_UNKBASE RBM_R19

#define MAX_SVE_REGSIZE_BYTES 256

// clang-format on
10 changes: 6 additions & 4 deletions src/coreclr/vm/codeman.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1720,10 +1720,12 @@ void EEJitManager::SetCpuInfo()
uint32_t maxVectorTLength = (maxVectorTBitWidth / 8);
uint64_t sveLengthFromOS = GetSveLengthFromOS();

// For now, enable SVE only when the system vector length is 16 bytes (128-bits)
// TODO: https://github.com/dotnet/runtime/issues/101477
if (sveLengthFromOS == 16)
// if ((maxVectorTLength >= sveLengthFromOS) || (maxVectorTBitWidth == 0))
if (sveLengthFromOS == 16
#ifdef _DEBUG
|| (CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitUseScalableVectorT)
&& ((maxVectorTLength >= sveLengthFromOS) || (maxVectorTBitWidth == 0)))
#endif
)
{
CPUCompileFlags.Set(InstructionSet_Sve);

Expand Down
15 changes: 13 additions & 2 deletions src/coreclr/vm/methodtablebuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1174,6 +1174,10 @@ MethodTableBuilder::CopyParentVtable()
}
}

#ifdef TARGET_ARM64
extern "C" uint64_t GetSveLengthFromOS();
#endif

//*******************************************************************************
// Determine if this is the special SIMD type System.Numerics.Vector<T>, whose
// size is determined dynamically based on the hardware and the presence of JIT
Expand All @@ -1186,7 +1190,7 @@ BOOL MethodTableBuilder::CheckIfSIMDAndUpdateSize()
{
STANDARD_VM_CONTRACT;

#if defined(TARGET_X86) || defined(TARGET_AMD64)
#if defined(TARGET_X86) || defined(TARGET_AMD64) || defined(TARGET_ARM64)
if (!bmtProp->fIsIntrinsicType)
return false;

Expand All @@ -1205,6 +1209,7 @@ BOOL MethodTableBuilder::CheckIfSIMDAndUpdateSize()
CORJIT_FLAGS CPUCompileFlags = ExecutionManager::GetEEJitManager()->GetCPUCompileFlags();
uint32_t numInstanceFieldBytes = 16;

#if defined(TARGET_X86) || defined(TARGET_AMD64)
if (CPUCompileFlags.IsSet(InstructionSet_VectorT512))
{
numInstanceFieldBytes = 64;
Expand All @@ -1213,13 +1218,19 @@ BOOL MethodTableBuilder::CheckIfSIMDAndUpdateSize()
{
numInstanceFieldBytes = 32;
}
#elif defined(TARGET_ARM64)
if (CPUCompileFlags.IsSet(InstructionSet_VectorT))
{
numInstanceFieldBytes = (uint32_t) GetSveLengthFromOS();
}
Comment on lines +1222 to +1225

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is "correct" because we'll rather have InstructionSet_VectorT128 if we have AdvSimd without SVE, correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking it's going to be InstructionSet_VectorT128 XOR InstructionSet_VectorT, never both enabled at the same time. So InstructionSet_VectorT is only serviced by SVE and will not be available when SVE is not there.

At some point we will need to decide if we prefer AdvSimd or SVE when the VL == 128 bits. This is dependent on micro-architecture, but we can find a different reason to pick one generally.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never both enabled at the same time.

Correct, that should generally be an error scenario and effectively a bug in the ISA detection logic in the VM, but I wanted to make sure it was being persisted here and wasn't something "unique" for the scalable scenario.

At some point we will need to decide if we prefer AdvSimd or SVE when the VL == 128 bits. This is dependent on micro-architecture, but we can find a different reason to pick one generally.

👍

#endif

if (numInstanceFieldBytes != 16)
{
bmtFP->NumInstanceFieldBytes = numInstanceFieldBytes;
return true;
}
#endif // TARGET_X86 || TARGET_AMD64
#endif // TARGET_X86 || TARGET_AMD64 || TARGET_ARM64

return false;
}
Expand Down
Loading