Skip to content

Commit dacf9db

Browse files
davidwrightonjkotasAaronRobinsonMSFT
authored
Change temporary entrypoints to be lazily allocated (#101580)
* WorkingOnIt * It basically works for a single example. Baseline Loader Heap: ---------------------------------------- System Domain: 7ffab916ec00 LoaderAllocator: 7ffab916ec00 LowFrequencyHeap: Size: 0xf0000 (983040) bytes total. HighFrequencyHeap: Size: 0x16a000 (1482752) bytes total, 0x3000 (12288) bytes wasted. StubHeap: Size: 0x1000 (4096) bytes total. FixupPrecodeHeap: Size: 0x168000 (1474560) bytes total. NewStubPrecodeHeap: Size: 0x18000 (98304) bytes total. IndirectionCellHeap: Size: 0x1000 (4096) bytes total. CacheEntryHeap: Size: 0x1000 (4096) bytes total. Total size: Size: 0x3dd000 (4050944) bytes total, 0x3000 (12288) bytes wasted. Compare Loader Heap: ---------------------------------------- System Domain: 7ff9eb49dc00 LoaderAllocator: 7ff9eb49dc00 LowFrequencyHeap: Size: 0xef000 (978944) bytes total. HighFrequencyHeap: Size: 0x1b2000 (1777664) bytes total, 0x3000 (12288) bytes wasted. StubHeap: Size: 0x1000 (4096) bytes total. FixupPrecodeHeap: Size: 0x70000 (458752) bytes total. NewStubPrecodeHeap: Size: 0x10000 (65536) bytes total. IndirectionCellHeap: Size: 0x1000 (4096) bytes total. CacheEntryHeap: Size: 0x1000 (4096) bytes total. Total size: Size: 0x324000 (3293184) bytes total, 0x3000 (12288) bytes wasted. LowFrequencyHeap is 4KB bigger HighFrequencyHeap is 288KB bigger FixupPrecodeHeap is 992KB smaller NewstubPrecodeHeap is 32KB smaller * If there isn't a parent methodtable and the slot matches... then it by definition the method is defining the slot * Fix a couple more issues found when running a subset of the coreclr tests * Get X86 building again * Attempt to use a consistent api to force slots to be set * Put cache around RequiresStableEntryPoint * Fix typo * Fix interop identified issue where we sometime set a non Precode into an interface * Move ARM and X86 to disable compact entry points * Attempt to fix build breaks * fix typo * Fix another Musl validation issue * More tweaks around NULL handling * Hopefully the last NULL issue * Fix more NULL issues * Fixup obvious issues * Fix allocation behavior so we don't free the data too early or too late * Fix musl validation issue * Fix tiered compilation * Remove Compact Entrypoint logic * Add new ISOSDacInterface15 api * Fix some naming of NoAlloc to a more clear IfExists suffix * Remove way in which GetTemporaryEntryPoint behaves differently for DAC builds, and then remove GetTemporaryEntrypoint usage from DAC entirely in favor of GetTemporaryEntryPointIfExists * Attempt to reduce most of the use of EnsureSlotFilled. Untested, but its late. * Fix the build before sending to github * Fix unix build break, and invalid assert * Improve assertion checks to validate that we don't allocate temporary entrypoints that will be orphaned if the type doesn't actually end up published. * Remove unused parameters and add contracts * Update method-descriptor.md * Fix musl validation issue * Adjust SOS api to be an enumerator * Fix assertion issues noted Fix ISOSDacInterface15 to actually work * Remove GetRestoredSlotIfExists - Its the same as GetSlot .... just replace it with that function. * Update src/coreclr/debug/daccess/daccess.cpp Co-authored-by: Jan Kotas <[email protected]> * Update docs/design/coreclr/botr/method-descriptor.md Co-authored-by: Jan Kotas <[email protected]> * Update src/coreclr/vm/methodtable.inl Co-authored-by: Jan Kotas <[email protected]> * Update src/coreclr/vm/methodtable.h Co-authored-by: Jan Kotas <[email protected]> * Fix GetMethodDescForSlot_NoThrow Try removing EnsureSlotFilled Implement IsEligibleForTieredCompilation in terms of IsEligibleForTieredCompilation_NoCheckMethodDescChunk * Fix missing change intended in last commit * Fix some more IsPublished memory use issues * Call the right GetSlot method * Move another scenario to NoThrow, I think this should clear up our tests... * Add additional IsPublished check * Fix MUSL validation build error and Windows x86 build error * Address code review feedback * Fix classcompat build * Update src/coreclr/vm/method.cpp Co-authored-by: Aaron Robinson <[email protected]> * Remove assert that is invalid because TryGetMulticCallableAddrOfCode can return NULL ... and then another thread could produce a stable entrypoint and the assert could lose the race * Final (hopefully) code review tweaks. * Its possible for GetOrCreatePrecode to be called for cases where it isn't REQUIRED. we need to handle that case. --------- Co-authored-by: Jan Kotas <[email protected]> Co-authored-by: Aaron Robinson <[email protected]>
1 parent 00bddf9 commit dacf9db

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+1168
-1073
lines changed

docs/design/coreclr/botr/method-descriptor.md

+6-66
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,9 @@ DWORD MethodDesc::GetAttrs()
8585
Method Slots
8686
------------
8787

88-
Each MethodDesc has a slot, which contains the entry point of the method. The slot and entry point must exist for all methods, even the ones that never run like abstract methods. There are multiple places in the runtime that depend on the 1:1 mapping between entry points and MethodDescs, making this relationship an invariant.
88+
Each MethodDesc has a slot, which contains the current entry point of the method. The slot must exist for all methods, even the ones that never run like abstract methods. There are multiple places in the runtime that depend on mapping between entry points and MethodDescs.
89+
90+
Each MethodDesc logically has an entry point, but we do not allocate these eagerly at MethodDesc creation time. The invariant is that once the method is identified as a method to run, or is used in virtual overriding, we will allocate the entrypoint.
8991

9092
The slot is either in MethodTable or in MethodDesc itself. The location of the slot is determined by `mdcHasNonVtableSlot` bit on MethodDesc.
9193

@@ -185,8 +187,6 @@ The target of the temporary entry point is a PreStub, which is a special kind of
185187

186188
The **stable entry point** is either the native code or the precode. The **native code** is either jitted code or code saved in NGen image. It is common to talk about jitted code when we actually mean native code.
187189

188-
Temporary entry points are never saved into NGen images. All entry points in NGen images are stable entry points that are never changed. It is an important optimization that reduced private working set.
189-
190190
![Figure 2](images/methoddesc-fig2.png)
191191

192192
Figure 2 Entry Point State Diagram
@@ -208,6 +208,7 @@ The methods to get callable entry points from MethodDesc are:
208208

209209
- `MethodDesc::GetSingleCallableAddrOfCode`
210210
- `MethodDesc::GetMultiCallableAddrOfCode`
211+
- `MethodDesc::TryGetMultiCallableAddrOfCode`
211212
- `MethodDesc::GetSingleCallableAddrOfVirtualizedCode`
212213
- `MethodDesc::GetMultiCallableAddrOfVirtualizedCode`
213214

@@ -220,7 +221,7 @@ The type of precode has to be cheaply computable from the instruction sequence.
220221

221222
**StubPrecode**
222223

223-
StubPrecode is the basic precode type. It loads MethodDesc into a scratch register and then jumps. It must be implemented for precodes to work. It is used as fallback when no other specialized precode type is available.
224+
StubPrecode is the basic precode type. It loads MethodDesc into a scratch register<sup>2</sup> and then jumps. It must be implemented for precodes to work. It is used as fallback when no other specialized precode type is available.
224225

225226
All other precodes types are optional optimizations that the platform specific files turn on via HAS\_XXX\_PRECODE defines.
226227

@@ -236,7 +237,7 @@ StubPrecode looks like this on x86:
236237

237238
FixupPrecode is used when the final target does not require MethodDesc in scratch register<sup>2</sup>. The FixupPrecode saves a few cycles by avoiding loading MethodDesc into the scratch register.
238239

239-
The most common usage of FixupPrecode is for method fixups in NGen images.
240+
Most stubs used are the more efficient form, we currently can use this form for everything but interop methods when a specialized form of Precode is not required.
240241

241242
The initial state of the FixupPrecode on x86:
242243

@@ -254,67 +255,6 @@ Once it has been patched to point to final target:
254255

255256
<sup>2</sup> Passing MethodDesc in scratch register is sometimes referred to as **MethodDesc Calling Convention**.
256257

257-
**FixupPrecode chunks**
258-
259-
FixupPrecode chunk is a space efficient representation of multiple FixupPrecodes. It mirrors the idea of MethodDescChunk by hoisting the similar MethodDesc pointers from multiple FixupPrecodes to a shared area.
260-
261-
The FixupPrecode chunk saves space and improves code density of the precodes. The code density improvement from FixupPrecode chunks resulted in 1% - 2% gain in big server scenarios on x64.
262-
263-
The FixupPrecode chunks looks like this on x86:
264-
265-
jmp Target2
266-
pop edi // dummy instruction that marks the type of the precode
267-
db MethodDescChunkIndex
268-
db 2 (PrecodeChunkIndex)
269-
270-
jmp Target1
271-
pop edi
272-
db MethodDescChunkIndex
273-
db 1 (PrecodeChunkIndex)
274-
275-
jmp Target0
276-
pop edi
277-
db MethodDescChunkIndex
278-
db 0 (PrecodeChunkIndex)
279-
280-
dw pMethodDescBase
281-
282-
One FixupPrecode chunk corresponds to one MethodDescChunk. There is no 1:1 mapping between the FixupPrecodes in the chunk and MethodDescs in MethodDescChunk though. Each FixupPrecode has index of the method it belongs to. It allows allocating the FixupPrecode in the chunk only for methods that need it.
283-
284-
**Compact entry points**
285-
286-
Compact entry point is a space efficient implementation of temporary entry points.
287-
288-
Temporary entry points implemented using StubPrecode or FixupPrecode can be patched to point to the actual code. Jitted code can call temporary entry point directly. The temporary entry point can be multicallable entry points in this case.
289-
290-
Compact entry points cannot be patched to point to the actual code. Jitted code cannot call them directly. They are trading off speed for size. Calls to these entry points are indirected via slots in a table (FuncPtrStubs) that are patched to point to the actual entry point eventually. A request for a multicallable entry point allocates a StubPrecode or FixupPrecode on demand in this case.
291-
292-
The raw speed difference is the cost of an indirect call for a compact entry point vs. the cost of one direct call and one direct jump on the given platform. The later used to be faster by a few percent in large server scenario since it can be predicted by the hardware better (2005). It is not always the case on current (2015) hardware.
293-
294-
The compact entry points have been historically implemented on x86 only. Their additional complexity, space vs. speed trade-off and hardware advancements made them unjustified on other platforms.
295-
296-
The compact entry point on x86 looks like this:
297-
298-
entrypoint0:
299-
mov al,0
300-
jmp short Dispatch
301-
302-
entrypoint1:
303-
mov al,1
304-
jmp short Dispatch
305-
306-
entrypoint2:
307-
mov al,2
308-
jmp short Dispatch
309-
310-
Dispatch:
311-
movzx eax,al
312-
shl eax, 3
313-
add eax, pBaseMD
314-
jmp PreStub
315-
316-
The allocation of temporary entry points always tries to pick the smallest temporary entry point from the available choices. For example, a single compact entry point is bigger than a single StubPrecode on x86. The StubPrecode will be preferred over the compact entry point in this case. The allocation of the precode for a stable entry point will try to reuse an allocated temporary entry point precode if one exists of the matching type.
317-
318258
**ThisPtrRetBufPrecode**
319259

320260
ThisPtrRetBufPrecode is used to switch a return buffer and the this pointer for open instance delegates returning valuetypes. It is used to convert the calling convention of MyValueType Bar(Foo x) to the calling convention of MyValueType Foo::Bar().

src/coreclr/debug/daccess/daccess.cpp

+42
Original file line numberDiff line numberDiff line change
@@ -3239,6 +3239,10 @@ ClrDataAccess::QueryInterface(THIS_
32393239
{
32403240
ifaceRet = static_cast<ISOSDacInterface14*>(this);
32413241
}
3242+
else if (IsEqualIID(interfaceId, __uuidof(ISOSDacInterface15)))
3243+
{
3244+
ifaceRet = static_cast<ISOSDacInterface15*>(this);
3245+
}
32423246
else
32433247
{
32443248
*iface = NULL;
@@ -8341,6 +8345,44 @@ HRESULT DacMemoryEnumerator::Next(unsigned int count, SOSMemoryRegion regions[],
83418345
return i < count ? S_FALSE : S_OK;
83428346
}
83438347

8348+
HRESULT DacMethodTableSlotEnumerator::Skip(unsigned int count)
8349+
{
8350+
mIteratorIndex += count;
8351+
return S_OK;
8352+
}
8353+
8354+
HRESULT DacMethodTableSlotEnumerator::Reset()
8355+
{
8356+
mIteratorIndex = 0;
8357+
return S_OK;
8358+
}
8359+
8360+
HRESULT DacMethodTableSlotEnumerator::GetCount(unsigned int* pCount)
8361+
{
8362+
if (!pCount)
8363+
return E_POINTER;
8364+
8365+
*pCount = mMethods.GetCount();
8366+
return S_OK;
8367+
}
8368+
8369+
HRESULT DacMethodTableSlotEnumerator::Next(unsigned int count, SOSMethodData methods[], unsigned int* pFetched)
8370+
{
8371+
if (!pFetched)
8372+
return E_POINTER;
8373+
8374+
if (!methods)
8375+
return E_POINTER;
8376+
8377+
unsigned int i = 0;
8378+
while (i < count && mIteratorIndex < mMethods.GetCount())
8379+
{
8380+
methods[i++] = mMethods.Get(mIteratorIndex++);
8381+
}
8382+
8383+
*pFetched = i;
8384+
return i < count ? S_FALSE : S_OK;
8385+
}
83448386

83458387
HRESULT DacGCBookkeepingEnumerator::Init()
83468388
{

src/coreclr/debug/daccess/dacimpl.h

+28-1
Original file line numberDiff line numberDiff line change
@@ -818,7 +818,8 @@ class ClrDataAccess
818818
public ISOSDacInterface11,
819819
public ISOSDacInterface12,
820820
public ISOSDacInterface13,
821-
public ISOSDacInterface14
821+
public ISOSDacInterface14,
822+
public ISOSDacInterface15
822823
{
823824
public:
824825
ClrDataAccess(ICorDebugDataTarget * pTarget, ICLRDataTarget * pLegacyTarget=0);
@@ -1223,6 +1224,9 @@ class ClrDataAccess
12231224
virtual HRESULT STDMETHODCALLTYPE GetThreadStaticBaseAddress(CLRDATA_ADDRESS methodTable, CLRDATA_ADDRESS thread, CLRDATA_ADDRESS *nonGCStaticsAddress, CLRDATA_ADDRESS *GCStaticsAddress);
12241225
virtual HRESULT STDMETHODCALLTYPE GetMethodTableInitializationFlags(CLRDATA_ADDRESS methodTable, MethodTableInitializationFlags *initializationStatus);
12251226

1227+
// ISOSDacInterface15
1228+
virtual HRESULT STDMETHODCALLTYPE GetMethodTableSlotEnumerator(CLRDATA_ADDRESS mt, ISOSMethodEnum **enumerator);
1229+
12261230
//
12271231
// ClrDataAccess.
12281232
//
@@ -1993,6 +1997,29 @@ class DacMemoryEnumerator : public DefaultCOMImpl<ISOSMemoryEnum, IID_ISOSMemory
19931997
unsigned int mIteratorIndex;
19941998
};
19951999

2000+
class DacMethodTableSlotEnumerator : public DefaultCOMImpl<ISOSMethodEnum, IID_ISOSMethodEnum>
2001+
{
2002+
public:
2003+
DacMethodTableSlotEnumerator() : mIteratorIndex(0)
2004+
{
2005+
}
2006+
2007+
virtual ~DacMethodTableSlotEnumerator() {}
2008+
2009+
HRESULT Init(PTR_MethodTable mTable);
2010+
2011+
HRESULT STDMETHODCALLTYPE Skip(unsigned int count);
2012+
HRESULT STDMETHODCALLTYPE Reset();
2013+
HRESULT STDMETHODCALLTYPE GetCount(unsigned int *pCount);
2014+
HRESULT STDMETHODCALLTYPE Next(unsigned int count, SOSMethodData methods[], unsigned int *pFetched);
2015+
2016+
protected:
2017+
DacReferenceList<SOSMethodData> mMethods;
2018+
2019+
private:
2020+
unsigned int mIteratorIndex;
2021+
};
2022+
19962023
class DacHandleTableMemoryEnumerator : public DacMemoryEnumerator
19972024
{
19982025
public:

src/coreclr/debug/daccess/request.cpp

+106-7
Original file line numberDiff line numberDiff line change
@@ -214,11 +214,15 @@ BOOL DacValidateMD(PTR_MethodDesc pMD)
214214

215215
if (retval)
216216
{
217-
MethodDesc *pMDCheck = MethodDesc::GetMethodDescFromStubAddr(pMD->GetTemporaryEntryPoint(), TRUE);
218-
219-
if (PTR_HOST_TO_TADDR(pMD) != PTR_HOST_TO_TADDR(pMDCheck))
217+
PCODE tempEntryPoint = pMD->GetTemporaryEntryPointIfExists();
218+
if (tempEntryPoint != (PCODE)NULL)
220219
{
221-
retval = FALSE;
220+
MethodDesc *pMDCheck = MethodDesc::GetMethodDescFromStubAddr(tempEntryPoint, TRUE);
221+
222+
if (PTR_HOST_TO_TADDR(pMD) != PTR_HOST_TO_TADDR(pMDCheck))
223+
{
224+
retval = FALSE;
225+
}
222226
}
223227
}
224228

@@ -419,7 +423,11 @@ ClrDataAccess::GetMethodTableSlot(CLRDATA_ADDRESS mt, unsigned int slot, CLRDATA
419423
else if (slot < mTable->GetNumVtableSlots())
420424
{
421425
// Now get the slot:
422-
*value = mTable->GetRestoredSlot(slot);
426+
*value = mTable->GetSlot(slot);
427+
if (*value == 0)
428+
{
429+
hr = S_FALSE;
430+
}
423431
}
424432
else
425433
{
@@ -430,8 +438,16 @@ ClrDataAccess::GetMethodTableSlot(CLRDATA_ADDRESS mt, unsigned int slot, CLRDATA
430438
MethodDesc * pMD = it.GetMethodDesc();
431439
if (pMD->GetSlot() == slot)
432440
{
433-
*value = pMD->GetMethodEntryPoint();
434-
hr = S_OK;
441+
*value = pMD->GetMethodEntryPointIfExists();
442+
if (*value == 0)
443+
{
444+
hr = S_FALSE;
445+
}
446+
else
447+
{
448+
hr = S_OK;
449+
}
450+
break;
435451
}
436452
}
437453
}
@@ -440,6 +456,89 @@ ClrDataAccess::GetMethodTableSlot(CLRDATA_ADDRESS mt, unsigned int slot, CLRDATA
440456
return hr;
441457
}
442458

459+
HRESULT
460+
ClrDataAccess::GetMethodTableSlotEnumerator(CLRDATA_ADDRESS mt, ISOSMethodEnum **enumerator)
461+
{
462+
if (mt == 0 || enumerator == NULL)
463+
return E_INVALIDARG;
464+
465+
SOSDacEnter();
466+
467+
PTR_MethodTable mTable = PTR_MethodTable(TO_TADDR(mt));
468+
BOOL bIsFree = FALSE;
469+
if (!DacValidateMethodTable(mTable, bIsFree))
470+
{
471+
hr = E_INVALIDARG;
472+
}
473+
else
474+
{
475+
DacMethodTableSlotEnumerator *methodTableSlotEnumerator = new (nothrow) DacMethodTableSlotEnumerator();
476+
*enumerator = methodTableSlotEnumerator;
477+
if (*enumerator == NULL)
478+
{
479+
hr = E_OUTOFMEMORY;
480+
}
481+
else
482+
{
483+
hr = methodTableSlotEnumerator->Init(mTable);
484+
}
485+
}
486+
487+
SOSDacLeave();
488+
return hr;
489+
}
490+
491+
HRESULT DacMethodTableSlotEnumerator::Init(PTR_MethodTable mTable)
492+
{
493+
unsigned int slot = 0;
494+
495+
WORD numVtableSlots = mTable->GetNumVtableSlots();
496+
while (slot < numVtableSlots)
497+
{
498+
MethodDesc* pMD = mTable->GetMethodDescForSlot_NoThrow(slot);
499+
SOSMethodData methodData = {0};
500+
methodData.MethodDesc = HOST_CDADDR(pMD);
501+
methodData.Entrypoint = mTable->GetSlot(slot);
502+
methodData.DefininingMethodTable = PTR_CDADDR(pMD->GetMethodTable());
503+
methodData.DefiningModule = HOST_CDADDR(pMD->GetModule());
504+
methodData.Token = pMD->GetMemberDef();
505+
506+
methodData.Slot = slot++;
507+
508+
if (!mMethods.Add(methodData))
509+
return E_OUTOFMEMORY;
510+
}
511+
512+
MethodTable::IntroducedMethodIterator it(mTable);
513+
for (; it.IsValid(); it.Next())
514+
{
515+
MethodDesc* pMD = it.GetMethodDesc();
516+
WORD slot = pMD->GetSlot();
517+
if (slot >= numVtableSlots)
518+
{
519+
SOSMethodData methodData = {0};
520+
methodData.MethodDesc = HOST_CDADDR(pMD);
521+
methodData.Entrypoint = pMD->GetMethodEntryPointIfExists();
522+
methodData.DefininingMethodTable = PTR_CDADDR(pMD->GetMethodTable());
523+
methodData.DefiningModule = HOST_CDADDR(pMD->GetModule());
524+
methodData.Token = pMD->GetMemberDef();
525+
526+
if (slot == MethodTable::NO_SLOT)
527+
{
528+
methodData.Slot = 0xFFFFFFFF;
529+
}
530+
else
531+
{
532+
methodData.Slot = slot;
533+
}
534+
535+
if (!mMethods.Add(methodData))
536+
return E_OUTOFMEMORY;
537+
}
538+
}
539+
540+
return S_OK;
541+
}
443542

444543
HRESULT
445544
ClrDataAccess::GetCodeHeapList(CLRDATA_ADDRESS jitManager, unsigned int count, struct DacpJitCodeHeapInfo codeHeaps[], unsigned int *pNeeded)

src/coreclr/inc/corinfo.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -894,7 +894,7 @@ enum CORINFO_ACCESS_FLAGS
894894
{
895895
CORINFO_ACCESS_ANY = 0x0000, // Normal access
896896
CORINFO_ACCESS_THIS = 0x0001, // Accessed via the this reference
897-
// UNUSED = 0x0002,
897+
CORINFO_ACCESS_PREFER_SLOT_OVER_TEMPORARY_ENTRYPOINT = 0x0002, // Prefer access to a method via slot over using the temporary entrypoint
898898

899899
CORINFO_ACCESS_NONNULL = 0x0004, // Instance is guaranteed non-null
900900

src/coreclr/inc/gfunc_list.h

-4
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,6 @@
1313
DEFINE_DACGFN(DACNotifyCompilationFinished)
1414
DEFINE_DACGFN(ThePreStub)
1515

16-
#ifdef TARGET_ARM
17-
DEFINE_DACGFN(ThePreStubCompactARM)
18-
#endif
19-
2016
DEFINE_DACGFN(ThePreStubPatchLabel)
2117
#ifdef FEATURE_COMINTEROP
2218
DEFINE_DACGFN(Unknown_AddRef)

0 commit comments

Comments
 (0)