-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple 'System.OutOfMemoryException' errors in .NET 7 #78959
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/gc Issue DetailsI'm seeing an issue very similar to this one when running a memory-heavy app on a linux container with memory limit >128GB RAM. The app started throwing random OutOfMemoryException in many unexpected places since we migrated to net70, while under no memory pressure (usually with more than 30% free memory). I can see the original issue was closed, but I'm not sure if it was fixed on the final net70 release or if the suggestion to set
|
Thanks for reporting this issue. This looks like its separate than the original issue -- we are investigating something similar with another customer. Would it be possible to share a dump when the OOM happens? |
Unfortunately not as this is running within a customer infrastructure and the dump would most probably contain confidential data. Is there an issue here on GitHub I can subscribe to? |
We dont have an issue yet, so will use this to provide updates. Its most likely something which is fixed in main which might need porting to 7: #77478 |
Hi @theolivenbaum, would it be possible for you to try out a private to ensure the fix resolves your issue? Thanks |
That might be hard as it would involve changing how we build our docker images. But fine to wait till this is back ported to 7 - any idea on a timeline for the next service release? |
We're also seeing a lot of OOM exceptions since migrating to .NET 7 from .NET 5 (we're now testing .NET 6). In our case, we're running under Windows via Azure App Services. Reported memory usage is low - perhaps lower than what it was under .NET 5. The project in question loads large-ish files in memory. |
can you try if setting |
We'll try to find time to test this. |
@mangod9 We're seeing a similar issue with .NET7 on Setting |
it'd be helpful to see what !ao displays (it's an sos extension). would that be possible? that's always the 1st step if you have a dump. |
ok good to know, yeah like Maoni suggests getting a dump or trace can help confirm whether its the same issue. We hope to get it fixed in an upcoming servicing release. |
Hopefully I did this right. I followed this guild. Here is the output:
|
thanks, it does look similar to other cases we have seen. |
would it be possible to try out a private fix? we could deliver a libclrgc.so to you and you could use it the same way you used the shipped version. that would be really helpful. |
That would probably be possible! |
I have copied a private libcoreclr.so at https://1drv.ms/u/s!AtaveiZOervriJhkWC64gVEV8dAHug?e=IyBaP3, if you want to give that a try. You will want to remove the |
ok thanks for trying it out. We will do additional validation and add it to a .NET 7 servicing release (due to holidays might be in Feb). |
@mangod9 meanwhile what would you recommend? Keep using the version you shared or using some of the flags suggested above? |
you could keep using the private if that works for you scenario. If you pickup a new servicing release it might not work however. Using |
Thanks! Will keep that in mind then! Just out of curiosity, how come the COMPlus_GCName flag is a workaround? Does the runtime includes two copies of the GC? |
in .NET 7 we have enabled new Regions functionality within the GC. Here are the details: #43844. Since this was a foundational change, we also shipped a separate GC which keeps the previous "Segments" functionality -- in case there are some issues like this one. Going forward, we do plan to use a similar mechanism to release newer GC changes and could have multiple GC implementations sometime in the future. |
That makes a lot of sense and what I was imagine happened! Looking forward then to the service release next year! |
👋 Hello! We recently upgraded a series of console apps/generic hosts (and 1 asp.net/webhost), running on alpine linux, from dotnet 6 to dotnet 7, and this issue (OOM while there's plenty of memory available) started happening on some of them, when under load. From what I can tell, it is not happening on the apps in which we have set the GC mode to server with I will try now to set |
I'm confused, if you are still using libclrgc and not getting OOM, and you want to know why it gets OOM without libclrgc, wouldn't you want to get rid of libclrgc and repro the OOM, and then do analysis there? the corresponding name of coreclr on linux would be libcoreclr.so. so if you want to look at this in windbg, you'd do libcoreclr instead of coreclr. |
Sorry for the confusion, let me try to clear it up. Using the older But we have another problem; we are still using
for a process with around 1.8gb resident (same scenario as above). I also noticed (for one of our other processes in this scenario) that using workstation GC gave better GC performance (or at least the resident memory observed did not fluctuate to high values). This led me to wonder if you were actually seeing a problem common to both GCs. Not sure if this has been helpful in end however! Windbg is showing 0 for all those gc_heap values:
Sorry if this is unrelated/unhelpful, can open a different issue. I'll double check against .NET6, very possibly something we've caused here too. |
hi @dave-yotta, if your heap is actually growing, then it's a distinctly different issue from what I mentioned above. if you could open a new issue so we can track them better, that'd be great! would it be possible to capture a top level GC trace? that's the first step at diagnosing a memory problem. it's described here. it's very low overhead so you can keep it on for a long time. if this problem shows up pretty quickly you could start capturing right before the process is started and terminate tracing when it's exhibited the "memory not being released and the heap size is too large" behavior. iif you cannot repro with libclrgc, that's most likely a problem in GC so we'd like to track this down with your help. thanks! |
hey @Maoni0, the (used) heap isn't growing, unmanaged memory is growing. not sure if that's actually the heap free space or something else - but there's a lot of GC time and we found a lot of allocations/deallocations totalling 12gb (but never exceeding about 300mb at any one point), will try reducing the memory traffic...and I'll run that gc-collect trace before I do. Take awhile to get around to though sorry! :D |
no worries. whenever you get a chance, a gc-collect trace would be very helpful to us. |
@Maoni0 quick update, just tested the latest runtime without setting |
@theolivenbaum do you have a dump when it gets OOM that you could share? if there's privacy concerns, could you capture a top level GC trace so we can at least understand if "when starting" means "when starting and still in the initialization phase" or "after it's done some GCs"? |
@Maoni0 I'm having issues capturing a dump inside a container. Managed to install the dotnet tools but gcdump gives incomplete results, and dump just fails with an error related to not running as root user Update: This is the error message from dotnet-dump: |
what about dotnet trace? |
How can I get a memory dump using dotnet-trace? |
you don't. you capture a GC trace -
|
@theolivenbaum can you make sure
|
@Maoni0 @hoyosjs good news: found the issue and it was not related to the .NET runtime. The memory allocator used by RocksDB by default on Linux can severally leak memory, and switching to Jemalloc fixed the issue on the server we're observing the problem. Thanks again for the support and we can close the issue now! |
Has this really been resolved? We observed the same issue and have mitigated it since then using |
have you tried preview 5? if you are still seeing OOM without using libclrgc.so, is it possible to share a dump with us? |
Do you mean .NET 8 Preview 5? I'm talking about Net 7. If this has been fixed with .Net 8 do we get a backport to .Net 7 ? |
yeah, .net 8 preview 7. if you cannot try it, could you share a dump from .net 7 but without using libclrgc.so? you may or may not be hitting the same issue that other people hit so there's no guarantee that even if we backported it would fix the issue you hit. you could also look at the symbols I mentioned above in a dump yourself. |
Also @NKnusperer might make sense to create a separate issue for it, since there could be different reasons for OOMs. |
Reopening - repro given in #78959 (comment) and derivatives from it (all 16MB allocations) are not all solved |
I'm seeing an issue very similar to this one when running a memory-heavy app on a linux container with memory limit >128GB RAM.
The app started throwing random OutOfMemoryException in many unexpected places since we migrated to net70, while under no memory pressure (usually with more than 30% free memory).
I can see the original issue was closed, but I'm not sure if it was fixed on the final net70 release or if the suggestion to set
COMPlus_GCRegionRange=10700000000
is the expected workaround.The text was updated successfully, but these errors were encountered: