Troubleshooting the ASP.net System.OutOfMemoryException
with DebugDiag v1.1
Outline
·
Preliminary Steps (before the OOM exception occurs)
5.
Configure a Perfmon Capture
·
Wait
·
Reactive Steps (after waiting for the OOM condition to be reached)
·
Other Considerations
2.
Isolation into Application Pools
This page is for troubleshooting when we either see the
'System.OutOfMemoryException' reported by ASP.net in a client browser, in the
event logs, or in a memory dump.
Unlike ASP.net 1.1, ASP.net 2.0 and 3.5 are very good about
recording their exceptions into the System event log and Application event
log. The steps in this page are geared
for troubleshooting events like these:
ASP.NET
2.0.50727.0
Event ID
1334
"An unhandled exception
occurred and the process was terminated.
Exception: System.OutOfMemoryException
Message: Exception of type
'System.OutOfMemoryException' was thrown.
ASP.NET
2.0.50727.0
Event ID
1309
Event code: 3005 Event message: An
unhandled exception has occurred.
Process name:
w3wp.exe
Exception type:
OutOfMemoryException
Exception message: Exception of
type 'System.OutOfMemoryException' was thrown.
.NET Runtime 2.0 Error
Reporting
Event ID
5000
EventType clr20r3, P1 w3wp.exe, P2
6.0.3790.3959, P3 45d6968e, P4 system, P5 2.0.0.0, P6 4889de7a, P7 2dc7, P8 16,
P9 system.outofmemoryexception, P10 NIL.
Ensure
all web.config files have Debug statements set to FALSE rather than TRUE. Probably the most common cause for OOM
conditions is having debug statements set to true on a high traffic web
server. (Read
more here)
Install
Debugdiag v1.1 (or higher) by browsing to http://microsoft.com/downloads and
searching All Downloads for DEBUGDIAG. (These instructions do not work in
Debugdiag v1.0. It must be v1.1 or
higher.)
If
it takes only a few minutes to reach the OOM condition, this step is not
needed. If however it takes several
hours or a few days to reach the OOM condition, I give high recommendations for
replace the SOS.dll on the server with an improved version. This will prevent a possible memory leak in
DebugDiag 1.1 that will ultimately kill the debugdiag process and the iis
process which it is attached to.
To replace
the SOS.dll (for use with a 32-bit iis
process and Debugdiag 1.1 x86)
A.
If applicable, deactivate or remove any debugdiag
rules you may have already created.
B.
Download the new sos.dll from http://viisual.net/tools/sosdll/SOS.dll
and save it to the server(s) that you need to monitor CLR exceptions on.
C.
On the affected server(s), open Windows Explorer and
navigate to C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727. Rename the sos.dll
in that folder to sos.dll.old and copy the new sos.dll into this folder.
D.
Also in Windows Explorer, navigate to
C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322. Rename the sos.dll in that folder
to sos.old and copy the new sos.dll into this folder.
E.
Navigate to C:\Program Files\DebugDiag\Exts (or
possibly C:\Program Files (x86)\DebugDiag\Exts) and copy the new sos.dll into
this folder. For good measure, also copy the new sos.dll into the
C:\Program Files\DebugDiag folder.
F.
In the services console (start > Run > open:
services.msc) please stop and restart the Debug Diagnostic Service.
G.
Create the crash rule using the steps below.
To replace
the SOS.dll (for use with a 64-bit iis
process and Debugdiag 1.1 64-bit Beta)
A.
If applicable, deactivate or remove any debugdiag
rules you may have already created.
B.
Download the new sos.dll from http://viisual.net/tools/sosdll/AMD64/SOS.dll
and save it to the server(s) that you need to monitor CLR exceptions on.
C.
On the affected server(s), open Windows Explorer and
navigate to C:\windows\Microsoft.NET\Framework64\v2.0.50727. Rename the sos.dll in that folder to
sos.dll.old and copy the new sos.dll into this folder.
D.
Also in Windows Explorer, navigate to C:\Program
Files\DebugDiag\Exts and copy the new sos.dll into this folder. For good
measure, also copy the new sos.dll into the C:\Program Files\DebugDiag folder.
E.
In the services console (start > Run > open:
services.msc) please stop and restart the Debug Diagnostic Service.
F.
Create the crash rule using the steps below.
The following steps are to configure a debugdiag crash rule in a
way that a memory dump should automatically be triggered the same microsecond
the System.OutOfMemoryException is raised.
Launch Debugdiag from the Programs Menu
Select “Crash” as the rule type

Choose either “All IIS/Com+ related processes”
Or
“A specific process”
Or
“A Specific IIS Web Application Pool”
(If unsure, go with
“All IIS/Com+ related processes.”)

Select the Exceptions button

Select the Add Exception button

Highlight the Exception Code E0434F4D in the left-side pane.

In the .NET Exception Type field, carefully type (or
cut-and-paste):
System.OutOfMemoryException
If you mistype it, don’t capitalize the right letters, or leave a
space at the end or beginning, the dumps will not be produced.
Set Action Type to: Full Userdump
Set Action Limit to: 1 or 2
or 3
(We probably only need 1 dump really)
Click Save and Close button, Next, Next, Activate the rule
whenever you wish to begin monitoring the IIS processes for the OOM condition.
As long as the rule is active, debugdiag is monitoring the IIS process(es)
and waiting for the next System.OutOfMemoryException
to be thrown.
1.
Configure a PERFMON Capture
Before
the problem occurs, please set up perfmon captures on the web server(s).
This can be done locally (a server’s own perfmon monitoring the same server) or
remotely (one server’s perfmon montoring another server). The following steps may not line up
perfeclty with all versions of Windows 2000, XP, 2003, Vista, 2008, and Windows
7 but they should be adequate for getting the idea across of how to set it
up.
Steps:
Click the
Windows Start Button > Run > Open: Perfmon [Enter]
Expand
“Performance Logs and Alerts”
Right Click
on “Counter Logs”
Choose “New
Log Settings…”

Enter a
descriptive name (such as “OOM”)
Note the log
file location for later (or go to the “Log Files” tab and change the location)
Click the
“Add” button
Click the
“All Counters” and “All Instances” radio buttons
Select the
following from the “Performance Object” dropdown, being sure to “Add” each one
as you select it:
·
Add every Object that begins with “.NET” (such as, .NET CLR
Data, .NET CLR Exceptions, .NET CLR Interop, etc.)
·
Add every Object that begins with ASP.NET (such as ASP.NET,
ASP.NET Applications, etc.)
·
Memory
·
Process
·
Processor
·
Thread
·
Web Service

Click “Close”
Click “OK”
Note: you may have to choose “ADD OBJECTS”
instead. When the object is added,
presumably all counters for that object should be included. My list of
steps for perfmon may be in need of revision.
Stop the
perfmon capture by right-clicking it and selecting stop. Leave it stopped until you’re ready to begin
troubleshooting. Start the capture by
right clicking it and selecting START.
We want the perfmon capture to be running long before the OOM condition
is reached. But also want to avoid
having perfmon blg files that are over 1 gigabyte in size. So you may want to be cautious about when to
start the perfmon capture. You may also
want to stop and restart the perfmon capture every few hours.
Wait for the
problem to occur.
The debugdiag
crash rule should be active while waiting.
The perfmon
capture should be active while waiting.
(But don’t let it grow too big please.)
If you want
to inject debugdiag’s leak trackers, you may also do
this while waiting.
Reactive Steps
When the OOM state is reached and problems are reported, log onto
the affected server and…
1.
Check User
Dump Count in DebugDiag
When the next System.OutOfMemoryException
exception is thrown inside of the selected process, debugdiag should
create a memory dump of the process(es) being monitored. In theory, when asp.net throws the first
system.outofmemory exception a dump will be triggered. Assuming this works as planned, the userdump
count in debugdiag will increase from 0 to 1.
If for some reason the dumps are not automatically produced as expected
when the OOM condition begins, a good Plan B will be to manually trigger a
single set of hang dumps when the OOM condition has begun to hang the website
(but before it crashes the IIS process).
Steps for Plan B:
a) Launch
DebugDiag
b) Click Cancel
if given the choice of making a Crash Rule or Hang Rule
c) Expand
the Tools menu
d) Select “Create IIS/Com+ Hang Dump”
and wait for dump creation to occur (may take 30 seconds or more depending on
size of the w3wp.exe)
Wait for the dumps to finish (this may take a
few minutes)
Stop
the Performance Monitor log that corresponds to the affected server.
In
Performance Monitor:
1. Right
click on your log that is now listed under "Counter Logs"
2. Choose
"Stop Log"
3. Save it as
a .blg file to the location of your choice (it should save automatically)
4. You
can zip and upload this log later.
After Debugdiag has finished making its dumps—whether
automated dumps from the crash rule or from manually triggered dumps in the
middle of a barrage of oom exceptions-- feel free to restart the w3wp.exe to
recover from the OOM condition by (a.) recycling the application pool, (b.)
running iisreset or (c.) rebooting the server.
If you kept the default settings, you can find that dump file(s)
by clicking the icon of the manila folder.

You can load sos.dll into
windbg.exe and try to interpret the dump for yourself.
You can also run the two analysis scripts in Debugdiag’s Advanced
Analysis tab against the dumps.
Or you can zip the dump, open a support case with Microsoft, and
ask the ASP.net team to analzye the dump to reveal the root cause of the System.OutOfMemoryException. To collect and zip the dumps, event logs,
iis log, and .net config files, expand the tools menu in debugdiag, select
Advanced Data Collection, and select Create Full Cabinet File.

Other
Considerations
Sometimes yes and sometimes no.
Debugdiag’s leak tracking feature is great for troubleshooting memory
leaks in native code, but not for leaks in managed code. So if there is a memory leak in the native
code which leads to ASP.net complaining about not being able to find free
blocks of contiguous memory to use, then, yes, we do want to inject leak
tracking before the System.OutOfMemoryException
dump is triggered by a System.OutOfMemoryException. If however it is managed code that is
occupying the memory space, we do not want to inject leaktracking. For leaktracking its self can use up a lot of
memory.
Usually I prefer to start troubleshooting the System.OutOfMemoryException by getting
a System.OutOfMemoryException
dump without leak tracking. From that
dump I can judge whether or not the memory is being taken up by managed code
(no leak tracking needed) or by native code (leak tracking is a good
idea). If leak tracking is needed, we
go for a second dump.
If we decide that we need to “inject leak
tracking” into a process, you still follow the same steps above. And then while waiting for the System.OutOfMemoryException to be
thrown, you inject leak tracking as follows.
Browse to the website in focus to get the
w3wp.exe spawned for the AppPool that is in focus. Give it a bit of stress for 20 minutes or so
before injecting leaktrack.
Switch to the Processes tab in DebugDiag
Focus on the w3wp.exe processes. Scroll
to the right on the Processes tab to see what the Application Pool Names are.
This way you can tell which w3wp.exe corresponds with which Application
Pool.
Right-click all of the w3wp.exe processes that
we need to focus on, one at a time, and select "Monitor for leaks"
from the gray menu.

Wait for ASP.net to begin reporting the
throwing of the OOM exceptions and collect the dump(s) as usual.
One good question to ask in some cases is whether or not it is
possible that there are too many websites and/or web applications assigned to
the same application pool. If there are
multiple webapps assigned to the same IIS Application Pool, it may be good to
consider isolating some of the main websites or main webapps into their own new
Application Pools. You also might want
to isolate some of the websites/webapps that you’re most suspicious of into
their own application pools. But only do
this with careful testing first as isolation can possibly break dependencies or
state that different web applications might possible be sharing. And, yes, it is possible to have too many
application pools. Generally, however,
ten or fifteen AppPools may be safe.