Cisco Systems UBR10012 Network Router User Manual


 
4-6
Cisco uBR10012 Universal Broadband Router Troubleshooting Guide
OL-1237-01
Chapter 4 Troubleshooting Line Cards
General Information for Troubleshooting Line Card Crashes
Step 2 If the results from the Output Interpreter indicate a hardware-related problem, try removing and
reinserting the hardware into the chassis. If this does not correct the problem, replace the DRAM chips
on the hardware. If the problem persists, replace the hardware.
Step 3 If the problem appears software-related, verify that you are running a released version of software, and
that this release of software supports all of the hardware that is installed in the router. If necessary,
upgrade the router to the latest version of software.
Tip The most effective way of using the Output Interpreter tool is to capture the output of the
show stacks and show tech-support commands and upload the output into the tool. If the
problem appears related to a line card, you can also try decoding the show context command.
Upgrading to the latest version of the Cisco IOS software eliminates all fixed bugs that can cause line
card bus errors. If the crash is still present after the upgrade, collect the relevant information from the
above troubleshooting, as well as any information about recent network changes, and contact Cisco TAC.
Software-Forced Crashes
Software-forced crashes (SIG type is 23) occur when the Cisco IOS software encounters a problem with
the line card and determines that it can no longer continue, so it forces the line card to crash. The original
problem could be either hardware-based or software-based.
The most common reason for a software-forced crash on a line card is a “Fabric Ping Timeout,” which
occurs when the PRE-1 module sends five keepalive messages (fabric pings) to the line card and does
not receive a reply. If this occurs, you should see error messages similar to the following in the router’s
console log:
%GRP-3-FABRIC_UNI: Unicast send timed out (4)
%GRP-3-COREDUMP: Core dump incident on slot 4, error: Fabric ping failure
Fabric ping timeouts are usually caused by one of the following problems:
High CPU Utilization—Either the PRE-1 module or line card is experiencing high CPU utilization.
The PRE-1 module or line card could be so busy that either the ping request or ping reply message
was dropped. Use the show processes cpu command to determine whether CPU usage is
exceptionally high (at 95 percent or more). If so, see the “High CPU Utilization Problems” section
on page 3-9 for information on troubleshooting the problem.
CEF-Related Problems—If the crash is accompanied by system messages that begin with “%FIB,”
it could indicate a problem with Cisco-Express Forwarding (CEF) on one of the line card’s
interfaces. For more information, see Troubleshooting CEF-Related Error Messages, at the
following URL:
http://www.cisco.com/en/US/products/hw/routers/ps359/products_tech_note09186a0080110d68.s
html
IPC Timeout—The InterProcess Communication (IPC) message that carried the original ping
request or the ping reply was lost. This could be caused by a software bug that is disabling interrupts
for an excessive period of time, high CPU usage on the PRE-1 module, or by excessive traffic on the
line card that is filling up all available IPC buffers.
If the router is not running the most current Cisco IOS software, upgrade the router to the latest
software release, so that any known IPC bugs are fixed. If the show processes cpu shows that CPU
usage is exceptionally high (at 95 percent or more), or if traffic on the line card is excessive, see the
“High CPU Utilization Problems” section on page 3-9.