This section lists the information that should be captured to monitor the state of the CSM at the device level
The CSM specific counters are included in 2 dedicated MIB’s:
| MIB | OID | Full Path |
|---|---|---|
| CISCO-SLB-MIB | 1.3.6.1.4.1.9.9.161 | iso(1).org(3).dod(6).internet(1).private(4). enterprises(1).cisco(9).ciscoMgmt(9).ciscoSlbMIB(161) |
| CISCO-SLB-EXT-MIB | 1.3.6.1.4.1.9.9.254 | iso(1).org(3).dod(6).internet(1).private(4). enterprises(1).cisco(9).ciscoMgmt(9).ciscoSlbExtMIB(254) |
Fault tolerance state of the CSM can be monitored via SNMP.
MIB: CISCO-SLB-EXT-MIB:cslbxFtState.9
OID: 1.3.6.1.4.1.9.9.254.1.10.1.1.7.9
Possible Return Values:
notConfigFT(1), initializingFT(2), activeFT(3), standbyFT(4)
Description:
'notConfigFT' : Was not configured with FT. 'initializingFT' : Initializing Fault Tolerance. 'activeFT' : Active FT peer. 'standbyFT' : Standby FT peer.
CSM has 5 IXP Processors and 1 Power PC. The information for the CPU utilization of IXPs is not available via SNMP, therefore CLI show command needs to be used to collect the information.
CLI Command:
show mod con X tech-support utilization | include IXP
Note: X is the slot where CSM is installed
Sample Output:
show mod con 9 tech-support utilization | include IXP IXP Engines IXP1 0% - This is the Session Processor IXP2 0% - This is the TCP Processor IXP3 0% - This is the L7 Processor IXP4 0% - This is the LB Processor IXP5 0% - This is the NAT Processor
Similarly, the information for memory utilization on the CSM is not available via SNMP, therefore CLI show command needs to be used to collect the information.
CLI Command:
show mod con X tech-support utilization | include Memory
Note: X is the slot where CSM is installed
Sample Output:
sh mod con 9 tech-support utilization | in Memory Memory Availible Memory 71% 179M Allocated Memory 20% 51M OS Static Memory 9% 24M
This information can be obtained using CLI commands as well.
To calculate the bandwidth utilization of the CSM, the utilization of the port-channel between the CSM and MSFC and the GIG interfaces between the CSM and MSFC need to be monitored. There are no CLI commands on the CSM itself that indicate the bandwidth utilization, but there are CLI commands that indicate if the CSM is exceeding its throughput capabilities.
CLI Command:
show int port-channel 2xx | include rate
Note: xx is 256+slot number where CSM is installed.
Sample Output:
show interfaces port-channel 265 | include rate 5 minute input rate 2000 bits/sec, 3 packets/sec 5 minute output rate 44000 bits/sec, 76 packets/sec
CLI Command:
show int gig slot/port | include rate
Note: slot is where the CSM is installed, and port range is 1-4.
Sample Output:
show interfaces gigabitEthernet 9/1 | include rate 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 38000 bits/sec, 67 packets/sec
Following are additional CLI commands on the MSFC and CSM that indicate either an error condition or throughput limitation issue.
Following two MSFC commands show any errors reported on the port-channel and individual interfaces.
CLI Command:
show int port-channel 2xx | include error
Note: xx is 256+slot number where CSM is installed.
Sample Output:
show interfaces port-channel 265 | include error 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 output errors, 0 collisions, 0 interface resets
CLI Command:
show int gig slot/port | include error
Note: slot is where the CSM is installed, and port range is 1-4.
Sample Output:
show interfaces gig 9/1 | include error 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 output errors, 0 collisions, 8 interface resets
Following CSM commands indicate if CSM is experiencing capacity issue.
CLI Command:
show mod con X stats | include Overflow
Note: X is the slot where CSM is installed
Sample Output:
show mod c 9 stats | include Overflow Overflow Errors: 0, CRC Errors: 0
A non-zero value for the Overflow Errors indicates that CSM is dropping packets.
CLI Command:
show mod con X tech-support processor 5 | include TX
Note: X is the slot where CSM is installed
Sample Output:
show mod con 9 tech-support processor 5 | include TX TX FIFO Full 0 0 TX Window Full 0 0
A non-zero value for the above counters indicates that the NAT processor is unable to transmit the traffic to the transmit queue.
CLI Command:
show mod con X tech-support processor 2 | include Overflow
Note: X is the slot where CSM is installed
Sample Output:
show mod con 9 tech-support processor 2 | include Overflow Session Queue Overflow 0 0 Control->Term Queue Overflow 0 0 t_fifo Overflow 0 0
‘Session Reused while valid’ counter refers to the SRAM queue that exists between Session processor and TCP processor. When this increments, it means that the queue has been overrun.
‘t_fifo Overflow’ counter value refers to the TCP processor transmit queue overflowing. The result is the TCP processor is unable to send or forward packets.
The information for connections per second on the CSM can NOT be obtained via SNMP or a single CLI show command. The method to calculate the CPS information is to rely on the output value of two counters in a CLI command and then calculate the CPS with a formula.
CLI Command:
show mod c X tech-support processor 1
Note: X is the slot where CSM is installed.
Formula for CPS:
cps ~= Packet New Sessions value in the right column
-----------------------------------------------------
0.81 x Current time value in right column
Sample Output:
show mod c 3 tech-support processor 1
--------------------------------------------------------------
--------------------- SESSION Statistics ---------------------
--------------------------------------------------------------
Current time 49156 365
Aborted rx 2680373453 506291019
Total Packets rx 1423543303 21213140
Packets Dropped 2596 0
Packets Drop Stale Connection 12761723 215779
Packets Drop No More Sessions 0 0
Packets Drop No VLAN 42401 315
Packets Drop Bad Checksum 0 0
Packets Drop IP Fragments 5284 92
Packets Drop SI with no SMAC 25070 0
Packets Drop: SI, Route Mode, no DMAC 0 0
Packets Drop: Not IP, SNAP 0 0
Packets Drop: Zero L3 offset 0 0
Packets Drop: vlan/vs Force Drop 2463997 20159
Packets Drop: Slowpath limit exceeded 0 0
Packets Drop: LP non-ip, non-arp 0 0
Packets Drop: TCP/UDP with zero port 2144 63
Packets Drop: CDP 890357 6614
Packets Spanning Tree DMAC 0 0
Packets Repeat: Slowpath limit exceeded 0 0
Packets Slowpath 1468435 10983
Packets High Priority 350090 2607
Packets Session Hit 1244543546 18762672
Packets New Sessions 88736246 1376523
So, for the output above
cps ~= 1376523
---------------
0.81 * 365
~= 4656 connections per second
This section includes the information that should be monitored for the real servers (reals) on the CSM.
Information about number of current connections per real can be obtained via SNMP.
MIB: CISCO-SLB-MIB:slbRealNumberOfConnections
OID: 1.3.6.1.4.1.9.9.161.1.3.1.1.5
Possible Return Values:
An integer value indicating the number TCP and UDP connections currently assigned to this real server.
Information about total number of failed connections per real can be obtained via SNMP.
MIB: CISCO-SLB-MIB: slbRealTotalFails
OID: 1.3.6.1.4.1.9.9.161.1.3.1.1.16
Possible Return Values:
An integer value indicating the total number of times this real server has failed since the creation of this row.
Information about the real server state can be obtained via SNMP.
MIB: CISCO-SLB-MIB: slbRealState
OID: 1.3.6.1.4.1.9.9.161.1.3.1.1.4
Possible Return Values:
outOfService(1), inService(2), failed(3), readyToTest(4), testing(5), maxConnsThrottle(6), maxClientsThrottle(7), dfpThrottle(8), probeFailed(9), probeTesting(10), operWait(11), testWait(12), inbandProbeFailed(13), returnCodeFailed(14)
Description:
'outOfService' : Server is not in use by SLB as a destination for client connections. This state can be written and read. 'inService' : Server is in use as a destination for SLB client connections. This state can be written and read. 'failed' : Server has failed and will not be retried for retry timer seconds. This state can only be read. 'readyToTest' : Server has failed and has an expired retry timer, test connections will begin flow to it soon. This state can only be read. 'testing' : Server has failed and been given another test connection, success of this connection is not known yet. This state can only be read. 'maxConnsThrottle' : Server has reached its maximum number of connections and is no longer being given connections. This state can only be read. 'maxClientsThrottle' : Server has reached the maximum allowed clients. This state can only be read. 'dfpThrottle' : DFP has lowered the weight of this server to throttle level, so that no new connections will be assigned to it until DFP raises its weight. This state can only be read. 'probeFailed' : SLB probe to this this server has failed. No new connections will be assigned to it until a probe to this server succeeds. This state can only be read. 'probeTesting' : Server has received a test probe from SLB. This state can only be read. 'operWait' : Server is ready to go operational, but is waiting for the associated redirect virtual to be inservice. This state can only be read. 'testWait' : Server is ready to be tested. This state is applicable only when the server is used for http redirect load balancing. This state can only be read. 'inbandProbeFailed' : Server has failed the inband Health Probe agent. This state can only be read. 'returnCodeFailed' : Server has been disabled because it returned an HTTP code that matched a configured value. This state can only be read.
This section includes the information that should be monitored for the VSERVERs on the CSM.
The information about the number of current connection per vserver can be obtained via SNMP.
MIB: CISCO-SLB-MIB:slbVirtualNumberOfConnections
OID: 1.3.6.1.4.1.9.9.161.1.4.1.1.17
Possible Return Values:
An integer value is returned indicating the number of currently assigned connections being handled by this virtual server.
The information about the vserver state can be obtained via SNMP.
MIB: CISCO-SLB-MIB:slbVirtualServerState
OID: 1.3.6.1.4.1.9.9.161.1.4.1.1.2
Possible Return Values:
outOfService(1), inService(2), standby(3), inOperReal(4), stbInOperReal(5), testReal,(6) outOfMemory(7)
Description:
'outOfService' : Virtual server is not active and is not affecting client traffic in any way. 'inService' : Virtual server is active and is load-balancing matching client traffic to available real servers. 'standby' : Virtual server is a backup for a virtual server on another SLB device, and is currently inactive. 'inOperReal' : Real server associated with this redirect virtual server is not operational. This state can only be read. 'stbInOperReal' : Real server associated with this redirect virtual server is not operational, and this virtual server is in standby state. This state can only be read. 'testReal' : This is a redirect virtual server and the real server associated with it is being tested. This state can only be read. 'outOfMemory' : Virtual server is not enabled because it does not have enough memory to hold the configured matching policy information. This state can only be read.
This section includes the information that should be monitored for total number of current connections and connections rejected by the CSM.
Total number of current connections can be monitored via SNMP on the CSM.
MIB: CISCO-SLB-EXT-MIB:cslbxStatsCurrConnections
OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.3
Possible Return Values:
An integer value is returned indicating the number of connections currently still open.
Information about connections rejected by the CSM for various reasons can be monitored via SNMP.
For Failed Connections:
MIB: CISCO-SLB-EXT-MIB:cslbxStatsFailedConns
OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.5
Possible Return Values:
An integer value is returned indicating the number of connections that were load balanced to real servers that then failed to respond due timeout or TCP RST.
For Rejected Connections:
MIB: CISCO-SLB-EXT-MIB:cslbxStatsDroppedL4PolicyConns
OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.9
Possible Return Values:
An integer value is returned indicating the number of connections dropped by virtual servers with only layer 4 configuration.
MIB: CISCO-SLB-EXT-MIB:cslbxStatsDroppedL7PolicyConns
OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.10
Possible Return Values:
An integer value is returned indicating the number of connections dropped by virtual servers with some layer 7 policy.
MIB: CISCO-SLB-EXT-MIB:cslbxStatsMaxParseLenRejects
OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.18
Possible Return Values:
An integer value is returned indicating the number of connections rejected because the length of an HTTP request or response header exceeded the maximum L7 parse length configured for the matching virtual server.
MIB: CISCO-SLB-EXT-MIB:cslbxStatsNoActiveServerRejects.
OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.16
Possible Return Values:
An integer value is returned indicating the number of connections rejected because the chosen server farm did not have any active servers.
CSM is capable of sending SNMP TRAPs for software version 3.1.3 and above. CSM sends traps to a host for the following categories:
CSM sends and SNMP TRAP for the real server state change when:
A Real Server goes Probe_Failed, and becomes OPERATIONAL. Depending upon the SNMP TRAP tool, a sample output will look like the following:
“iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9: STRING: SLB-NETMGT: TCP health probe failed for server 77.77.77.142: 23 in serverfarm 'GROUP-A-TCP': " OR “.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9. 11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23: INTEGER: 9:” “.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9: STRING: SLB-NETMGT: TCP health probe re-activated server 77.77.77.142: 23 in serverfarm 'GROUP-A-TCP':” OR “.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9. 11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23: INTEGER: 2”
A Real Server is manually changed to ‘no inservice’, and then manually changed to ‘inservice’. Depending upon the SNMP TRAP tool, a sample output will look like the following:
“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9: STRING: SLB-NETMGT: Configured server 77.77.77.142: 23 to OUT-OF-SERVICE in serverfarm 'GROUP-A-TCP':” OR “.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9. 11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23: INTEGER: 1: “.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9: STRING: SLB-NETMGT: Configured server 77.77.77.142: 23 to INSERVICE in serverfarm 'GROUP-A-TCP'” OR “.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9. 11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23: INTEGER: 2:”
CSM only sends SNMP TRAP for the Vserver state change when:
A vserver is manually changed to ‘no inservice’. Depending upon the SNMP TRAP tool, a sample output will look like the following:
“.iso.org.dod.internet.private.enterprises.9.9.161.1.4.1.1.2.9. 14.71.82.79.85.80.45.65.45.84.69.76.78.69.84: INTEGER: 1:”
where the value for INTEGER : 1 can be obtained from the MIB Table :
outOfService(1), inService(2), standby(3), inOperReal(4), stbInOperReal(5), testReal,(6) outOfMemory(7)
CSM sends an SNMP TRAP for the Fault Tolerance State change when it goes from Active to Standby, and Standby to Active. Depending upon the SNMP TRAP tool, a sample output will look like the following:
.iso.org.dod.internet.private.enterprises.9.9.254.1.10.1.1.7.9: INTEGER: 3 .iso.org.dod.internet.private.enterprises.9.9.254.1.10.1.1.7.9: INTEGER: 4:
where the value for INTEGER : 3 and INTEGER : 4 can be obtained from the MIB Table:
notConfigFT(1), initializingFT(2), activeFT(3), standbyFT(4)
In addition to SNMP and CLI based monitoring, another important method to monitor the health of CSM is using SYSLOG messages. CSM sends SYSLOG messages for various events.
For CSM_SLB messages, only levels 3, 4, and 6 are used, where
Level 3 means ‘error conditions’
Level 4 means ‘warning conditions’
Level 6 means ‘informational’ log messages.
Before looking at log messages in various categories, it is important to note that when syslog messages are received, they are preceded by one of the following banners,
Where # is the slot number of the CSM module:
Error Message:
00:00:00: CSM_SLB-4-INVALIDID Module # invalid ID 00:00:00: CSM_SLB-4-DUPLICATEID Module # duplicate ID 00:00:00: CSM_SLB-3-OUTOFMEM Module # memory error 00:00:00: CSM_SLB-4-REGEXMEM Module # regular expression memory error 00:00:00: CSM_SLB-4-ERRPARSING Module # configuration warning 00:00:00: CSM_SLB-4-PROBECONFIG Module # probe configuration error 00:00:00: CSM_SLB-4-ARPCONFIG Module # ARP configuration error 00:00:00: CSM_SLB-6-RSERVERSTATE Module # server state changed 00:00:00: CSM_SLB-6-GATEWAYSTATE Module # gateway state changed 00:00:00: CSM_SLB-3-UNEXPECTED Module # unexpected error 00:00:00: CSM_SLB-3-REDUNDANCY Module # FT error 00:00:00: CSM_SLB-4-REDUNDANCY_WARN Module # FT warning 00:00:00: CSM_SLB-6-REDUNDANCY_INFO Module %d FT info 00:00:00: CSM_SLB-3-ERROR Module # error 00:00:00: CSM_SLB-4-WARNING Module # warning 00:00:00: CSM_SLB-6-INFO Module # info 00:00:00: CSM_SLB-4-TOPOLOGY Module # warning 00:00:00: CSM_SLB-3-RELOAD Module # configuration reload failed 00:00:00: CSM_SLB-3-VERMISMATCH Module # image version mismatch 00:00:00: CSM_SLB-4-VERWILDCARD Received CSM-SLB module version wildcard on slot # 00:00:00: CSM_SLB-3-PORTCHANNEL Portchannel allocation failed for module # 00:00:00: CSM_SLB-3-IDB_ERROR Unknown error occurred while configuring IDB
| Error | Notes |
|---|---|
| Error Message | SLB-LCSC: No ARP response from real server A.B.C.D. |
| Explanation | The configured real server A.B.C.D. did not respond to ARP requests. |
| Example | *Dec 10 09:17:55.254: %CSM_SLB-6-RSERVERSTATE: Module 5 server state changed: SLB-NETMGT: Server 155.155.155.30 failed ARP request |
| Error | Notes |
|---|---|
| Error Message | SLB-LCSC: Health probe failed for server A.B.C.D on port P. |
| Explanation | The configured real server on port P of A.B.C.D. failed health checks. |
| Example | *Dec 10 09:16:59.978: %CSM_SLB-6-RSERVERSTATE: Module 5 server state changed: SLB-NETMGT: ICMP health probe failed for server 155.155.155.30:80 in serverfarm 'TEST' |
| Error | Notes |
|---|---|
| Error Message | SLB-FT: No response from peer. Transitioning from Standby to Active. |
| Explanation | The CSM detected a failure in its fault-tolerant peer and has transitioned to the active state. |
| Example | *Dec 10 09:25:06.310: %CSM_SLB-6-REDUNDANCY_INFO: Module 5 FT info: State Transition Standby -> Active *Dec 10 09:25:06.310: %CSM_SLB-4-REDUNDANCY_WARN: Module 5 FT warning: Standby is Active now (no heartbeat from active unit) |
| Error | Notes |
|---|---|
| Error Message | SLB-FT: Heartbeat intervals are not identical between ft pair. SLB-FT: Standby is not monitoring active now. |
| Explanation | Proper configuration of the fault-tolerance feature requires that the heartbeat intervals be identical between CSMs within the same fault-tolerance group, and this is currently not the case. The fault-tolerance feature is disabled until the heartbeat intervals have been configured identically. |
| Example | *Dec 10 12:41:45.112: %CSM_SLB-3-REDUNDANCY: Module 5 FT error: heartbeat interval is not identical between ft pair 5:15: |
| Error | Notes |
|---|---|
| Error Message | SLB-FT: The configurations are not identical between the members of the fault tolerant pair. |
| Explanation | In order for the fault-tolerance system to preserve the sticky database, the different CSMs in the fault-tolerance group must be identically configured, and this is not currently the case. |
| Error | Notes |
|---|---|
| Error Message | SLB-DIAG: WatchDog task not responding |
| Explanation | A critical error occurred within the CSM hardware or software |
| Error | Notes |
|---|---|
| Error Message | SLB-DIAG: Fatal Diagnostic Error %x, Info %x. |
| Explanation | A hardware fault was detected. The hardware is unusable and must be repaired or replaced. |
| Error | Notes |
|---|---|
| Error Message | SLB-DIAG: Diagnostic Warning %x, Info %x. |
| Explanation | A non-fatal hardware fault was detected. |
In addition, there are messages when the CSM crashes. The crash could be on any of the 6 (5 IXP + I PowerPC) processors, IXP1, IXP2, IXP3, IXP4, IXP5, or PPC.
An example of such message could be:
Oct 14 10:22:57: %CSM_SLB-3-UNEXPECTED: Module unexpected error: IXP2 exception encountered. Oct 14 10:22:59: %CSM_SLB-3-UNEXPECTED: Module unexpected error: Rebooting...
| Error | Notes |
|---|---|
| Error Message | SLB-LCSC: No ARP response from gateway address A.B.C.D. |
| Explanation | The configured gateway A.B.C.D. did not respond to ARP requests. |
| Error | Notes |
|---|---|
| Error Message | CSM_SLB-4-TOPOLOGY Module [dec] warning |
| Explanation | The CSM is detecting a "bridge loop" in the network. |
We have 9 guests and no members online