Content Switching Module Device Monitoring

CSM Device Monitoring

This section lists the information that should be captured to monitor the state of the CSM at the device level

The CSM specific counters are included in 2 dedicated MIB’s:

MIBOIDFull Path
CISCO-SLB-MIB 1.3.6.1.4.1.9.9.161 iso(1).org(3).dod(6).internet(1).private(4).
enterprises(1).cisco(9).ciscoMgmt(9).ciscoSlbMIB(161)
CISCO-SLB-EXT-MIB 1.3.6.1.4.1.9.9.254 iso(1).org(3).dod(6).internet(1).private(4).
enterprises(1).cisco(9).ciscoMgmt(9).ciscoSlbExtMIB(254)

Fault Tolerance State

Fault tolerance state of the CSM can be monitored via SNMP.

MIB: CISCO-SLB-EXT-MIB:cslbxFtState.9

OID: 1.3.6.1.4.1.9.9.254.1.10.1.1.7.9

Possible Return Values:

notConfigFT(1),
initializingFT(2),
activeFT(3),
standbyFT(4)

Description:

'notConfigFT' : Was not configured with FT.
'initializingFT'   : Initializing Fault Tolerance.
'activeFT'    : Active FT peer.
'standbyFT'   : Standby FT peer.

CPU Utilization and Memory Utilization

CSM has 5 IXP Processors and 1 Power PC. The information for the CPU utilization of IXPs is not available via SNMP, therefore CLI show command needs to be used to collect the information.

CLI Command:

show mod con X tech-support utilization | include IXP 

Note: X is the slot where CSM is installed

Sample Output:

show mod con 9 tech-support utilization | include IXP
IXP Engines
IXP1    0%   - This is the Session Processor
IXP2    0%   - This is the TCP Processor
IXP3    0%   - This is the L7 Processor
IXP4    0%   - This is the LB Processor
IXP5    0%   - This is the NAT Processor

Similarly, the information for memory utilization on the CSM is not available via SNMP, therefore CLI show command needs to be used to collect the information.

CLI Command:

show mod con X tech-support utilization | include Memory 

Note: X is the slot where CSM is installed

Sample Output:

sh mod con 9 tech-support utilization | in Memory
Memory
Availible Memory     71%     179M
Allocated Memory     20%      51M
OS Static Memory      9%      24M

Bandwidth /Throughput Utilization

This information can be obtained using CLI commands as well.

To calculate the bandwidth utilization of the CSM, the utilization of the port-channel between the CSM and MSFC and the GIG interfaces between the CSM and MSFC need to be monitored. There are no CLI commands on the CSM itself that indicate the bandwidth utilization, but there are CLI commands that indicate if the CSM is exceeding its throughput capabilities.

CLI Command:

show int port-channel 2xx | include rate

Note: xx is 256+slot number where CSM is installed.

Sample Output:

show interfaces port-channel 265 | include rate
5 minute input rate 2000 bits/sec, 3 packets/sec
5 minute output rate 44000 bits/sec, 76 packets/sec

CLI Command:

show int gig slot/port | include rate 

Note: slot is where the CSM is installed, and port range is 1-4.

Sample Output:

show interfaces gigabitEthernet 9/1 | include rate
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 38000 bits/sec, 67 packets/sec

Following are additional CLI commands on the MSFC and CSM that indicate either an error condition or throughput limitation issue.

Following two MSFC commands show any errors reported on the port-channel and individual interfaces.

CLI Command:

show int port-channel 2xx | include error

Note: xx is 256+slot number where CSM is installed.

Sample Output:

show interfaces port-channel 265 | include error
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 output errors, 0 collisions, 0 interface resets

CLI Command:

show int gig slot/port | include error 

Note: slot is where the CSM is installed, and port range is 1-4.

Sample Output:

show interfaces gig 9/1 | include error
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 output errors, 0 collisions, 8 interface resets

Following CSM commands indicate if CSM is experiencing capacity issue.

CLI Command:

show mod con X stats | include Overflow

Note: X is the slot where CSM is installed

Sample Output:

show mod c 9 stats | include Overflow
Overflow Errors: 0, CRC Errors: 0

A non-zero value for the Overflow Errors indicates that CSM is dropping packets.

CLI Command:

show mod con X tech-support processor 5 | include TX

Note: X is the slot where CSM is installed

Sample Output:

show mod con 9 tech-support processor 5 | include TX
TX FIFO Full               0           0
TX Window Full             0           0

A non-zero value for the above counters indicates that the NAT processor is unable to transmit the traffic to the transmit queue.

CLI Command:

show mod con X tech-support processor 2 | include Overflow

Note: X is the slot where CSM is installed

Sample Output:

show mod con 9 tech-support processor 2 | include Overflow
Session Queue Overflow     0           0
Control->Term Queue Overflow               0           0
t_fifo Overflow            0           0

‘Session Reused while valid’ counter refers to the SRAM queue that exists between Session processor and TCP processor. When this increments, it means that the queue has been overrun.

‘t_fifo Overflow’ counter value refers to the TCP processor transmit queue overflowing. The result is the TCP processor is unable to send or forward packets.

Connections Per Second

The information for connections per second on the CSM can NOT be obtained via SNMP or a single CLI show command. The method to calculate the CPS information is to rely on the output value of two counters in a CLI command and then calculate the CPS with a formula.

CLI Command:

show mod c X tech-support processor 1

Note: X is the slot where CSM is installed.

Formula for CPS:

cps ~= Packet New Sessions value in the right column 
       -----------------------------------------------------
       0.81 x Current time value in right column

Sample Output:

show mod c 3 tech-support processor 1
--------------------------------------------------------------
--------------------- SESSION Statistics ---------------------
--------------------------------------------------------------
     Current time               49156       365       
     Aborted rx 2680373453  506291019 
     Total Packets rx           1423543303  21213140  
     Packets Dropped            2596        0         
     Packets Drop Stale Connection              12761723    215779    
     Packets Drop No More Sessions              0           0         
     Packets Drop No VLAN       42401       315       
     Packets Drop Bad Checksum  0           0         
     Packets Drop IP Fragments  5284        92        
     Packets Drop SI with no SMAC               25070       0         
     Packets Drop: SI, Route Mode, no DMAC      0           0         
     Packets Drop: Not IP, SNAP 0           0         
     Packets Drop: Zero L3 offset               0           0         
     Packets Drop: vlan/vs Force Drop           2463997     20159     
     Packets Drop: Slowpath limit exceeded      0           0         
     Packets Drop: LP non-ip, non-arp           0           0         
     Packets Drop: TCP/UDP with zero port       2144        63        
     Packets Drop: CDP          890357      6614      
     Packets Spanning Tree DMAC 0           0         
     Packets Repeat: Slowpath limit exceeded    0           0         
     Packets Slowpath           1468435     10983     
     Packets High Priority      350090      2607      
     Packets Session Hit        1244543546  18762672  
     Packets New Sessions       88736246    1376523  

So, for the output above

cps ~=   	1376523
    		---------------
      		0.81 * 365

      ~= 4656 connections per second

Go to top


Real Servers Monitoring

This section includes the information that should be monitored for the real servers (reals) on the CSM.

Current Connections

Information about number of current connections per real can be obtained via SNMP.

MIB: CISCO-SLB-MIB:slbRealNumberOfConnections

OID: 1.3.6.1.4.1.9.9.161.1.3.1.1.5

Possible Return Values:

An integer value indicating the number TCP and UDP connections currently assigned to this real server.

Total Failed Connections

Information about total number of failed connections per real can be obtained via SNMP.

MIB: CISCO-SLB-MIB: slbRealTotalFails

OID: 1.3.6.1.4.1.9.9.161.1.3.1.1.16

Possible Return Values:

An integer value indicating the total number of times this real server has failed since the creation of this row.

Real Server State

Information about the real server state can be obtained via SNMP.

MIB: CISCO-SLB-MIB: slbRealState

OID: 1.3.6.1.4.1.9.9.161.1.3.1.1.4

Possible Return Values:

outOfService(1),
inService(2),
failed(3),
readyToTest(4),
testing(5),
maxConnsThrottle(6),
maxClientsThrottle(7),
dfpThrottle(8),
probeFailed(9),
probeTesting(10),
operWait(11),
testWait(12),
inbandProbeFailed(13),
returnCodeFailed(14)

Description:

'outOfService' : Server is not in use by SLB as a destination for client connections.  This state can be written and read.
'inService' : Server is in use as a destination for SLB client connections.  This state can be written and read.
'failed' : Server has failed and will not be retried for retry timer seconds. This state can only be read.
'readyToTest' : Server has failed and has an expired retry timer, test connections will begin flow to it soon. This state can only be read. 
'testing' : Server has failed and been given another test connection, success of this connection is not known yet. This state can only be read.
'maxConnsThrottle' : Server has reached its maximum number of connections and is no longer being given connections. This state can only be read.
'maxClientsThrottle' : Server has reached the maximum allowed clients.  This state can only be read.
'dfpThrottle' : DFP has lowered the weight of this server to throttle level, so that no new connections will be assigned to it until DFP raises its weight. This state can only be read.
'probeFailed' : SLB probe to this this server has failed.  No new connections will be assigned to it until a probe to this server succeeds.  This state can only be read.
'probeTesting' : Server has received a test probe from SLB.  This state can only be read.
'operWait' : Server is ready to go operational, but is waiting for the associated redirect virtual to be inservice. This state can only be read.
'testWait' : Server is ready to be tested. This state is applicable only when the server is used for http redirect load balancing. This state can only be read.
'inbandProbeFailed' : Server has failed the inband Health Probe agent.  This state can only be read.
'returnCodeFailed' : Server has been disabled because it returned an HTTP code that matched a configured value. This state can only be read.

Go to top


Vserver Monitoring

This section includes the information that should be monitored for the VSERVERs on the CSM.

Current Connections

The information about the number of current connection per vserver can be obtained via SNMP.

MIB: CISCO-SLB-MIB:slbVirtualNumberOfConnections

OID: 1.3.6.1.4.1.9.9.161.1.4.1.1.17

Possible Return Values:

An integer value is returned indicating the number of currently assigned connections being handled by this virtual server.

Vserver State

The information about the vserver state can be obtained via SNMP.

MIB: CISCO-SLB-MIB:slbVirtualServerState

OID: 1.3.6.1.4.1.9.9.161.1.4.1.1.2

Possible Return Values:

outOfService(1),
inService(2),
standby(3),
inOperReal(4),
stbInOperReal(5),
testReal,(6)
outOfMemory(7)

Description:

'outOfService' : Virtual server is not active and is not affecting client traffic in any way.
'inService' : Virtual server is active and is load-balancing matching client traffic to available real servers.
'standby' : Virtual server is a backup for a virtual server on another SLB device, and is currently    inactive.
'inOperReal' : Real server associated with this redirect virtual server is not operational.  This state can only be read.
'stbInOperReal' : Real server associated with this redirect virtual server is not operational, and this virtual server is in standby state.  This state can only be read.
'testReal' : This is a redirect virtual server and the real server associated with it is being tested.  This state can only be read.
'outOfMemory' : Virtual server is not enabled because it does not have enough memory to hold the configured matching policy information.  This state can only be read.

Go to top


Layer 4 and Layer 7 statistical monitoring for Total Current Connections, Failed and Rejected Connections

This section includes the information that should be monitored for total number of current connections and connections rejected by the CSM.

Total Current Connections

Total number of current connections can be monitored via SNMP on the CSM.

MIB: CISCO-SLB-EXT-MIB:cslbxStatsCurrConnections

OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.3

Possible Return Values:

An integer value is returned indicating the number of connections currently still open.

Failed and Rejected Connections

Information about connections rejected by the CSM for various reasons can be monitored via SNMP.

For Failed Connections:

MIB: CISCO-SLB-EXT-MIB:cslbxStatsFailedConns

OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.5

Possible Return Values:

An integer value is returned indicating the number of connections that were load balanced to real servers that then failed to respond due timeout or TCP RST.

For Rejected Connections:

MIB: CISCO-SLB-EXT-MIB:cslbxStatsDroppedL4PolicyConns

OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.9

Possible Return Values:

An integer value is returned indicating the number of connections dropped by virtual servers with only layer 4 configuration.

MIB: CISCO-SLB-EXT-MIB:cslbxStatsDroppedL7PolicyConns

OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.10

Possible Return Values:

An integer value is returned indicating the number of connections dropped by virtual servers with some layer 7 policy.

MIB: CISCO-SLB-EXT-MIB:cslbxStatsMaxParseLenRejects

OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.18

Possible Return Values:

An integer value is returned indicating the number of connections rejected because the length of an HTTP request or response header exceeded the maximum L7 parse length configured for the matching virtual server.

MIB: CISCO-SLB-EXT-MIB:cslbxStatsNoActiveServerRejects.

OID: 1.3.6.1.4.1.9.9.254.1.1.1.1.16

Possible Return Values:

An integer value is returned indicating the number of connections rejected because the chosen server farm did not have any active servers.

Go to top


CSM SNMP Traps

CSM is capable of sending SNMP TRAPs for software version 3.1.3 and above. CSM sends traps to a host for the following categories:

  • Real Server State Change
  • Vserver State Change.
  • Fault Tolerance State Change.

Real Server State Change

CSM sends and SNMP TRAP for the real server state change when:

A Real Server goes Probe_Failed, and becomes OPERATIONAL. Depending upon the SNMP TRAP tool, a sample output will look like the following:

“iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9:
  STRING:  SLB-NETMGT:  TCP health probe failed for server 77.77.77.142: 23 in serverfarm 'GROUP-A-TCP': "
OR
“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9.
11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23:  INTEGER:  9:”
 		
“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9:
  STRING:  SLB-NETMGT:  TCP health probe re-activated server 77.77.77.142: 23 in serverfarm 'GROUP-A-TCP':”
OR
“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9.
11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23:  INTEGER:  2”

A Real Server is manually changed to ‘no inservice’, and then manually changed to ‘inservice’. Depending upon the SNMP TRAP tool, a sample output will look like the following:

“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9:
  STRING:  SLB-NETMGT:  Configured server 77.77.77.142: 23 to OUT-OF-SERVICE in serverfarm 'GROUP-A-TCP':”
OR
“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9.
11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23:  INTEGER:  1:

“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9:
  STRING:  SLB-NETMGT:  Configured server 77.77.77.142: 23 to INSERVICE in serverfarm 'GROUP-A-TCP'”
OR
“.iso.org.dod.internet.private.enterprises.9.9.161.1.3.1.1.4.9.
11.71.82.79.85.80.45.65.45.84.67.80.77.77.77.142.23:  INTEGER:  2:”

Vserver State Change

CSM only sends SNMP TRAP for the Vserver state change when:

A vserver is manually changed to ‘no inservice’. Depending upon the SNMP TRAP tool, a sample output will look like the following:

“.iso.org.dod.internet.private.enterprises.9.9.161.1.4.1.1.2.9.
14.71.82.79.85.80.45.65.45.84.69.76.78.69.84:  INTEGER:  1:” 

where the value for INTEGER : 1 can be obtained from the MIB Table :

outOfService(1),
inService(2),
standby(3),
inOperReal(4),
stbInOperReal(5),
testReal,(6)
outOfMemory(7)

Fault Tolerance State Change

CSM sends an SNMP TRAP for the Fault Tolerance State change when it goes from Active to Standby, and Standby to Active. Depending upon the SNMP TRAP tool, a sample output will look like the following:

.iso.org.dod.internet.private.enterprises.9.9.254.1.10.1.1.7.9:  INTEGER:  3
.iso.org.dod.internet.private.enterprises.9.9.254.1.10.1.1.7.9:  INTEGER:  4:

where the value for INTEGER : 3 and INTEGER : 4 can be obtained from the MIB Table:

notConfigFT(1),
initializingFT(2),
activeFT(3),
standbyFT(4)

Go to top


CSM Syslog Messages

In addition to SNMP and CLI based monitoring, another important method to monitor the health of CSM is using SYSLOG messages. CSM sends SYSLOG messages for various events.

For CSM_SLB messages, only levels 3, 4, and 6 are used, where

Level 3 means ‘error conditions’

Level 4 means ‘warning conditions’

Level 6 means ‘informational’ log messages.

Before looking at log messages in various categories, it is important to note that when syslog messages are received, they are preceded by one of the following banners,

Where # is the slot number of the CSM module:

General Syslog Messages List

Error Message:

00:00:00: CSM_SLB-4-INVALIDID Module # invalid ID 
00:00:00: CSM_SLB-4-DUPLICATEID Module # duplicate ID
00:00:00: CSM_SLB-3-OUTOFMEM Module # memory error
00:00:00: CSM_SLB-4-REGEXMEM Module # regular expression memory error
00:00:00: CSM_SLB-4-ERRPARSING Module # configuration warning
00:00:00: CSM_SLB-4-PROBECONFIG Module # probe configuration error
00:00:00: CSM_SLB-4-ARPCONFIG Module # ARP configuration error
00:00:00: CSM_SLB-6-RSERVERSTATE Module # server state changed
00:00:00: CSM_SLB-6-GATEWAYSTATE Module # gateway state changed
00:00:00: CSM_SLB-3-UNEXPECTED Module # unexpected error
00:00:00: CSM_SLB-3-REDUNDANCY Module # FT error
00:00:00: CSM_SLB-4-REDUNDANCY_WARN Module # FT warning
00:00:00: CSM_SLB-6-REDUNDANCY_INFO Module %d FT info
00:00:00: CSM_SLB-3-ERROR Module # error
00:00:00: CSM_SLB-4-WARNING Module # warning
00:00:00: CSM_SLB-6-INFO Module # info
00:00:00: CSM_SLB-4-TOPOLOGY Module # warning
00:00:00: CSM_SLB-3-RELOAD Module # configuration reload failed
00:00:00: CSM_SLB-3-VERMISMATCH Module # image version mismatch
00:00:00: CSM_SLB-4-VERWILDCARD Received CSM-SLB module version wildcard on slot #
00:00:00: CSM_SLB-3-PORTCHANNEL Portchannel allocation failed for module #
00:00:00: CSM_SLB-3-IDB_ERROR Unknown error occurred while configuring IDB

Real Server State related Syslog Messages

ErrorNotes
Error Message SLB-LCSC: No ARP response from real server A.B.C.D.
Explanation The configured real server A.B.C.D. did not respond to ARP requests.
Example *Dec 10 09:17:55.254: %CSM_SLB-6-RSERVERSTATE: Module 5 server state changed: SLB-NETMGT: Server 155.155.155.30 failed ARP request
ErrorNotes
Error Message SLB-LCSC: Health probe failed for server A.B.C.D on port P.
Explanation The configured real server on port P of A.B.C.D. failed health checks.
Example *Dec 10 09:16:59.978: %CSM_SLB-6-RSERVERSTATE: Module 5 server state changed: SLB-NETMGT: ICMP health probe failed for server 155.155.155.30:80 in serverfarm 'TEST'

Fault Tolerant State related Syslog Messages

ErrorNotes
Error Message SLB-FT: No response from peer. Transitioning from Standby to Active.
Explanation The CSM detected a failure in its fault-tolerant peer and has transitioned to the active state.
Example *Dec 10 09:25:06.310: %CSM_SLB-6-REDUNDANCY_INFO: Module 5 FT info: State Transition Standby -> Active
*Dec 10 09:25:06.310: %CSM_SLB-4-REDUNDANCY_WARN: Module 5 FT warning: Standby is Active now (no heartbeat from active unit)
ErrorNotes
Error Message SLB-FT: Heartbeat intervals are not identical between ft pair.
SLB-FT: Standby is not monitoring active now.
Explanation Proper configuration of the fault-tolerance feature requires that the heartbeat intervals be identical between CSMs within the same fault-tolerance group, and this is currently not the case. The fault-tolerance feature is disabled until the heartbeat intervals have been configured identically.
Example *Dec 10 12:41:45.112: %CSM_SLB-3-REDUNDANCY: Module 5 FT error: heartbeat interval is not identical between ft pair 5:15:
ErrorNotes
Error Message SLB-FT: The configurations are not identical between the members of the fault tolerant pair.
Explanation In order for the fault-tolerance system to preserve the sticky database, the different CSMs in the fault-tolerance group must be identically configured, and this is not currently the case.

Hardware and Firmware crash related Syslog Messages

ErrorNotes
Error Message SLB-DIAG: WatchDog task not responding
Explanation A critical error occurred within the CSM hardware or software
ErrorNotes
Error Message SLB-DIAG: Fatal Diagnostic Error %x, Info %x.
Explanation A hardware fault was detected. The hardware is unusable and must be repaired or replaced.
ErrorNotes
Error Message SLB-DIAG: Diagnostic Warning %x, Info %x.
Explanation A non-fatal hardware fault was detected.

In addition, there are messages when the CSM crashes. The crash could be on any of the 6 (5 IXP + I PowerPC) processors, IXP1, IXP2, IXP3, IXP4, IXP5, or PPC.

An example of such message could be:

Oct 14 10:22:57: %CSM_SLB-3-UNEXPECTED: Module  unexpected error:
IXP2 exception encountered.
Oct 14 10:22:59: %CSM_SLB-3-UNEXPECTED: Module  unexpected error:
Rebooting...

Gateway Health Monitoring related Syslog Messages

ErrorNotes
Error Message SLB-LCSC: No ARP response from gateway address A.B.C.D.
Explanation The configured gateway A.B.C.D. did not respond to ARP requests.

Topology Change related Syslog Message

ErrorNotes
Error Message CSM_SLB-4-TOPOLOGY Module [dec] warning
Explanation The CSM is detecting a "bridge loop" in the network.

Moving your apps to Amazon or Miscrosoft Clouds?

We can help you analyze your existing infrastructure, identify the cost savings we can achieve by migrating to a cloud provider. We can then execute end-to-end migration plan of your infrastructure and bringing down your TCO.

Cloud Computing

Ready for IPv6 Migration?

The Internet is running out of the equivalent of phone numbers - familiar problem, non-trivial solution.

The world has to move to IPv6, with its 128-bit addresses. But that's easier said than done.

IPv6 Migration

Are you fluent in "Linux"?

Learn Linux from a leading expert and quickly master you Linux skills.

Learn how to simplify your workflow and increase your productivity using tips and techniques of the pros.

Ideal training for Corporate IT Beginners and Advanced IT Admins alike.

Corporate Linux Training

Who's Online

We have 9 guests and no members online