Use > Investigate and Diagnose Problems > Interpret Incidents Related to SNMPTraps

Interpret Incidents Related to SNMP Traps

In addition to tracking Root Cause Incidents, NNMi accepts traps and (if there is a corresponding incident configuration that is enabled) generates a corresponding incidents to inform you of a potential problem. Concept Link IconSee Also

The following incidents are generated as a result of exceeding SNMP trap threshold or queue size limits.

If you are an NNMi administrator, see Control which Incoming Traps Are Visible in Incident Views for a list of the SNMP Trap Incidents that NNMi enables by default.

(NNMi Advanced) The following Incidents are generated only in a Global Network Management Environment:

  • Message Queue Incident Rate Exceeded ()
  • Message Queue Size Exceeded ()

Hosted Object Trap Storm

A Hosted Object Trap Storm incident indicates that the trap rate for an object on the specified node has exceeded a configured threshold.

The NNMi administrator can control traffic from traps with thresholds and blocking:

Use this incident to determine the following:

Trap Information Source
CIA Description
trapSource The IP address of the SNMP agent from which the SNMP traps originated.
totalTrapRate The total trap rate for the Node identified as the Source Node.
Suppressed Hosted Objects

A report containing all of the Source Objects on the specified Node that have exceeded a trap rate threshold. All SNMP traps are suppressed for the specified Object when the trap rate threshold is exceeded.

Note the following:

  • The report tracks the average trap rate over time.

    If a trap storm spike was the source of the problem, this report might show a lower than expected trap rate, because NNMi gathers the report data after the initial incident.

  • If the report exceeds the 2000 character limit, NNMi continues to log the information using subsequent files numbered consecutively; for example: suppressedHosteObjects.1, suppressedHostedObjects.2, and so on.
Unsuppressed Hosted Objects

A report containing all of the Source Objects on the specified Node that have traps that are not currently suppressed. This means the trap rate threshold has not been exceeded for these objects.

Note the following:

  • The report tracks the average trap rate over time.

    If a trap storm spike was the source of the problem, this report might show a lower than expected trap rate, because NNMi gathers the report data after the initial incident.

  • If the report exceeds the 2000 character limit, NNMi continues to log the information using subsequent CIAs numbered consecutively; for example: unsuppressedHosteObjects.1, unsuppressedHostedObjects.2, and so on.

Note the following:

  • By default NNMi determines threshold rates every 2 minutes. This means a trap rate must be below the threshold for at least the 2 minute interval before the incident is canceled.
  • If multiple objects are exceeding the specified threshold, NNMi generates the incident as soon as one object exceeds the configured threshold. The Source Object for the incident is the first object that exceeds the trap storm threshold.

When the trap rate returns to below the configured threshold, NNMi cancels the incident. See Incident Form: General Tab for more information.

Message Queue Incident Rate Exceeded (NNMi Advanced)

(NNMi Advanced) This incident applies to NNMi's Global Network Management feature, see NNMi's Global Network Management Feature (NNMi Advanced) for more information about this feature.

A queue is established on each Regional Manager. This queue holds information to be forwarded to the Global Manager.

A Message Queue Incident Rate Exceeded incident indicates that the volume of messages entering a Regional Manager's Global Network Management message queue has exceeded rate limits: 20 incidents per second within a 5 minute period (6,000 incidents within 5 minutes). NNMi would generate this Incident if a sudden burst of incident forwarding occurs (for example, 6,001 incidents within 2 minutes).

When the message queue's incident rate High limit is crossed, NNMi does the following:

  • Generates a Message Queue Incident Rate Exceeded incident with the Severity set to Critical.
  • Generates a GlobalNetworkManagementIncidentRateLimitExceeded health conclusion with the Severity set to Major.
  • Stops forwarding to Global Managers any incidents generated from SNMP Traps.

The NNMi administrator must specifically configure SNMP Trap Incidents to be forwarded from this Regional Manager to Global Managers.

To view the associated conclusion information, check the health of the Regional Manager using the Health tab from HelpSystem Information.

NNMi closes the incident when the incident rate falls below 90 percent of the incident rate limit and the next incident has been successfully forwarded.

Message Queue Size Exceeded (NNMi Advanced)

(NNMi Advanced) When the Global Network Management feature is enabled, a queue is established on each Regional Manager. This queue holds information to be forwarded to Global Managers. See NNMi's Global Network Management Feature (NNMi Advanced) for more information about this feature.

A Message Queue Size Exceeded incident indicates that a Regional Manager's Global Network Management message queue has exceeded configured limits:

  • Default lower limit is 200,000 messages.
  • Default upper limit is 250,000 messages.

When the message queue size's lower limit is reached, NNMi generates the following:

  • A Message Queue Size Exceeded incident with the Severity set to Warning.
  • A GlobalNetworkManagementIncidentQueueSizeLimitExceeded health conclusion with the Severity set to Warning.

When the message queue size's upper limit is reached, NNMi generates the following:

  • A Message Queue Size Exceeded incident with the Severity set to Critical.
  • A GlobalNetworkManagementIncidentQueueSizeLimitExceeded health conclusion with the Severity set to Major.

  • Stops forwarding to Global Managers any incidents generated from SNMP Traps.

    The NNMi administrator must specifically configure SNMP Trap Incidents to be forwarded from this Regional Manager to Global Managers. 

This incident indicates a connection problem with a Global Manager. Click HelpSystem Information and select the Global Network Management tab to identify which Global Manager is not currently connected.

To resolve this issue, communication with that Global Manager must be reestablished.

Pipeline Queue Size Exceeded Limit

A Pipeline Queue Size Exceeded Limit incident indicates one of the queues connecting the stages for the Event Pipeline is above the configured limits. NNMi determines queue size limits based on memory size. Click here for more information about the Event Pipeline.

Any incident information that appears in your incident views first travels through the Event Pipeline. The Event Pipeline guarantees that the incident data is analyzed in chronological order.

Not all information that travels through the pipeline results in an incident.

If at any time an incident does not meet the criteria for a stage in the Event Pipeline, it is ignored and passed to the next stage in the pipeline or it is dropped. For information about each of the stages in the Event Pipeline, see Help for Administrators.

When the lower queue size limit is reached, NNMi generates the following:

  • A Pipeline Queue Size Exceeded Limit incident with the Severity set to Major.
  • A PipelineQueueSizeLowerLimitExceeded health conclusion with the Severity set to Major.

When the upper limit is reached, NNMi does the following:

  • Generates a Pipeline Queue Size Exceeded Limit incident with the Severity set to Critical.
  • Generates a PipelineQueueSizeHigherLimitExceeded health conclusion with the Severity set to Major.

  • Drops incidents created from SNMP Traps, but continues to generate incidents created from Management Events. 

To reduce the number of incidents in the queue, ask your NNMi administrator to disable any SNMP Trap Incident configurations that are not essential.

SNMP Trap Limit (Warning, Major or Critical)

An SNMP Trap Limit (Warning, Major or Critical incident indicates the number of SNMP traps has reached or exceeded the maximum limit. The SNMP trap limit is 100,000.

When the maximum limit is reached, NNMi no longer accepts traps from the Event system. The NNMi administrator can reduce the number of traps in the NNMi database. If you are an NNMi administrator, see "Configuring the Auto-Trim Oldest SNMP Trap Incidents Feature" in the Network Node Manager i Software Deployment Reference for more information.

An SNMP Trap Limit (Warning, Major or Critical) incident is generated with Severity set to either Warning, Major, or Critical.

When the number of traps reaches 90 percent of the maximum limit, NNMi generates the following:

  • An SNMP Trap Limit Warning SNMP Trap incident with a Severity set to Warning.
  • A health conclusion SnmpTrapLimitExceeded with the Severity set to Warning.

When the number of traps reaches 95 percent of this maximum limit, NNMi generates the following:

  • An SNMP Trap Limit Major SNMP Trap incident with a Severity set to Major.
  • A health conclusion SnmpTrapLimitExceeded with the Severity set to Major.

When the number of traps reaches the maximum limit, NNMi generates the following:

  • An SNMP Trap Limit Critical SNMP Trap incident with a Severity set to Critical.
  • A health conclusion SnmpTrapLimitExceeded with the Severity set to Critical.

Trap Storm

A Trap Storm incident indicates one of the following:

  • The overall trap rate in your network management domain exceeds a set threshold. Use the overallThresholdRate argument to the nnmtrapconfig.ovpl command to set this threshold.

    The incident's blockedSources and blockedTraps CIA values are set to all.

  • The trap rate on an IP address in a Node exceeds a set threshold. Use the thresholdRate argument to the nnmtrapconfig.ovpl command to set this threshold.

    The incident's blockedSources CIA value contains the IP address of the node that is the source of the trap storm. The blockedTraps CIA is set to all.

  • The overall trap rate for a specific trap (Object Identifier) exceeds a threshold. Use the thresholdRate argument to the nnmtrapconfig.ovpl command to set this threshold.

    The incident's blockedSources CIA value is set to all. The incident's blockedTraps CIA contains the Object Identifier (OID) of the trap that has exceeded the specified threshold value.

Note the following:

  • NNMi determines threshold rates every 5 minutes. This means a trap rate must be below the threshold for at least the 5 minute interval before the incident is canceled.
  • If multiple nodes are exceeding the specified threshold, NNMi tracks information for only the first node that has exceeded the trap storm threshold until it can cancel the incident.

Use this incident to determine the following:

Trap Information Source
CIA Description
trapRate

The trap rate for the first trap that has exceeded the threshold limit.

blocked Sources

The IP address for the Node, if any, that has suppressed traps.

This CIA vaue is all if the overall trap rate is exceeded or if the overall trap rate for a specific trap OID is exceeded.

blockedTraps

A report containing all of the nodes that have traps that are currently suppressed.

This CIA vaue is all if the overall trap rate is exceeded or if the overall trap rate for a specific node is exceeded.

When the trap rate returns to below the configured threshold, NNMi cancels the incident. See Incident Form: General Tab for more information.