Skip navigation
8100 Views 10 Replies Latest reply: Jun 3, 2011 8:22 AM by jmp242 RSS
guyhlupi Rank: White Belt 10 posts since
Apr 20, 2011
Currently Being Moderated

May 10, 2011 2:07 PM

Interface Error Monitoring - Cisco Routers and Switches

I know that there is an interface error data source by default in ethernetCsmacd and ethernetCsmacd64 that looks for the aggregate of all interface errors. In the threshold "interface errors" I set the maximum value to 50 but I never get an alert. I have verified that there are more than 50 errors incrementing on various monitored interfaces within the 300 second polling period. What I did next was set up a new data source under ethernetCsmacd and ethernetCsmacd64 for just CRC errors using OID 1.3.6.1.4.1.9.2.2.1.1.12. I set the RRD type to Counter and put in a threshold with the minimum value blank and the maximum value set to 50. Now I get an alert on every interface for which the SNMP counter value for CRC errors is over 50. Since the SNMP counter only clears upon a device reload or wrap this results in a lot of alerts on interfaces where errors are not incrementing. I set the RRD type to Derive and the same thing happens.

 

It was my understanding that leaving the minimum blank and maximum at some value, let's call it 50, would only alert if the interface showed an increase of 50 or more CRC errors since the last polling period. Is that correct or does that set up cause the system to send an alert for every interface that has a CRC error count over 50 every polling period? That seems to be the behavior.

 

If someone could tell me how to alert only if an interface has seen an increase of X errors within a polling period that would be great.

  • nilie Rank: Green Belt 372 posts since
    May 27, 2010

    The OIDs for interface errors are 32-bit counters so the data source should be counter or (more apropriately) derive and in this case you are looking at the number of errors per second. By setting your threshold to 50, an event will be created whenever the error rate goes over the threshold of 50 errors/sec which is pretty high and this is why you don't see many events. Set your threshold at a value of 1 will be relevant since 1 error per second is already a sign that something does not go well.

     

    Hope this will help.

  • nilie Rank: Green Belt 372 posts since
    May 27, 2010

    OK, now I have a better understanding of your problem.

    First, can you tell us please what is the model of the Cisco device that is reporting lots of errors ? Second, from your Zenoss server, can you try to run a snmpwalk command on that device, with the OID you mentioned in your initial post. Wait 5 minutes or more and repeat then post here the results.

     

    I find it strange Cisco calls the MIB you're trying to use "OLD-CISCO-INTERFACES-MIB".

  • jskeane Rank: Green Belt 67 posts since
    Sep 15, 2010
    Currently Being Moderated
    4. May 18, 2011 12:10 PM (in response to nilie)
    Re: Interface Error Monitoring - Cisco Routers and Switches

    The OLD-CISCO-INTERFACES-MIB is also one of the MIBs for the Cisco 2960G and ME3400E switches. The full sets of MIBs for these switches also includes OLD-CISCO-SYS-MIB, OLD-CISCO-TS-MIB, and OLD-CISCO-IP-MIB.

  • nilie Rank: Green Belt 372 posts since
    May 27, 2010

    According to the information you posted, the error counters are not increasing during the polling interval so the rate is zero. In this case, a data point of type GAUGE will show you 6086 constantly and will trigger the event every time in case the threshold you set is to any lower value than that. If the type of the data point is DERIVE or COUNTER then your graph should show zero. Can you please post here a screenshot of your datapoint and threshold configuration ?

     

     

    Thanks

  • nilie Rank: Green Belt 372 posts since
    May 27, 2010

    Your configuration is correct which makes all this problem of yours quite puzzling. Do you have any events with "debug" severity showing up in the event console and also do you see anything abnormal in the zenperfsnmp daemon log ?

  • jmp242 ZenossMaster 4,060 posts since
    Mar 7, 2007

    I don't know that you can change the type once the RRDs are created... a lot of this stuff gets set in the RRD once, on creation. You could try deleting the affected RRDs and see if they are re-generated appropriately.

     

    --

    James Pulver

    Information Technology Area Supervisor

    LEPP Computer Group

    Cornell University

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points