I am looking to tell any easy way if it should be a gauge or a derive.
I haven't been using snmp long enough to easily tell or I dont know the common rule to easily know which one to choose.
I am going to add the below and am looking for guidance.
LATENCY_AVG_MINUTE, SNMP OID: .220.127.116.11.4.1.15418.104.22.168.7.9.1
and if possible:
'BANDWIDTH_AVG_MINUTE': .22.214.171.124.4.1.154126.96.36.199.7.4.1 (output is in kbps)
Taking a few known high loaded cluster of WSA the average response time is usual somewhere between 1200 and 5100 ms, however a good alert value should be about 15000 to 30000 ms (15-30 sec).
This should be profiled together with the TRANSACTION_RATE_AVG_MINUTE as the LATENCY_AVG with only marginal traffic is very sensitive to be spoiled with long run request. This again can be avoided by taking BANDWITH_AVG into account also (large downloads will increase latency, however also bandwith usage).
This is important to understand to proper monitor the WSA units within an high performance environment. If there are any questions please feel free to contact us anytime.
Total current connections > is calculated from adding the following two variables
If the total sum of this values are increasing dramatically (in factors) of the usual amount without any reasonable change, this may indicate temporary leaks that might worth customer supports attention to follow up.
To setup the initial values our recommendation is to trial these numbers within e.g. one business week in production to determine the typical behavior for your network / group of WSA units. Also, these numbers are good to record to enlighten new possible trends of the user behavior.
The only official way of getting those answers is to read the MIB detailed description hoping the vendor took care to document it thoroughly. If you see the mention "counter" this should tell you to use derive, otherwise stay with gauge.
However, you can go ahead and try to guess the correct type of data source and you will notice rigth away if you're getting it wrong. If you use gauge for a "counter" OID, your graph will have a dented look, something like a handsaw shape (a slope increasing up to the maximum value that can be stored in that counter followed by an abrupt return to zero). On the other hand, if you're trying to use counter for a "gauge" OID your graphs will be zero or close to zero most of the time instead of the actual value. As a clue, you can do several snmpget for that OID at about 30 seconds to one minute interval and if the value doesn't change (or it changes very little), you may assume it is a gauge type.
Hoping this will help.
A less scientific approach that might help...
If you want to graph the difference between the current data point and the last datapoint, you want a DERIVE
If you want to graph the datapoint as its retrieved you want a GAUGE. An example of this would would be temperature. If you are graphing temperature it doesnt make sense to take a delta, you simply want to graph
the tempature as its returned.
From your examples above I would suggest LATENCY_AVG_MINUTE, TRANSACTION_RATE_AVG_MINUTE, BANDWIDTH_AVG_MINUTE all be GAUGE. The SNMP agent on the appliciance is already calculating the rate, so you want to graph the value as its returned.
For CURRENT_TOTAL_CLIENT_CONN & CURRENT_TOTAL_SERVER_CONN the decision is a little more complicated. You have to ask yourself, do I want to graph the TOTAL CONNECTIONS at a momement in time (GAUGE), or do I wan to graph it as Connections Per Second aka rate. If you want the rate you want to use DERIVE. So you can see that any given OID can be potentially be useful as either type.
So.... Quick Cheat Sheet:
RATE/DELTA = DERIVE
EXACT VALUE = GAUGE
thank you both. I got it.
Follow Us On Twitter »
||Latest from the Zenoss Blog »||Community||Products||Services||Customers||About Us|
Copyright © 2005-2011 Zenoss, Inc.