TY - GEN
T1 - Automated and adaptive threshold setting
T2 - 2nd International Conference on Autonomic Computing, ICAC 2005
AU - Breitgand, David
AU - Henis, Ealan
AU - Shehory, Onn
PY - 2005
Y1 - 2005
N2 - Threshold violations reported for system components signal undesirable conditions in the system. In complex computer systems, characterized by dynamically changing workload patterns and evolving business goals, the pre-computed performance thresholds on the operational values of performance metrics of individual system components are not available. This paper focuses on a fundamental enabling technology for performance management: automatic computation and adaptation of statistically meaningful performance thresholds for system components. We formally define the problem of adaptive threshold setting with controllable accuracy of the thresholds and propose a novel algorithm for solving it. Given a set of Service Level Objectives (SLOs) of the applications executing in the system, our algorithm continually adapts the per-component performance thresholds to the observed SLO violations. The purpose of this continual threshold adaptation is to control the average amounts of false positive and false negative alarms to improve the efficacy of the threshold-based management. We implemented the proposed algorithm and applied it to a relatively simple, albeit non-trivial, storage system. In our experiments we achieved a positive predictive value of 92% and a negative predictive value of 93% for component level performance thresholds.
AB - Threshold violations reported for system components signal undesirable conditions in the system. In complex computer systems, characterized by dynamically changing workload patterns and evolving business goals, the pre-computed performance thresholds on the operational values of performance metrics of individual system components are not available. This paper focuses on a fundamental enabling technology for performance management: automatic computation and adaptation of statistically meaningful performance thresholds for system components. We formally define the problem of adaptive threshold setting with controllable accuracy of the thresholds and propose a novel algorithm for solving it. Given a set of Service Level Objectives (SLOs) of the applications executing in the system, our algorithm continually adapts the per-component performance thresholds to the observed SLO violations. The purpose of this continual threshold adaptation is to control the average amounts of false positive and false negative alarms to improve the efficacy of the threshold-based management. We implemented the proposed algorithm and applied it to a relatively simple, albeit non-trivial, storage system. In our experiments we achieved a positive predictive value of 92% and a negative predictive value of 93% for component level performance thresholds.
UR - http://www.scopus.com/inward/record.url?scp=33745498905&partnerID=8YFLogxK
U2 - 10.1109/ICAC.2005.11
DO - 10.1109/ICAC.2005.11
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:33745498905
SN - 0769522769
SN - 9780769522760
T3 - Proceedings - Second International Conference on Autonomic Computing, ICAC 2005
SP - 204
EP - 215
BT - Proceedings - Second International Conference on Autonomic Computing, ICAC 2005
Y2 - 13 June 2005 through 16 June 2005
ER -