TY - GEN
T1 - Improving resource matching through estimation of actual job requirements
AU - Yom-Tov, Elad
AU - Aridor, Yariv
PY - 2006
Y1 - 2006
N2 - Heterogeneous clusters and grid infrastructures are becoming increasingly popular. In these computing infrastructures, machines have different resources (e.g., memory sizes, disk space, and installed software packages). These differences give rise to a problem of over-provisioning, that is, sub-optimal utilization of a cluster due to users requesting resource capacities greater than what their jobs actually need. Our analysis of a real workload file (LANL CM5) revealed differences of up to two orders of magnitude between requested memory capacity and actual memory usage. The problem of over-provisioning has received very little attention so far. We discuss different approaches for applying machine learning methods to estimate the actual resource capacities used by jobs. These approaches are independent of the scheduling policies and the dynamic resource-matching schemes used. Our simulations show that these methods can yield an improvement of over 50% in utilization (throughput) of heterogeneous clusters.
AB - Heterogeneous clusters and grid infrastructures are becoming increasingly popular. In these computing infrastructures, machines have different resources (e.g., memory sizes, disk space, and installed software packages). These differences give rise to a problem of over-provisioning, that is, sub-optimal utilization of a cluster due to users requesting resource capacities greater than what their jobs actually need. Our analysis of a real workload file (LANL CM5) revealed differences of up to two orders of magnitude between requested memory capacity and actual memory usage. The problem of over-provisioning has received very little attention so far. We discuss different approaches for applying machine learning methods to estimate the actual resource capacities used by jobs. These approaches are independent of the scheduling policies and the dynamic resource-matching schemes used. Our simulations show that these methods can yield an improvement of over 50% in utilization (throughput) of heterogeneous clusters.
UR - http://www.scopus.com/inward/record.url?scp=33845879131&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:33845879131
SN - 1424403073
SN - 9781424403073
T3 - Proceedings of the IEEE International Symposium on High Performance Distributed Computing
SP - 367
EP - 368
BT - Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing, HPDC-15
T2 - 15th IEEE International Symposium on High Performance Distributed Computing, HPDC-15
Y2 - 19 June 2006 through 23 June 2006
ER -