Statistical Virtualization: Scale as a Tool for Implementing Service Overlays

Statistical Virtualization: Scale as a Tool for Implementing Service Overlays

Google Tech Talks January, 10 2008 In this talk, we will explore the use of on-line non-parametric time series analysis and prediction to build virtualized services in distributed computing environments. In particular, by analyzing and predicting the future behavior of one set of distributed services we explore how they can be amalgamated dynamically to implement a "virtual" service overlay with properties that are not supported by any of the constituent service components (ie derive strictly from aggregation). We illustrate the approach by detailing distributed batch-scheduling mechanisms that provide both reservation and co-allocation services in environments that explicitly do not support them. While our work focuses on national-scale scientific computing infrastructure, we believe its alternative approach to virtualizing distributed systems abstractions is important in a larger scalable systems context for two reasons. First, because the methodology is inherently statistical, it improves with scale making scale a tool (rather than an impediment) in terms of implementation. Secondly, it shares many common features, both conceptually and implementationally with scalable search services making it possible, we believe, to explore the use of commercial search infrastructure in future work. Speaker: Rich Wolski My origins, like those of most people born in North America during this century, are ambiguous and questionable. I am currently an Associate Professor in the <b>…</b>