• Keine Ergebnisse gefunden

5.3 Modeling the Resource Demand

5.3.3 Modeling Approach

The resource demand of a VM is modeled by a discrete-time stochastic processYi(t) as men-tioned in the problem statement chapter. This process is characterized using the resource demand Ri(t) observed. The stochastic process consists of individual discrete random vari-ables Yi to take care of missing stationarity. Each random variable is described by its own probability distribution comparable to the SARIMA approach. Furthermore, the random variables are discrete, since the observed resource demand that is used for characterization is discrete as well. The resolution of memory demand is technically limited by one byte. Relative CPU time could be a continuous measure but is discretized as well in all common virtualization environments.

A probability distribution function completely describes a discrete random variable that models the resource demand at a certain time t. Probabilities state how often a certain demand is expected to occur. This information is exactly what is needed to support the SLO

58

5.3 Modeling the Resource Demand

specification worked out in the previous section. Resource capacity that must be provided to a VM can be estimated based on this probability distribution and the SLO. It will be described later on in Section 5.4 how this is done. This section is limited to the description of the model and the respective characterization process.

The model is split into two parts comparable to the SARIMA approach. The first part captures possible long term changes of the demand behavior. They are assumed to be strictly monotonic increasing or decreasing over time. The second part models seasonal trends and the noise behavior without any monotonic long term changes. Both parts of the model are presented separately in the following.

Modeling the Long Term Trend

Conventionally, a long term trend of a time series is assumed to be an additive component of the time series. This component is modeled by a polynomial that is fitted using linear regression. This modeling does not capture the resource demand behavior of a VM very well as can be seen in Figure 5.3.

0 50

time t Ri(t)

-20 50

time t 0

0 10 5

time t

a) b) c)

f1(Ri(t)) 100

f2(Ri(t))

Figure 5.3: a) An exemplary time series of seven month resource demand that has a long term trend. b) The time series detrended the conventional way by subtracting a linear trend function from it. c) The time series detrended by dividing it by a linear trend function.

The seasonal trend as well as the noise performance seems not to be increased by an additive component with increasing time but scaled by a factor. Hence, it was tried to divide the time series by a linear function for detrending it as follows:

Ri(t) =Ri(t)· 1

at+b. (5.13)

The resulting time series is presented in Figure 5.3 c). The parametersaand bwere found conventionally using linear regression [51] as well. The idea behind this modeling is the as-sumption that the number of clients that use the service increases over time. This means that in times of high utilization an increased number of clients has to be served. The number of clients increases also in times of low utilization but not by the same number but the same

5 Statistical Static Resource Management

factor.

Based on this idea, the resource demand model can now be detailed some more by following equation:

Yi(t) =Yi(t)·LTi(t). (5.14) The functionLTi(t) models the long term trend by a linear equation that is characterized using linear regression. The stochastic process Yi(t) captures the seasonal trend and the noise.

Modeling the Seasonal Trend and the Noise

Classical time series analysis suggests now splitting up the seasonal trend and the noise as discussed before. This is done for an exemplary time series in Figure 5.4. One can clearly see that the remaining residuali(t) is not stationary which was also found in [104].

0 50

time t Ri*(t)

a) b)

100

0 50 100

time t c)

-30 0 20

time t

STi(t) εi(t)

= +

Figure 5.4: a) One day resource demand of an exemplary service. b) The seasonal trend derived using moving averaging. c) The residual noise that remains when the seasonal trend is removed from the time series.

As a result, splitting up the seasonal trend and the noise will not lead to any advantage.

Instead, it is required that each random variableYiof the processYi(t) is characterized indi-vidually comparable to the SARIMA approach. This means that one probability distribution for each variable must be derived from the data observed in history. Therefore, it is assumed that within a small time interval (e.g. half an hour) denoted by ∆tavgthe trend as well as the noise performance is not significantly changing. Hence, small intervals [t−12∆tavg, t+12∆tavg] of the time series Ri(t) can be treated as if the demand behavior within these intervals is stationary. Furthermore, it is assumed that the resource demand values are statistically inde-pendent in these intervals.

Based on these assumptions, the probability distribution of the random variable Yi that describes the resource demand at time t0 can be derived from the dataRi([t012∆tavg, t0+

1

2∆tavg]) observed in this interval. This approach is illustrated in Figure 5.5 for clarification.

The validity of both assumptions will be discussed later on as well as the consequences for resource management when they are violated.

Until now, the model only describes the demand behavior of a VM during the characteriza-tion phase. In the following, it will be shown how the demand behavior expected in the future

60

5.3 Modeling the Resource Demand

R

P(R) Yi*(t0+1)

P(R)

R Yi*(t0)

t0

tavg

t Ri(t)

t0+1

Figure 5.5: Each random variableYiofYi(t) must be individually characterized using the data Ri(t) observed in the past due to missing stationarity. Therefore, it is assumed that within a small interval around a timet0the demand behavior is nearly stationary and statistically independent. Hence, the probability distribution that describes the random variableYi at t0 can be derived from the data within this interval.

can be extrapolated from these models.

Extrapolating into the Future

The part of the model that captures the long term trend is described by a linear equation.

This model can be simply extrapolated into the future under the assumption that these long term changes will go on in the future the same way like observed during the characterization phase.

The part of the model that describes the seasonal trend and the noise should not show any monotonic behavior, since a possible long term trend has been removed from the data used for characterization. It should only describe changes in the demand behavior that are caused by day time dependent workload variations. It is further assumed that during the characterization phase of the models a time period with peak resource demand (e.g. at a Monday morning) was observed that never will be exceeded by resource demand in future5, which can be formally expressed as follows:

max

∀t∈[tp2,tp2+∆tp2]

(Yi(t))≥ max

∀t∈[tp3,tp3+∆tp3]

(Yi(t)). (5.15)

As a result, a discrete random variableYimax that describes the maximal resource demand of VMi within the time interval [tp3, tp3+∆tp3] in future w.r.t. trends and the noise can be extracted from the model using following equation:

Yimax= max

∀t∈[tp2,tp2+∆tp2]

Yi(t)· max

∀tLT∈[tp3,tp3+∆tp3]

(LTi(tLT))

. (5.16)

The part of the model that describes the seasonal trend and the noise is scaled by the influence of the long term trend maximally expected in phase three. The random variable that describes

5after having purged a possible long term trend from it

5 Statistical Static Resource Management

the maximal resource demand is selected from the resulting stochastic process.

The definition of the outer functionmaxdepends on how the random variables are used for resource management. This will be described in Section 5.4. Hence, some more details will be presented to this function in this section as well.