• Keine Ergebnisse gefunden

5.4 Static Scheduling

5.4.2 Pessimistic Statistical Scheduling

5.4 Static Scheduling

below 0 because unused CPU time cannot be used later for compensation. Furthermore, no residual can exist at time t= 0. The real CPU time demandJR+k(t, Ck) at time tis now the initial demandJRk(t) plus the residualk(t−1, Ck) that remains from timet−1.

An exemplary initial time series JRk(t) of joint CPU time demand as well as the resulting one JR+k(t, Ck) are presented in Figure 5.7 a). One can clearly see that JR+k(t, Ck) exceeds JRk(t) when resource shortages occur. Hence, the real demand JR+k(t, Ck) will exceed Ck more often compared to the initial demand JRk(t), which results in an increased amount of resource shortages.

t

resources Ck

a)

JRk(t) JRk(t)

b) R

P(R)

Ck JYk JYk

% 2 . 77 ) (JYk+Ck = P

% 1 . 85 ) (JYkCk = P

+ +

Figure 5.7: a) Resource demand JRk(t) of an exemplary time series and the resulting real resource demand JR+k(t, Ck) when JRk(t) exceeds the server’s capacity at some time. b) The respective probability distributions show that percentile based SLOs can be violated when interdependencies between resource demand and provided resources are neglected.

It is shown in Figure 5.7 b) that this changed demand behavior can lead to SLO violations.

The initial time series JRk(t) does not exceed the server’s capacity in 85.1% of time. The resulting real one will only meet the demand in 77.2%. A SLO is violated, if the respective Pimin is higher than 77.2%.

5 Statistical Static Resource Management

time.

Hence, the resources Amaxi maximally required by VM i in the interval [tp3, tp3+ ∆tp3] in future must be determined for each resource type. It has been shown in the modeling section how a random variable Yimaxcan be derived from the resource demand model that describes the maximally expected resource demand in the interval [tp3, tp3+∆tp3]. Amaxi must be selected in a way that it provides enough resources capacity to meet the demand behavior described by exactly this random variable.

Yimaxis selected as the maximum of the modelYi([tp2, tp2+∆tp2]) scaled by the influence of a possible long term trend according to Equation (5.16). A size of the random variablesYithat describe the resource demand of VMs must be defined to find this maximum. For the resource management concept, this size is simply defined by the minimal resources Ai that must be reserved to fulfill the SLO of the respective VM. A higher amount of required resources means a largerYi, which can be formally expressed as follows:

Yi(t0)> Yi(t1)⇐⇒Ai(t0)> Ai(t1). (5.20) Hence, a functionq:Y(t)7→A(t) must be found that determines the resourcesA(t) required to fulfill the SLO with respect to the demand described byY(t). This function in conjunction with the method for findingYimax (cf. Equation (5.16)) will result in following equation:

Amaxi = max

∀t∈[tp2,tp2+∆tp2]

q

Yi(t)· max

∀tLT∈[tp3,tp3+∆tp3]

(LTi(tLT))

. (5.21)

The resources Amaxi maximally required by VMi in the interval [tp3, tp3+∆tp3] in the future to fulfill their respective SLO can be directly calculated using this equation.

Different functionsq must be applied to deriveAi(t) fromYi(t) depending on the resource type. One function will be presented for memory and one for CPU time in the following.

Deriving Required Memory Capacity

The provided memory capacityAi(t) must meet the demandRi(t) all the time, since memory shortages can lead to complete service failures as discussed in Section 3.3.2.

Given a random variableYi that describes the memory demand at a certain timet, memory capacity Ai must be provided at t in a way that P(Yi > Ai) = 0 holds. This leads to Ai=max(Yi) (cf. Section 5.1.2). Applying this to a whole stochastic process Yi(t) results in Ai(t) =max(Yi(t)) =q(Yi(t)). Hence, functionqfor memory determines simply the maximum of each random variableYi of the stochastic processYi(t).

In principal, the resultingAmaxi should be the same like the maximal memory demandRmaxi determined using benchmarks in phase one of the concept. Hence,Rmaxi could be directly used

68

5.4 Static Scheduling

for resource management. Nevertheless, the way required memory capacity is determined from the random variables that describe the memory demand behavior has been presented here.

This method will be needed later on, when correlations are used for more optimistic statistical static resource management and for the dynamic resource management concept as well.

Deriving Required CPU Time

CPU time can be used to trade off provided capacity against service performance in contrast to memory. Provided CPU time Ai can be lower than the maximum of the demandYi, which can lead to performance losses as intended. These performance losses must be limited by the conditions derived from the SLOs.

A classical percentile based as well as a new more fine grained SLO specification have been presented in Section 5.2. Both are suited to support resource performance trade-offs. The resource management concept presented in this thesis can deal with both types of specifications.

But only the way how fine grained SLOs are supported will be presented, since any percentile based SLO can be expressed by a fine grained one as well.

Fine grained SLOs can be defined by a function Pηmini (η) as presented in Section 5.2. This function assigns a probability to each defined performance goal ηi stating how often this goal must be achieved. Additionally, it has been shown how a functionPαmin

i (α) can be derived from Pηmini (η) that defines how often a certain ratioαbetween provided resourcesAi and resource demand Ri must not be deceeded. Finally, following condition for resource management has been derived in Section 5.2.4 that must be met to not violate the SLO:

∀αmin ∈]0,1[∩dom(Pαmini ), t0:P(g(Ri(IVtSLO0 ), Ai(IVtSLO0 ))≥αmin)≥Pαminimin) with

IVtSLO0 = [t0, t0+ ∆tSLOi ]. (5.22)

This equation must hold for all possible realizationsRi(t) of the stochastic processYi(t) that describes the demand behavior of a VM. The major challenge is to calculate the probability P(g(Ri(IVtSLO0 ), Ai(IVtSLO0 )) ≥αmin), which is needed to determine appropriate values for Ai(t) with respect to the underlying processYi(t).

First, a method will be presented that allows determining minimal resource capacity Ai

required for demand described by Yi to address this challenge. Resulting resource shortages must not exceed all possible valuesαmin∈]0,1[∩dom(Pαmini ) more often than specified by a certain function Pαminimin). This condition can be formally expressed as follows:

∀αmin∈]0,1[∩dom(Pαmini ) :P(g(Yi, Ai)≥αmin)≥Pαminimin). (5.23)

5 Statistical Static Resource Management

The transformation of a discrete random variable by a function leads to another discrete random variable [22]. Hence, the resultXiof the functionXi=g(Yi, Ai) is a discrete random variable as well because of Yi. Yi describes the resource demand Ri with respect to the noise. Xi describes the resulting resource shortagesαi, when a certain resource capacityAiis provided to the resource demand Yi.

The probability P(g(Yi, Ai)≥αmin) can be calculated based on this dependence using the probability distributionfXi(α) of the random variable Xi as follows:

P(g(Yi, Ai)≥αmin) = P(Xi≥αmin)

= 1−P(Xi< αmin)

= 1− Z

α∈]0,αmin[

P(Xi=α)dα

= 1− Z

α∈]0,αmin[

fXi(α)dα. (5.24)

The probability distribution fXi(α) is not known so far. It will be shown in a next step how this probability distribution can be derived from the known onefYi(R) ofYi.

It has been shown in Section 5.1.2 how the probability distribution fN(n) of a discrete random variableNthat is a mapping of another discrete random variableM can be calculated fromMs distributionfM(m). Applying the respective equation (Equation (5.4)) to the random variablesYi andXi that are mapped on each other by functiong leads to:

fXi(α) = X

R:g(R,Ai)=α

fYi(R). (5.25)

One can derive from the definition of functiongin Section 5.10 thatgreduces to the invertible function g(R, Ai) = ARi for α <1. As a result, each valueα∈]0,1[ will lead to exactly one valueR. Hence,R can be calculated fromαusingR=Aαi, which will reduce Equation (5.25) to:

fXi(α) =fYi(Ai

α) forα∈]0,1[. (5.26)

Inserting now Equation (5.26) into Equation (5.24) leads to following condition:

∀αmin∈]0,1[∩dom(Pαmini ) : Z

α∈]0,αmin[

fYi(Ai

α)dα <1−Pαminimin) (5.27) that is equivalent to the one initially expressed by Equation (5.23). It can be now simply tested if a certain value of Ai will violate the SLO or not based on this equation and the probability distribution fYi(R) of the resource demand described byYi. Starting withRmaxi , possible values of Ai are iterated downwards to find the minimalAi that barely will satisfy

70

5.4 Static Scheduling

the SLO.

In principal, this method can be applied individually for each time t to determine Ai(t) fromYi(t). But it is not obvious thatAi(t) determined this way will satisfy the condition this sections starts with (Equation (5.22)) as well. Determining Ai(t) individually for each timet only ensures that the univariate probability distribution of the resource demand at each fixed time twill satisfy the SLOs. But this does not necessarily implicate that the same is true for all possible explicit time seriesRi(t) in each intervalIVtSLO0 as well (cf. Section 5.1.4). It will be shown in the following that under the assumption of statistical independence of the random variables ofYi(t) this implication is actually valid.

Proof. Formally expressed, it needs to be shown that if condition (5.23) meets for all random variables Yi of the stochastic processYi(t), the same will be true for any realization Ri(t) in any intervalIVtSLO0 as well, which can be expressed by following equation:

∀αmin∈]0,1[∩dom(Pαmini ), t0, t∈IVtSLO0 , Yi=Yi(t), Ai =Ai(t) : P(g(Yi, Ai)≥αmin)≥Pαminimin)⇒

P(g(Ri(IVtSLO0 ), Ai(IVtSLO0 ))≥αmin)≥Pαminimin) with

IVtSLO

0 = [t0, t0+ ∆tSLOi ]. (5.28)

First, eachYi is mapped on a random variableXiby functiong(Yi, Ai), which turns the left part of the implication intoP(Xi ≥αmin)≥Pαminimin). AllXitogether form the stochastic process Xi(t). Now, a lower bound of all Xi ∈ Xi(t) is defined by a new discrete random variable Xfi as follows:

∀αmin∈]0,1[∩dom(Pαmini ) :P(fXi≥αmin) = min

∀Xi∈Xi(t)(P(Xi≥αmin)). (5.29) For thisXfi it holds that:

∀Xi∈Xi(t), αmin∈]0,1[∩dom(Pαmin

i ) : P(Xi≥αmin)≥Pαmin

imin)⇔P(fXi≥αmin)≥Pαmin

imin). (5.30) Let now bexi(t) any possible realization ofXi(t). As shown in Section 5.1.4, the probability that anyxi(t) exceeds a certain threshold αmin in an intervalIVtSLO0 can be calculated from the individual statistically independent random variables involved as follows:

P(xi(IVtSLO0 )≥αmin) = 1

|IVtSLO0 |

X

XiXi(IV SLOt0 )

P(Xi ≥αmin). (5.31)

5 Statistical Static Resource Management

WithXfibeing a lower bound of allXi(as defined by Equation (5.29)) it further holds that:

∀αmin∈]0,1[∩dom(Pαmin

i ), Xi∈Xi(t) : 1

|IVtSLO0 |

X

XiXi(IV SLOt0 )

P(Xi≥αmin)≥P(fXi≥αmin). (5.32)

If nowP(Xi≥αmin)≥Pαmin

imin) holds for allXi ∈Xi(t) and allαmin∈]0,1[∩dom(Pαmin

i ), P(fXi≥αmin)≥Pαminimin) will hold as well according to Equation (5.30). In this case, the inequation P(xi(IVtSLO

0 )≥αmin)≥Pαmin

imin) will also hold for any possiblexi(t) and all αmin∈]0,1[∩dom(Pαmini ) because of Equation (5.31) and (5.32).

Let now beRi(t) any concrete realization of the stochastic processYi(t). Applying function g onRi(t) leads to a time seriesxi(t) that is a realization ofXi(t)7. In a final step, it needs to be shown that

∀αmin∈]0,1[∩dom(Pαmin

i ) :P(xi(IVtSLO

0 )≥αmin)≥Pαmin

imin)

⇔P(g(Ri(IVtSLO

0 ), Ai(IVtSLO

0 ))≥αmin)≥Pαmin

imin) (5.33) holds, which is not obvious since functiong is not invertible.

Therefore, one can first transform the left side as follows:

P(xi(IVtSLO

0 )≥αmin) = 1−P(xi(IVtSLO

0 )< αmin). (5.34) Regarding the range ofαmin in Equation (5.33) and the definition ofg(cf. Equation (5.10) in Section 5.2.3) one can derive that indeed

1−P(g(Ri(IVtSLO0 ), Ai(IVtSLO0 ))< αmin)

= 1−P(Ai(IVtSLO0 ) Ri(IVtSLO

0 ) < αmin)

= 1−P(xi(IVtSLO0 )< αmin)

=P(xi(IVtSLO0 )≥αmin) (5.35) holds. The reason is that function g either returns the ratio between Ai and Ri or simply 1 depending on the parameters. A result of 1 will not influence the probability in Equation (5.35) for all αmin ∈ ]0,1[ ∩ dom(Pαmini ). Hence, only the invertible part of g needs to be regarded.

7Xi(t) is a mapping ofYi(t) which means that all random variablesXiofXi(t) are mappings of the respective YiofYi(t). Mapping a random variableYion a variableXimeans mapping all possible realizations on each other by the respective mapping function. Hence, ifXi(t) is a mapping ofYi(t), a mapping of a concrete realization ofYi(t) is a concrete realization ofXi(t).

72

5.4 Static Scheduling

Please note that Equation (5.31) will be only valid for sufficiently long intervals IVtSLO

0 as

discussed in Section 5.1.4. A correction term must be added to the probabilityP(g(Yi, Ai)≥ αmin) in Equation (5.23) for shorter intervals to determineAi(t). Such correction terms depend on the number of concrete realizations (samples in the intervalIVtSLO

0 in this case) of a random variable. They can be characterized using known approaches out of the field of the theory of samples as presented for instance in [38]. This topic will not be detailed any deeper within this thesis.

Finally, functionq that determines CPU time capacity Ai(t) required to support the CPU time demand Yi(t) with respect to a SLO can individually determine Ai at each time t as described before.

Comparable to memory, Equation (5.21) can be used with function q to determine the CPU time capacity Amaxi maximally required in the future. ButAmaxi can be lower than the maximally expected demandRmaxi in contrast to memory depending on the demand behavior of the VM and on how restrictive the SLO is.

Distributing VMs to Servers

Once the valuesAmaxi are found for each resource type of all VMs, the distribution of VMs to servers can be determined the same way the pessimistic static approach presented in Chapter 4 does. But the maximal CPU time Amaxi provided to VM i can be lower compared to the value found by the initial approach because of the resource performance trade-off.

Providing less CPU time than required can lead to resources shortages as intended, if the overall CPU time demand of all VMs placed on the same server exceeds its capacity. But it is important that required resource capacity Amaxi is individually reserved for each VM to prevent SLO violations caused by these resource shortages. This way, the amount of provided capacity never gets below a value that can lead to SLO violations. Common virtualization environments allow reserving such fixed amount of resource capacity individually for each VM as described in Section 3.4.1.

Finally, VMs need to be sorted to apply heuristics such as first-fit or best-fit, while the distribution of VMs to servers is planned. This can be done by their Amaxi values the same way the pessimistic approach does.