• Keine Ergebnisse gefunden

5.4 Static Scheduling

5.4.1 Known Approaches

Statistical static management approaches that are similar to the one presented in this thesis have been suggested in [117, 58, 45]. They are all based on stochastic processes and perform resource management nearly the same way. The main idea of them will be shortly presented in the following as well as the main weaknesses.

These approaches model the resource demand of each VM by one single random variable Yi for each resource type as mentioned in the modeling section. A random variableJ Yk that describes the joint resource demand of the VMs is derived for each resource type as follows:

J Yk= X

i:B(i)=k

Yi. (5.17)

This variable is used to decide whether or not a certain combination of VMs will fit together on the same server k.

It has been shown in Section 5.1.2 that the probability distribution of the sum of statistically independent random variables can be determined by applying convolution on their individual distributions. The authors of [117, 58, 45] suggests to determine the probability distribution that describesJ Yk exactly this way under the assumption of statistical independence of these variables. The validity of this assumption will be discussed later in this section.

OnceJ Yk is found, the jointly required resourcesJAk(t) to be provided to the VMs to fulfill their SLOs must be derived. JAk(t) is a constant valueJAk, sinceJ Yk is time independent.

A common way for specifying SLOs has been introduced in Section 5.2. A minimal required probability Pimin defines how often a defined performance goal must be satisfied. Such SLOs can be used to trade off resources against performance. The authors of [117, 58, 45] focused on such a SLO specification as well. They suggest to determine the resource capacity JAk provided to the joint resource demandJ Yk as follows:

P(J Yk≤JAk)≥ max

∀i:B(i)=k(Pimin). (5.18) The resource capacity JAk provided to the VMs must satisfy the joint resource demand de-scribed byJ Yk at least with the maximum of the required probabilitiesPimin. The maximum ensures that the most restrictive SLO of the VMs is satisfied. In principal, JAk is selected as thePiminth percentile ofJ Yk using the most restrictive of thePimins.

According to the first constraint for static resource management expressed by Equation(3.3) in the problem statement chapter,JAk =JAk(t) must be lower or equal to the server’s resource

64

5.4 Static Scheduling

capacityCk to fulfill all SLOs.

Based on this condition, classical bin packing algorithms can now decide whether or not a certain combination of VMs will fit together on the same server. But mainly two different assumptions prevent this approach from directly being used in real data centers. They will be discussed in the following.

Correlations

A major problem of this approach concerns the way the random variablesJ Ykare derived from the Yis of the VMs. The convolution operation performed requires statistical independence which is not given in real data centers.

It can be mainly distinguished between two types of correlations. Structural correlations occur, if services are distributed across different VMs. Handling one request involves differ-ent VMs nearly at the same time in this case. Creating a dynamic web page, for instance, typically requires the work of a web server, a data base, and in some cases a storage system.

Hence, the resource demands of the respective VMs are positively correlated. The second class of correlations, temporal correlations, typically occurs in data centers that support business activities at daytime. At night, most of the services lowly utilize hardware while they require maximal resources by day. Hence, their demand behavior is positively correlated as well.

The joint resource demand of different VMs will be either under- or overestimated when correlations are ignored depending on whether the random variables are positively or negatively correlated. An example for each case is illustrated in Figure 5.6. The probability of the highest resource demand is higher than the one that is calculated using convolution in case of positively correlated workload. Negatively correlated workload, in contrast, reduces the probability of high resource demand of different applications at the same time so that the calculated distribution overestimates the real one.

As a result, either resource are wasted or even SLO violations can occur when correlations are neglected while statistic resource management is performed. Two approaches will be presented within this thesis that deal with correlations in different ways. The first one pessimistically assumes completely positively correlated workload, which only leads to wasted resources but not to any SLO violations. The second one uses negative correlations for a more optimistic resource management, while it guarantees not to violate any SLO as well.

Interdependencies between Required and Provided Resource Capacity

Memory demand must be met by provided capacity at any time as discussed in Section 3.3.2.

Provided CPU time, in contrast, can be traded off against service performance by defining appropriate SLOs. Resource shortages are tolerated as long as performance goals are not violated more often than specified. The approaches presented in [117, 58, 45] try to ensure

5 Statistical Static Resource Management

0 100 R

P(R)

0.5 1.0

P(R)

0 100 R

0.5 1.0

Y

1

Y

2

a) d)

t R1

R2

100 JR

P(JR)

0.5 1.0

negatively correlated

c)

t R1

R2

0 200 JR

P(JR)

0.5 1.0

uncorrelated

b)

t R1

R2

0 200 JR

P(JR)

0.5 1.0

positively correlated

Figure 5.6: Shows the influence of correlations when two exemplary random variables Y1and Y2are added. Their respective probability distributions are presented in a). Three cases are regarded in which the variables are either positively correlated b), un-correlated c), or negatively un-correlated d). Exemplary time series were selected one for each case. They fit to the variables and to the respective correlation as well.

One can see that summing up the time series will lead to different probability distributions. Hence, the sum of different random variables depends on possible correlations between them.

meeting this condition. The probability distribution of the joint CPU time demand J Yk is used to decide whether or not a certain combination of VMs will fit together on the same server.

This method can only work, ifJ Ykis not influenced by actually occurring resource shortages.

But this assumption is not true in most cases. An exemplary time series JRk(t) of the joint CPU time demand of different VMs will be analyzed in the following to point out the reason behind.

A certain amount of CPU time is needed by the VMs at each time stept. This demand can exceed the servers capacity Ck as intended by the statistical resource management approach.

When the capacity is actually exceeded att, a residual of the CPU time demand will remain.

Hence it is demanded at timet+ 1 further on in addition to the initial demand JRk(t+ 1).

It is now assumed that the initial CPU time demand stated byJRk(t) will not change due to resources shortages as well. A second time series JR+k(t, Ck) can be calculated from the initial one based on this assumption as follows:

JR+k(t, Ck) =JRk(t) +k(t−1, Ck) with

k(t, Ck) =max(k(t−1, Ck) +JRk(t)−Ck,0) and

k(0, Ck) = 0.

(5.19)

This time series additionally considers the remaining residuals. The residualk(t, Ck) at time tis the sum of the residual remaining from timet−1 and the difference between the resource demand attand the server’s capacityCk. Themaxfunction ensures that residuals do not fall

66

5.4 Static Scheduling

below 0 because unused CPU time cannot be used later for compensation. Furthermore, no residual can exist at time t= 0. The real CPU time demandJR+k(t, Ck) at time tis now the initial demandJRk(t) plus the residualk(t−1, Ck) that remains from timet−1.

An exemplary initial time series JRk(t) of joint CPU time demand as well as the resulting one JR+k(t, Ck) are presented in Figure 5.7 a). One can clearly see that JR+k(t, Ck) exceeds JRk(t) when resource shortages occur. Hence, the real demand JR+k(t, Ck) will exceed Ck more often compared to the initial demand JRk(t), which results in an increased amount of resource shortages.

t

resources Ck

a)

JRk(t) JRk(t)

b) R

P(R)

Ck JYk JYk

% 2 . 77 ) (JYk+Ck = P

% 1 . 85 ) (JYkCk = P

+ +

Figure 5.7: a) Resource demand JRk(t) of an exemplary time series and the resulting real resource demand JR+k(t, Ck) when JRk(t) exceeds the server’s capacity at some time. b) The respective probability distributions show that percentile based SLOs can be violated when interdependencies between resource demand and provided resources are neglected.

It is shown in Figure 5.7 b) that this changed demand behavior can lead to SLO violations.

The initial time series JRk(t) does not exceed the server’s capacity in 85.1% of time. The resulting real one will only meet the demand in 77.2%. A SLO is violated, if the respective Pimin is higher than 77.2%.