Ensuring Resource Constraints - Dynamic Scheduling

6.4 Dynamic Scheduling

6.4.3 Ensuring Resource Constraints

Several resource constraints have been worked out in the problem statement chapter in Section 3.4.2 and in Section 3.4.3. They will be shortly summarized in the following for clarity reasons.

Constraint One

The first constraint (Equation (3.3) in Section 3.4.2) ensures that enough resource capacity is provided to the demand of the VMs to not violate any of their SLOs at any time. Therefore, the jointly required resources of all VMs placed on the same server must not exceed its capacity.

The jointly required resources are overestimated by the sum of the resources individually required by the VMs comparable to the static approach, which leads to following equation:

∀t∈[tp₃, tp₃+∆tp₃], k: X

i:B(i,t)=k

~ai(t)≤~ck (6.4)

that describes constraint one. The sum of the resources~a_i(t) individually required by all VMs placed on the same server must not exceed its capacity~ck at any time⁵.

Constraint Two and Three

The second and third constraint ensure that the mapping restrictions concerning the distribu-tion of VMs to servers are met. They are formally expressed by the Equadistribu-tions (3.7) and (3.8) in Section 3.4.3.

Constraint Four

A further constraint concerns the resources required by a VM i while it is migrated from a server k1 to a serverk2. Both servers must provide the resource capacity (memory and CPU time) required by the VM during the whole migration phase (cf. Equation (3.9) in Section 3.4.3). Overestimating again the jointly required resourceja#»_k(t) by the sum of the individually

5The operation≤in this equation has been defined by Equation (3.4) in Section 3.4.2.

100

6.4 Dynamic Scheduling

required resources~a_i(t) leads to following equation:

∀t∈[t₀, t₀+ ∆t^mig_i [:~a_i(t) + X

j:B(j,t)=k

~a_j(t)≤#»c_k₂ (6.5)

that describes constraint four. The migration starts at t₀ and has a duration of ∆t^mig_i . Constraint Five

A last resource constraint worked out in Section 3.4.3 states that the number of servers required by the initial safe distribution must not be exceeded by the dynamic scheduling algorithm at any time.

A formal way will be presented in the following to decide, if possible operations prevent a way back to the safe distribution or not with respect to all of these constraints.

For this, a formal description of the “state of safety” of a distribution is introduced. This state is represented by a directed multigraph G= (N, E). The nodes N are the servers. An edgeE represent a way for an unsafe VM to become a safe one. Such a graph is presented for an exemplary unsafe distribution in Figure 6.6 b). The edges in the graph point to servers to which the respective VMs must be moved to become safe. Moving back all VMs to their safe postions leads to a graph without any edges (cf. Figure 6.6 a)). Hence, a graph without any edges describes a safe distribution while graphs that contain any edges represent unsafe ones.

S₃ b) a)

VM₁ VM₂ VM₃

VM₄ VM₅

VM₉

S₁ S₂ S₄

VM₇ S₁

S₂ S₄

S₃

S₃ VM₁

VM₂ VM₃ VM₄

VM₅ VM₉

S₁ S₂ S₄

VM₇

S₂ S₄ S₁

Figure 6.6: a) An exemplary safe distribution of VMs to servers and the respective graph.

b) An exemplary unsafe distribution that belongs to the safe one of a) and the respective graph. Edges point the way for unsafe VMs to become safe.

In principal, any redistribution can be performed as long as the first, second, and third constraint are met. VMs can be consolidated to less servers in times of reduced resource demand as long as Equation (6.4) holds and the mapping restrictions are not violated. The way back for unsafe VMs is annotated in the graph by inserting respective edges. Unused servers can be switched off. All inactive servers must be reactivated and all unsafe VMs must be simply moved back to their initial server according to the edges in the graph to restore the

6 Dynamic Resource Management

safe distribution in worst case.

But the last two resource constraints prevent moving back the VMs in some cases because of missing resource capacity on the destination server. Other VMs must be moved away from this server first to free some resources for the VM. Moving away the other VMs requires free resource capacity at their initial server as well. This can lead to situations in which cyclic dependencies prevent that any of the VMs can be moved at all without an additional server.

An upcoming resource shortage can not be resolved.

This situation is strongly related to classical deadlock problems known from operating system and data base theory [113, 3]. There exist mainly four different strategies to deal with deadlocks [113]:

• Ignoring the problem (ostrich strategy),

• detecting and resolving the deadlocks,

• dynamically preventing deadlocks at runtime, or

• conceptually preventing deadlocks.

The first two options are not applicable in the given context. Applying the first one can not guarantee to meet any SLOs. Resolving deadlocks is not possible without an additional server, which is forbidden by one of the resource constraints. Hence, deadlocks must be prevented.

The scheduling algorithm developed within this thesis prevents deadlocks by avoiding cyclic dependencies inspired by the idea presented in [14]. Such cyclic dependencies can be directly detected in the graph introduced before. Each distribution of VMs to servers whose graph contains any cycles can in principal cause such deadlocks. All servers with nodes in the graph that belong to a cycle can contain unsafe VMs that could not be moved back home because of other unsafe VMs allocating the resource capacity they require. Hence, no migration operation must be performed that leads to a cycle in the graph to prevent such deadlock scenarios.

It will be shown in the following that indeed preventing cycles in the graph ensures a way back to the safe distribution with respect to all resource constraints no matter how the resource demand of the VMs develops. An algorithm will be presented that can resolve any upcoming resource shortage assuming that the underlying graph of a current distribution is acyclic.

First, nodes without any outgoing edges are regarded (e.g. S4 in Figure 6.6 b) ). These nodes represent servers that only contain safe or even no VMs. Hence, unsafe VMs that are safe on these servers can be moved back in any case with respect to the resource constraints.

The initial safe distribution ensures that enough resources capacity is left for them to support their maximal demand at any time. None of the mapping restrictions will be violated as well in this case. The respective incoming edges are removed in the graph after the respective VMs have been moved back.

102

6.4 Dynamic Scheduling

Now, an arbitrary unsafe distribution with an acyclic graph is regarded. All outgoing paths from any node in the graph will end in nodes that have no more outgoing edges. The outgoing path fromS2ends inS4in the graph in Figure 6.6 b) for instance. All incoming edges of nodes at the end of the paths can be removed by migrating back the respective VMs as described before. This will lead to other nodes along the path that do not have any outgoing edges any more so that their incoming ones can be removed as well. Finally, all edges to all nodes can be removed by recursively removing edges from all paths starting at their respective ends. In the example, first the edge fromS1toS4is removed by migrating VM 9 back home before the edges between S₃ andS₁ can be removed by moving VM 2 and 3. Finally, the edge between S2andS3can be removed by moving VM 7, which will end up in the safe initial distribution.

Please note that VMs must be recursively migrated back only in worst case. In most case, only a few VMs must be moved to solve an upcoming resource shortage. But before VMs are moved, it has to be ensured that the graph remains acyclic. This ensures a way back to the complete safe distribution at any time.

It will be presented in the following how a set of feasible operation can be extracted from a current distribution that prevent cycles in the graph.

Im Dokument Resource management in virtualized data centers regarding performance and energy aspects (Seite 114-117)