Current Implementation - Resource-Elasticity Support for Distributed Memory HPC Applications

Figure 7.1: Program structure of the simple EPOP example (with source in Listing 7.6).

A branch can select any phase except the initialization phase; the initialization phase is restricted to have a single execution. In the example shown in Fig. 7.1, a branch determines whether to finalize the application afterPhase 1, or if to continue execution fromPhase 0.

7.2.3 Application Data

All application state is held outside of the phases in the application data, similarly to func-tional programming languages. Instead of holding state, all static or allocated application data and file descriptors are required to be created in the data block only and then passed around between phases. Once the computations of a block are applied, the data is passed to the next computational block based on the control flow of the program.

In the example presented in Fig. 7.1, it can be observed that the data of the application is passed from phase to phase. In the case of the EPs, the data is also passed again to its computational block on each iteration. Finally, if the branch is taken, the data is then passed from the EPPhase 1to the EPPhase 0.

7.3 Current Implementation

A minimalistic implementation of EPOP is presented in this section. It is currently imple-mented in the C programming language. The implementation will be used to show more concretely how EPOP solves the original problem of improving the structure of elastic MPI programs, while also providing some additional benefits, like simplifying the creation of performance models to assist runtime schedulers.

7.3.1 Driver Program

EPOP programs are not compiled as executable binaries. Instead, they are built into archives that are loaded by a driver program. Currently only one driver is provided with EPOP, but multiple drivers can be provided later since driver programs are independent of EPOP applications. Profiling and automatic tuning drivers could be added to the im-plementation in the future.

7 Elastic-Phase Oriented Programming (EPOP)

Each EPOP program implements aget program()routine that returns a program def-inition. The program definition includes its size, a program structure and some other metadata that describes it.

mpiexec −n 1024 epop . / double phase example <parameter1> <parameter2>

Listing 7.2: EPOP + MPI application started with an MPI launcher. Multiple EPOP drivers are started (each with its own instance of the EPOP application and common parameters).

EPOP applications are not necessarily also MPI applications. In fact, these applications don’t even need to be parallel to be implemented with the EPOP framework. For EPOP + MPI applications, multiple driver programs are launched. Each driver instance created by the MPI launcher will load the EPOP + MPI application in a typical SPMD style launch.

Listing 7.2 shows a sample launch command with a typical mpiexeclauncher provided by some resource manager and MPI library combinations.

/ / t h e i n i t i a l i z a t i o n p h a s e i s c a l l e d b e f o r e t h i s c o d e b l o c k / / t h e i n i t i a l pc v a l u e i s p r o v i d e d by t h e r e s o u r c e manager while( pc < program−>s i z e ){

switch( program−>elements [ pc ] . type ){ c a s e EP :

do {

/ / p r o b i n g f o r a d a p t a t i o n s i s d o n e h e r e i f( r e s o u r c e a d a p t a t i o n )

program−>elements [ pc ] . phase adapt ( program−>data ) ; program−>elements [ pc ] . p h a s e e x e c ( program−>data ) ;

} while( program−>elements [ pc ] . l o o p c o n d i t i o n ( program−>data ) ) ; pc ++;

break; c a s e RP :

do{

program−>elements [ pc ] . p h a s e e x e c ( program−>data ) ;

} while( program−>elements [ pc ] . l o o p c o n d i t i o n ( program−>data ) ) ; pc ++;

break; c a s e BRANCH:

pc = program−>elements [ pc ] . b r a n c h c o n d i t i o n ( pc , program−>data ) ; i f( pc == 0 | | pc > program−>data ) r e t u r n −2;

continue; d e f a u l t:

r e t u r n −1;

} }

Listing 7.3: Simplified example control loop of a driver program.

Listing 7.3 shows a simplified example of a control loop used to implement a driver program. The program counter (pcvariable) is used as an index in an array of EPs, rigid-phases and branches, and each is called according to its type with the program’s data.

7.3 Current Implementation 7.3.2 Program Element

As seen in the sample code in Listing 7.3, the driver program traverses an array of both phases and branches. This is possible with the use of the program element structure as the type for the elements of the array used by the implementation. A program element can be one of the three phase types or a branch. In the sample driver code, there is no case for the initialization program element type, since these are called only once before entering the driver loop. The current C language implementation of the program element type is shown in code Listing 7.4.

typedef s t r u c t { i n t type ;

void ( * i n i t ) (i n t* , char* * * , void* * ) ; void ( * phase adapt ) (void* ) ;

void ( * phase exec ) (void* ) ; i n t ( * loop condition ) (void* ) ;

i n t ( * branch condition ) (i n t , void* ) ; } program element t ;

Listing 7.4: C structure of the program element.

The compute blocks of phases resemble kernels in other programming models. Each compute block defined in nearly the same way as a PThread [169, 1, 172], with a single void pointer as input and an integer return type for error handling by the driver program.

The adaptation block has the same interface, but has a different purpose. When the driver program interacts with the resource manager and determines that there is an adaptation to be performed, this routine is called on preexisting processes. On joining processes, the driver program proceeds to get the program counter value from the resource manager first, and then calls the adaptation block of the appropriate phase. This design eliminates the need for complex control structures found in regular resource-elastic MPI applications.

The branch type can be used for arbitrary jumps. This is achieved by modifying the program counter passed to the branch implementation by the driver program. The branch operation takes both the application data and the program counter. Differential or absolute jumps can be implemented by either computing a concrete value for the program counter, or by adding or subtracting a differential to its current value. An example code snippet of anif-elseblock based on differential jumps can be seen in Listing 7.5.

i n t branch (i n t pc , void * appdata ){

i f ( appdata−>b r a n c h c o n d i t i o n == 0 ) r e t u r n ( pc + 1 ) ; e l s e r e t u r n ( pc + 2 ) ;

}

Listing 7.5: C code for a differential branch performing an if-else operation. The driver continues to the program element atpc+1orpc+2depending on the appdata->branch conditionvariable. The program structure must be assembled with the correct order in theget program()routine.

7 Elastic-Phase Oriented Programming (EPOP)

MPI Comm adapt begin(&intercomm , &new ,

&s t a y i n g c o u n t , &l e a v i n g c o u n t , &j o i n i n g c o u n t ) ; program−>elements [ 1 ] . phase adapt = adapt ; program−>elements [ 1 ] . p h a s e e x e c = phase0 ;

Listing 7.6: Simple elastic EPOP application (for comparison with MPI in Listing 7.1).

Examples of the initialization, EP and rigid-phase types can be seen in the double phase example in Listing 7.6; this example matches the code in the MPI example shown in

Im Dokument Resource-Elasticity Support for Distributed Memory HPC Applications (Seite 73-77)