• Keine Ergebnisse gefunden

CANN parallel solution test results

4.2 T OWARDS A GENERIC PARALLELIZATION OF THE CANN FRAMEWORK

4.2.2 CANN parallel solution test results

Vector of NetManager

run run run

CNMImplementation

CNMImplementation Threads

Figure 4.2 – The parallel architecture solution

The CANN simulation environment allows the user to create many ANN instances at the same time grouping them in a Vector. The user can start the execution of the ANN at any time.

4.2.2 CANN parallel solution test results

The main goal of the tests is to verify check whether the fact of having different instances of ANN’s running at the same time will influence the ANN performances. The proposed test measures the performance of two different ANN instances when running alone and when running together, sharing the machine resources. The same test was performed on different machines with one and two processors, to verify the behavior of the solution when the CPU is shared and when two CPU’s can be allocated. The first machine is a Pentium III 550 MHz with 256 Mb of RAM and the second one is an IBM Netfinity 3000 with 2 Pentium III 667 MHz processors and 512 Mbytes of RAM.

The two selected nets were the Backpropagation and the SOM. The time each network took to perform its learning process was measured. The Backpropagation network ran 10000 epochs to learn the XOR problem and the SOM ran 5000 epochs to learn the Bi-Dimensional problem. The start of the learning process is manual for both networks using the CANN GUI. Therefore, one shall be started before the other. The SOM network was started first.

Table 4.1 - Networks running on a machine with one CPU Network Standalone Parallel BP 14843 ms 20812 ms SOM 10750 ms 21429 ms

Table 4.1 shows the results when the tests where performed on the single CPU machine. The column Standalone shows the time average the networks took to perform the learning running each one in a different time frame, not competing for the CPU. The column Parallel shows the time average when they were executed in parallel. The results show that when running in parallel both networks spend more time to perform the learning individually, however, as they are running together, the total time is simply the time the last net took to learn. When running them separately the time of both shall be summed. The Figure 4.3 shows the difference of running the ANN sequentially and in parallel. Running the Backpropagation and the SOM separately would take on average 25593 ms (14843 ms + 10750ms), while running together it would take on average 21429 ms (time SOM took to finish, the BP certainly had finished before). Thus, it is worth running the networks in parallel even with a one CPU machine.

Time (s) BP and SOM in parallel

BP sequential SOM sequential

10.8 21.4 25.5

Figure 4.3 – The time difference between running in parallel and sequentially

The Speed-up (Sp) calculus (Hwang & Xu, 1998), reinforces this conclusion. It is a simple acceleration factor that is given by the reason of the sequential time (st) by the parallel time (pt) as can be seen on Equation 4.1 below.

pt Sp= st

Equation 4.1 – Speed-up

Taking the sequential time as the addition of the sequential times for the Backpropagation and SOM execution (25593 ms) and for the parallel time the SOM time (21429 ms), the Speed-up will be 1.19 (Equation 4.2). It means that running the two networks in parallel it will be 1.19 times faster or 19% faster.

19 . 21429 1

10750 14843

+ =

= Sp

Equation 4.2 – Speed-up for running BP and SOM in parallel in a single CPU

For the machine with two CPU’s, more tests were performed. The first test was to verify the performance of the Backpropagation running standalone and with two instances at the same time. Table 4.2 shows the standalone performance average of the Backpropagation.

Table 4.2 – Backpropagation running standalone in a 2 CPU’s machine Network Standalone

BP 5500 ms

Table 4.3 shows the performance when two instances of the Backpropagation run during 10000 epochs on the 2 CPU’s machine.

Table 4.3 – Two Backpropagation instances running in parallel in a 2 CPU’s machine Network Parallel

BP 1 8367 ms BP 2 8586 ms

Once again it was worth running two ANN’s at the same time, considering that, on average, running two instances of Backpropagation at the same time is faster than running them in sequence. The Speed-up for this execution is given in Equation 4.3 below. The sequential time is given by the execution of two Backpropagation simulations sequentially and the parallel time is given by the longest execution of the two parallel Backpropagation simulations. The Speed-up result is that the parallel execution is 28% faster than the sequential one.

28 . 8586 1

5500

5500+ =

= Sp

Equation 4.3 – Speed-up for running BP in a 2 CPU machine

Figure 4.4 shows the processors being allocated to perform the execution of the two Backpropagation simulations in parallel. The two CPU’s are nearly 100% allocated during the simulation period.

Figure 4.4 – Two CPU’s running two Backpropagation instances in parallel

Table 4.4 below shows the average time for running the learning of one instance of the SOM network in the machine with two CPU’s. Table 4.5 shows the average time two SOM instances take to run in parallel on the same machine.

Table 4.4 – SOM running standalone in a 2 CPU’s machine Network Standalone

SOM 8555 ms

Table 4.5 – Two SOM instances running in parallel in a 2 CPU’s machine Network Parallel

SOM 1 8974 ms SOM 2 8760 ms

There is no significant difference between running the SOM as a standalone learning process or running two learning processes at the same time. The Speed-up for this execution is given in Equation 4.4 below.

90

Equation 4.4 – Speed-up for running SOM in a 2 CPU machine

This clearly shows the significant advantage of having such a parallel solution for running the SOM learning. During the same time frame two networks can be learned instead of one, the parallel solution for the SOM simulation is, on average, 90% faster than the sequential one.

Table 4.6 shows the learning time when two instances of different ANN’s run in parallel on the machine with two CPU’s. The Backpropagation instance took a little more time than when running standalone and the SOM instance once again presented a similar performance average.

Table 4.6 – Backpropagation and SOM instances running in parallel in a machine with two CPU’s Network Parallel

BP 6510 ms

SOM 9291 ms

The results confirmed that it is worthwhile simulating different ANN models at the same time on the same machine. The Speed-up for this execution is given in Equation 4.5 below. The parallel execution was 51% faster than running sequentially.

51

Equation 4.4 – Speed-up for running BP and SOM in a 2 CPU machine

Figure 4.5 below shows the allocation of the CPU’s during the Backpropagation and SOM instances learning.

Figure 4.5 –Two CPU’s running Backpropagation and SOM instances in parallel

It is important to note that running more than one ANN in parallel inside the CANN simulation environment may lead to performance bottlenecks. Depending on the number of nets running simultaneously and the size of those nets, the machine resources can be dried out fast.

CANN runs in one Java runtime, receiving a main controller process. The threads created for the ANN’s will allocate the resources of this process. For each created ANN instance, its threads will make use of the same memory and CPU footprint that was allocated for the other ANN instances inside the same process. So it is important, when running more than one ANN using CANN, to clearly understand the ANN necessities of CPU and memory footprint as well as the machine resources and Java runtime issues. It may be better to run some ANN instances on separate machines to avoid competing for the CPU and memory given the number of CPU’s and threads to create and the available memory. Further tests could be elaborated and performed to measure such situations in order to map the exact implications of running parallel ANN’s in the CANN simulation environment.

To be able to extend the possible resources for running the parallel ANN instances an extension of this general solution is presented in Chapter 5 where the Training session parallelism allows the implementation of parallelism by distributing the different ANN models on the networked computers for learning and testing processes.

The solution presented here is general enough as a good solution for the CANN framework. However, it does not take into consideration the parallelism inherent to the specific ANN models. As a general solution is difficult at such a level, as already explained, it is important to do some investigation on how to implement the parallelism at least for the CNM model, which is the model where parallelism could lead to significant gains due to the inherently structure of CNM nets. Such an investigation can also help evaluate the CANN architecture as well. Next section explores the possibilities for implementing parallelism for the CNM.