Multicomputer Technology Initiative

PVM Results

The results are shown in the form of timing diagrams, extracted from the XPVM space-time view. Each horizontal bar represents a process running on the virtual machine. The colour of the bar indicates the state of the process at any moment in time. The possible colours are:

Green, indicating processing
Yellow, indicating PVM overhead
White, indicating that the process is waiting (usu. for message receipt of process synchronisation)

Message passing between the processes in the virtual machine is represented by red lines joining the processes. In the active display it is possible to query both the processes and messages to obtain information about message direction, start and end times etc. In the static view presented here, this information is not available.

All results have bee obtained using PVM version 3.4beta7 and XPVM version 1.2.5 where applicable. The hardware used is described here.

PBIL Algorithm

We have used PVM to implement an antenna design algorithm, based on PBIL. Population Based Incremental Learning is a type of Genetic Algorithm which functions by randomly generating solution vectors and selecting the best of these according to pre-defined criteria. The results show the XPVM output for the first 3 generations of the algorithm.

The algorithm has been run using different numbers of processors in order to indicate the advantage of implementing a parallel solution. Each run was performed for 3 generations and a population vector size of 100. The adjacent graph shows plots for different numbers of processors, with XPVM or just the PVM console running. Interesting features of the data include:

Consulting the timings for no XPVM, the time for a number of processors (n) is approximately the time for 1 processor, divided by n. There is thus approximately an 8 fold speedup when using 8 processors.
The improvement in processing time flattens off exponentially, yielding diminishing returns for increased numbers of processors.
The XPVM / no XPVM curves are almost identical, except for the case where the processor running XPVM (processor 8) was used to process data. This indicates that the increased workload from XPVM can influence timing results to a large degree.

The above results show that the PBIL algorithm is one which can be efficiently parallelised, yielding greatly improved processing times. The amount of data that is passed between processes is small relative to the time to process the data, resulting in large speed improvements with increasing numbers of processors. This is a characteristic of this algorithm and it should not be inferred that this magnitude of speed up will occur in all cases.

Generic Detection Algorithm

PVM has been used to implement a parallel, image based detection algorithm. In this algorithm, detection is performed on a number of image segments, using various detection kernels. The detections are performed by first FFTing the image, multiplying with the frequency-domain representation of the kernels and then performing an IFFT on each image. After detection, either the full, 'convolved' images or just the object coordinates can be returned to the master process.

Three output graphs (from XPVM) are available. The first illustrates the algorithm timing, for co-ordinate only returns to the master process. The second illustrates the algorithm timing, for co-ordinate plus full image returns to the master process.

It can be seen that the message passing overhead of the full image returns causes a bottleneck at the master process that vastly increases the time taken for the whole algorithm. This is a result of the image size (8 MBytes per image) and the network topology (simple 100Mbps EtherNet hub).

The third graph shows timings for the updated algorithm (version1.4 12/5/1999), using 10 detection kernels of varying size and each slave returning one composite thresholded image to the master. The total execution time for this version (excluding setup time) is 29.4 seconds on Gollach. Of this total time, approximately 3 seconds is taken to distribute the images to the 8 slave processes.

to MTI home page