|
Multicomputer Technology Initiative
|
PVM Results
The results are shown in the form of timing diagrams, extracted from the
XPVM
space-time view. Each horizontal bar represents a process running on
the virtual machine. The colour of the bar indicates the state of the process
at any moment in time. The possible colours are:
-
Green, indicating processing
-
Yellow, indicating PVM overhead
-
White, indicating that the process is waiting (usu. for message receipt
of process synchronisation)
Message passing between the processes in the virtual machine is represented
by red lines joining the processes. In the active display it is possible
to query both the processes and messages to obtain information about message
direction, start and end times etc. In the static view presented here,
this information is not available.
All results have bee obtained using PVM version 3.4beta7 and XPVM version
1.2.5 where applicable. The hardware used is described here.
PBIL Algorithm
We have used PVM to implement an antenna design algorithm, based on PBIL.
Population Based Incremental Learning is a type of Genetic Algorithm which
functions by randomly generating solution vectors and selecting the best
of these according to pre-defined criteria. The results show the XPVM output
for the first 3 generations
of the algorithm.
The algorithm has been run using different numbers of processors in
order to indicate the advantage of implementing a parallel solution. Each
run was performed for 3 generations and a population vector size of 100.
The adjacent graph shows plots for different numbers of processors, with
XPVM or just the PVM console running. Interesting features of the data
include:
-
Consulting the timings for no XPVM, the time for a number of processors
(n) is approximately the time for 1 processor, divided by n.
There is thus approximately an 8 fold speedup when using 8 processors.
-
The improvement in processing time flattens off exponentially, yielding
diminishing returns for increased numbers of processors.
-
The XPVM / no XPVM curves are almost identical, except for the case where
the processor running XPVM (processor 8) was used to process data. This
indicates that the increased workload from XPVM can influence timing results
to a large degree.
The above results show that the PBIL algorithm is one which can be efficiently
parallelised, yielding greatly improved processing times. The amount of
data that is passed between processes is small relative to the time to
process the data, resulting in large speed improvements with increasing
numbers of processors. This is a characteristic of this algorithm and it
should not be inferred that this magnitude of speed up will occur in all
cases.
Generic Detection Algorithm
PVM has been used to implement a parallel, image based detection algorithm.
In this algorithm, detection is performed on a number of image segments,
using various detection kernels. The detections are performed by first
FFTing the image, multiplying with the frequency-domain representation
of the kernels and then performing an IFFT on each image. After detection,
either the full, 'convolved' images or just the object coordinates can
be returned to the master process.
Three output graphs (from XPVM) are available. The first illustrates
the algorithm timing, for co-ordinate
only returns to the master process. The second illustrates the algorithm
timing, for co-ordinate
plus full image returns to the master process.
It can be seen that the message passing overhead of the full image returns
causes a bottleneck at the master process that vastly increases the time
taken for the whole algorithm. This is a result of the image size (8 MBytes
per image) and the network topology (simple 100Mbps EtherNet hub).
The third graph shows timings for the updated algorithm (version1.4
12/5/1999), using 10 detection kernels of varying size and each slave returning
one
composite thresholded image to the master. The total execution time
for this version (excluding setup time) is 29.4 seconds on Gollach. Of
this total time, approximately 3 seconds is taken to distribute the images
to the 8 slave processes.
to MTI home page