Technical Report Number
The problem considered in this paper is to find an assignment of logic components to processors which will achieve logic simulation speed-ups approaching the ideal for large processor populations. This problem becomes particularly important when a significant portion of the speed-up expected from logic simulation engines is attributed to load sharing (as opposed to obtaining speed-up by employing specialized hardware to carry out specific tasks associated with the simulation process such as event queue manipulation or function evaluation). Our research considers this problem for a particular multiprocessor simulation architecture for which a performance model has bene developed. The model is parameterized on the basis of workload characteristics, architecture design parameters, and load (processing and communication) distribution. Workload characteristics were derived from actual VLSI circuit simulations. An approach to the component assignment problem is presented and evaluated using this data.
The circuits that we are considering have component populations in the tens and hundreds of thousands. Since the optimal component assignment problem is NP-complete, a heuristic assignment is considered and its performance compared against a simple random assignment algorithm (i.e. components are assigned to processors such that each processor is chosen with equal probability). The random assignment method, though simple, is shown analytically to have large communication requirements even for the small number of processors (e.g. five). This approach therefore limits the number of processors which can be effectively applied towards speeding up the simulation process through exploitation of computational parallelism. The heuristic examined in this paper attempts to improve on the random assignment method by reducing the communication volume while maintaining the degree of parallelism in the random method. Results show that for tens of processors and circuit sizes in the tens of thousands, the heuristic can reduce the message volume in the benchmark circuits by a factor of between 2 and 4.
Wong, Ken and Franklin, Mark A., "Load and Communications Balancing on Multiprocessor Logic Simulation Engines" Report Number: WUCS-87-27 (1987). All Computer Science and Engineering Research.