Document Type

Technical Report


Computer Science and Engineering

Publication Date






Technical Report Number



The Least Occupied Output First (LOOFA) scheduler is one of several unbuffered crossbar schedulers that provides strong performance guarantees when operated with a speedup of 2 or more. Because LOOFA requires the computation of a maximal matching, it has been considered too slow for use in systems with link rates of 10 Gb/s or more. This paper studies an approximate variant of LOOFA described briefly in [16]. We introduce a general family of schedulers that allows for partial sorting and that includes the LOOFA scheduler as a special case. We show that all schedulers in this class are work-conserving and use this to provide insight into the operation of the Approximate LOOFA scheduler and a stronger motivation for its use. We provide a detailed design of the ALOOFA scheduler in order to evaluate its implementation complexity and performance characteristics. We also introduce a simple, natural lower bound on the performance of crossbar schedulers and use it to show that a previously proposed “stress test” traffic pattern is in fact difficult to schedule well. Our result implies that non-trivial speedups are required for ideal worst-case scheduling performance, something that has been generally assumed to be true, but never conclusively demonstrated. We also compare the performance of both LOOFA variants to our lower bound on stress test traffic and observe that for speedups between 1 and 2, the performance of both variants stays within 25% of the lower bound, and that the performance characteristics of the two variants are essentially indistinguishable.


Permanent URL: