Date of Award

Spring 5-15-2023

Author's School

McKelvey School of Engineering

Author's Department

Computer Science & Engineering

Degree Name

Doctor of Philosophy (PhD)

Degree Type



Streaming dataflow applications exist in numerous fields of study including, but not limited to, bio-sequence analysis, astrophysics, network packet analysis, and data integration. The inputs for these applications are independent of one another, making the problems prime candidates for parallel acceleration using wide-SIMD vector processors, such as graphics cards. However, many of these streaming applications exhibit irregular dataflow, where the number of outputs per input from any computation within the application is data dependent and unknown a priori. To ameliorate the throughput issues caused by irregular dataflow on wide-SIMD systems, we utilize a framework such as MERCATOR, which queues data between computational stages, ensuring that each computational stage has full-width SIMD vectors with which to work.Implementing queues within the application do not come without costs however. Overheads are incurred for queueing between compute nodes including physically moving data between queues, determining how many inputs from the queue can be safely consumed at one time, and memory usage of the queues themselves. This dissertation looks to optimize irregular dataflow application throughput through four pieces of work, with direct analysis of the queues and feature additions to enable lower overheads with greater queue functionality. We first examine queues in the context of region-based state, and how to efficiently make execution boundaries within a data stream. Next, we explore how much space should be assigned to queues within an application and whether the overhead of queueing is worthwhile based on runtime statistics. We then cover how minimum queue size requirements affect application performance and limit viable application implementations, which we solve using interruptible execution. Finally, we examine how to efficiently implement iterative applications for throughput.


English (en)


Jeremy Buhler

Committee Members

Sanjoy Baruah, Roger Chamberlain, Shyam Dwaraknath, Christopher Gill,