Date of Award

5-14-2024

Author's School

McKelvey School of Engineering

Author's Department

Computer Science & Engineering

Degree Name

Doctor of Philosophy (PhD)

Degree Type

Dissertation

Abstract

Data streaming algorithms are a class of problems that deal with moving data through a system while being processed. When implementing these types of algorithms a developer will spend time tuning an implementation for deployment, but if they are using heterogeneous architectures or when nodes of computation are physically separate from one another performance may be lost to unforeseen data movement complications. In this work we aim to alleviate some of these pain points through a combination of programming advice and mathematical models. One of the pain points often unseen and underappreciated by developers is a type of data streaming task known as data integration, which are tasks that transform one data element form into another form, usually targeting some other step in the overall processing pipeline. When studying how to improve these types of tasks, we implement them across a variety of execution platforms. Of particular interest to us, we give advice on how to implement such tasks on FPGA architectures. Beyond individual data streaming tasks we then utilize mathematical modeling to understand how individual nodes of computation effect the full data flow of a streaming algorithm. Here we apply existing queuing theory models to reason about average performance of the algorithm and also make estimations on the cost of using the system. To speak on absolute bounds of the system we turn to network calculus which allows us to make estimates of latency on data throughput in the system and predictions of queue bounds on a node for given arrival and service processes. This represents the first known application of network calculus techniques to streaming data applications.

Language

English (en)

Chair

Roger Chamberlain

Share

COinS