Abstract
The increasing popularity of deep learning in workloads across vision, speech, and language has inspired many attempts to develop hardware accelerators for matrix-matrix multiplication. Both application-specific integrated circuits (ASICs), and field-programmable arrays (FPGAs) are used for this purpose. However, a trade-off between the two platforms is that ASICs provide little flexibility after they are manufactured while designs on FPGAs are flexible but application development on FPGAs is more time-consuming. In this work, we aim to find the balance between reconfigurability and development efficiency by designing a reconfigurable systolic architecture as an overlay on the FPGA. Our contribution to the reconfigurable systolic architectures is a multiplexer-based crossbar network that interconnects every processing element in the network. The crossbar network grants user run-time reconfigurability of the topology of the systolic array, enabling the user to specify the shape and size of the systolic architecture on-the-fly. The proposed overlay architecture achieves similar computational hardware resource usage and maximum clock frequency compared to the baseline designs.
Degree
Master of Science (MS)
Document Type
Thesis
Date of Award
Spring 4-25-2022
Language
English (en)
DOI
https://doi.org/10.7936/76vz-dv86
Recommended Citation
Chen, Zihao, "A Reconfigurable FPGA Overlay Architecture for Matrix-Matrix Multiplication" (2022). McKelvey School of Engineering Theses & Dissertations. 710.
The definitive version is available at https://doi.org/10.7936/76vz-dv86