Abstract
The increasing popularity of deep learning in workloads across vision, speech, and language has inspired many attempts to develop hardware accelerators for matrix-matrix multiplication. Both application-specific integrated circuits (ASICs), and field-programmable arrays (FPGAs) are used for this purpose. However, a trade-off between the two platforms is that ASICs provide little flexibility after they are manufactured while designs on FPGAs are flexible but application development on FPGAs is more time-consuming. In this work, we aim to find the balance between reconfigurability and development efficiency by designing a reconfigurable systolic architecture as an overlay on the FPGA. Our contribution to the reconfigurable systolic architectures is a multiplexer-based crossbar network that interconnects every processing element in the network. The crossbar network grants user run-time reconfigurability of the topology of the systolic array, enabling the user to specify the shape and size of the systolic architecture on-the-fly. The proposed overlay architecture achieves similar computational hardware resource usage and maximum clock frequency compared to the baseline designs.
Degree
Master of Science (MS)
Author's Department
Computer Science & Engineering
Document Type
Thesis
Date of Award
Spring 4-25-2022
Language
English (en)
DOI
https://doi.org/10.7936/76vz-dv86
Recommended Citation
Chen, Zihao, "A Reconfigurable FPGA Overlay Architecture for Matrix-Matrix Multiplication" (2022). McKelvey School of Engineering Theses & Dissertations. 710.
The definitive version is available at https://doi.org/10.7936/76vz-dv86