Date of Award
Summer 9-12-2023
Degree Name
Doctor of Philosophy (PhD)
Degree Type
Dissertation
Abstract
Today’s personalized recommendation systems leverage deep learning to deliver the best user experience in internet services used by search engines, social networks, online retail, and content streaming. The underlying deep learning (DL-)based recommendation models now consume the majority of the datacenter cycles spent on AI. Given the volume of personalized inferences and their rapid growth rate occurring in datacenters, we first propose RecNMP -- a lightweight, commodity DRAM compliant, near-memory processing (NMP) solution, to accelerate the sparse embedding operations in recommendation models. To demonstrate the performance potential of NMP technology in real hardware, we develop the FPGA-enabled NMP platform called AxDIMM providing rapid prototyping under a realistic system setting using industry-representative recommendation framework. While the untrusted near-data processing (NDP) units bring in new threats to workloads that are private and sensitive, we propose SecNDP, a lightweight encryption and verification scheme for untrusted NDP devices to perform computation over ciphertext. Considering the fast-evolving and rigorous growth of production-grade recommendation models, at datacenter scale, we propose Hercules – a comprehensive optimization framework tailor-designed for at-scale neural recommendation inferences. To architect the future datacenters, in DisaggRec, we envision constructing the datacenter in a disaggregated manner. Resource disaggregation achieves the independent decoupled scaling-out of the compute and memory resources to match the changing demands from fast-evolving workloads.
Language
English (en)
Chair
Xuan Zhang