Abstract
Adversarial patch attack has demonstrated that it can cause the misclassification of deep neural networks to the target label when the size of patch is relatively small to the size of input image; however, the effectiveness of adversarial patch attack has never been experimented on deep reinforcement learning algorithms. We design algorithms to generate adversarial patches to attack two types of deep reinforcement learning algorithms, including deep Q-networks (DQN) and proximal policy optimization (PPO). Our algorithms of generating adversarial patch consist of two parts: choosing attack position and training adversarial patch on that position. Under the same bound of total perturbation, adversarial patch attacks achieve comparable results as FGSM and PGD attack, on Atari and Procgen environments, for DQN and PPO respectively. In addition, We also design Context Re-Constructor to reconstruct state when the state is corrupted by the patch. Based on the reconstructed states, we can identify the patch position and then use mask defense and recover defense to defend against adversarial patch. Lastly, we also test the transferability of adversarial patch.
Committee Chair
Yevgeniy Vorobeychik
Committee Members
Yevgeniy Vorobeychik Nathan Jacobs Ning Zhang
Degree
Master of Science (MS)
Author's Department
Computer Science & Engineering
Document Type
Thesis
Date of Award
Spring 5-15-2023
Language
English (en)
DOI
https://doi.org/10.7936/zxk1-v362
Recommended Citation
Tong, Peizhen, "Adversarial Patch Attacks on Deep Reinforcement Learning Algorithms" (2023). McKelvey School of Engineering Theses & Dissertations. 835.
The definitive version is available at https://doi.org/10.7936/zxk1-v362