Date of Award
Master of Science (MS)
Adversarial patch attack has demonstrated that it can cause the misclassification of deep neural networks to the target label when the size of patch is relatively small to the size of input image; however, the effectiveness of adversarial patch attack has never been experimented on deep reinforcement learning algorithms. We design algorithms to generate adversarial patches to attack two types of deep reinforcement learning algorithms, including deep Q-networks (DQN) and proximal policy optimization (PPO). Our algorithms of generating adversarial patch consist of two parts: choosing attack position and training adversarial patch on that position. Under the same bound of total perturbation, adversarial patch attacks achieve comparable results as FGSM and PGD attack, on Atari and Procgen environments, for DQN and PPO respectively. In addition, We also design Context Re-Constructor to reconstruct state when the state is corrupted by the patch. Based on the reconstructed states, we can identify the patch position and then use mask defense and recover defense to defend against adversarial patch. Lastly, we also test the transferability of adversarial patch.
Yevgeniy Vorobeychik Nathan Jacobs Ning Zhang