Abstract

Adversarial patch attack has demonstrated that it can cause the misclassification of deep neural networks to the target label when the size of patch is relatively small to the size of input image; however, the effectiveness of adversarial patch attack has never been experimented on deep reinforcement learning algorithms. We design algorithms to generate adversarial patches to attack two types of deep reinforcement learning algorithms, including deep Q-networks (DQN) and proximal policy optimization (PPO). Our algorithms of generating adversarial patch consist of two parts: choosing attack position and training adversarial patch on that position. Under the same bound of total perturbation, adversarial patch attacks achieve comparable results as FGSM and PGD attack, on Atari and Procgen environments, for DQN and PPO respectively. In addition, We also design Context Re-Constructor to reconstruct state when the state is corrupted by the patch. Based on the reconstructed states, we can identify the patch position and then use mask defense and recover defense to defend against adversarial patch. Lastly, we also test the transferability of adversarial patch.

Committee Chair

Yevgeniy Vorobeychik

Committee Members

Yevgeniy Vorobeychik Nathan Jacobs Ning Zhang

Degree

Master of Science (MS)

Author's Department

Computer Science & Engineering

Author's School

McKelvey School of Engineering

Document Type

Thesis

Date of Award

Spring 5-15-2023

Language

English (en)

Share

COinS