Download PDFOpen PDF in browser

Testing Neural Network Robustness Against Adversarial Attacks

EasyChair Preprint 15066

22 pagesDate: September 25, 2024

Abstract

Neural networks (NNs) have become fundamental tools in various applications, including image classification, autonomous systems, and natural language processing. Despite their impressive performance, NNs are highly vulnerable to adversarial attacks—subtle input perturbations that lead to incorrect predictions. This paper explores the different types of adversarial attacks, such as white-box, black-box, and gray-box attacks, as well as specific techniques like the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). We delve into methods for testing the robustness of NNs against these attacks, including perturbation analysis, adversarial example generation, and evaluation metrics. Additionally, various defense mechanisms, such as adversarial training, defensive distillation, and input preprocessing, are discussed, along with their limitations.

 

Experimental setups for testing robustness, utilizing datasets like MNIST and ImageNet, and NN architectures like CNNs and ResNets, are outlined. The paper highlights key challenges, including the trade-off between robustness and performance, and the adaptive nature of adversarial attacks. Through case studies in real-world applications and an analysis of industry trends, this work underscores the critical need for ongoing research in securing neural networks against adversarial threats. By exploring emerging defense strategies and combining multiple approaches, we aim to strengthen the robustness of NNs and ensure their safe deployment in sensitive domains.

Keyphrases: Robustness Testing, adversarial attacks, machine learning, neural networks

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:15066,
  author    = {Harold Jonathan and Edwin Frank},
  title     = {Testing Neural Network Robustness Against Adversarial Attacks},
  howpublished = {EasyChair Preprint 15066},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser