Robust Adversarial Resilience in Deep Neural Architectures via Multiobjective Optimization for Secure Machine Learning Systems
Keywords:
adversarial attacks, deep neural networks, multi-objective optimization, secure machine learning, robustness, FGSM, PGD, Deep FoolAbstract
The increasing sophistication of adversarial attacks poses a significant threat to the robustness and trustworthiness of deep learning systems, especially in security-critical domains. This paper presents a multiobjective optimization framework that enhances adversarial resilience in deep neural networks (DNNs) by jointly optimizing accuracy, robustness, and computational efficiency. The proposed framework utilizes Pareto-front-based learning to balance competing objectives and incorporates gradient masking, feature squeezing, and adversarial retraining strategies to ensure comprehensive defense mechanisms. Empirical evaluations demonstrate significant improvements in resilience across diverse attack models including FGSM, PGD, and DeepFool, without compromising model performance.
References
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint.
Carlini, N., & Wagner, D. (2017). Towards evaluating the robustness of neural networks. IEEE S&P, Vol. 38(2), pp. 39–57.
Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016). Distillation as a defense to adversarial perturbations. IEEE EuroS&P, Vol. 1(2), pp. 56–67.
Athalye, A., Carlini, N., & Wagner, D. (2018). Obfuscated gradients give a false sense of security. ICML, Vol. 70(1), pp. 274–283.
Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., & Jordan, M. (2019). Theoretically principled trade-off between robustness and accuracy. ICML, Vol. 97(3), pp. 7472–7481.
Moosavi-Dezfooli, S. M., Fawzi, A., & Frossard, P. (2016). DeepFool: a simple and accurate method to fool deep neural networks. CVPR, Vol. 5(1), pp. 2574–2582.
Szegedy, C., Zaremba, W., & Sutskever, I. (2013). Intriguing properties of neural networks. arXiv preprint.
Xiao, C., Li, B., Zhu, J., He, W., Liu, M., & Song, D. (2018). Generating adversarial examples with adversarial networks. IJCAI, Vol. 27(1), pp. 3905–3911.
Kurakin, A., Goodfellow, I., & Bengio, S. (2017). Adversarial machine learning at scale. ICLR, Vol. 2(1), pp. 1–10.
Wang, B., Yao, Y., Shan, S., & Viswanath, B. (2020). Symmetric feature denoising for robust learning. NeurIPS, Vol. 33(1), pp. 1–12.
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., & McDaniel, P. (2018). Ensemble adversarial training. ICLR, Vol. 6(2), pp. 1–15.
Song, Y., Shu, R., Kushman, N., & Ermon, S. (2019). Constructing unrestricted adversarial examples with generative models. NeurIPS, Vol. 32(1), pp. 8312–8323.
Pang, T., Xu, K., Du, C., Chen, N., & Zhu, J. (2020). Rethinking softmax cross-entropy loss for adversarial robustness. ICLR, Vol. 8(1), pp. 1–10.
Hein, M., & Andriushchenko, M. (2017). Formal guarantees on the robustness of a classifier against adversarial manipulation. NeurIPS, Vol. 30(1), pp. 2266–2276
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Ahmed El-Sayed (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.