SECURING ARTIFICIAL INTELLIGENCE AGAINST INTELLIGENT ADVERSARIES: ROBUST LEARNING FRAMEWORKS FOR ADVERSARIAL, POISONING, AND MODEL EXTRACTION ATTACKS

Authors

  • Asif Ahmad Departamento de Informática da Escola de Ciências e Tecnologia, Universidade de Évora Author https://orcid.org/0009-0009-5538-1633
  • Khair Muhammad Saraz Mehran University of Engineering and Technology Author
  • Imran Khan Dawood University of Engineering and Technology Author
  • Shadia Saad Baloch Isra University Hyderabad, Pakistan. Author
  • Amber Baig Isra University Hyderabad, Pakistan. Author
  • Ghulam Nabi Riphah International University Islamabad, Pakistan. Author

DOI:

https://doi.org/10.71146/kjmr853

Keywords:

adversarial attacks, artificial intelligence security, model extraction, poisoning attacks, robust learning framework

Abstract

The rapid proliferation of artificial intelligence (AI) systems across critical domains has heightened concerns regarding their vulnerability to intelligent adversaries. This study evaluated the robustness of machine learning models against adversarial (evasion), poisoning, and model extraction attacks and proposed a multi-layered robust learning framework to mitigate these threats. Experimental results demonstrated that baseline models experienced accuracy degradation of up to 37.6% under PGD adversarial attacks, while poisoning contamination reduced performance by more than 23% and produced backdoor trigger success rates exceeding 92%. Model extraction fidelity reached 89.3%, indicating substantial intellectual property risks. The proposed framework integrated adversarial training, anomaly-based data sanitization, and privacy-preserving output perturbation mechanisms. Following implementation, adversarial robustness improved by up to 26.3%, poisoning attack success rates declined below 13%, and extraction fidelity decreased by 24.2%. Importantly, these improvements were achieved with less than 3% reduction in clean-data accuracy, confirming that enhanced security did not significantly compromise predictive utility. Statistical analysis indicated that robustness improvements were significant at p < 0.05 across all attack categories. The findings emphasized that defense-in-depth architectures provided superior resilience compared to isolated mitigation techniques. The study contributed to secure and trustworthy AI development by presenting a scalable framework capable of addressing multi-stage intelligent adversarial threats while maintaining operational performance stability.

Downloads

Download data is not yet available.

Downloads

Published

2026-03-07

Issue

Section

Engineering and Technology

Categories

How to Cite

SECURING ARTIFICIAL INTELLIGENCE AGAINST INTELLIGENT ADVERSARIES: ROBUST LEARNING FRAMEWORKS FOR ADVERSARIAL, POISONING, AND MODEL EXTRACTION ATTACKS. (2026). Kashf Journal of Multidisciplinary Research, 3(03), 1-16. https://doi.org/10.71146/kjmr853