SECURING ARTIFICIAL INTELLIGENCE AGAINST INTELLIGENT ADVERSARIES: ROBUST LEARNING FRAMEWORKS FOR ADVERSARIAL, POISONING, AND MODEL EXTRACTION ATTACKS

Asif Ahmad; Khair Muhammad Saraz; Imran Khan; Shadia Saad Baloch; Amber Baig; Ghulam Nabi

doi:10.71146/kjmr853

Authors

Asif Ahmad Departamento de Informática da Escola de Ciências e Tecnologia, Universidade de Évora Author https://orcid.org/0009-0009-5538-1633
Khair Muhammad Saraz Mehran University of Engineering and Technology Author
Imran Khan Dawood University of Engineering and Technology Author
Shadia Saad Baloch Isra University Hyderabad, Pakistan. Author
Amber Baig Isra University Hyderabad, Pakistan. Author
Ghulam Nabi Riphah International University Islamabad, Pakistan. Author

DOI:

https://doi.org/10.71146/kjmr853

Keywords:

adversarial attacks, artificial intelligence security, model extraction, poisoning attacks, robust learning framework

Abstract

The rapid proliferation of artificial intelligence (AI) systems across critical domains has heightened concerns regarding their vulnerability to intelligent adversaries. This study evaluated the robustness of machine learning models against adversarial (evasion), poisoning, and model extraction attacks and proposed a multi-layered robust learning framework to mitigate these threats. Experimental results demonstrated that baseline models experienced accuracy degradation of up to 37.6% under PGD adversarial attacks, while poisoning contamination reduced performance by more than 23% and produced backdoor trigger success rates exceeding 92%. Model extraction fidelity reached 89.3%, indicating substantial intellectual property risks. The proposed framework integrated adversarial training, anomaly-based data sanitization, and privacy-preserving output perturbation mechanisms. Following implementation, adversarial robustness improved by up to 26.3%, poisoning attack success rates declined below 13%, and extraction fidelity decreased by 24.2%. Importantly, these improvements were achieved with less than 3% reduction in clean-data accuracy, confirming that enhanced security did not significantly compromise predictive utility. Statistical analysis indicated that robustness improvements were significant at p < 0.05 across all attack categories. The findings emphasized that defense-in-depth architectures provided superior resilience compared to isolated mitigation techniques. The study contributed to secure and trustworthy AI development by presenting a scalable framework capable of addressing multi-stage intelligent adversarial threats while maintaining operational performance stability.

Downloads

Download data is not yet available.

Downloads

Published

2026-03-07

Issue

Vol. 3 No. 03 (2026): Mar 2026

Section

Engineering and Technology

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

KJMR publishes all articles as open access under CC BY 4.0, allowing anyone to share and adapt the work, even commercially, with proper credit, a license link, and clear notice of changes, without implying endorsement. Authors retain copyright while granting the journal non-exclusive publishing and archiving rights and may self-archive without embargo. Third-party material requires proper permission, and the journal ensures long-term free access through its website and archiving partners.

How to Cite

SECURING ARTIFICIAL INTELLIGENCE AGAINST INTELLIGENT ADVERSARIES: ROBUST LEARNING FRAMEWORKS FOR ADVERSARIAL, POISONING, AND MODEL EXTRACTION ATTACKS. (2026). Kashf Journal of Multidisciplinary Research, 3(03), 1-16. https://doi.org/10.71146/kjmr853