REAL-TIME FACIAL EMOTION RECOGNITION USING MODIFIED EFFICIENTNETB0 WITH DUAL-DATASET TRAINING

Muhammad Nadeem; Hamza Rafi; Muhammad Ihsan; Muhammad Arslan; Sohail Raza Chohan; Wasif Akbar

doi:10.71146/kjmr879

Authors

Muhammad Nadeem Department of Computing & Emerging Technologies, Emerson University, Multan, Pakistan. Author
Hamza Rafi Department of Computing & Emerging Technologies, Emerson University, Multan, Pakistan. Author
Muhammad Ihsan Department of Computing & Emerging Technologies, Emerson University, Multan, Pakistan. Author
Muhammad Arslan Department of Computing & Emerging Technologies, Emerson University, Multan, Pakistan. Author
Sohail Raza Chohan Department of Computing & Emerging Technologies, Emerson University, Multan, Pakistan. Author
Wasif Akbar Department of Computing & Emerging Technologies, Emerson University, Multan, Pakistan. Author

DOI:

https://doi.org/10.71146/kjmr879

Keywords:

EfficientNetB0, facial emotion recognition, deep learning, real-time detecting, FER2013, CK+, transfer learning, MBConv, squeeze-and-excitation, human-computer interaction, cosine annealing, confusion matrix

Abstract

Recent advances in emotionally-aware computing have sparked growing interest in automated facial emotion recognition (FER), a field with far-reaching implications for human-computer interaction. While custom convolutional neural networks (CNNs) have demonstrated early promise in this domain, achieving consistently high performance under real-world conditions has remained a persistent challenge. This paper introduces an improved FER system built on a modified EfficientNetB0 architecture — pretrained on ImageNet and fine-tuned on a combined FER2013/CK+ dataset — designed to close the gap between laboratory accuracy and practical deployment. The proposed end-to-end pipeline integrates face detection, pre-processing, deep inference, and temporal smoothing into a seamless real-time workflow. Training with a 40-epoch cosine-annealing learning rate schedule yielded a peak validation accuracy of 88.4% on FER2013, representing a 14.1 percentage point improvement over the CNN baseline of 74.3%. Per-class recognition rates derived from the normalized confusion matrix are as follows: Disgust (96%), Surprise (95%), Angry (92%), Fear (91%), Happy (89%), Neutral (85%), and Sad (81%). Notably, the system achieves an inference speed of 22–28 frames per second on a standard laptop without GPU acceleration, underscoring its viability for real-world deployment in emotionally intelligent interaction applications.

Downloads

Download data is not yet available.

References

1. Chua LO. CNN: A vision of complexity. International Journal of Bifurcation and Chaos 1997;7(10):2219–2425.

2. Yousaf F, Arslan M, Ahmad Khan A, Tanzil A, Batool A, Asad M. Machine learning-based detection of mirai and bashlite botnets in IoT networks. jcbi.orgF Yousaf, M Arslan, AA Khan, A Tanzil, A Batool, M AsadJournal of Computing & Biomedical Informatics, 2024•jcbi.org [homepage on the Internet] [cited 2026 May 22];Available from: https://www.jcbi.org/index.php/Main/article/view/517

3. theory AM-C, 2017 undefined. Communication without words. taylorfrancis.com [homepage on the Internet] 2017 [cited 2026 May 8];193–200. Available from: https://www.taylorfrancis.com/chapters/edit/10.4324/9781315080918-15/communication-without-words-albert-mehrabian

4. Calvo RA, D’Mello S. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Trans Affect Comput [homepage on the Internet] 2010 [cited 2026 May 8];1(1):18–37. Available from: https://ieeexplore.ieee.org/abstract/document/5520655

5. Methodology for Ensuring Secure Disease Prediction using Machine Learning Techniques | Journal of Computing & Biomedical Informatics [Homepage on the Internet]. [cited 2026 May 22];Available from: https://www.jcbi.org/index.php/Main/article/view/435

6. Oudah M, Wooders J. Real-time Facial Communication Restores Cooperation After Defection in Social Dilemmas. 2026 [cited 2026 May 8];Available from: https://arxiv.org/pdf/2601.15211

7. Arslan M, Asad M, Khan A, Iqbal S, … MA-I, 2024 undefined. Deep Image Synthesis, Analysis and Indexing Using Integrated CNN Architectures. ieeexplore.ieee.orgM Arslan, M Asad, AH Khan, S Iqbal, MN Asghar, AA AlaulamieIEEE Access, 2024•ieeexplore.ieee.org [homepage on the Internet] [cited 2026 May 22];Available from: https://ieeexplore.ieee.org/abstract/document/10792907/

8. Masoomi M, Saeidi M, Cedeno R, et al. BODY LANGUAGE-" HEARING" WHAT IS NOT BEING SAID. search.ebscohost.com [homepage on the Internet] 2024 [cited 2026 May 8];3. Available from: https://search.ebscohost.com/login.aspx?direct=true&profile=ehost&scope=site&authtype=crawler&jrnl=18411401&AN=190724788&h=25SwFzlDfOz%2FWpWYz0N5LkeqOk2w450c4i2fgFQVWK7YHJQWI7RcLXc%2FtoEzpJk3q%2BjffM4IxbBFo5IkdjlCww%3D%3D&crl=c

9. Shree DrD. Performance Evaluation of Facial Expression Recognition Using CNN and DRLBP. International Journal of Engineering Science & Humanities [homepage on the Internet] 2026 [cited 2026 May 8];16(1):141–151. Available from: https://www.ijesh.com/j/article/view/544

10. Kim H, Bian Y, Krumhuber EG. Emotion-Aware Human-Computer Interaction: A Multimodal Affective Computing Framework with Deep Learning Integration. pspress.org [homepage on the Internet] 2025 [cited 2026 May 8];6(2):380–394. Available from: https://www.pspress.org/index.php/tcsm/article/view/255

11. Zadjali A Al, Balushi A Al, … AS-… C on, 2026 undefined. IAE-Net: Incremental Learning-Based Attention-Enhanced DenseNet for Robust Facial Emotion Recognition. mdpi.com [homepage on the Internet] [cited 2026 May 8];Available from: https://www.mdpi.com/2227-7390/14/6/1023

12. Truong V, 2026 DW-S, 2026 undefined. Performance Evaluation of Hardware Architectures for Convolutional Neural Networks. ieeexplore.ieee.org [homepage on the Internet] [cited 2026 May 8];Available from: https://ieeexplore.ieee.org/abstract/document/11475928/

13. Yang Q, He Y, Chen H, Wu Y, Algorithms ZR-, 2025 undefined. Robust Audio-Visual Fusion for Emotion Recognition Based on Cross-Modal Learning under Noisy Conditions. scholarworks.bwise.kr [homepage on the Internet] [cited 2026 May 8];Available from: https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/88285

14. Shao D, Zhuang L, Ma L, Yi S. Expression recognition method based on feature redundancy optimization. Springer [homepage on the Internet] 2025 [cited 2026 May 8];19(4). Available from: https://link.springer.com/article/10.1007/s11760-025-03889-z

15. Dey K, Roy S, Jana B, Dhar P. Efficient CNN architecture with image sensing and algorithmic channeling for dataset harmonization. nature.com [homepage on the Internet] 2025 [cited 2026 May 8];13(4):431–444. Available from: https://www.nature.com/articles/s41598-025-90616-w

16. Howard AG, Zhu M, Chen B, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. 2017 [cited 2026 May 8];Available from: http://arxiv.org/abs/1704.04861

17. Siddique A, Browne W, Access GG-I, 2026 undefined. Lateralized learning for multi-class visual classification tasks. ieeexplore.ieee.org [homepage on the Internet] 2026 [cited 2026 May 8];107–132. Available from: https://ieeexplore.ieee.org/abstract/document/11370875/

18. Liu T, Li R, Wang C, on XH-P of the AC, 2025 undefined. Region-Aware Cross-Modal Embedding for Fine-Grained Text-To-Video Retrieval. ieeexplore.ieee.org [homepage on the Internet] [cited 2026 May 8];Available from: https://ieeexplore.ieee.org/abstract/document/11298736/

19. Physics XZ-IC on, Photonics undefined, and undefined, 2026 undefined. DCNN-FLAME: A Dual-Supervised Style Transfer-Based Method for 3D Animated Character Expression Reconstruction. informatica.si [homepage on the Internet] [cited 2026 May 8];Available from: https://www.informatica.si/index.php/informatica/article/view/12444

20. Kumar R, Engineering NM-, Science T& A, 2025 undefined. Semantic Multi-Query Model for Cultural Computing of Image Search System. mail.joiv.org [homepage on the Internet] 2025 [cited 2026 May 8];15(3):22976–22982. Available from: https://mail.joiv.org/index.php/joiv/article/view/4294

21. Applications PB-MT and, 2025 undefined. A framework for enhanced image indexing and retrieval using the deep learning models. Springer [homepage on the Internet] 2025 [cited 2026 May 8];84(41):50037–50061. Available from: https://link.springer.com/article/10.1007/s11042-025-21097-2

22. Sam Chandra Bose A, Singh L, Qamar S, Uma S, Puspha Annabel LS, Singla S. Application of feature-based image matching method as an object recognition method. repository.pnb.ac.id [homepage on the Internet] 2025 [cited 2026 May 8];37(7):1195–1215. Available from: http://repository.pnb.ac.id/id/eprint/15562/

23. Guo Z, Wang J, Zhang B, Ku Y, Ma F. Facial Beauty Prediction Using Global Context Vision Transformer. ieeexplore.ieee.org [homepage on the Internet] 2025 [cited 2026 May 8];55(2). Available from: https://ieeexplore.ieee.org/abstract/document/10983768/

24. Gul F, Shah M, Ali M, Qazi T, Access MA-I, 2025 undefined. Introducing an efficient method for feature extraction in image retrieval systems. nature.com [homepage on the Internet] 2025 [cited 2026 May 8];8(4):2693–2707. Available from: https://www.nature.com/articles/s41598-025-24118-0

25. Deekshita P, Bonu V, Ramyasri A, … VR-J of A, 2025 undefined. Hierarchical Multi-Scale Attention-based Remote Sensing Super-resolution Network. ieeexplore.ieee.org [homepage on the Internet] [cited 2026 May 8];Available from: https://ieeexplore.ieee.org/abstract/document/11405676/

26. Chen C, Liu X, Zhou M, et al. StressIRNet: A Novel Lightweight CNN Architecture for Stress Classification Leveraging Smartphone Thermal Imaging Modality. ieeexplore.ieee.org [homepage on the Internet] 2025 [cited 2026 May 8];19(9). Available from: https://ieeexplore.ieee.org/abstract/document/11244908/

27. Bhati R, Agrawal AP, Ali S, Kumar A. Synthesis and mechanical characterization of banana-hemp reinforced epoxy composites: Influence of fiber orientation. journals.sagepub.com [homepage on the Internet] 2025 [cited 2026 May 8];57(4):559–579. Available from: https://journals.sagepub.com/doi/abs/10.1177/00952443251327735

28. Alsubaie M, Luo S, Shaukat K, Zhang W, AI JL-, 2025 undefined. The diagnostic classification of the pathological image using computer vision. mdpi.com [homepage on the Internet] 2025 [cited 2026 May 8];18(3):785–804. Available from: https://www.mdpi.com/1999-4893/18/2/96

29. Asif S, Qurrat-ul-Ain, Khan SUR, Amjad K, Awais M. SKINC-NET: An efficient lightweight deep learning model for multiclass skin lesion classification in dermoscopic images. Springer [homepage on the Internet] 2025 [cited 2026 May 8];84(13):12531–12557. Available from: https://link.springer.com/article/10.1007/s11042-024-19489-x

30. Intelligence SD-F in A, 2025 undefined. An efficient method for early Alzheimer’s disease detection based on MRI images using deep convolutional neural networks. frontiersin.org [homepage on the Internet] 2025 [cited 2026 May 8];8. Available from: https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1563016/full

31. Balasubramani K, Shanmugavel KL. An Efficient Approach for Tumor Grade Classification from MRI Image using Hybrid ResNet-101 with Enhanced GoogLeNet Algorithm. Springer [homepage on the Internet] 2025 [cited 2026 May 8];34(6):694–723. Available from: https://link.springer.com/article/10.1007/s11518-025-5671-y

32. Aldhyani THH, Alkahtani H. Developing sustainable system based on transformers algorithms to predict the Dubas insects’ diseases in palm leaves. frontiersin.org [homepage on the Internet] 2025 [cited 2026 May 8];16. Available from: https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1612800/full

33. Keerthana I, and RSK-AJ for S, 2025 undefined. An Enhanced Transfer Learning-Based Hierarchical Ensemble Framework for Diabetic Retinopathy Identification and Multistage Classification. Springer [homepage on the Internet] 2025 [cited 2026 May 8];Available from: https://link.springer.com/article/10.1007/s13369-025-10827-1

34. Ali M, Iqbal M, Lee S, Duan X, Sciences SK-A, 2025 undefined. Explainable AI Based Multi Class Skin Cancer Detection Enhanced by Meta Learning with Generative DDPM Data Augmentation. mdpi.com [homepage on the Internet] [cited 2026 May 8];Available from: https://www.mdpi.com/2076-3417/15/21/11689

35. Pai A, Chhapariya K, Buddhiraju K, Imaging SD-J of, 2026 undefined. A New Feature Set for Texture-Based Classification of Remotely Sensed Images in a Quantum Framework. mdpi.com [homepage on the Internet] [cited 2026 May 8];Available from: https://www.mdpi.com/2313-433X/12/4/149

36. İncetas M, Signal RA-, Processing I and V, 2025 undefined. Spiking neural network-based edge detection model for content-based image retrieval. Springer [homepage on the Internet] 2025 [cited 2026 May 8];19(1). Available from: https://link.springer.com/article/10.1007/s11760-024-03799-6

37. Giveki D, Supercomputing SE-TJ of, 2025 undefined. Semantic image representation for image recognition and retrieval using multilayer variational auto-encoder, InceptionNet and low-level image features. Springer [homepage on the Internet] 2025 [cited 2026 May 8];81(1). Available from: https://link.springer.com/article/10.1007/s11227-024-06792-5

38. Wu Z, Liang C, Wang J, Chen Z. Intelligent Retrieval and Reuse in Product Manufacturing: A Comprehensive Analysis of Current Practices and Future Directions. journals.sagepub.com [homepage on the Internet] 2025 [cited 2026 May 8];Available from: https://journals.sagepub.com/doi/abs/10.1177/18758967251367065

39. Yadav PS, Tyagi DK, Vipparthi SK. A novel approach for image retrieval in remote sensing using vision-language-based image caption generation. Springer [homepage on the Internet] 2025 [cited 2026 May 8];84(6):2985–3014. Available from: https://link.springer.com/article/10.1007/s11042-024-20447-w

40. Tao Z, Ma B, Xu J, Zhang P, Things XL-II of, 2025 undefined. An IoT-Oriented Image Retrieval Scheme Based on Multi-Feature Fusion for Cloud-Edge Environments. ieeexplore.ieee.org [homepage on the Internet] [cited 2026 May 8];Available from: https://ieeexplore.ieee.org/abstract/document/11115125/

41. G. S. Vieira, A. U. Fonseca, N. M. Sousa, J. P. Felix,... - Google Scholar [Homepage on the Internet]. [cited 2026 May 8];Available from: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&as_ylo=2025&q=G.+S.+Vieira%2C+A.+U.+Fonseca%2C+N.+M.+Sousa%2C+J.+P.+Felix%2C+and+F.+Soares%2C+%22A+novel+content-based+image+retrieval+system+with+feature+descriptor+integration%2C%22+Expert+Syst.+Appl.%2C+vol.+232%2C+Dec.+2023.&btnG=

42. Tribedi S, Barai RK. SI-Net: a fusion model for facial emotion recognition with inception blocks and re-parameterized Swish1 function. Springer [homepage on the Internet] 2025 [cited 2026 May 8];84(34):42547–42570. Available from: https://link.springer.com/article/10.1007/s11042-025-20809-y

43. Hamed W, Merabtene M, … MT-, and DS, 2025 undefined. Accelerated Training of Swin Transformer V2 Models for Facial Expression Recognition using GradScaler and Autocast. ieeexplore.ieee.org [homepage on the Internet] 2025 [cited 2026 May 8];19(9). Available from: https://ieeexplore.ieee.org/abstract/document/11290264/

44. Krishnasamy N, … NZ-J of, 2025 undefined. Ensemble deep learning framework for hybrid facial datasets using landmark detection: State-of-the-art tools. ojs.bonviewpress.com [homepage on the Internet] [cited 2026 May 8];Available from: https://ojs.bonviewpress.com/index.php/JCCE/article/view/4451

45. Ezzameli K, Applied HM-IIJ of, 2026 undefined. Vision Transformer-Based Facial Emotion Recognition. iaeng.org [homepage on the Internet] [cited 2026 May 8];Available from: https://www.iaeng.org/IJCS/issues_v53/issue_1/IJCS_53_1_40.pdf