Multimodal Sensor Fusion in Autonomous Driving: A Deep Learning-Based Visual Perception Framework
DOI:
https://doi.org/10.71146/kjmr490Keywords:
Multimodal Sensor Fusion, Autonomous Driving, Deep Learning, Transformer Architecture, Visual Perception, LiDAR, Radar, RGB Camera, Object Detection, Real-Time SystemsAbstract
Autonomous driving has triggered the evolution of multimodal sensor fusion systems due to the needs to provide safety, reliability, and real-time environmental awareness. The study proposes a visual perception framework called FusionNet, which is a deep learning-based visual perception framework that has an intermediary fusion approach (enabled by transformers) that combines RGB camera, LiDAR, and radar data. In contrast to classic early or late fusion techniques, FusionNet uses modality-specific encoders and cross-attention layers to mutually adjust and merge semantic and geometric features dynamically. The massive test on the KITTI and nuScenes data sets have shown that FusionNet not only performs better in terms of increasing the mean Average Precision (mAP) than unimodal systems, but it also offers such an improvement in particularly adverse scenarios, like fog, low light, occlusion, among others, in which the unimodal systems do not perform well. The model is real-time capable with a time of 59 milliseconds per frame and it is robust under different weather conditions and in cases of bad sensors. Also, FusionNet has better localization quality on large IoU thresholds and could resist modality dropout training. These findings point to the future promise of deep multimodal fusion as a constituent building block of the future of autonomous vehicle perception systems capable of faithful deployment in a wide range of urban and environmental contexts.
Downloads

Downloads
Published
Issue
Section
License
Copyright (c) 2025 Hadi Abdullah, Majeed Ali, Ijaz khan, Abdullah Faiz , Syed Haider Abbas Naqvi, Ali Majid (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.