Document Type

Honors Paper


Ozgur Izmirli

Publication Date



In this thesis, we re-examine the problem of 3D object detection in the context of self driving cars with the first publicly released View of Delft (VoD) dataset [1] containing 4D radar sensor data. 4D radar is a novel sensor that provides velocity and Radar Cross Section (RCS) information in addition to position for its point cloud. State of the art architectures such as 3DETR [2] and IASSD [3] were used as a baseline. Several attention-free methods, like point cloud concatenation, feature propagation and feature fusion with MLP, as well as attentional methods utilizing cross attention, were tested to determine how we can best combine LiDAR and radar to develop a multimodal detection architecture that outperforms the baseline architectures trained only on either modality alone. Our findings indicate that while attention-free methods did not consistently surpass the baseline performance across all classes, they did lead to notable performance gains for specific classes. Furthermore, we found that attentional methods faced challenges due to the sparsity of radar point clouds and duplicated features, which limited the efficacy of the crossattention mechanism. These findings highlight potential avenues for future research to refine and improve upon attentional methods in the context of 3D object detection.

Available for download on Monday, May 22, 2023



The views expressed in this paper are solely those of the author.