A team from Google Research has a new blog article on fusing Lidar and camera data for 3D object detection. The motivating problem here seems to be the issue of misalignment between 3D LiDAR data and 2D camera data.
The blog discusses the team’s forthcoming paper titled “DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection” which will be presented at the IEEE/CVF Computer Vision and Pattern Recognition (CVPR) conference in June 2022. A preprint of the paper is available here.
Some excerpts from the blog and the associated paper:
We evaluate DeepFusion on the Waymo Open Dataset, one of the largest 3D detection challenges for autonomous cars, using the Average Precision with Heading (APH) metric under difficulty level 2, the default metric to rank a model’s performance on the leaderboard. Among the 70 participating teams all over the world, the DeepFusion single and ensemble models achieve state-of-the-art performance in their corresponding categories.