MDANet: Multi-Modal Deep Aggregation Network for Depth Completion

Yanjie Ke; Kun Li; Wei Yang 0011; Zhenbo Xu; Dayang Hao; Liusheng Huang; Gang Wang

Back to ICRA

ICRA 2021

MDANet: Multi-Modal Deep Aggregation Network for Depth Completion

Conference Paper Accepted Paper Artificial Intelligence · Robotics

Details

Abstract

Depth completion aims to recover the dense depth map from sparse depth data and RGB image respectively. However, due to the huge difference between the multi-modal signal input, vanilla convolutional neural network and simple fusion strategy cannot extract features from sparse data and aggregate multi-modal information effectively. To tackle this problem, we design a novel network architecture that takes full advantage of multi-modal features for depth completion. An effective Pre-completion algorithm is first put forward to increase the density of the input depth map and to provide distribution priors. Moreover, to effectively fuse the image features and the depth features, we propose a multi-modal deep aggregation block that consists of multiple connection and aggregation pathways for deeper fusion. Furthermore, based on the intuition that semantic image features are beneficial for accurate contour, we introduce the deformable guided fusion layer to guide the generation of the dense depth map. The resulting architecture, called MDANet, outperforms all the stateof-the-art methods on the popular KITTI Depth Completion Benchmark, meanwhile with fewer parameters than recent methods. The code of this work will be available at https://github.com/USTC-Keyanjie/MDANet_ICRA2021.

Authors

Keywords

Image resolution
Fuses
Semantics
Network architecture
Data aggregation
Feature extraction
Data mining
Depth Completion
Convolutional Network
Convolutional Neural Network
Image Features
Sparse Data
RGB Images
Depth Map
Fusion Strategy
Multimodal Information
Depth Features
Multimodal Features
Dense Depth
Contralateral
Mean Square Error
Spatial Information
Feature Maps
Mean Absolute Error
Image Information
Semantic Information
Addition Operations
Accurate Depth
Stereo Matching
Aggregation Scheme
Input Density
Deformable Convolution
Depth Perception
Edges Of Objects
Multimodal Representation
Depth Values
Down-sampling Operation

Context

Venue: IEEE International Conference on Robotics and Automation
Archive span: 1984-2025
Indexed papers: 30179
Paper id: 212143371363551837