ICRA Conference 2025 Conference Paper
MonoLDP: LED Assisted Indoor Mobile Bot Monocular Depth Prediction and Pose Estimation System
- Chenxin Liang
- Jingyang Wang
- Shoujie Li
- Kit Wa Sou
- Xinyu Luo
- Wenbo Ding 0001
Multi-robot clusters are increasingly deployed in indoor environments, where effective communication and 3D perception are critical for coordinated operations. Monocular cameras, known for their lightweight design, cost-effectiveness, and versatility, present a promising solution for these tasks. However, relying solely on monocular cameras for comprehensive perception and communication presents significant challenges. To address this, we introduce MonoLDP, a novel system that leverages monocular cameras for depth estimation, mutual pose estimation, and visible light communication in indoor environments, providing an integrated framework to overcome these limitations. MonoLDP features a two-stage network: (1) a depth estimation module that infers depth from monocular images, and (2) a depth-guided 3D object recognition network for agent-relative localization and pose estimation. We created a custom dataset to validate the accuracy of MonoLDP. On our indoor dataset, MonoLDP outperforms the baseline by 43. 39% in 3D detection and 42. 39% in bird's-eye view detection, with an average localization error of 0. 104 m and an orientation error of 1. 66 degrees. Moreover, the depth estimation network demonstrates excellent performance on the NYU v2 dataset. Additionally, the system achieves a communication rate of 1. 2 Kbps with a bit error rate below 10 -2 at a distance of up to 4 m using LED arrays. Our code will be released at https://github.com/RavenLiang1005/MonoLDP.git.