Open vocabulary 3D object detection (OV3D) allows precise and extensible object recognition crucial for adapting to diverse environments encountered in assistive robotics. This paper presents OpenNav, a zero-shot 3D object detection pipeline based on RGB-D images for smart wheelchairs. Our pipeline integrates an open-vocabulary 2D object detector with a mask generator for semantic segmentation, followed by depth isolation and point cloud construction to create 3D bounding boxes. The smart wheelchair exploits these 3D bounding boxes to identify potential targets and navigate safely. We demonstrate OpenNav’s performance through experiments on the Replica dataset and we report preliminary results with a real wheelchair. OpenNav improves state-of-the-art significantly on the Replica dataset at mAP25 (+9pts) and mAP50 (+5pts) with marginal improvement at mAP. The code is publicly available at this link https://github.com/EasyWalk-PRIN/OpenNav.

OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair Navigation

Rahman, Muhammad Rameez ur;Vascon, Sebastiano
2025-01-01

Abstract

Open vocabulary 3D object detection (OV3D) allows precise and extensible object recognition crucial for adapting to diverse environments encountered in assistive robotics. This paper presents OpenNav, a zero-shot 3D object detection pipeline based on RGB-D images for smart wheelchairs. Our pipeline integrates an open-vocabulary 2D object detector with a mask generator for semantic segmentation, followed by depth isolation and point cloud construction to create 3D bounding boxes. The smart wheelchair exploits these 3D bounding boxes to identify potential targets and navigate safely. We demonstrate OpenNav’s performance through experiments on the Replica dataset and we report preliminary results with a real wheelchair. OpenNav improves state-of-the-art significantly on the Replica dataset at mAP25 (+9pts) and mAP50 (+5pts) with marginal improvement at mAP. The code is publicly available at this link https://github.com/EasyWalk-PRIN/OpenNav.
2025
Lecture Notes in Computer Science - ECCV proceedings
File in questo prodotto:
File Dimensione Formato  
2408.13936v1 (1).pdf

accesso aperto

Descrizione: Versione ArXiv
Tipologia: Documento in Pre-print
Licenza: Accesso libero (no vincoli)
Dimensione 6.26 MB
Formato Adobe PDF
6.26 MB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5105908
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact