实验室研究面向自动摄影的具身智能体:
1、感知:
(1)深度视觉计算:深度视觉计算是利用计算机视觉技术估计图像或视频中的深度信息,即场景中各点像素到相机成像平面的垂直距离。实验室关注室内外场景无缝切换的跨域单目度量深度估计。
(2)开放世界感知与导航:①开放世界2D或3D障碍物/未知物体检测与识别;②开放世界视觉导航:视觉导航是移动机器人利用视觉传感器实现场景感知、路径规划、运动规划的整个体系。实验室关注视觉导航及避障技术。包括: 视觉里程计(VO)、建图(利用VO和深度图)、重定位(从已知地图中识别自身位置)、闭环检测(消除VO的闭环误差) 、障碍物检测、道路分割、未知物体检测、全局导航、视觉避障、Scene tagging(自动标注房间中物体)等。
2、推理:根据已有的知识和信息进行逻辑思考,做出合理的判断。
(1)开放世界的数字孪生;
(2)知识与数据双轮驱动;
3、决策。基于推理的结果选择最佳的行动方案。
4、记忆。短期工作记忆和三种长期记忆:情景记忆、语义记忆和程序记忆。
vRobotit实验室关于“摄影具身智能体”代表性论文:
[1] Yihao Liu, Feng Xue,, Anlong Ming*, Mingshuai Zhao, Huadong Ma, Nicu Sebe, SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model, in Proceedings of the 32th ACM International Conference on Multimedia (MM), 2024. 注:深度视觉计算
代码:arXiv:2403.08556
[2] Feng Xue, Yicong Chang, Tianxi Wang, Yu Zhou, Anlong Ming, Indoor Obstacle Discovery on Reflective Ground Using Monocular Camera, International Journal of Computer Vision (IJCV), vol. 132, pp. 987-1007, 2024. 注:小障碍物感知
代码:https://github.com/mRobotit/IndoorObstacleDiscovery-RG
[3] Fei Sheng, Feng Xue, Wenteng Liang, Yichong Chang, Anlong Ming*, Monocular Depth Distribution Alignment with Low Computation,the 2022 International Conference on Robotics and Automation (ICRA), 2022. 注:深度视觉计算
代码:https://github.com/mRobotit/USNet
[4] Feng Xue, Junfeng Cao, Fei Sheng, Yankai Wang, Yu Zhou, Anlong Ming, Boundary-induced and Scene-aggregated Network for Monocular Depth Prediction, Pattern Recognition (PR), vol. 115, 2021. 注:深度视觉计算
代码:https://github.com/mRobotit/BS-Net
[5] Wenteng Liang, Feng Xue, Yihao Liu, Guofeng Zhong, Anlong Ming*, Unknown Sniffer for Object Detection: Don't Turn a Blind Eye to Unknown Objects, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023. 注:开放世界物体感知
代码:https://github.com/mRobotit/UnSniffer
[6] Y. Chang, F. Xue, F. Sheng, W. Liang and A. Ming, Fast Road Segmentation via Uncertainty-aware Symmetric Network, in International Conference on Robotics and Automation (ICRA), 2022. 注:可行域分割
代码:https://github.com/mRobotit/DANet
[7] F. Xue, A. Ming and Y. Zhou, Tiny Obstacle Discovery by Occlusion-Aware Multilayer Regression, in IEEE Transactions on Image Processing (TIP), vol. 29, pp. 9373-9386, 2020. 注:小障碍物感知
代码:https://github.com/mRobotit/Tiny-Obstacle-Discovery-ROS
[8] F. Xue, A. Ming, M. Zhou and Y. Zhou, A Novel Multilayer Framework for Tiny Obstacle Discovery, in International Conference on Robotics and Automation (ICRA), 2019. 注:小障碍物感知
代码:https://github.com/mRobotit/Tiny-Obstacle-Discovery
[9] Feng Xue, Yicong Chang, Wenzhuang Xu, Wenteng Liang, Fei Sheng, Anlong Ming*, Evidence-based Real-time Road Segmentation with RGB-D Data Augmentation, IEEE Transactions on Intelligent Transportation Systems (TITS), accepted, 2024. 注:可行域分割
代码:整理中