我们最近在《Biomimetics》期刊上发表的论文,提出了一种受人体触觉启发的软手指,结合内部视觉与动觉感知,用于估计手持物体的姿态,解决了机器人领域中充满挑战的问题,实现了高精度的姿态估计和物体分类。该论文第一作者刘小博是南方科技大学机械与能源工程系博士研究生,合作作者包括南方科技大学机械与能源工程系博士研究生韩旭东、郭宁,本文的共同通讯作者是设计学院助理教授万芳、机械与能源工程系助理教授宋超阳。

Our research introduces a bio-inspired soft finger that combines inner vision and kinesthetic sensing to estimate in-hand object pose, addressing a challenging problem in robotics. We achieved high-precision pose estimation and object classification using this innovative approach.

DOI: https://doi.org/10.3390/biomimetics8060501

机器人和人机互动中最具挑战性的任务之一是手持物体姿态估计。这是一个让人和机器人都感到困惑的问题,主要是由于手和物体引起的遮挡。我们的研究深入探讨了这个复杂问题,引入了一种模仿人类灵巧性的新方法。我们提出了一个软手指,这是一种受生物启发的解决方案,将内部视觉与动觉感知相结合,以估计物体的姿态。这个软手指拥有柔韧的骨架和可适应的皮肤,确保它能够适应不同的物体。此外,在交互过程中,它的骨骼变形提供了宝贵的接触信息,由内部摄像头收集。我们的研究最终形成了一个端到端的框架,使用这些软手指的原始图像来估计手持物体的姿态。该框架包括用于动觉信息处理的编码器和用于对象姿态和类别估计的部分。我们在七个物体上对我们的方法进行了严格的测试,取得了令人印象深刻的结果,姿态估计误差为2.02毫米,方向误差为11.34度,分类准确度达到了惊人的99.05%。

One of the most challenging robotics and human-robot interaction tasks is in-hand object pose estimation. It’s a problem that perplexes humans and robots, primarily due to occlusion caused by the hand and object. Our study delves into this complex issue, introducing a novel approach that seeks to mimic human-like dexterity. We propose a soft finger, a bio-inspired solution that blends inner vision with kinesthetic sensing to estimate the pose of objects. This soft finger has a flexible skeleton and adaptable skin, ensuring its ability to conform to various things. Moreover, its skeleton deformations, which occur during interactions, provide valuable contact information gathered by the inner camera. Our research culminates in an end-to-end framework, using the raw images from these soft fingers to estimate in-hand object pose. This framework comprises an encoder for kinesthetic information processing and an object pose and category estimator. We rigorously tested our approach on seven objects, achieving impressive results, with an error of 2.02 mm for pose estimation and 11.34 degrees for orientation and a remarkable 99.05% classification accuracy.

来自人类触觉的灵感|Our Research Inspiration: The Human Touch

人类手非常擅长处理物体,这主要是因为我们能够检测和理解手和物体之间的互动。虽然视觉信息提供了丰富的数据来感知物体的形状,但人们仍然可以仅依靠触觉感知来评估物体的重要属性,如大小、形状、位置和方向。人们采用两种主要的触觉感知模式:皮肤感知和动觉感知。皮肤感知依赖于皮肤与物体的物理接触,因此非常适合感知压力、振动和温度等因素。相比之下,动觉感知提供了对身体位置和运动的认知,对于区分物体的位置和方向非常有价值。受动觉感知的启发,我们的研究引入了一根软手指,配备了嵌入式摄像头和深度学习架构,用于物体识别。

Human hands are remarkably adept at handling objects, largely due to our ability to detect and understand the interactions between our hands and objects. While visual information provides rich data to perceive an object’s shape, humans can still assess essential object properties, such as size, shape, position, and orientation, relying solely on their sense of touch. Humans employ two primary modes of tactile perception: cutaneous and kinesthetic. The cutaneous sense depends on physical contact between the skin and objects, making it ideal for perceiving pressure, vibration, and temperature. In contrast, the kinesthetic sense provides awareness of the position and movement of the body, which is invaluable for discerning an object’s position and orientation. Drawing inspiration from the kinesthetic sense, our study introduces a soft finger equipped with an embedded camera and a deep learning architecture for object recognition.

物体姿态估计的挑战|Challenges in Object Pose Estimation

识别物体的姿态是机器人领域一个基本但具有挑战性的任务。为了成功操作,机器人必须深刻理解其环境以及与之互动的物体。视觉传感器已成为感知环境和识别物体的常见解决方案。已经开发了许多方法来进行物体定位和分类,充分利用了深度学习的威力。然而,遮挡问题仍然是一个持续存在的挑战,尤其是在精细操作任务中。机器人系统中的固有不确定性、容差和噪声进一步加大了物体姿态估计的复杂性。

Recognizing the pose of objects is a fundamental yet challenging task in robotics. For successful manipulation, robots must deeply understand their environment and the objects they interact with. Vision sensors have become a common solution for perceiving the environment and recognizing objects. Many approaches have been developed for object localization and classification, leveraging the power of deep learning. However, occlusion remains a persistent challenge, particularly in delicate manipulation tasks. The inherent uncertainties, tolerances, and noise in robotic systems further complicate object pose estimation.

我们的方法:软手指本体感知|Our Approach: Soft Finger Proprioception

受手-物体姿态估计(HOPE)问题的启发,我们旨在通过我们独特的方法解决手抓取物体的姿态估计问题。对于全动力手抓取器,我们从手抓取器电机获取关节角度,并从触觉和力传感器获取接触状态。利用这些信息,我们估计物体的姿态和类别。对于非全动力手抓取器,需要额外的传感器来测量额外的自由度。针对此目的提出了GelSight、DIGIT和FingerVision等传感器。然而,它们的局限性,如弹性层的厚度,使我们不得不探索替代解决方案。

Inspired by the hand-object pose estimation (HOPE) problem, we aim to solve the gripper-object pose estimation issue with our unique approach. We gather joint angles from gripper motors and contact states from tactile and force sensors for fully actuated grippers. Using this information, we estimate object pose and category. In the case of under-actuated grippers, additional sensors are required to measure extra degrees of freedom. Sensors like GelSight, DIGIT, and FingerVision are proposed for this purpose. However, their limitations, such as the thickness of the elastic layer, have led us to explore an alternative solution.

我们的视触融合柔性指|Our Vision-based Soft Finger

我们的研究引入了一种柔软的自适应手指,配备了嵌入式摄像头,用于捕捉与物体互动过程中的手指变形。这根柔软手指安装在手抓取器上,增强了适应性,使其能够通过本体感知来识别处理的物体。我们的方法是使用原始图像来估计物体的姿态和类别,而不依赖于CAD模型。我们采用一种阶段性的方法进行识别,简化了培训,提高了可重用性。我们的方法包括用于嵌入互动信息的特征提取器和用于进一步操作任务的后处理器。特征提取器采用ResNet块的编码器-解码器架构,后处理器使用多层感知器(MLP)进行分类和回归。

Our research introduces a soft, adaptive finger with an integrated camera that captures finger deformations during object interactions. This soft finger is mounted on a gripper, enhancing adaptability and the ability to recognize handled objects through proprioceptive sensing. Our approach uses raw images to estimate object pose and categories without relying on CAD models. We adopt a one-stage methodology for recognition, simplifying training, and improving reusability. Our method consists of a feature extractor for interaction information embedding and a post-processor for further manipulation tasks. The feature extractor employs an Encoder-Decoder architecture with ResNet blocks, while the post-processor utilizes a multilayer perceptron (MLP) for classification and regression.

主要贡献|Key Contributions

我们的研究为该领域带来了几项重要的贡献:

Our research brings several significant contributions to the field:

  1. 我们设计并制造了一个配备嵌入式摄像头的柔软手指,用于本体感知。
    We design and fabricate a soft finger with an integrated camera for proprioception.
  2. 我们提出了一个框架,用于提取和融合手指数据,以确定手抓取器内物体的状态。
    We propose a framework to extract and fuse finger data to determine objects’ states within a gripper.
  3. 我们通过姿态估计和分类的高准确性证明了我们方法的有效性。
    We demonstrate the effectiveness of our method through high accuracy in pose estimation and classification.

未来方向|Future Directions

展望未来,我们的研究为各种令人兴奋的可能性敞开了大门。我们计划探索将我们的方法应用于不同的触觉传感器和手抓取器,使其成为各种应用的多功能解决方案。此外,我们打算扩大可以受益于我们方法的操作任务的范围。提高物体姿态估计的进程还在继续,我们致力于推动机器人领域的可能性边界。

As we look ahead, our study opens doors to exciting possibilities. We plan to explore the transferability of our method to different tactile sensors and grippers, making it a versatile solution for various applications. Additionally, we intend to expand the range of manipulation tasks that can benefit from our approach. The journey of enhancing object pose estimation continues, and we are committed to pushing the boundaries of what’s possible in robotics.

Xiaobo Liu, Xudong Han, Ning Guo, Fang Wan*, and Chaoyang Song* (2023). “Bio-inspired Proprioceptive Touch of a Soft Finger with Inner-Finger Kinesthetic Perception.” Biomimetics, 8(6), 501.

DOI: https://doi.org/10.3390/biomimetics8060501