- Researchers develop a robotic arm that uses tactile and vision to learn the game of Jenga.
- The machine-learning method developed in this work could help robot assemble consumer products, and perform other tasks that require careful physical interaction.
Jenga is a complex game that requires precise eye-hand coordination and strategy. As humans, we seamlessly integrate our senses of sights and touch to master this game. Robots, on the other hand, do not yet possess this level of sophistication.
Most robotic learning systems use only visual data, without a sense of touch, which basically limits their ability to learn about the external world. Existing learning algorithms based on model-free reinforcement learning techniques have little to no ability to exploit knowledge about physical objects, contacts, or forces.
Recently, researchers at MIT’s MCube Lab developed an algorithm to replicate this ability using a robot. Unlike conventional machine-learning techniques that uses massive datasets to evaluate their next best move, this robot learns and exploits a hierarchical model that enables gentle and precise extraction of pieces.
The robot is equipped with an external RGB camera, soft-pronged gripper, and a force-sensing wrist cuff. All these components allow the robot to observe and feel the Jenga tower and its individual blocks.
The researchers customized an industrial ABB IRB 120 robotic arm and set up a Jenga tower within its reach. As the arm gently pushes against a block, a computer captures tactile and visual feedback from its cuff and camera, and compares these measurements with the robot’s previous moves.
This model enables the robot to accurately estimate the state of a piece, simulate next possible moves, and decide on a favorable one. In real-time, the machine learns whether to keep pushing the block or move to a new one, in order to keep the structure from collapsing.
This is more challenging than developing AI for chess or Go, as the game of Jenga requires basic physical skills such as pulling, pushing, placing and aligning individual blocks.
The robot developed in this work efficiently identifies when a block feels stuck or free and decides how to extract it using far less data. It is trained on nearly 300 attempts (instead of tens of thousands of attempts). Attempts of similar outcomes and measurements are grouped in clusters and each cluster represents specific block behaviors.
For every single data cluster, the machine developed a model to estimate the behavior of a block given its current tactile and visual measurements. This clustering strategy — inspired by the natural way humans learn — significantly increases the robot’s efficiency with which it can learn to play the game.
This method is a successful example of artificial intelligence moving into the physical world. As the robot interacts with its surrounding, it learns some of the basic skills that define human manipulation.
This tactile learning system can be applied to tasks beyond the game of Jenga, especially those tasks that require careful physical interaction. For example, assembling consumer products and separating recyclable materials from landfill trash.
In a smartphone and laptop assembly line, for instance, most steps require touch and force actions rather than just vision, and this technology could drastically improve such assembly lines.