Humanoid robots hold great promise in assisting humans in diverse environments and tasks, due to their flexibility and adaptability leveraging human-like morphology. However, research in humanoid robots is often bottlenecked by the costly and fragile hardware setups. To accelerate algorithmic research in humanoid robots, we present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands and a variety of challenging whole-body manipulation and locomotion tasks. Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning baseline achieves superior performance when supported by robust low-level policies, such as walking or reaching. With HumanoidBench, we provide the robotics community with a platform to identify the challenges arising when solving diverse tasks with humanoid robots, facilitating prompt verification of algorithms and ideas.
We benchmark a variety state-of-the-art reinforcement learning algorihtms on all tasks. Our results show how these end-to-end (flat) algorithms struggle with controlling the complex humanoid robot dynamics and solving the most challenging tasks. In fact, many of such tasks require long-horizon planning and necessitate acquiring a diverse set of skills (e.g., balancing, walking, reaching, etc.) to successfully achieve the desired objective.
We argue that these issues can be mitigated by introducing additional structure into the learning problem. In particular, we explore a hierarchical learning paradigm, where one or multiple low-level skill policies are provided to a high-level planning policy that sends setpoints to lower-level policies.
As an example, in the push task we use a one-hand reaching policy (trained with massively parallelized PPO using MuJoCo MJX) as a low-level skill, which allows the robot to reach a 3D point in space with its left hand.
Low-level reaching policy (left hand)
Low-level reaching policy (two hands)
With HumanoidBench, we set a high bar with complex everyday tasks, in the hope to stimulate the community to accelerate the development of whole-body algorithms for humanoid robots with high-dimensional observation and action spaces.
Many of the tasks are still unsolved — the videos below show a non-exhaustive collection of failure modes.
@misc{sferrazza2024humanoidbench,
title={HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation},
author={Carmelo Sferrazza and Dun-Ming Huang and Xingyu Lin and Youngwoon Lee and Pieter Abbeel},
year={2024},
}