HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation

1University of California Berkeley, 2 Yonsei University


Humanoid robots hold great promise in assisting humans in diverse environments and tasks, due to their flexibility and adaptability leveraging human-like morphology. However, research in humanoid robots is often bottlenecked by the costly and fragile hardware setups. To accelerate algorithmic research in humanoid robots, we present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands and a variety of challenging whole-body manipulation and locomotion tasks. Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning baseline achieves superior performance when supported by robust low-level policies, such as walking or reaching. With HumanoidBench, we provide the robotics community with a platform to identify the challenges arising when solving diverse tasks with humanoid robots, facilitating prompt verification of algorithms and ideas.


HumanoidBench is the first-of-its-kind simulated humanoid robot benchmark, including 27 distinct whole-body control tasks, each of these presenting unique challenges, such as intricate long-horizon control and sophisticated coordination.



"Static" Manipulation


"Dynamic" Manipulation


Simulation Environment

The simulation environment of HumanoidBench uses the MuJoCo physics engine, featuring the choice across different robot models (e.g, Unitree H1, Agility Robotics Digit) and end effectors (e.g., Shadow Hand, Robotiq 2F-85 gripper). In our experiments, we opt for the Unitree H1 humanoid robot with two dexterous Shadow Hands attached to its arms.

Observation Space

Our simulated environment supports the following observations:

Hierarchical Reinforcement Learning

We benchmark a variety state-of-the-art reinforcement learning algorihtms on all tasks. Our results show how these end-to-end (flat) algorithms struggle with controlling the complex humanoid robot dynamics and solving the most challenging tasks. In fact, many of such tasks require long-horizon planning and necessitate acquiring a diverse set of skills (e.g., balancing, walking, reaching, etc.) to successfully achieve the desired objective.

We argue that these issues can be mitigated by introducing additional structure into the learning problem. In particular, we explore a hierarchical learning paradigm, where one or multiple low-level skill policies are provided to a high-level planning policy that sends setpoints to lower-level policies.

As an example, in the push task we use a one-hand reaching policy (trained with massively parallelized PPO using MuJoCo MJX) as a low-level skill, which allows the robot to reach a 3D point in space with its left hand.




Low-level reaching policy (left hand)

Low-level reaching policy (two hands)

Vast Opportunities for Future Research!

With HumanoidBench, we set a high bar with complex everyday tasks, in the hope to stimulate the community to accelerate the development of whole-body algorithms for humanoid robots with high-dimensional observation and action spaces.

Many of the tasks are still unsolved — the videos below show a non-exhaustive collection of failure modes.


    title={HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation},
    author={Carmelo Sferrazza and Dun-Ming Huang and Xingyu Lin and Youngwoon Lee and Pieter Abbeel},