Stage 3: Data Generation with WBCMimicGen

Overview

Data generation for robotic manipulation encompasses multiple approaches, each with distinct advantages and limitations. Motion planning excels at atomic task execution but requires extensive manual implementation for complex sequences. Reinforcement learning addresses highly dynamic tasks with fine-grained control requirements, though reward engineering remains challenging. MimicGen-based methods bridge this gap by leveraging human demonstrations for contact-rich manipulation while achieving the requisite scale for effective learning, particularly in long-horizon tasks. However, direct teleoperation in simulation often produces discontinuous trajectories. To address this limitation, we integrate Whole Body Control (WBC) into the MimicGen framework, yielding WBCMimicGen—a method that generates smooth trajectories while enabling coordinated control of multi-arm mobile manipulators. Implementation details are provided below.

Our approach demonstrates significant data amplification: from 3-5 human demonstrations per task, we generate thousands of trajectory variants with diverse object configurations, as exemplified in the kitchen and canteen scenarios.

Data Collection

WBCMimicGen begins with human demonstration collection via a VR-based teleoperation system, as illustrated below:

Representative demonstrations from both task domains:

Task Decomposition and Annotation

WBCMimicGen decomposes trajectories into object-centric subtasks and applies spatial transformations to generate diverse training scenarios. This hierarchical decomposition follows task semantics and requires either manual or automated annotation of trajectory segments to encode subtask boundaries and objectives.

We illustrate the decomposition strategy through two representative examples:

Kitchen task example

Kitchen task decomposition:

Stage 1: grasp bowl

Start: Robot initial position
Goal: Bowl stably grasped

Stage 2: place in microwave

Start: Bowl grasped
Goal: Bowl placed inside microwave

Stage 3: close microwave door

Start: Bowl inside microwave
Goal: Microwave door closed

Grasp bowl

Place in microwave

Close microwave door

Canteen task example

Canteen task decomposition:

Stage 1: grasp fork

Start: Robot initial position
Goal: Fork stably grasped

Stage 2: place fork

Start: Fork stably grasped
Goal: Fork placed in the blanket

Stage 3: grasp plate

Start: Fork placed in the blanket
Goal: Plate stably grasped

Stage 4: Drop plate

Start: Plate stably grasped
Goal: Drop above the left blanket

Stage 5: Place plate

Start: Drop above the left blanket
Goal: Plate placed in the blanket

Grasp fork

Place fork

Grasp plate

Drop plate

Place plate

Large-scale Data Generation

Following task decomposition and annotation, WBCMimicGen generates large-scale training data with systematic variations:

Detail of WBCMimicGen

WBCMimicGen extends the MimicGen framework through whole-body control optimization:

Unlike standard MimicGen's linear interpolation of end-effector poses, WBCMimicGen formulates trajectory generation as a quadratic programming (QP) problem that optimizes joint velocities while respecting manipulability constraints, joint limits, and velocity bounds. This approach yields smoother, dynamically feasible demonstrations. Formally, the trajectory generation problem seeks joint velocities $\boldsymbol x$ that track desired end-effector velocities, formulated as[1]:

where $x = (a_\text{base}, \dot{q}_\text{active}, \delta_1, \delta_2, \cdots, \delta_i)$ and $\mathcal{X}^{+, -}$ is the limits; $a_\text{base}$ is the velocities of the robot base; $\dot{q}_\text{active}$ is the velocity of the joints related to the end-effectors in the QP (so called the active joints); $\delta_i$ are slack variables that can help construct a solvable QP. Without loss of generality, suppose there are $k$ end-effectors and $n$ joints, these variables can be expressed as:

Comparative analysis demonstrates WBCMimicGen's superior trajectory smoothness (right) versus standard MimicGen (left):

Kitchen task's data generated by MimicGen

Kitchen task's data generated by WBCMimicGen

Canteen task's data generated by MimicGen

Canteen task's data generated by WBCMimicGen

Data Generation of Mobile Manipulator

A key innovation of WBCMimicGen is its extension to mobile manipulation. The following demonstration shows coordinated base-arm control on the ARX7 platform executing a multi-stage task: grasping toothpaste, placing it in a cup, and repositioning the cup:

Camera on the robot's head.

Camera in the environment.

Code Reference

Data generation related code:

Teleoperation System:

Quest VR control: https://github.com/ByteDance-Seed/manip-as-in-sim-suite/blob/main/wbcmimic/scripts/basic/record_demos_ur5_quest.py

Data Annotation Scripts:

https://github.com/ByteDance-Seed/manip-as-in-sim-suite/blob/main/wbcmimic/scripts/mimicgen/annotate_demos.py

Data Generation Scripts:

Generation configuration: https://github.com/ByteDance-Seed/manip-as-in-sim-suite/blob/main/wbcmimic/config/mimicgen/generate_data.yaml
Generation script: https://github.com/ByteDance-Seed/manip-as-in-sim-suite/blob/main/wbcmimic/scripts/mimicgen/run_hydra.py

Next Steps

The generated dataset enables subsequent imitation learning for policy training and real-world deployment, detailed in Stage 4.

References

[1] Jesse Haviland, Niko Sünderhauf, and Peter Corke. A holistic approach to reactive mobile manipulation. IEEE Robotics and Automation Letters, 7(2):3122–3129, 2022.