Stage 3: Data Generation with WBCMimicGen

Overview

ManipAsInSim Pipeline Overview

Data generation for robotic manipulation encompasses multiple approaches, each with distinct advantages and limitations. Motion planning excels at atomic task execution but requires extensive manual implementation for complex sequences. Reinforcement learning addresses highly dynamic tasks with fine-grained control requirements, though reward engineering remains challenging. MimicGen-based methods bridge this gap by leveraging human demonstrations for contact-rich manipulation while achieving the requisite scale for effective learning, particularly in long-horizon tasks. However, direct teleoperation in simulation often produces discontinuous trajectories. To address this limitation, we integrate Whole Body Control (WBC) into the MimicGen framework, yielding WBCMimicGen—a method that generates smooth trajectories while enabling coordinated control of multi-arm mobile manipulators. Implementation details are provided below.

Our approach demonstrates significant data amplification: from 3-5 human demonstrations per task, we generate thousands of trajectory variants with diverse object configurations, as exemplified in the kitchen and canteen scenarios.

Data Collection

WBCMimicGen begins with human demonstration collection via a VR-based teleoperation system, as illustrated below:

Representative demonstrations from both task domains:

Task Decomposition and Annotation

WBCMimicGen decomposes trajectories into object-centric subtasks and applies spatial transformations to generate diverse training scenarios. This hierarchical decomposition follows task semantics and requires either manual or automated annotation of trajectory segments to encode subtask boundaries and objectives.

We illustrate the decomposition strategy through two representative examples:

Kitchen task example

Kitchen task decomposition:

Stage 1: grasp bowl

  • Start: Robot initial position
  • Goal: Bowl stably grasped

Stage 2: place in microwave

  • Start: Bowl grasped
  • Goal: Bowl placed inside microwave

Stage 3: close microwave door

  • Start: Bowl inside microwave
  • Goal: Microwave door closed

Canteen task example

Canteen task decomposition:

Stage 1: grasp fork

  • Start: Robot initial position
  • Goal: Fork stably grasped

Stage 2: place fork

  • Start: Fork stably grasped
  • Goal: Fork placed in the blanket

Stage 3: grasp plate

  • Start: Fork placed in the blanket
  • Goal: Plate stably grasped

Stage 4: Drop plate

  • Start: Plate stably grasped
  • Goal: Drop above the left blanket

Stage 5: Place plate

  • Start: Drop above the left blanket
  • Goal: Plate placed in the blanket

Large-scale Data Generation

Following task decomposition and annotation, WBCMimicGen generates large-scale training data with systematic variations:

Detail of WBCMimicGen

WBCMimicGen extends the MimicGen framework through whole-body control optimization:

Unlike standard MimicGen's linear interpolation of end-effector poses, WBCMimicGen formulates trajectory generation as a quadratic programming (QP) problem that optimizes joint velocities while respecting manipulability constraints, joint limits, and velocity bounds. This approach yields smoother, dynamically feasible demonstrations. Formally, the trajectory generation problem seeks joint velocities $\boldsymbol x$ that track desired end-effector velocities, formulated as[1]:

WBC optimization problem formulation
WBC optimization problem formulation

where $x = (a_\text{base}, \dot{q}_\text{active}, \delta_1, \delta_2, \cdots, \delta_i)$ and $\mathcal{X}^{+, -}$ is the limits; $a_\text{base}$ is the velocities of the robot base; $\dot{q}_\text{active}$ is the velocity of the joints related to the end-effectors in the QP (so called the active joints); $\delta_i$ are slack variables that can help construct a solvable QP. Without loss of generality, suppose there are $k$ end-effectors and $n$ joints, these variables can be expressed as:

WBC optimization problem formulation

Comparative analysis demonstrates WBCMimicGen's superior trajectory smoothness (right) versus standard MimicGen (left):

Kitchen task's data generated by MimicGen

Kitchen task's data generated by WBCMimicGen

Canteen task's data generated by MimicGen

Canteen task's data generated by WBCMimicGen

Data Generation of Mobile Manipulator

A key innovation of WBCMimicGen is its extension to mobile manipulation. The following demonstration shows coordinated base-arm control on the ARX7 platform executing a multi-stage task: grasping toothpaste, placing it in a cup, and repositioning the cup:

Camera on the robot's head.

Camera in the environment.

Code Reference

Data generation related code:

Teleoperation System:

Data Annotation Scripts:

Data Generation Scripts:

Next Steps

The generated dataset enables subsequent imitation learning for policy training and real-world deployment, detailed in Stage 4.

References

[1] Jesse Haviland, Niko Sünderhauf, and Peter Corke. A holistic approach to reactive mobile manipulation. IEEE Robotics and Automation Letters, 7(2):3122–3129, 2022.

results matching ""

    No results matching ""