Stage 2: Camera Calibration

Overview

ManipAsInSim Pipeline Overview

Camera calibration establishes geometric correspondence between simulated and physical visual observations—a prerequisite for vision-based policy transfer. Our approach employs differentiable rendering within an optimization framework to automatically determine camera parameters that minimize simulation-to-reality discrepancies. The methodology encompasses multi-view data acquisition across diverse robot configurations, segmentation via Segment Anything Model (SAM), and gradient-based optimization of camera extrinsics to achieve pixel-level alignment between rendered and observed masks.

Calibration Workflow

Step 1: Data Collection

Calibration data acquisition protocol:

  1. Trajectory Design

    • Define joint configurations spanning the operational workspace
    • Ensure mechanical equilibrium before data capture
    • Log kinematic state with sub-degree precision
  2. Synchronized Acquisition

    Per configuration:

    • Command target joint state
    • Allow settling time (≥0.5s)
    • Capture synchronized data:
      • Visual observations
      • Joint encoders
      • Temporal stamps
  3. Quality Assurance

    • Validate image sharpness and absence of motion artifacts
    • Verify encoder precision
    • Confirm temporal alignment
RGB Frame 0 - Initial Position
RGB Frame 1 - Movement Phase
RGB Frame 3 - Intermediate Position
RGB Frame 4 - Final Position

Data Collection - RGB Camera Calibration Frames

Step 2: Mask Generation

Automated segmentation via SAM:

RGB Frame 0 - Initial Position
RGB Frame 1 - Movement Phase
RGB Frame 3 - Intermediate Position
RGB Frame 4 - Final Position

Data Collection - Masks of real images

Step 3: Optimization Solving

Parameter optimization procedure:

  1. Initialization

    • Extrinsic position: Coarse measurements
    • Orientation: Workspace-centered
    • Intrinsics: Manufacturer specifications or pre-calibrated
  2. Gradient-Based Optimization

    1. Update simulation camera parameters
    2. Render masks across all configurations
    3. Compute L1 norm between rendered and observed masks
    4. Backpropagate gradients through differentiable renderer
    5. Update parameters via gradient descent
    6. Iterate to convergence
  3. Termination Conditions

    • Loss plateau (Δ < ε)
    • Parameter convergence (||∇θ|| → 0)
    • Iteration limit
RGB Frame 0 - Initial Position
RGB Frame 1 - Movement Phase
RGB Frame 3 - Intermediate Position
RGB Frame 4 - Final Position

Data Collection - Error of masks after optimization

Advanced Techniques

Improving Calibration Accuracy

  • Dense Sampling: Increase configuration density in task-critical regions
  • Hierarchical Refinement: Initialize fine-grained optimization with coarse calibration results

Code Reference

Implementation references:

Data Collection Tools:

Calibration Optimization Script:

Calibration Data Management

Optimal calibration requires configuration sets that maximize pose diversity while maintaining visibility across the camera's field of view. Joint configurations should exhibit maximal variance to ensure robust parameter estimation.

Store selected configurations in tools/calibration/calibration_data/{setup_name}/joints.txt. Example structure for ur5_d435:

ManiUniCon/
  ├── tools/
  │   └── calibration/
  │       └── ur5_d435/
  │           └── joints.txt

Joint configuration format (radians):

[-3.124601, -1.630071, -1.790162, -1.297886, 1.560076, -0.423495]
[-3.258589, -2.106229, -1.353434, 0.375175, 1.814344, -0.502429]
[-2.698257, -2.036311, -1.348458, 0.769870, 1.861926, -0.439289]
[-2.950685, -2.074254, -1.298800, -1.055511, 0.241022, -0.498035]
[-2.992676, -1.901685, -0.997740, -0.726964, 1.578677, -0.399332]
[-2.629278, -1.849604, -1.016632, -1.474060, 2.517711, -0.396734]
[-2.941660, -1.930229, -1.653374, -0.604498, 0.486982, -0.462255]
[-2.450230, -2.012307, -1.765171, 0.477528, 0.984876, 0.714768]
[-2.872846, -2.243410, -1.031999, 1.257755, 1.437975, 0.839455]

Post-acquisition data structure:

ManiUniCon/
├── tools/
│   └── calibration/
│       └── ur5_d435/
│           └── joints.txt
│           └── multi_frame_calibration_data.npz  # key: "rgb_images", "captured_joint_values", "target_joint_values"

Optimized parameters are output to terminal upon convergence.

Next Steps

Calibrated camera parameters enable accurate visual correspondence for Stage 3: leveraging the aligned simulation environment for large-scale data generation via teleoperation and augmentation strategies.

results matching ""

    No results matching ""