A label-efficient framework for 3D detection using expert predictions
The expert agent shares predicted bounding boxes — no raw data or model weights required.
Can we reuse existing expert perception sources — like robotaxis — to train ego vehicles for label-efficient learning?
We study a new scenario where an ego vehicle learns from the 3D bounding box predictions of a nearby expert agent — without accessing raw sensor data or model weights. This is label-efficient, sensor-agnostic, and communication-light. But naively using received predictions as ground-truth labels yields poor performance (22.0 AP), revealing two fundamental challenges.
Refining & Discovering Boxes for 3D Perception from Others' Predictions. Two complementary components address the two challenges:
The pipeline is sensor- and detector-agnostic: works with 8/16/32-beam LiDARs, PointPillars or SECOND, and generalizes to sim-to-real domain adaptation.
R&B-POP closes nearly all of the gap to the supervised upper bound (56.5 vs. 58.4 AP@IoU 0.5) while using no human labels for detector training — only 40 frames for the ranker.
| Label source | Refinement | Self-training | 0–30m | 30–50m | 50–80m | 0–80m |
|---|---|---|---|---|---|---|
| R's pred | — | — | 34.7 | 13.5 | 8.6 | 22.0 |
| R's GT | — | — | 29.7 | 14.1 | 7.3 | 19.6 |
| R's pred | heuristic | — | 53.2 | 22.0 | 16.9 | 37.8 |
| R's pred | ranker | — | 50.3 | 24.7 | 18.2 | 38.0 |
| R's pred | — | naive ST | 45.9 | 18.7 | 16.5 | 32.4 |
| R's pred | heuristic | naive ST | 50.4 | 19.6 | 15.4 | 35.4 |
| R's pred | ranker | naive ST | 60.6 | 29.7 | 19.2 | 45.0 |
| R's pred | — | dist. curriculum | 57.3 | 29.6 | 21.0 | 42.5 |
| R's pred | heuristic | dist. curriculum | 60.5 | 25.5 | 17.0 | 43.2 |
| R's pred | ranker | dist. curriculum | 73.3 | 43.3 | 23.3 | 56.5 |
| E's GT ⋆ | — | — | 75.2 | 45.9 | 28.8 | 58.4 |
V2V4Real, PointPillars, 32-beam LiDAR. AP@IoU 0.5. ⋆ Supervised upper bound.
@inproceedings{yoo2025rnbpop,
title={Learning 3D Perception from Others' Predictions},
author={Yoo, Jinsu and Feng, Zhenyang and Pan, Tai-Yu and Sun, Yihong and
Phoo, Cheng Perng and Chen, Xiangyu and Campbell, Mark and
Weinberger, Kilian Q. and Hariharan, Bharath and Chao, Wei-Lun},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025}
}