1
Capture site
A real task pack.
Use captured real-site tasks to see what works.

Policy score
WinnerA real task pack.
100 or 500 episodes.
Know what to test next.
Compare 1-3 policies on one captured task pack before using robot time.
What broke?
What changed?
Where to try?
Generated clips help review results. They are not real-world proof.
Boundary: virtual results guide what to test next. They do not approve deployment, safety, or guaranteed real-world success.