@@ -182,8 +182,7 @@ <h2 class="title is-3 has-text-centered"> How Does UWM Learn from Both Actions a
182182 < div class ="column is-full-width has-text-justified ">
183183 < h3 class ="title is-5 "> </ h3 >
184184 < div class ="content has-text-justified ">
185- < img src ="./static/images/training_policy.png " class ="inline-figure " style ="width:100% "
186- alt ="Learning DiSPOs. " />
185+ < img src ="./static/images/training.png " class ="inline-figure " style ="width:100% " alt ="UWMs " />
187186 </ div >
188187 < div class ="content has-text-justified ">
189188 < p >
@@ -200,7 +199,7 @@ <h3 class="title is-5"></h3>
200199 </ div >
201200 < h3 class ="title is-5 "> Unified Training With & Without Actions</ h3 >
202201 < div class ="content has-text-centered ">
203- < img src ="./static/images/dataset .png " class ="inline-figure " style ="width:100 % " alt ="Learning DiSPOs. " />
202+ < img src ="./static/images/droid .png " class ="inline-figure " style ="width:70 % " alt ="" />
204203 </ div >
205204 < div class ="content has-text-justified ">
206205 < p >
@@ -224,10 +223,24 @@ <h3 class="title is-5">Unified Training With & Without Actions</h3>
224223 a large and diverse set of robotic trajectories that offers multi-environment coverage
225224 and varied manipulation tasks.
226225 < em > See the paper for more details on dataset composition and preprocessing.</ em >
227-
228226 </ p >
229227 </ div >
230- </ div >
228+ < h3 class ="title is-5 "> Forward Dynamics via Video Diffusion</ h3 >
229+ < div class ="content has-text-justified ">
230+ < p >
231+ One of the core capabilities of UWM is modeling < strong > forward dynamics</ strong > , that is, predicting how the environment changes given the current observation and action.
232+ We achieve this by setting the < em > action diffusion timestep to zero</ em > , effectively conditioning on the true action,
233+ and allowing the < em > video diffusion</ em > process to denoise the future frame.
234+ This enables UWM to forecast plausible next observations, grounded in both action and visual context.
235+ </ p >
236+ </ div >
237+ < div class ="content has-text-centered ">
238+ < img src ="./static/images/forward.png " class ="inline-figure " style ="width:60% " alt ="" />
239+ </ div >
240+ < p >
241+ Above is an example of forward prediction from UWM, shown in both simulation in the top row and the real world in the bottom row.
242+ </ p >
243+
231244 </ div >
232245 </ div >
233246 </ section >
@@ -245,26 +258,26 @@ <h3 class="title is-5 mt-4">Real Robot Experiments</h3>
245258 < p >
246259 We evaluate < strong > Unified World Models (UWM)</ strong > on diverse real-world tasks, each demanding
247260 precise control and robust generalization. Training combines demonstrations with action-free
248- videos, allowing UWM to learn both policy and environment dynamics.
261+ videos from DROID , allowing UWM to learn both policy and environment dynamics.
249262 </ p >
250263 </ div >
251264
252265 <!-- Buttons for Real Robot tasks -->
253266 < div class ="content has-text-centered ">
254267 < button class ="button custom-btn-gradient " onclick ="showExperiment('rice') ">
255- Cook- Rice
268+ Rice-Cooker
256269 </ button >
257270 < button class ="button custom-btn-gradient " onclick ="showExperiment('towel') ">
258271 Hang-Towel
259272 </ button >
260273 < button class ="button custom-btn-gradient " onclick ="showExperiment('paper') ">
261- Refill- Paper-Towels
274+ Paper-Towel
262275 </ button >
263276 < button class ="button custom-btn-gradient " onclick ="showExperiment('bowl') ">
264277 Stack-Bowls
265278 </ button >
266279 < button class ="button custom-btn-gradient " onclick ="showExperiment('block') ">
267- Place- Block
280+ Block-Cabinet
268281 </ button >
269282 </ div >
270283
@@ -359,7 +372,7 @@ <h4 class="title is-5" style="margin-top:2rem;">Towel (Out-of-Distribution)</h4>
359372 </ div >
360373 </ div >
361374 </ div >
362-
375+
363376
364377 <!-- PAPER Container (10 videos: 5 in-dist, 5 ood-dist) -->
365378 < div id ="paper " style ="display: none; text-align:center; margin-top: 1rem; ">
@@ -543,7 +556,7 @@ <h4 class="title is-5" style="margin-top:2rem;">Bowl (Out-of-Distribution)</h4>
543556
544557 <!-- Optional table or figure for Real Robot results -->
545558 < p class ="has-text-centered ">
546- < img src ="./static/images/results.png " style ="width:100 %; " alt ="Real Robot Results Table ">
559+ < img src ="./static/images/results.png " style ="width:110 %; " alt ="Real Robot Results Table ">
547560 </ p >
548561
549562 <!-- Subsection: Simulation Experiments -->
@@ -561,32 +574,72 @@ <h3 class="title is-5 mt-5">Simulation Experiments</h3>
561574 </ p >
562575 </ div >
563576
564- < div class ="columns is-mobile is-centered ">
565- < div class ="column is-one-fifth " style ="text-align:center; ">
566- < video autoplay muted loop playsinline style ="width:100%; ">
567- < source src ="./static/videos/libero/1.mp4 " type ="video/mp4 ">
568- </ video >
569- </ div >
570- < div class ="column is-one-fifth " style ="text-align:center; ">
571- < video autoplay muted loop playsinline style ="width:100%; ">
572- < source src ="./static/videos/libero/2.mp4 " type ="video/mp4 ">
573- </ video >
574- </ div >
575- < div class ="column is-one-fifth " style ="text-align:center; ">
576- < video autoplay muted loop playsinline style ="width:100%; ">
577- < source src ="./static/videos/libero/3.mp4 " type ="video/mp4 ">
578- </ video >
579- </ div >
580- < div class ="column is-one-fifth " style ="text-align:center; ">
581- < video autoplay muted loop playsinline style ="width:100%; ">
582- < source src ="./static/videos/libero/4.mp4 " type ="video/mp4 ">
583- </ video >
577+ < div class ="container is-max-desktop " style ="margin-top: 2rem; ">
578+ < h4 class ="title is-5 has-text-centered "> LIBERO (Simulation)</ h4 >
579+
580+ <!-- Top row with labels -->
581+ < div class ="columns is-mobile is-centered has-text-centered ">
582+ < div class ="column is-one-fifth ">
583+ < p > < strong > Soup-Cheese</ strong > </ p >
584+ < video autoplay muted loop playsinline class ="video-hover ">
585+ < source src ="./static/videos/libero/1.mp4 " type ="video/mp4 ">
586+ </ video >
587+ </ div >
588+ < div class ="column is-one-fifth ">
589+ < p > < strong > Book-Caddy</ strong > </ p >
590+ < video autoplay muted loop playsinline class ="video-hover ">
591+ < source src ="./static/videos/libero/2.mp4 " type ="video/mp4 ">
592+ </ video >
593+ </ div >
594+ < div class ="column is-one-fifth ">
595+ < p > < strong > Bowl-Drawer</ strong > </ p >
596+ < video autoplay muted loop playsinline class ="video-hover ">
597+ < source src ="./static/videos/libero/3.mp4 " type ="video/mp4 ">
598+ </ video >
599+ </ div >
600+ < div class ="column is-one-fifth ">
601+ < p > < strong > Mug-Mug</ strong > </ p >
602+ < video autoplay muted loop playsinline class ="video-hover ">
603+ < source src ="./static/videos/libero/4.mp4 " type ="video/mp4 ">
604+ </ video >
605+ </ div >
606+ < div class ="column is-one-fifth ">
607+ < p > < strong > Moka-Moka</ strong > </ p >
608+ < video autoplay muted loop playsinline class ="video-hover ">
609+ < source src ="./static/videos/libero/5.mp4 " type ="video/mp4 ">
610+ </ video >
611+ </ div >
584612 </ div >
585- < div class ="column is-one-fifth " style ="text-align:center; ">
586- < video autoplay muted loop playsinline style ="width:100%; ">
587- < source src ="./static/videos/libero/5.mp4 " type ="video/mp4 ">
588- </ video >
613+ < div class ="columns is-mobile is-centered ">
614+ < div class ="column is-one-fifth " style ="text-align:center; ">
615+ < video autoplay muted loop playsinline class ="video-hover ">
616+ < source src ="./static/videos/libero/11.mp4 " type ="video/mp4 ">
617+ </ video >
618+ </ div >
619+ < div class ="column is-one-fifth " style ="text-align:center; ">
620+ < video autoplay muted loop playsinline class ="video-hover ">
621+ < source src ="./static/videos/libero/22.mp4 " type ="video/mp4 ">
622+ </ video >
623+ </ div >
624+ < div class ="column is-one-fifth " style ="text-align:center; ">
625+ < video autoplay muted loop playsinline class ="video-hover ">
626+ < source src ="./static/videos/libero/33.mp4 " type ="video/mp4 ">
627+ </ video >
628+ </ div >
629+ < div class ="column is-one-fifth " style ="text-align:center; ">
630+ < video autoplay muted loop playsinline class ="video-hover ">
631+ < source src ="./static/videos/libero/44.mp4 " type ="video/mp4 ">
632+ </ video >
633+ </ div >
634+ < div class ="column is-one-fifth " style ="text-align:center; ">
635+ < video autoplay muted loop playsinline class ="video-hover ">
636+ < source src ="./static/videos/libero/55.mp4 " type ="video/mp4 ">
637+ </ video >
638+ </ div >
589639 </ div >
640+ < p class ="has-text-centered ">
641+ < img src ="./static/images/libero_table.png " style ="width:90%; " alt ="Sim Results Table ">
642+ </ p >
590643 </ div >
591644
592645 </ section >
@@ -622,98 +675,87 @@ <h2 class="title is-3 has-text-centered">Team</h2>
622675
623676 <!-- Team Member 1: Chuning Zhu -->
624677 < div class ="column is-one-third ">
625- < figure class ="image " style ="margin: 0 auto; ">
626- < a href ="https://homes.cs.washington.edu/~zchuning/ " target ="_blank " rel ="noopener ">
627- < img src ="./static/images/chuning.jpg " alt ="Chuning Zhu " class ="team-photo ">
628- </ a >
629- </ figure >
630- < p class ="mt-2 "> < strong > Chuning Zhu</ strong > </ p >
631- < p class ="is-size-7 "> University of Washington</ p >
678+ < div class ="has-text-centered ">
679+ < figure class ="image is-inline-block ">
680+ < a href ="https://homes.cs.washington.edu/~zchuning/ " target ="_blank " rel ="noopener ">
681+ < img src ="./static/images/chuning.jpg " alt ="Chuning Zhu " class ="team-photo ">
682+ </ a >
683+ </ figure >
684+ < p class ="mt-2 "> < strong > Chuning Zhu</ strong > </ p >
685+ < p class ="is-size-7 "> University of Washington</ p >
686+ </ div >
632687 </ div >
633688
634689 <!-- Team Member 2: Raymond Yu -->
635690 < div class ="column is-one-third ">
636- < figure class ="image " style ="margin: 0 auto; ">
637- < a href ="https://raymondyu5.github.io/ " target ="_blank " rel ="noopener ">
638- < img src ="./static/images/raymond.png " alt ="Raymond Yu " class ="team-photo ">
639- </ a >
640- </ figure >
641- < p class ="mt-2 "> < strong > Raymond Yu</ strong > </ p >
642- < p class ="is-size-7 "> University of Washington</ p >
691+ < div class ="has-text-centered ">
692+ < figure class ="image is-inline-block ">
693+ < a href ="https://raymondyu5.github.io/ " target ="_blank " rel ="noopener ">
694+ < img src ="./static/images/raymond.png " alt ="Raymond Yu " class ="team-photo ">
695+ </ a >
696+ </ figure >
697+ < p class ="mt-2 "> < strong > Raymond Yu</ strong > </ p >
698+ < p class ="is-size-7 "> University of Washington</ p >
699+ </ div >
643700 </ div >
644701
645702 <!-- Team Member 3: Siyuan Feng -->
646703 < div class ="column is-one-third ">
647- < figure class ="image " style ="margin: 0 auto; ">
648- < a href ="https://www.cs.cmu.edu/~sfeng/ " target ="_blank " rel ="noopener ">
649- < img src ="./static/images/sfeng.jpg " alt ="Siyuan Feng " class ="team-photo ">
650- </ a >
651- </ figure >
652- < p class ="mt-2 "> < strong > Siyuan Feng</ strong > </ p >
653- < p class ="is-size-7 "> Toyota Research Institute</ p >
704+ < div class ="has-text-centered ">
705+ < figure class ="image is-inline-block ">
706+ < a href ="https://www.cs.cmu.edu/~sfeng/ " target ="_blank " rel ="noopener ">
707+ < img src ="./static/images/sfeng.jpg " alt ="Siyuan Feng " class ="team-photo ">
708+ </ a >
709+ </ figure >
710+ < p class ="mt-2 "> < strong > Siyuan Feng</ strong > </ p >
711+ < p class ="is-size-7 "> Toyota Research Institute</ p >
712+ </ div >
654713 </ div >
655714
656715 <!-- Team Member 4: Benjamin Burchfiel -->
657716 < div class ="column is-one-third ">
658- < figure class ="image " style ="margin: 0 auto; ">
659- < a href ="https://scholar.google.com/citations?user=eGoTK1YAAAAJ&hl=en " target ="_blank " rel ="noopener ">
660- < img src ="./static/images/benjamin.jpg " alt ="Benjamin Burchfiel " class ="team-photo ">
661- </ a >
662- </ figure >
663- < p class ="mt-2 "> < strong > Benjamin Burchfiel</ strong > </ p >
664- < p class ="is-size-7 "> Toyota Research Institute</ p >
717+ < div class ="has-text-centered ">
718+ < figure class ="image is-inline-block ">
719+ < a href ="https://scholar.google.com/citations?user=eGoTK1YAAAAJ&hl=en " target ="_blank " rel ="noopener ">
720+ < img src ="./static/images/benjamin.jpg " alt ="Benjamin Burchfiel " class ="team-photo ">
721+ </ a >
722+ </ figure >
723+ < p class ="mt-2 "> < strong > Benjamin Burchfiel</ strong > </ p >
724+ < p class ="is-size-7 "> Toyota Research Institute</ p >
725+ </ div >
665726 </ div >
666727
667728 <!-- Team Member 5: Paarth Shah -->
668729 < div class ="column is-one-third ">
669- < figure class ="image " style ="margin: 0 auto; ">
670- < a href ="https://www.paarthshah.me/about " target ="_blank " rel ="noopener ">
671- < img src ="./static/images/paarth.jpeg " alt ="Paarth Shah " class ="team-photo ">
672- </ a >
673- </ figure >
674- < p class ="mt-2 "> < strong > Paarth Shah</ strong > </ p >
675- < p class ="is-size-7 "> Toyota Research Institute</ p >
730+ < div class ="has-text-centered ">
731+ < figure class ="image is-inline-block ">
732+ < a href ="https://www.paarthshah.me/about " target ="_blank " rel ="noopener ">
733+ < img src ="./static/images/paarth.jpeg " alt ="Paarth Shah " class ="team-photo ">
734+ </ a >
735+ </ figure >
736+ < p class ="mt-2 "> < strong > Paarth Shah</ strong > </ p >
737+ < p class ="is-size-7 "> Toyota Research Institute</ p >
738+ </ div >
676739 </ div >
677740
678741 <!-- Team Member 6: Abhishek Gupta -->
679742 < div class ="column is-one-third ">
680- < figure class ="image " style ="margin: 0 auto; ">
681- < a href ="https://homes.cs.washington.edu/~abhgupta/ " target ="_blank " rel ="noopener ">
682- < img src ="./static/images/abhgupta.jpeg " alt ="Abhishek Gupta " class ="team-photo ">
683- </ a >
684- </ figure >
685- < p class ="mt-2 "> < strong > Abhishek Gupta</ strong > </ p >
686- < p class ="is-size-7 "> University of Washington</ p >
743+ < div class ="has-text-centered ">
744+ < figure class ="image is-inline-block ">
745+ < a href ="https://homes.cs.washington.edu/~abhgupta/ " target ="_blank " rel ="noopener ">
746+ < img src ="./static/images/abhgupta.jpeg " alt ="Abhishek Gupta " class ="team-photo ">
747+ </ a >
748+ </ figure >
749+ < p class ="mt-2 "> < strong > Abhishek Gupta</ strong > </ p >
750+ < p class ="is-size-7 "> University of Washington</ p >
751+ </ div >
687752 </ div >
688753
689754 </ div >
690755 </ div >
691756 </ section >
692757
693758
694- < style >
695- .team-photo {
696- width : 128px !important ;
697- height : 128px !important ;
698- border-radius : 50% !important ;
699- object-fit : cover !important ;
700- display : block !important ;
701- }
702-
703- .team-photo {
704- width : 128px ;
705- height : 128px ;
706- border-radius : 50% ;
707- object-fit : cover;
708- /* crops to the center */
709- }
710-
711- .mt-2 {
712- margin-top : 0.5rem ;
713- /* minor top spacing for the name */
714- }
715- </ style >
716-
717759
718760 < section class ="section " id ="BibTeX ">
719761 < div class ="container is-max-desktop content ">
0 commit comments