Geometry Forcing (GF) Overview. (a) Our proposed GF paradigm enhances video diffusion models by aligning with geometric features from VGGT. (b) Compared to DFoT, our method generates more temporally ...
Abstract: We present a self-supervised learning algorithm for 3D human pose estimation of a single person based on a multiple-view camera system and 2D body pose estimates for each view. To train our ...
Abstract: Open-Vocabulary 3D object affordance grounding aims to anticipate "action possibilities" regions on 3D objects with arbitrary instructions, which is crucial for robots to generically ...