Zhengbo Zhang

Ph.D. Student, Singapore University of Technology and Design

My research aims to understand and reason about the physical world as represented in video, and to controllably manipulate its visual representations, with a focus on scene structure, object motion, and long-horizon event understanding across space and time, leveraging diffusion models and their vision-language variants.

Selected Publications

† indicates corresponding author

DART: Difficulty-Adaptive Routing for Zero-Shot Video Temporal Grounding
Zhengbo Zhang, Mark He Huang, Zhigang Tu, Ming-Hsuan Yang
European Conference on Computer Vision (ECCV), 2026
Unleashing the Power of Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation
Duo Peng, Zhengbo Zhang, Ping Hu, Qiuhong Ke, De Wen Soh, Mohammed Bennamoun, Jun Liu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2026
InstaInpaint: Instant 3D-Scene Inpainting with Masked Large Reconstruction Model
Junqi You, Chieh Hubert Lin, Weijie Lyu, Zhengbo Zhang, Ming-Hsuan Yang
Advances in Neural Information Processing Systems (NeurIPS), 2025
Performing Defocus Deblurring by Modeling its Formation Process
Zhengbo Zhang, Lin Geng Foo, Hossein Rahmani, Jun Liu, De Wen Soh
IEEE International Conference on Computer Vision (ICCV), 2025
Visual Prompting for One-shot Controllable Video Editing without Inversion
Zhengbo Zhang, Yuxi Zhou, Duo Peng, Joo Hwee Lim, Zhigang Tu, De Wen Soh, Lin Geng Foo
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025
Informative Sample Selection Model for Skeleton-based Action Recognition with Limited Training Samples
Zhigang Tu, Zhengbo Zhang†, Jia Gong, Junsong Yuan, Bo Du
IEEE Transactions on Image Processing (TIP), 2025
FADE: A Dataset for Detecting Falling Objects around Buildings in Video
Zhigang Tu, Zhengbo Zhang†, Zitao Gao, Chunluan Zhou, Junsong Yuan, Bo Du
IEEE Transactions on Information Forensics and Security (TIFS), 2025
Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation
Duo Peng, Zhengbo Zhang, Ping Hu, Qiuhong Ke, David Yau, Jun Liu
European Conference on Computer Vision (ECCV), 2024 (Oral)
Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers
Zhengbo Zhang, Li Xu, Duo Peng, Hossein Rahmani, Jun Liu
European Conference on Computer Vision (ECCV), 2024
Distilling Inter-Class Distance for Semantic Segmentation
Zhengbo Zhang, Chunluan Zhou, Zhigang Tu
International Joint Conference on Artificial Intelligence (IJCAI), 2022 (Oral)