Chao Huang
News
Work
Experience
Publications
Notes
☼
Publications
Full list of papers.
Bold
= first / co-first author.
2026
Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination
Yunlong Tang
, Daiki Shimada,
Hang Hua
,
Chao Huang
,
Jing Bi
, Rogerio Feris,
Chenliang Xu
CVPR Findings, 2026
Code
Asynchronous Temporal Modeling with Two-Agent Framework for Streaming Dense Video Captioning
Yunlong Tang
,
Chao Huang
,
Susan Liang
,
Jing Bi
, Yicheng Wang, Daiki Shimada,
Chenliang Xu
CVPR, 2026
When to Think and When to Look: Uncertainty-Guided Lookback
Jing Bi
, Filippos Bellos, Junjia Guo, Yayuan Li,
Chao Huang
,
Yunlong Tang
,
Luchuan Song
,
Susan Liang
, Zhongfei Mark Zhang, Jason J. Corso,
Chenliang Xu
CVPR, 2026
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Yunlong Tang
,
Jing Bi
,
Chao Huang
,
Susan Liang
, Daiki Shimada,
Hang Hua
, Yunzhong Xiao,
Yizhi Song
,
Pinxin Liu
,
Mingqian Feng
, Junjia Guo, Zhuo Liu,
Luchuan Song
,
Ali Vosoughi
, Jinxi He, Liu He,
Zeliang Zhang
,
Jiebo Luo
,
Chenliang Xu
AAAI Demo, 2026
Best Demo Runner-up
Paper
Video
Code
XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models
Xingrui Wang
,
Jiang Liu
,
Chao Huang
,
Xiaodong Yu
,
Ze Wang
,
Ximeng Sun
,
Jialian Wu
,
Alan Yuille
, Emad Barsoum,
Zicheng Liu
ICLR, 2026
Paper
Project
Code
Data
DRIFT: Directional Reasoning Injection for Fine-Tuning MLLMs
Chao Huang
,
Zeliang Zhang
,
Jiang Liu
,
Ximeng Sun
,
Jialian Wu
,
Xiaodong Yu
,
Ze Wang
,
Chenliang Xu
,
Emad Barsoum
,
Zicheng Liu
ACL Findings, 2026
Paper
Project
Code
2025
ZeroSep: Separate Anything in Audio with Zero Training
Chao Huang
, Yuesheng Ma, Junxuan Huang,
Susan Liang
,
Yunlong Tang
,
Jing Bi
, Wenqiang Liu,
Nima Mesgarani
,
Chenliang Xu
NeurIPS, 2025
Paper
Project
Code
Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
Jiani Liu*, Zhiyuan Wang*,
Zeliang Zhang*
,
Chao Huang
,
Susan Liang
,
Yunlong Tang
,
Chenliang Xu
NeurIPS, 2025
Paper
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark
Yunlong Tang
,
Pinxin Liu
,
Mingqian Feng
, Zhangyun Tan, Rui Mao,
Chao Huang
,
Jing Bi
, Yunzhong Xiao,
Susan Liang
,
Hang Hua
,
Ali Vosoughi
,
Luchuan Song
,
Zeliang Zhang
,
Chenliang Xu
NeurIPS D&B, 2025
Paper
Project
Code
Learning to Highlight Audio by Watching Movies
Chao Huang
,
Ruohan Gao
, J. M. F. Tsang, Jan Kurcius, Cagdas Bilen,
Chenliang Xu
,
Anurag Kumar
,
Sanjeel Parekh
CVPR, 2025
Paper
Project
Code
Dataset
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Yunlong Tang*
, Junjia Guo*,
Hang Hua
,
Susan Liang
,
Mingqian Feng
, Xinyang Li, Rui Mao,
Chao Huang
,
Jing Bi
,
Zeliang Zhang
, Pooyan Fazli,
Chenliang Xu
CVPR, 2025
Paper
Project
Code
FreSca: Scaling in Frequency Space Enhances Diffusion Models
Chao Huang
,
Susan Liang
,
Yunlong Tang
,
Li Ma
,
Yapeng Tian
,
Chenliang Xu
CVPR Workshop (GMCV), 2025
Paper
Project
Code
π-AVAS: Can Physics-Integrated Audio-Visual Modeling Boost Neural Acoustic Synthesis?
Susan Liang
,
Chao Huang
,
Yunlong Tang
,
Zeliang Zhang
,
Chenliang Xu
ICCV, 2025
High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling
Chao Huang
,
Susan Liang
,
Yapeng Tian
,
Anurag Kumar
,
Chenliang Xu
IJCV, 2025
Paper
Project
Code
Video Understanding with Large Language Models: A Survey
Yunlong Tang*
, ...,
Chao Huang
, ..., Ping Luo,
Jiebo Luo
,
Chenliang Xu
IEEE TCSVT, 2025
Paper
Project
2024
DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models
Chao Huang
,
Susan Liang
,
Yapeng Tian
,
Anurag Kumar
,
Chenliang Xu
ACCV, 2024
Best Paper Award
Paper
Project
Code
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Chao Huang
,
Dejan Markovic
,
Chenliang Xu
,
Alexander Richard
ECCV, 2024
Paper
Project
Scaling Concept with Text-Guided Diffusion Models
Chao Huang
,
Susan Liang
,
Yunlong Tang
,
Yapeng Tian
,
Anurag Kumar
,
Chenliang Xu
arXiv, 2024
Paper
Project
Code
Language-Guided Joint Audio-Visual Editing Via One-Shot Adaptation
Susan Liang
,
Chao Huang
,
Yapeng Tian
,
Anurag Kumar
,
Chenliang Xu
ACCV, 2024
Paper
Project
Dataset
2023
AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis
Susan Liang
,
Chao Huang
,
Yapeng Tian
,
Anurag Kumar
,
Chenliang Xu
NeurIPS, 2023
Paper
Project
Code
Egocentric Audio-Visual Object Localization
Chao Huang
,
Yapeng Tian
,
Anurag Kumar
,
Chenliang Xu
CVPR, 2023
Paper
Code
2020 & Earlier
Non-Local Part-Aware Point Cloud Denoising
Chao Huang*
,
Ruihui Li*
,
Xianzhi Li
,
Chi-Wing Fu
arXiv, 2020
Extreme Image Compression via Multiscale Autoencoders With Generative Adversarial Optimization
Chao Huang
,
Haojie Liu
,
Tong Chen
,
Qiu Shen
,
Zhan Ma
IEEE VCIP, 2019