Wenqi Shao (邵文琪)

Biography

I am a Research Scientist at Shanghai AI Lab. I got my Ph.D. degree from Multimedia Lab, the Chinese University of Hong Kong (CUHK) in 2022. During my Ph.D. period, I was supervised by Prof. Xiaogang Wang, Prof. Ping Luo, and Prof. Hongsheng Li. Before that, I received bachelor's degree in School of Mathematics at University of Electronic Science and Technology of China (UESTC) in 2017, ranking 1/40. I was fortunate to have several interships in industry, such as Tencent ARC, Huawei Noah AI Foundation Group, and Sensetime Research.

My research interests lie in the evaluation and reasoning of multimodal foundation models, as well as compression techniques and hardware codesign for large models.

I am happy to work with self-motivated research interns at Shanghai AI lab. Feel free to send me an email if you are interested in the above topics.

News

[01/2025] Five papers incluing VLB dynamic multimodal evaluation, MMIU multi-image evaluation, EMOS embodied multi-agents, SAMRefiner mask refinement with SAM, and Lumina-T2X text-to-X generation were accepted to ICLR'25.
[09/2024] Four papers incluing ConvBench (Spotlight) multi-modal multi-turn conversation evaluation, MM-NIAH multi-modal long-context evaluation, SearchLVLMs LVLMs search with RAG, and T2VHE T2V evaluation protocol were accepted to NeurIPS'24.
[05/2024] One paper ChartAssistant in pre-training LVLMs on chart-related tasks was accepted to ACL'24 Findings.
[05/2024] Four papers including MMT-Bench in evaluating LVLMs on a task map, ImplicitBench in evaluating safety of T2I models, Sphinx-X in pre-training powerful LVLMs, and RoboCodeX in pre-training embodied foundation models were accepted to ICML'24.
[03/2024] Two papers including OmniMedVQA in evaluting LVLMs on medical domain and DiffAgent in T2I model selection with LLMs were accepted to CVPR'24.
[01/2024] Three papers including OmniQuant (Spotlight) in quantizing LLMs, BESA in sparsifying LLMs, and Tree-Planner in planning with LLMs were accepted to ICLR'24.
[08/2023] One paper EMMS in multimodal multitask model selection was accepted to NeurIPS'23.
[06/2023] Two papers including DMMI (Oral) in generalized refeering segmentation and DiffRate in ViT token compression were accepted to ICCV'23.
[03/2023] A whitening apporach RCD in real-time denoising for image and video was accepted to CVPR'23.
[01/2023] A self-supervised framework for outdoor point cloud in autonomous driving CO3 was accepted to ICLR'23.
[08/2022] Congrats! I have passed my Ph.D. Oral Defense.
[06/2022] An effective model selection method SFDA was accepted to ECCV 2022
[06/2022] A self-supervised framework for point cloud in autonomous driving CO3 was released.
[01/2022] A normalization method DTN for ViTs was accepted to ICLR 2022.
[08/2021] Theoretical analysis for channel pruning CWDA was accepted to NeurIPS 2021.

Publications

2022

CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving,
R. Chen, Y. Mu, R. Xu, W. Shao, C. Jiang, H. Xu, Y. Qiao, Z. Li, P. Luo
arXiv preprint (ICLR), 2023. [Code], [Paper]
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space,
W. Shao , X. Zhao, Y. Ge, Z. Zhang, L. Yang, X. Wang, Y. Shan, P. Luo.
European Conference on Computer Vision (ECCV), 2022. [Code], [Paper]
Dynamic Token Normalization Improves Vision Transformer,
W. Shao, Y. Ge, Z. Zhang, X. Xu, X. Wang, Y. Shan, P. Luo
International Conference on Learning Representation (ICLR), 2022. [Code], [Paper]

2021

Rethinking the pruning criteria for convolutional neural network,
Z. Huang*, W. Shao*, X. Wang, L. Lin, P. Luo
Advances in Neural Information Processing (NeurIPS), 2021. [Paper], [Supp]

What makes for end-to-end object detection?
P. Sun, Y. Jiang, E. Xie, W. Shao, Z. Yuan, C. Wang, P. Luo
International Conference on Machine Learning (ICML), 2021. [Paper]

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution,
Z. Zhang, W. Shao, J. Gu, X. Wang, P. Luo
International Conference on Machine Learning (ICML), 2021. [Paper]

BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening,
W. Shao, H. Yu, Z. Zhang, H. Xu, Z. Li, P. Luo
arXiv preprint, Technical Report 2021. [Paper]

2020

Channel equilibrium networks for learning deep representation,
W. Shao, S. Tang, X. Pan, P. Tan, X. Wang, P. Luo
International Conference on Machine Learning (ICML), 2020. [Paper]

2019

SSN: Learning Sparse Switchable Normalization via SparsestMax,
W. Shao, J. Li, J. Ren, R. Zhang, X. Wang, P. Luo
International Journal of Computer Vision (IJCV), Volume 128, 2019. [Paper], [Code]

Towards understanding regularization in batch normalization,
P. Luo*, X. Wang*, W. Shao*, Z. Peng.
International Conference on Learning Representation (ICLR), 2019. [Paper]

SSN: Learning Sparse Switchable Normalization via SparsestMax,
W. Shao, T. Meng, J. Li, Y. Li, R. Zhang, X. Wang, P. Luo
Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [Paper], [Code]

Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks,
Z. Zhang, J. Li, W. Shao, Z. Peng, R. Zhang, X. Wang, P. Luo
International Conference on Computer Vision (ICCV), 2019. [Paper],

Differentiable Dynamic Normalization for Learning Deep Representation,
P. Luo*, Z. Peng*, W. Shao, R. Zhang, J. Ren, P. Luo
International Conference on Machine Learning (ICML), 2019. [Paper],

Academic Activities

Co-organizer of Statistical Deep Learning Workshop on Computer Vision, ICCV 2019.
Conference Reviewer of CVPR 2020, 2021; ICLR 2020, 2021, 2022; NeurIPS 2019, 2020, 2021, 2022; ICCV 2021.
Invited Speaker in VALSE Workshop on Normalization Methods, VALSE, 2021

Teaching

Teaching Assistant for ELEG2401 Introduction to Embedded System, Fall 2018, 2019, 2020, and 2021.
Teaching Assistant for ELEG1410 Linear Algebra and Vector Calculus for Engineers, Spring 2019, 2021, and 2022.
Teaching Assistant for ELEG2450 Probability and Statistics for Engineers, Spring 2020.