Biography

I am a Research Scientist at Shanghai AI Lab. I got my Ph.D. degree from Multimedia Lab, the Chinese University of Hong Kong (CUHK) in 2022. During my Ph.D. period, I was supervised by Prof. Xiaogang Wang, Prof. Ping Luo, and Prof. Hongsheng Li. Before that, I received bachelor's degree in School of Mathematics at University of Electronic Science and Technology of China (UESTC) in 2017, ranking 1/40. I was fortunate to have several interships in industry, such as Tencent ARC, Huawei Noah AI Foundation Group, and Sensetime Research.

My research interests lie in the pre-training, evaluation, applications of multimodal foundation models, as well as compression techniques and hardware codesign for large models.

I am happy to work with self-motivated research interns at Shanghai AI lab. Feel free to send me an email if you are interested in the above topics.

News

  • [05/2024] One paper ChartAssistant in pre-training LVLMs on chart-related tasks was accepted to ACL'24 Findings.

  • [05/2024] Four papers including MMT-Bench in evaluting LVLMs on a task map, ImplicitBench in evaluating safety of T2I models, Sphinx-X in pre-training powerful LVLMs, and RoboCodeX in pre-training embodied foundation models were accepted to ICML'24.

  • [03/2024] Two papers including OmniMedVQA in evaluting LVLMs on medical domain and DiffAgent in T2I model selection with LLMs were accepted to CVPR'24.

  • [01/2024] Three papers including OmniQuant (Spotlight) in quantizing LLMs, BESA in sparsifying LLMs, and Tree-Planner in planning with LLMs were accepted to ICLR'24.

  • [08/2023] One paper EMMS in multimodal multitask model selection was accepted to NeurIPS'23.

  • [06/2023] Two papers including DMMI (Oral) in generalized refeering segmentation and DiffRate in ViT token compression were accepted to ICCV'23.

  • [03/2023] A whitening apporach RCD in real-time denoising for image and video was accepted to CVPR'23.

  • [01/2023] A self-supervised framework for outdoor point cloud in autonomous driving CO3 was accepted to ICLR'23.

  • [08/2022] Congrats! I have passed my Ph.D. Oral Defense.

  • [06/2022] An effective model selection method SFDA was accepted to ECCV 2022

  • [06/2022] A self-supervised framework for point cloud in autonomous driving CO3 was released.

  • [01/2022] A normalization method DTN for ViTs was accepted to ICLR 2022.

  • [08/2021] Theoretical analysis for channel pruning CWDA was accepted to NeurIPS 2021.

Publications

2022

  • CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving,
    R. Chen, Y. Mu, R. Xu, W. Shao, C. Jiang, H. Xu, Y. Qiao, Z. Li, P. Luo
    arXiv preprint (ICLR), 2023. [Code], [Paper]
  • Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space,
    W. Shao , X. Zhao, Y. Ge, Z. Zhang, L. Yang, X. Wang, Y. Shan, P. Luo.
    European Conference on Computer Vision (ECCV), 2022. [Code], [Paper]
  • Dynamic Token Normalization Improves Vision Transformer,
    W. Shao, Y. Ge, Z. Zhang, X. Xu, X. Wang, Y. Shan, P. Luo
    International Conference on Learning Representation (ICLR), 2022. [Code], [Paper]

2021

  • Rethinking the pruning criteria for convolutional neural network,
    Z. Huang*, W. Shao*, X. Wang, L. Lin, P. Luo
    Advances in Neural Information Processing (NeurIPS), 2021. [Paper], [Supp]

  • What makes for end-to-end object detection?
    P. Sun, Y. Jiang, E. Xie, W. Shao, Z. Yuan, C. Wang, P. Luo
    International Conference on Machine Learning (ICML), 2021. [Paper]

  • Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution,
    Z. Zhang, W. Shao, J. Gu, X. Wang, P. Luo
    International Conference on Machine Learning (ICML), 2021. [Paper]

  • BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening,
    W. Shao, H. Yu, Z. Zhang, H. Xu, Z. Li, P. Luo
    arXiv preprint, Technical Report 2021. [Paper]

2020

  • Channel equilibrium networks for learning deep representation,
    W. Shao, S. Tang, X. Pan, P. Tan, X. Wang, P. Luo
    International Conference on Machine Learning (ICML), 2020. [Paper]

2019

  • SSN: Learning Sparse Switchable Normalization via SparsestMax,
    W. Shao, J. Li, J. Ren, R. Zhang, X. Wang, P. Luo
    International Journal of Computer Vision (IJCV), Volume 128, 2019. [Paper], [Code]

  • Towards understanding regularization in batch normalization,
    P. Luo*, X. Wang*, W. Shao*, Z. Peng.
    International Conference on Learning Representation (ICLR), 2019. [Paper]

  • SSN: Learning Sparse Switchable Normalization via SparsestMax,
    W. Shao, T. Meng, J. Li, Y. Li, R. Zhang, X. Wang, P. Luo
    Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [Paper], [Code]

  • Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks,
    Z. Zhang, J. Li, W. Shao, Z. Peng, R. Zhang, X. Wang, P. Luo
    International Conference on Computer Vision (ICCV), 2019. [Paper],

  • Differentiable Dynamic Normalization for Learning Deep Representation,
    P. Luo*, Z. Peng*, W. Shao, R. Zhang, J. Ren, P. Luo
    International Conference on Machine Learning (ICML), 2019. [Paper],

Academic Activities

  • Co-organizer of Statistical Deep Learning Workshop on Computer Vision, ICCV 2019.

  • Conference Reviewer of CVPR 2020, 2021; ICLR 2020, 2021, 2022; NeurIPS 2019, 2020, 2021, 2022; ICCV 2021.

  • Invited Speaker in VALSE Workshop on Normalization Methods, VALSE, 2021

Teaching

  • Teaching Assistant for ELEG2401 Introduction to Embedded System, Fall 2018, 2019, 2020, and 2021.

  • Teaching Assistant for ELEG1410 Linear Algebra and Vector Calculus for Engineers, Spring 2019, 2021, and 2022.

  • Teaching Assistant for ELEG2450 Probability and Statistics for Engineers, Spring 2020.