I am a third-year Ph.D. candidate at Shanghai Jiao Tong University and Shanghai AI Laboratory, supervised by Prof. Jifeng Dai. I obtained my bachelor’s degree from Beihang University in 2022, where I worked with Prof. Si Liu. I also have a double bachelor’s degree in economics from Peking University. Currently, I am an intern at OpenGVLab of Shanghai AI Laboratory. Previously I was an intern at SenseTime and Sea AI Lab.
Ph.D. (Joint Program with Shanghai AI Lab), 2022-2027(expected)
Department of EE, Shanghai Jiao Tong University
B.A. in Economics (Double Major), 2019-2022
National School of Development, Peking University
B.Eng. in Computer Science, 2018-2022
Shenyuan Honors College, Beihang University
2025.8: 🚀 We release InternVL3.5, a leading multimodal large language model with advanced versatility, reasoning, and efficiency.
2025.8: 🏆 Our paper Sparkle on VLM spatial reasoning is accepted by EMNLP 2025 Findings and awarded the Best Paper Award at IJCAI MKLM Workshop 2025.
2025.7: ⭐️ Our paper PIIP on efficient multimodal understanding is accepted by TPAMI.
2025.7 🏆 Our paper Limit of RLVR on reinforcement learning for LLM is awarded the Best Paper Award (2/172) of ICML AI4MATH Workshop 2025!
2025.6: ⭐️ Our paper V2M Survey on vision-to-music generation is accepted by ISMIR 2025.
2025.4: 🎤 Talk on Mono-InternVL at Open Multimodal Gathering Workshop hosted by NUS ShowLab. [Slides]
2025.2: ⭐️ Our papers Mono-InternVL and SynerGen-VL on encoder-free MLLMs are accepted by CVPR 2025.
2024.10: ⭐️ Our paper ItiNera on LLM for urban itinerary generation is accepted by EMNLP 2024. It is also awarded the Best Paper Award of KDD Urban Computing Workshop (UrbComp) 2024.
2024.9: ⭐️ Our paper PIIP on efficient vision backbone is accepted by NeurIPS 2024 as Spotlight, ranking Top 10 in NeurIPS 2024 (among 15671 submissions) and Top 2 in computer vision area.
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Weiyun Wang*, Zhangwei Gao*, Lixin Gu*, Hengjun Pu*, Long Cui*, Xingguang Wei*, Zhaoyang Liu*, Linglin Jing*, Shenglong Ye*, Jie Shao*, Zhaokai Wang*, Zhe Chen*, Hongjie Zhang, Ganlin Yang, Haomin Wang, Qi Wei, Jinhui Yin, Wenhao Li, Erfei Cui, Guanzhou Chen, Zichen Ding, Changyao Tian, Zhenyu Wu, Jingjing Xie, Zehao Li, Bowen Yang, Yuchen Duan, Xuehui Wang, Songze Li, Xiangyu Zhao, Haodong Duan, Nianchen Deng, Bin Fu, Yinan He, Yi Wang, Conghui He, Botian Shi, Junjun He, Yingtong Xiong, Han Lv, Lijun Wu, Wenqi Shao, Kaipeng Zhang, Huipeng Deng, Biqing Qi, Jiaye Ge, Qipeng Guo, Wenwei Zhang, Wanli Ouyang, Limin Wang, Min Dou, Xizhou Zhu, Tong Lu, Dahua Lin, Jifeng Dai, Weijie Su, Bowen Zhou, Kai Chen, Yu Qiao, Wenhai Wang, Gen Luo
Technical Report
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang
ICML 2025 AI4MATH Workshop Best Paper Award (2/172)
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning
Yihong Tang*, Ao Qu*, Zhaokai Wang*, Dingyi Zhuang*, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao
EMNLP 2025 Findings & IJCAI 2025 MKLM Workshop Best Paper Award
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Hao Li, Changyao Tian, Jie Shao, Xizhou Zhu, Zhaokai Wang, Jinguo Zhu, Wenhan Dou, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai
CVPR 2025
ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning
Yihong Tang*, Zhaokai Wang*, Ao Qu*, Yihao Yan*, Zhaofeng Wu, Dingyi Zhuang, Jushi Kai, Kebing Hou, Xiaotong Guo, Han Zheng, Tiange Luo, Jinhua Zhao, Zhan Zhao, Wei Ma
EMNLP 2024 Industry Track & KDD 2024 UrbComp Workshop Best Paper Award
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Hao Li*, Xue Yang*, Zhaokai Wang*, Xizhou Zhu, Jie Zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai
CVPR 2024
Talks:
Conference Reviewer:
Teaching Assistant