I am a Ph.D. student in Computer Science at the University of California, San Diego, advised by Prof. Julian McAuley. My research focuses on multimodal large language models, reinforcement learning for LLM reasoning, and recommender systems. I received my B.E. from the Turing Class, Chu Kochen Honors College at Zhejiang University.

๐Ÿ”ฅ News

๐Ÿ“ Publications

Published

ICML 2026

WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning

Gagan Mundada, Zihan Huang, Rohan Surana, Sheldon Yu, Jennifer Yuntong Zhang, Xintong Li, Tong Yu, Lina Yao, Jingbo Shang, Julian McAuley, Junda Wu

[Paper] [OpenReview]

EMNLP 2025

Image Difference Captioning via Adversarial Preference Optimization

Zihan Huang, Junda Wu, Rohan Surana, Tong Yu, David Arbour, Raghav Sinha, Julian McAuley

[Paper]

COLM 2025

Traceable and Explainable Multimodal Large Language Models: An Information-Theoretic View

Zihan Huang, Junda Wu, Rohan Surana, Raghav Jain, Tong Yu, Raghavendra Addanki, David Arbour, Sungchul Kim, Julian McAuley

[Paper]

IEEE TMM 2025

AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding

Zihan Huang, Tao Wu, Wang Lin, Shengyu Zhang, Jingyuan Chen, Fei Wu

[Paper] [Project]

BIB 2024

Global-local aware Heterogeneous Graph Contrastive Learning for multifaceted association prediction in miRNAโ€“geneโ€“disease networks

Yang Si, Zihan Huang, Zhengqing Fang, Zhen Yuan, Zhen Huang, Yuxuan Li, Yuxuan Wei, Fei Wu, Yong-Fang Yao

[Paper]

Preprints

arXiv

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Rohan Surana, Gagan Mundada, Xunyi Jiang, Chen Wang, Zifan Tang, Dongxu Jiao, Zihan Huang, et al.

[Paper]

arXiv

AMPS: Adaptive Modality Preference Steering via Functional Entropy

Zihan Huang, Xintong Li, Rohan Surana, Tong Yu, Rui Wang, Julian McAuley, Jingbo Shang, Junda Wu

[Paper]

arXiv

Skill-R1: Agent Skill Evolution via Reinforcement Learning

Yash Vishe, Rohan Surana, Xunyi Jiang, Zihan Huang, Xintong Li, Nikki Lijing Kuang, Tong Yu, Ryan A. Rossi, Jingbo Shang, Julian McAuley, Junda Wu

[Paper]

arXiv

Skill-CMIB: Multimodal Agent Skill for Consistent Action via Conditional Multimodal Information Bottleneck

Zihan Huang, Junda Wu, Tong Yu, Qianqi Yan, Rohan Surana, Uttaran Bhattacharya, Lina Yao, Xin Eric Wang, Julian McAuley

[Paper]

arXiv

Evaluation on Entity Matching in Recommender Systems

Zihan Huang, Rohan Surana, Zhouhang Xie, Junda Wu, Yu Xia, Julian McAuley

[Paper] [Code]

๐Ÿ“– Educations

  • 2024.09 โ€“ Present, Ph.D. in Computer Science, University of California San Diego, CA, United States (Advisor: Julian McAuley)
  • 2020.09 โ€“ 2024.06, B.E. in Artificial Intelligence, Turing Class, Chu Kochen Honors College, Zhejiang University, Hangzhou, China

๐Ÿ’ผ Research Experiences

  • UC San Diego โ€” McAuley Lab, 2024.09 โ€“ Present, Advisor: Prof. Julian McAuley
  • Alibaba โ€” Tmall Meta Research Group, 2024.06 โ€“ 2024.09
  • Shanghai Institute for Advanced Study of Zhejiang University, 2024.01 โ€“ 2024.06, Supervisor: Prof. Jingyuan Chen
  • ZJU DCD Lab, 2022.02 โ€“ 2023.06, Supervisor: Prof. Fei Wu

๐ŸŽ– Honors and Awards

  • Project leader of National Student Research Training Program (SRTP) and rated as Outstanding, 2022โ€“2023
  • Honorable Winner of Mathematical Contest in Modeling (MCM), team leader, 2022, COMAP
  • Academic Excellence Award 2020โ€“2022, Chu Kochen Honors College of ZJU
  • Second-Class Scholarship for Elite Student in Basic Sciences, 2021โ€“2022, ZJU
  • NeurIPS emergency Reviewer, 2023
  • 2nd Prize of 3D Printing Competition, 2020, ZJU
  • 2nd Prize of Zhejiang College Student Physics Competition, 2021
  • Half Marathon 1h53m, ranked 84/1000+, 2020, Hangzhou, Zhejiang
  • 1st Prize of Search & Rescue, Asia-Pacific Robotics Championship (APRC), 2013
  • 1st Prize of Triathlon, Asia-Pacific Robotics Championship (APRC), 2013