Xue Yang

Assistant Professor, Ph.D. Supervisor

School of Automation and Intelligent Sensing, Shanghai Jiao Tong University

800 Dongchuan Road, Shanghai, 200240, China

📧 [email protected], [email protected], [email protected]


我正在寻找自驱力较强的攻读硕/博士 (2027年保研、拿到2026年及以后创智/中关村/河套等国家AI学院offer) 的学生、实习生,与严骏驰教授共同指导,目标是在智能体、多模态大模型、空间智能、遥感影像解译等课题上做出有影响力的工作。请随时通过电子邮件与我联系。

Looking for self-motivated students (Master/Ph.D. 2027 spring & fall), interns to join us, co-supervised by Prof. Junchi Yan, with the goal of doing impactful work on the topic of Agentic AI, Multimodal Large Language Model, Spatial Intelligence, Remote Sensing Image Interpretation, etc. Please do not hesitate to contact me via email.

🔑 Research Interests

My research Citations: 12273 interests include Agentic AI, Multimodal Large Language Model, Spatial Intelligence, Remote Sensing Image Interpretation, etc.

🔥 Latest News

2026-05

One paper related to Spatial Intelligence (Holi-Spatial, Oral, 168/23918=0.007) is accepted by ICML 2026. Congratulations to Prof. Zhihang Zhong. 🎉🎉🎉

2026-05

CitationClaw-v2 is released. Cheaper and more accurate.

2026-05

Become the sponsor chair of the 4th SCS-CV. 🧨🧨🧨

2026-05

One paper related to RS-VLM survey (GeoChef) is accepted by GRSM. Congratulations to Prof. Yue Zhou. 🧨🧨🧨

2026-05

GeoViS has been selected as a Candidate for the CVPR 2026 Best Paper Award. 🧨🧨🧨

2026-05

One paper related to AI4SCI (Molecular Detoxification, Oral) is accepted by KDD 2026, AI for Sciences Track. Congratulations to Fei Lin and Ziyang Gong. 🎉🎉🎉

2026-05

Two paper related to Streaming Video (PhoStream) and Spatial Intelligence (Holi-Spatial, Spotlight, 536/23918=2.2%) are accepted by ICML 2026. Congratulations to Xudong Lu and Prof. Zhihang Zhong. 🎉🎉🎉

2026-04

One paper related to object detection in remote sensing images is accepted by TCSVT.

2026-02

Five paper related to Safety of LLMs are accepted by ACL 2026 (two Main Conference, three Findings). Congratulations to Yu Tian. 🎉🎉🎉

2026-03

CitationClaw is released. Turning Every Citation into Explainable Impact.

2026-02

Received 2025 Reviewer Certificate from IEEE TPAMI

2026-02

Five paper related to Video Tokenization (AdapTok), PEFT (CrossEarth-Gate), Visual Grounding (GeoVis, Oral, 74/16092=0.46%), OBB (PWOOD), AD (SpatialRetrievalAD) are accepted by CVPR 2026 and one paper related to RGB-T&OBB (CrossWeaver) is accepted as findings. Congratulations to Yan Li, Ziyang Gong, Zonghao Guo, Mingxin Liu, Xiaosong Jia, and Qi Ming. 🎉🎉🎉

2026-02

One paper related to weakly-supervised segmentation (SAPNet++) is accepted by TPAMI. Congratulations to Zhaoyang Wei. 🎉🎉🎉

2026-01

Shortlisted for Elsevier's 2025 Most Cited Chinese Researchers

2026-01

Six papers related to VLM (MM-Helix, SpaCE-10), OBB (SPWOOD, Point2RBox-v3), VLA (InterleaveVLA), Gen (OF-Diff) are accepted by ICLR 2026. Congratulations to Xiangyu Zhao, Ziyang Gong, Wei Zhang, Teng Zhang, Cunxin Fan, Ziqi Ye, etc. 🎉🎉🎉

🔥 Recent Works
Equal contribution
Corresponding author
Project Leader
KDD
Oral
Image
【ToxiMol】Breaking Bad Molecules Are MLLMs Ready for Structure-Level Molecular Detoxification? (KDD, 2026) Citation: 4
ICML
Oral
Image
【Holi-Spatial】Evolving Video Streams into Holistic 3D Spatial Intelligence (ICML, 2026) Citation: 2
ICML
Image
【PhoStream】Benchmarking Real-World Streaming for Omnimodal Assistants in Mobile Scenarios (ICML, 2026) Citation: 2
Tech. Report
Image
【Video-MME-v2】Towards the Next Stage in Video Understanding Evaluation (Tech. Report, 2026) Citation: 0
ACL
Image
【Rt-LRM】Red Teaming Large Reasoning Models (ACL, 2026) Citation: 0
ACL
Image
【Safe-FedLLM】Delving into the Safety of Federated Large Language Models (ACL, 2026) Citation: 1
arXiv
Image
【BizGenEval】A Systematic Benchmark for Commercial Visual Content Generation (arXiv, 2026) Citation: 1
arXiv
Image
【CrossEarth-SAR】A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation (arXiv, 2026) Citation: 0
arXiv
Image
【EvoTok】A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation (arXiv, 2026) Citation: 0
arXiv
Image
【FIRM】Trust Your Critic Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation (arXiv, 2026) Citation: 2
arXiv
Image
【GRADE】Benchmarking Discipline-Informed Reasoning in Image Editing (arXiv, 2026) Citation: 0
arXiv
Image
【CourtSI】Stepping VLMs onto the Court Benchmarking Spatial Intelligence in Sports (arXiv, 2026) Citation: 0
Tech. Report
Image
【InternVL-U】Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing (Tech. Report, 2026) Citation: 9
Tech. Report
Image
【ACE-Brain-0】Spatial Intelligence as a Shared Scaffold for Universal Embodiments (Tech. Report, 2026) Citation: 0
CVPR
Image
【AdapTok】Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space (CVPR, 2026) Citation: 5