Xue Yang

Assistant Professor, Ph.D. Supervisor

School of Automation and Intelligent Sensing, Shanghai Jiao Tong University

800 Dongchuan Road, Shanghai, 200240, China

📧 [email protected], [email protected], [email protected]


我正在寻找自驱力较强的攻读硕/博士 (2027年保研、拿到2026年及以后创智/中关村/河套等国家AI学院offer) 的学生、实习生,与严骏驰教授共同指导,目标是在智能体、多模态大模型、空间智能、遥感影像解译等课题上做出有影响力的工作。请随时通过电子邮件与我联系。

Looking for self-motivated students (Master/Ph.D. 2027 spring & fall), interns to join us, co-supervised by Prof. Junchi Yan, with the goal of doing impactful work on the topic of Agentic AI, Multimodal Large Language Model, Spatial Intelligence, Remote Sensing Image Interpretation, etc. Please do not hesitate to contact me via email.

🔑 Research Interests

My research Citations: 12490 interests include Agentic AI, Multimodal Large Language Model, Spatial Intelligence, Remote Sensing Image Interpretation, etc.

🔥 Latest News

2026-05

We have open-sourced SkillOpt jointly with Microsoft Research Asia 🔥🔥🔥

2026-05

Congratulations to doctoral student Ziyang Gong for receiving the first CCF doctoral student funding program. 🧨🧨🧨

2026-05

Serving as the Associate Editor (AE) for Visual Intelligence.

2026-05

One paper related to Spatial Intelligence (Holi-Spatial, Oral, 168/23918=0.7%) is accepted by ICML 2026. Congratulations to Prof. Zhihang Zhong. 🎉🎉🎉

2026-05

CitationClaw-v2 is released. Cheaper and more accurate.

2026-05

Serving as the Sponsor Chair for the 4th SCS-CV. 🧨🧨🧨

2026-05

One paper related to RS-VLM survey (GeoChef) is accepted by GRSM. Congratulations to Prof. Yue Zhou. 🧨🧨🧨

2026-05

GeoViS has been selected as a Candidate for the CVPR 2026 Best Paper Award. 🧨🧨🧨

2026-05

One paper related to AI4SCI (Molecular Detoxification, Oral, 6.1%) is accepted by KDD 2026, AI for Sciences Track. Congratulations to Fei Lin and Ziyang Gong. 🎉🎉🎉

2026-05

Two paper related to Streaming Video (PhoStream) and Spatial Intelligence (Holi-Spatial, Spotlight, 536/23918=2.2%) are accepted by ICML 2026. Congratulations to Xudong Lu and Prof. Zhihang Zhong. 🎉🎉🎉

2026-04

One paper related to object detection in remote sensing images is accepted by TCSVT.

2026-02

Five paper related to Safety of LLMs are accepted by ACL 2026 (two Main Conference, three Findings). Congratulations to Yu Tian. 🎉🎉🎉

2026-03

CitationClaw is released. Turning Every Citation into Explainable Impact.

2026-02

Received 2025 Reviewer Certificate from IEEE TPAMI

2026-02

Five paper related to Video Tokenization (AdapTok), PEFT (CrossEarth-Gate), Visual Grounding (GeoVis, Oral, 74/16092=0.46%), OBB (PWOOD), AD (SpatialRetrievalAD) are accepted by CVPR 2026 and one paper related to RGB-T&OBB (CrossWeaver) is accepted as findings. Congratulations to Yan Li, Ziyang Gong, Zonghao Guo, Mingxin Liu, Xiaosong Jia, and Qi Ming. 🎉🎉🎉

🔥 Recent Works
Equal contribution
Corresponding author
Project Leader
arXiv
Image
【PhotoFlow】Agentic 3D Virtual Photography Missions (arXiv, 2026) Citation: 0
arXiv
Image
【SkillLens】From Raw Experience to Skill Consumption A Systematic Study of Model-Generated Agent Skills (arXiv, 2026) Citation: 0
arXiv
Image
【SkillOpt】Executive Strategy for Self-Evolving Agent Skills (arXiv, 2026) Citation: 0
arXiv
Image
【SpaceDG】Benchmarking Spatial Intelligence under Visual Degradation (arXiv, 2026) Citation: 0
KDD
Oral
Image
【ToxiMol】Breaking Bad Molecules Are MLLMs Ready for Structure-Level Molecular Detoxification? (KDD, 2026) Citation: 4
ACL
Image
【SafeSteer】A Decoding-level Defense Mechanism for Multimodal Large Language Models (ACL, 2026) Citation: 0
TCSVT
Image
【CTAOD-RS】Continual Test-Time Adaptive Object Detection in Remote Sensing Images via Spectral Routing and Historical Reconstruction (TCSVT, 2026) Citation: 0
ICML
Oral
Image
【Holi-Spatial】Evolving Video Streams into Holistic 3D Spatial Intelligence (ICML, 2026) Citation: 3
ICML
Image
【PhoStream】Benchmarking Real-World Streaming for Omnimodal Assistants in Mobile Scenarios (ICML, 2026) Citation: 2
ACL
Image
【DPN-LE】Dual Personality Neuron Localization and Editing for Large Language Models (ACL, 2026) Citation: 0
arXiv
Image
【MM-WebAgent】A Hierarchical Multimodal Web Agent for Webpage Generation (arXiv, 2026) Citation: 0
arXiv
Image
【HiProto】Hierarchical Prototype Learning for Interpretable Object Detection Under Low-quality Conditions (arXiv, 2026) Citation: 0
Tech. Report
Image
【Video-MME-v2】Towards the Next Stage in Benchmarks for Comprehensive Video Understanding (Tech. Report, 2026) Citation: 1
ACL
Image
【Rt-LRM】Red Teaming Large Reasoning Models (ACL, 2026) Citation: 2
ACL
Image
【Safe-FedLLM】Delving into the Safety of Federated Large Language Models (ACL, 2026) Citation: 1