About Me

I am a second-year Master’s student at Tsinghua University and a member of the CVML Lab, advised by Prof. Chun Yuan.

Before joining Tsinghua, I received my B.S. in Computer Science and Technology from the Central University of Finance and Economics in 2025. My recent research focuses on training better foundation multimodal large language models.

Research Interests

Multimodal large language models
Controllable video generation and world models
Multimodal Image Fusion
Remote sensing understanding and reasoning

Education

Tsinghua University, M.S. in Computer Technology, 2025 - present
Central University of Finance and Economics, B.S. in Computer Science and Technology, 2021 - 2025

Publications

Image Generation

Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach

Jiayang Li*, Chengjie Jiang*, Junjun Jiang†, Pengwei Liang, Jiayi Ma, Liqiang Nie

TPAMI 2026 IF:18.6 Diffusion Transformer Text-Controlled Image Fusion Multimodal Segmentation

Project PDF Code Model

RIS-FUSION: Rethinking Text-Driven Infrared and Visible Image Fusion From The Perspective of Referring Image Segmentation

Siju Ma, Changxiyu Gong, Xiaofeng Fan, Yong Ma, Chengjie Jiang†

ICASSP 2026 Oral Infrared-Visible Fusion Referring Image Segmentation Text-Driven Fusion

PDF Code Dataset

Two in One: Robust Fusion of Infrared and Visible Images in Rainy Condition

Jing Li, Jiafeng Yan, Chengjie Jiang, Bin Yang†

JAS 2026 IF:19.2 Infrared-Visible Fusion Rain Removal Robust Perception Coupled Restoration

Where Fusion Meets Dehazing: A Coupled Framework for Robust Visible-Infrared Image Fusion in Haze

Jing Li, Jiafeng Yan, Chengjie Jiang, Bin Yang, Yu Liu†

TIP Under Review Infrared-Visible Fusion Image Dehazing Adverse Weather Coupled Restoration

Multimodal Understanding

FOVIS: Foveated Vision for Ultra-High-Resolution Remote Sensing Reasoning

Y. Zhou*, Chengjie Jiang*, H. Zheng, X. Wang, S. Xu, Z. Long, L. Shi, X. Fan, C. Yuan†

Under Review Remote Sensing Reasoning Ultra-High Resolution Foveated Attention

Look Where It Matters: Training-Free Ultra-HR Remote Sensing VQA via Adaptive Zoom Search

Yunqi Zhou*, Chengjie Jiang*, Chun Yuan, Jing Li†

Arxiv 2025 Remote Sensing VQA Ultra-High Resolution Training-Free Plug-and-play

Project PDF Code

GRASP: Geospatial Pixel Reasoning via Structured Policy Learning

Chengjie Jiang, Y. Zhou, J. Yan, J. Li†, J. Li, Y. Zhou, H. He, J. Li

Arxiv 2025 Remote Sensing Geospatial Pixel Reasoning Structured Policy Learning

PDF

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments

M. Lin*, W. Huang*, Y. Li, Chengjie Jiang, K. Wu, F. Zhong, S. Qian†, X. Wang, X. Qi†

Arxiv 2025 Embodied AI VLA Benchmark Dynamic Spatial Reasoning Multi-Step Action

PDF

Internship

XPENG Motors

Embodied AI Research Intern · Topic: VLM Pre-training · Mentor: TBD.

July 2026 - Present

Tencent LIGHTSPEED STUDIOS

Research Intern · Topic: Multimodal Large Language Models · Mentor: Shengju Qian.

July 2024 - June 2025

Awards

M Award, Mathematical Contest in Modeling 2024

Huawei Scholarship, Central University of Finance and Economics 2023

First-Class Scholarship for Comprehensive Development, Central University of Finance and Economics 2022 and 2023

Outstanding Academic Scholarship, Central University of Finance and Economics 2023

Chengjie Jiang（蒋铖杰）