World Model · Generative AI · Multimodal LLM

Runyi Li 李 潤一

I am an incoming Ph.D. student in the Department of Information Science and Technology at the University of Tokyo, supervised by Prof. Toshihiko Yamasaki. I received my M.S. degree from Peking University, supervised by Prof. Jian Zhang, and my B.E. degree from Sichuan University, supervised by Prof. Lei Zhang.

My research focuses on world models, generative AI, multimodal large language models, and embodied AI, with publications at CVPR, ECCV (Oral), ICLR, NeurIPS, and ACM MM.

I am looking for internship opportunities in Tokyo or Yokohama, Japan, especially around multimodal generation, world models, and interpretability.

ECCV
Oral presentation for OmniSSR in 2024
700+
Citations across publications
2026
Outstanding Graduate, Peking University

Education & Experience

Academic training first, then research visits and internships.

Education

Research Visits & Internships

  • ByteDance Research intern at TikTok Group, working on unified vision understanding and generation with multimodal LLMs and diffusion models.
  • Wurzburg Visiting student at Wurzburg University, working with Prof. Radu Timofte on generative AI for low-level vision.
  • RabbitPre Internship on trustworthy AI and AIGC protection.

Research

A compact map of the themes that connect my recent projects.

Current Direction

I work on generative models that can understand, synthesize, and protect visual content. Recently, my projects have covered autoregressive visual generation, diffusion-based image restoration, multimodal forgery analysis, watermarking, and 3D Gaussian representations.

I am especially interested in models that plan before they generate, preserve identity and provenance, and remain useful under realistic degradation or manipulation.

Recent News

  • 2026 Video-Mirai released as an arXiv preprint.
  • 2026 Mirai accepted to CVPR 2026.
  • 2025 FakeShield accepted to ICLR 2025.
  • 2025 OmniGuard accepted to CVPR 2025.
  • 2024 OmniSSR selected as an ECCV 2024 Oral.

Autoregressive Generation

Visual generation models with planning, foresight, and stronger temporal structure.

Low-Level Vision

Diffusion-based super-resolution, restoration, and controllable fidelity-realness trade-offs.

Trustworthy AI

Forgery localization, copyright protection, watermarking, and source attribution.

3D & Multimodal AI

3D Gaussian splatting, multimodal LLM reasoning, and visual-audio protection.

Publications

Selected papers grouped by research theme. Equal contribution is marked with *.

Generative AI and Autoregressive Models

World Models and Agentic AI

Agentic World Modeling paper preview

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

Meng Chu, Xuan Billy Zhang, Kevin Qinghong Lin, ..., Runyi Li, ..., Jiaya Jia

arXiv 2026

We synthesize foundations, capability levels, governing laws, evaluation principles, and open problems for agentic world modeling.

Generative AI for Low-Level Vision

RealOSR teaser

RealOSR: Latent Guidance Boosts Diffusion-based Real-world Omnidirectional Image Super-Resolutions

IEEE JCSVT submission

We introduce RealOSR, an efficient diffusion-based framework for real-world omnidirectional image super-resolution with latent guidance.

CTSR teaser

CTSR: Controllable Fidelity-Realness Trade-off Distillation for Real-World Image Super Resolution

arXiv 2025

We achieve controllable real-world image super-resolution, striking a trade-off between fidelity and realness.

LAFR teaser

LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter

arXiv 2025

We achieve efficient real-world face restoration via latent-space alignment, using only 600 training images from FFHQ.

Generative AI for Trustworthiness and Safety

GS-Hider teaser

GS-Hider: Hiding Messages into 3D Gaussian Splatting

Xuanyu Zhang, Jiarui Meng, Runyi Li, Zhipei Xu, Yongbing Zhang, Jian Zhang

NeurIPS 2024

We propose a 3DGS steganography framework that hides a 3D scene or image in the original 3D scene and decodes it from 3D Gaussians.

EditGuard teaser

EditGuard: Versatile Image Watermarking for Tamper Localization and Copyright Protection

CVPR 2024

We propose a versatile deep forensic watermark against AIGC editing methods such as Stable Diffusion inpainting, ControlNet, and SDXL.

V2A-Mark teaser

V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection

ACM MM 2024

We propose a versatile deep forensic watermark against AIGC editing methods for video and audio.

GaussianSeal teaser

GaussianSeal: Rooting Adaptive Watermarks for 3D Gaussian Generation Model

Machine Intelligence Research

We propose an adaptive watermarking framework for 3D Gaussian generation models.

Protect-Your-IP teaser

Protect-Your-IP: Scalable Source-Tracing and Attribution against Personalized Generation

Runyi Li, Xuanyu Zhang, Zhipei Xu, Yongbing Zhang, Jian Zhang

IEEE TIFS submission

We propose an IP protection framework against personalized generation, jointly supporting source tracing and scalable attribution.

Publications Before 2023

JMIR paper teaser

Effectiveness of eHealth Self-management Interventions in Patients with Heart Failure: Systematic Review and Meta-analysis

Siru Liu, Jili Li, Dingyuan Wan, Runyi Li, Zhan Qu, Yundi Hu, Jialin Liu

Journal of Medical Internet Research, 2022

This study systematically reviews evidence for the effectiveness of eHealth self-management in heart failure patients.

CBAM paper teaser

Breast Cancer X-ray Image Staging Based on EfficientNet with Multi-scale Fusion and CBAM Attention

Runyi Li, Sen Wang, Zizhou Wang, Lei Zhang

Journal of Physics, 2021

We applied EfficientNet with CBAM attention to breast cancer medical image classification.

CISAI paper teaser

Image Caption and Medical Report Generation Based on Deep Learning: A Review and Algorithm Analysis

Runyi Li, Zizhou Wang, Lei Zhang

IEEE CISAI, 2021

A concise review of image captioning methods and medical report generation.

Academic Service

  • Reviewer ICLR 2025 & 2026, AAAI 2026, CVPR 2025 & 2026, NeurIPS 2025 & 2026, ICCV 2025, ACM MM 2025, IEEE JSTSP, and IEEE TIFS.

Selected Honours

  • 2026Outstanding Graduate, Peking University
  • 2025Merit Student, Peking University
  • 2025Scholarship of Ping'An Bank, Peking University & Ping'An Bank
  • 2024Merit Student, Peking University
  • 2024Scholarship of Public Welfare, Peking University & Guangdong Province
  • 2023Outstanding Graduate, Sichuan Province
  • 2023Outstanding Graduate, Sichuan University
  • 2023National Scholarship, Ministry of Education of China
  • 2021Bronze Award, Kaggle NFL Contest 2021
  • 2021Scholarship of Soochow-Singapore Industrial Park, Soochow City & Sichuan University