Runyi Li (李润一)
ID Photo

I am a second-year master's student at the School of Electronic and Computer Engineering, Peking University supervised by Prof. Jian Zhang. I received the B.E. degree from Sichuan University, in 2023, with Honour Graduate.

My research interest includes Generative AI, Multi-modal LLM, Low-level Vision and 3D Gaussian Splatting.

I am currently working as internship at Bytedance Tiktok Group, focusing on unified vision understanding and generation, via MLLM and Diffusion Model.

Before that, I worked as visiting student at Wurzburg University supervised by Prof. Radu Timofte, focusing on generative AI for low-level vision, such as image super-resolution and face restoration.

I have also done research in (1) trustworthy AI during internship at RabbitPre, a unihorn startup, and (2) AI for Chemistry and Materials research at School of Physics, Peking University, supervised by Prof. Lixin Xiao.

I have published several papers in top-tier conferences, such as CVPR, ECCV (oral), ICLR, NeurIPS, and ACMMM, and I have got 200+ citations.

I am looking for Ph.D. position in 26 fall, about Multi-modal LLM, Generative AI, and other topics in computer vision and machine learning. Seeking to contribute to foundational research in multimodal generation and interpretability. If you are interested in my research, please feel free to contact me.

Publications 2023-Now

Generative AI for Low-level Vision

omnissr

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model

(* equal contribution)

ECCV2024 Oral

We achieve zero-shot omnidirectional image super-resolution with both fidelity and realness, dubbed as OmniSSR.

realosr

RealOSR: Latent Unfolding Boosting Diffusion-based Real-world Omnidirectional Image Super-Resolution

(* equal contribution)

NeurIPS 25 submission

We achieve first real-world omnidirectional image super-resolution, named as RealOSR.

realosr

CTSR: Controllable Fidelity-Realness Trade-off Distillation for Real-World Image Super Resolution

AAAI 26 submission

We achieve controllable real-world image SR, striking a trade-off between fidelity and realness.

realosr

LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter

NeurIPS 25 submission

We achieve efficient real-world face restoration via latent-space alignment, using only 600 (0.9%) training imgages from FFHQ.

Generative AI for Trustworthy and Safety

fakeshield

FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models

ICLR 2025

We achieve first explainable framework for image manipulation localization via MLLM, named as FakeShield.

omniguard

OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Xuanyu Zhang, Zecheng Tang, Zhipei Xu, Runyi Li, Youmin Xu, Bin Chen, Feng Gao, Jian Zhang

CVPR 2025

We propose the first hybrid manipulation localization framework OmniGuard, which can localize and detect various manipulations in images.

gs-hider

GS-Hider: Hiding Messages into 3D Gaussian Splatting

Xuanyu Zhang, Jiarui Meng, Runyi Li, Zhipei Xu, Yongbing Zhang, Jian Zhang

NeurIPS 2024

We propose the first 3DGS steganography framework GS-Hider, which can hide an entire 3D scene or an image into the original 3D scene and accurately decode it from 3D Gaussians.

EditGuard

Editguard: Versatile image watermarking for tamper localization and copyright protection

Xuanyu Zhang, Runyi Li, Jiwen Yu, Youmin Xu, Weiqi Li, Jian Zhang

CVPR 2024

We propose a versatile deep forensic watermark against AIGC editing methods, such as stable diffusion inpaint, controlnet, SDXL and etc.

v2amark

V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection

Xuanyu Zhang, Youmin Xu, Runyi Li, Jiwen Yu, Weiqi Li, Jian Zhang

ACMMM 2024

We propose a versatile deep forensic watermark against AIGC editing methods for video and audio.

gaussianseal

GaussianSeal: Rooting Adaptive Watermarks for 3D Gaussian Generation Model

Machine Intelligence Research (Springer JCR Q1, If=8.7)

We achieve first watermarking framework for 3DGS generation model, named as GaussianSeal.

protect your ip

Protect-Your-IP: Scalable Source-Tracing and Attribution against Personalized Generation

Runyi Li, Xuanyu Zhang, Zhipei Xu, Yongbing Zhang, Jian Zhang

IEEE TIFS in submission (SCI JCR Q1, If=6.8)

We propose the IP protector against personalized generation, which jointly accomplish source-tracing and scalable attribution.

Publications before 2023

A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition

Effectiveness of eHealth self-management interventions in patients with heart failure: systematic review and meta-analysis

Journal of Medical Internet Research (SCI JCR Q1, If=7.4), 2022

This study aimed was to systematically review the evidence for the effectiveness of eHealth self-management in HF patients.

cbam

Breast cancer X-ray image staging: based on efficient net with multi-scale fusion and cbam attention

Journal of Physics, 2021

We applied the EfficientNet model with CBAM attetion to the breast cancer medical image data set, and completed the breast cancer classification task with high accuracy.

cbam

Image caption and medical report generation based on deep learning: a review and algorithm analysis

Runyi Li, Zizhou Wang, Lei Zhang

IEEE Conference CISAI, 2021

This study is a concise review of image captioning methods and medical report generation.

Academic services

I have served as a reviewer for ICLR 2025, CVPR 2025, ICCV 2025, ACMMM 2025, and IEEE TIFS.

Selected Honours