Xintao Wang

Dark Mode

Contact Me

I am currently a senior staff researcher at KwaiVGI, Kuaishou Technology, leading an effort on visual content generation, especially on video generation.
We are actively looking for research interns and full-time researchers to work on related research topics, including but not limited to image and video generation/editing. Please feel free to drop me an email to xintao.wang@outlook.com if you are interested.

Previously, I was a senior staff researcher atTencent ARC Lab and Tencent AI Lab, where I led an effort on visual content generation (AIGC).
I got my Ph.D. degree from Multimedia Lab (MMLab), the Chinese University of Hong Kong, advised by Prof. Chen Change Loy and Prof. Xiaoou Tang. I also work closely with Prof. Chao Dong. Earlier, I obtained my bachelor's degree from Zhejiang University.

I am currently immersed in the exhilarating field of generative AI, which has been an exciting journey.
● 2D (Image/Video) Generation

Controllable Image Generation: T2I-Adapter, PhotoMaker, CustomNet, MasaCtrl, DragonDiffusion, SmartEdit
Controllable Video Generation: MotionCtrl, Tune-A-Video
Video Foundation Models : VideoCrafter Sereries (VideoCrafter1, DynamiCrafter, EvalCrafter, StyleCrafter, etc).

● 3D Generation

Dream3D, GET3D——

● Previously, I worked on Restoration

General Image Restoration: Real-ESRGAN, ESRGAN
Face Restoration: GFPGAN, VQFR, GLEAN
Video Restoration : EDVR, BasicVSR
Training Frameworks and others : BasicSR, SFTGAN

News

[3/2024] Nine papers are accepted to CVPR 2024.
[1/2024] Three papers are accepted to ICLR 2024.
[12/2023] Three papers are accepted to AAAI 2024.
[10/2023] Ranked as Top 2% Scientists Worldwide 2023 (Single Year) by Stanford University.
[10/2023] Two papers are accepted to NeurIPS 2023.
[09/2023] Release T2I-Adapter for SDXL: the most efficient control models, collaborating with HuggingFace.
[07/2023] Three papers are accepted to ICCV 2023.
[04/2023] One paper is accepted to ICML 2023.
[03/2023] We are holding the 360° Super-Resolution Challenge as a part of the NTIRE workshop in conjunction with CVPR 2023.
[02/2023] Three papers to appear in CVPR 2023.
[11/2022] Two papers to appear in AAAI 2023.
[09/2022] Ranked as Top 2% Scientists Worldwide 2022 (Single Year) by Stanford University.
[09/2022] Two papers to appear in NeurIPS 2022.
[07/2022] Two papers to appear in ECCV 2022. VQFR is accepted as oral (2.7%).
[06/2022] Two papers to appear in ACM MM 2022.
[05/2022] BasicSR joins the XPixel Group!
[04/2022] We release a high-quality face video dataset (VFHQ). Please refer to the project page and our paper.
[12/2021] One paper to appear in NeurIPS 2021 as spotlight (2.85%): FAIG: Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution. Codes are released in TencentARC/FAIG.
[10/2021] Real-ESRGAN is accepted by ICCV 2021 AIM workshop with Honorary Nomination Paper Award.

Click for More

[07/2021] One paper to appear in ICCV 2021: Towards Vivid and Diverse Image Colorization with Generative Color Prior

[07/2021] The codes for practical image restoration Real-ESRGAN are released on Github.

[06/2021] The training and testing codes of GFPGAN are released on TencentARC.

[03/2021] 5 papers to appear in CVPR 2021.

[03/2021] A brand-new HandyView online!.

[08/2020] A brand-new BasicSR v1.0.0 online!

[06/2019] We have released the EDVR training and testing codes and also updated BasicSR codes!

[06/2019] Got my first outstanding reviewer recognition from CVPR 2019!

[05/2019] Our video restoration method, EDVR, won all four tracks in the NTIRE 2019 video restoration and enhancement challenges. Check our paper for more details.

[03/2019] Our paper Deep Network Interpolation for Continuous Imagery Effect Transition to appear in CVPR 2019.

[08/2018] Our SuperSR team won the third track of the 2018 PIRM Challenge on Perceptual Super-Resolution. Check the report ESRGAN for more details.

[06/2018] We won the NTIRE 2018 Challenge on Single Image Super-Resolution as first runner-up and ranked the first in the Realistic Wild ×4 conditions track.

[02/2018] Our paper Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform to appear in CVPR 2018.

[07/2017] Our HelloSR team won the NTIRE 2017 Challenge on Single Image Super-Resolution as first runner-up.

T2I-Adapter

Dig out controllable ability for text-to-image diffusion models

VideoCrafter

Open sourced large models for video generation

Real-ESRGAN

Practical algorithms for image restoration

GFPGAN

Practical face restoration

BasicSR

Open source image and video restoration toolbox

HandyView

Handy image viewer

Publications [Full List]

(* equal contribution, ^# corresponding author)
Seleted Preprint

teaser

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

Chong Mou, Xintao Wang^#, Liangbin Xie, Yanze Wu, Jian Zhang^#, Zhongang Qi, Ying Shan, Xiaohu Qie

arXiv preprint, 2023. Paper (arXiv) Codes

teaser

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

Yunpeng Bai, Xintao Wang^#, Yan-Pei Cao, Yixiao Ge, Chun Yuan^#, Ying Shan

arXiv preprint, 2023. Paper (arXiv) Codes

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen

arXiv preprint, 2023. Project Page Paper (arXiv) Codes

teaser

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang

arXiv preprint, 2023. Project Page Paper (arXiv) Codes

2023

teaser

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Mingdeng Cao, Xintao Wang^#, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng^#

ICCV, 2023. Project Page Paper (arXiv) Codes

teaser

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Yufei Shi, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou

ICCV, 2023. Project Page Paper (arXiv) Codes

Fate/Zero: Fusing Attentions for Zero-shot Text-based Video Editing

Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen

ICCV, 2023. Project Page Paper (arXiv) Codes

teaser

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Liangbin Xie*, Xintao Wang*, Xiangyu Chen*, Gen Li, Ying Shan, Jiantao Zhou, Chao Dong

ICML, 2023. Paper (arXiv) Codes

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

Jiale Xu, Xintao Wang^#, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao^#

CVPR, 2023. Project Page Paper (arXiv) Codes (Coming Soon)

teaser

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer

Fanghua Yu*, Xintao Wang*, Mingdeng Cao, Gen Li, Ying Shan, Chao Dong^#

CVPR, 2023. Paper (arXiv) Codes

teaser

HAT: Activating More Pixels in Image Super-Resolution Transformer

Xiangyu Chen, Xintao Wang, Jiantao Zhou, Chao Dong

CVPR, 2023. Paper (arXiv) Codes

teaser

Mitigating Artifacts in Real-World Video Super-Resolution Models

Liangbin Xie, Xintao Wang, Shuwei Shi, Jinjin Gu, Chao Dong, Ying Shan

AAAI, 2022. Paper (arXiv) Codes

teaser

Accelerating the Training of Video Super-resolution Models

Lijian Lin, Xintao Wang^#, Zhongang Qi, Ying Shan

AAAI, 2022. Paper (arXiv) Codes

2022

teaser

AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos

Yanze Wu*, Xintao Wang*, Gen Li, Ying Shan

NeurIPS, 2022. Paper (arXiv) Codes

teaser

Rethinking Alignment in Video Super-Resolution Transformers

Shuwei Shi, Jinjin Gu, Liangbin Xie, Xintao Wang, Yujiu Yang, Chao Dong

NeurIPS, 2022. Paper (arXiv) Codes

teaser

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Yuchao Gu, Xintao Wang, Liangbie Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng

Selected as Oral (2.7%)
ECCV, 2022. Paper (arXiv) Codes

teaser

MM-RealSR: Metric Learning based Interactive Modulation for Real-World Super-Resolution

Chong Mou, Yanze Wu, Xintao Wang, Chao Dong, Jian Zhang, Ying Shan

ECCV, 2022. Paper (arXiv) Codes

2021

teaser

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Xintao Wang, Liangbie Xie, Chao Dong, Ying Shan

ICCVW, 2021. Paper (arXiv) Codes

GFPGAN: Towards Real-World Blind Face Restoration with Generative Facial Prior

Xintao Wang, Yu Li, Honglun Zhang, Ying Shan

CVPR, 2021. Project Page Paper (arXiv) Codes

To be updated