Vision Transformers Enable Fast and Robust Accelerated MRI

Kang Lin, Reinhard Heckel
Proceedings of The 5th International Conference on Medical Imaging with Deep Learning, PMLR 172:774-795, 2022.

Abstract

The Vision Transformer, when trained or pre-trained on datasets consisting of millions of images, gives excellent accuracy for image classification tasks and offers computational savings relative to convolutional neural networks. Motivated by potential accuracy gains and computational savings, we study Vision Transformers for accelerated magnetic resonance image reconstruction. We show that, when trained on the fastMRI dataset, a popular dataset for accelerated MRI only consisting of thousands of images, a Vision Transformer tailored to image reconstruction yields on par reconstruction accuracy with the U-net while enjoying higher throughput and less memory consumption. Furthermore, as Transformers are known to perform best with large-scale pre-training, but MRI data is costly to obtain, we propose a simple yet effective pre-training, which solely relies on big natural image datasets, such as ImageNet. We show that pre-training the Vision Transformer drastically improves training data efficiency for accelerated MRI, and increases robustness towards anatomy shifts. In the regime where only 100 MRI training images are available, the pre-trained Vision Transformer achieves significantly better image quality than pre-trained convolutional networks and the current state-of-the-art. Our code is available at \url{https://github.com/MLI-lab/transformers_for_imaging}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v172-lin22a, title = {Vision Transformers Enable Fast and Robust Accelerated {MRI}}, author = {Lin, Kang and Heckel, Reinhard}, booktitle = {Proceedings of The 5th International Conference on Medical Imaging with Deep Learning}, pages = {774--795}, year = {2022}, editor = {Konukoglu, Ender and Menze, Bjoern and Venkataraman, Archana and Baumgartner, Christian and Dou, Qi and Albarqouni, Shadi}, volume = {172}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v172/lin22a/lin22a.pdf}, url = {https://proceedings.mlr.press/v172/lin22a.html}, abstract = {The Vision Transformer, when trained or pre-trained on datasets consisting of millions of images, gives excellent accuracy for image classification tasks and offers computational savings relative to convolutional neural networks. Motivated by potential accuracy gains and computational savings, we study Vision Transformers for accelerated magnetic resonance image reconstruction. We show that, when trained on the fastMRI dataset, a popular dataset for accelerated MRI only consisting of thousands of images, a Vision Transformer tailored to image reconstruction yields on par reconstruction accuracy with the U-net while enjoying higher throughput and less memory consumption. Furthermore, as Transformers are known to perform best with large-scale pre-training, but MRI data is costly to obtain, we propose a simple yet effective pre-training, which solely relies on big natural image datasets, such as ImageNet. We show that pre-training the Vision Transformer drastically improves training data efficiency for accelerated MRI, and increases robustness towards anatomy shifts. In the regime where only 100 MRI training images are available, the pre-trained Vision Transformer achieves significantly better image quality than pre-trained convolutional networks and the current state-of-the-art. Our code is available at \url{https://github.com/MLI-lab/transformers_for_imaging}.} }
Endnote
%0 Conference Paper %T Vision Transformers Enable Fast and Robust Accelerated MRI %A Kang Lin %A Reinhard Heckel %B Proceedings of The 5th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2022 %E Ender Konukoglu %E Bjoern Menze %E Archana Venkataraman %E Christian Baumgartner %E Qi Dou %E Shadi Albarqouni %F pmlr-v172-lin22a %I PMLR %P 774--795 %U https://proceedings.mlr.press/v172/lin22a.html %V 172 %X The Vision Transformer, when trained or pre-trained on datasets consisting of millions of images, gives excellent accuracy for image classification tasks and offers computational savings relative to convolutional neural networks. Motivated by potential accuracy gains and computational savings, we study Vision Transformers for accelerated magnetic resonance image reconstruction. We show that, when trained on the fastMRI dataset, a popular dataset for accelerated MRI only consisting of thousands of images, a Vision Transformer tailored to image reconstruction yields on par reconstruction accuracy with the U-net while enjoying higher throughput and less memory consumption. Furthermore, as Transformers are known to perform best with large-scale pre-training, but MRI data is costly to obtain, we propose a simple yet effective pre-training, which solely relies on big natural image datasets, such as ImageNet. We show that pre-training the Vision Transformer drastically improves training data efficiency for accelerated MRI, and increases robustness towards anatomy shifts. In the regime where only 100 MRI training images are available, the pre-trained Vision Transformer achieves significantly better image quality than pre-trained convolutional networks and the current state-of-the-art. Our code is available at \url{https://github.com/MLI-lab/transformers_for_imaging}.
APA
Lin, K. & Heckel, R.. (2022). Vision Transformers Enable Fast and Robust Accelerated MRI. Proceedings of The 5th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 172:774-795 Available from https://proceedings.mlr.press/v172/lin22a.html.

Related Material