Ten papers were accepted by MICCAI 2021.

C. Wang, H. Zhang, Q. Li, K. Shang, Y. Lyu, B. Dong, and S. Kevin Zhou,“Generalizable limited-angle CT reconstruction via sinogram extrapolation,”International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract：Computed tomography (CT) reconstruction from X-ray projections acquired within a limited angle range is challenging, especially when the angle range is extremely small. Both analytical and iterative models need more projections for eﬀective modeling. Deep learning methods have gained prevalence due to their excellent reconstruction performances, but such success is mainly limited within the same dataset and does not generalize across datasets with diﬀerent distributions. Hereby we propose ExtraPolationNetwork for limited-angle CT reconstruction via the introduction of a sinogram extrapolation module, which is theoretically justiﬁed. The module complements extra sinogram information and boots model generalizability. Extensive experimental results show that our reconstruction model achieves state-of-the-art performance on NIH-AAPM dataset, similar to existing approaches. More importantly, we show that using such a sinogram extrapolation module signiﬁcantly improves the generalization capability of the model on unseen datasets (e.g., COVID-19 and LIDC datasets) when compared to existing approaches.

This image has an empty alt attribute; its file name is image.png

X. Liu, J. Wang, F. Liu, S. Kevin Zhou, “Universal Undersampled MRI Reconstruction,” International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Deep neural networks have been extensively studied for undersampled MRI reconstruction. While achieving state-of-the-art performance, they are trained and deployed specifically for one anatomy with limited generalization ability to another anatomy. Rather than building multiple models, a universal model that reconstructs images across different anatomies is highly desirable for efficient deployment and better generalization. Simply mixing images from multiple anatomies for training a single network does not lead to an ideal universal model due to the statistical shift among datasets of various anatomies, the need to retrain from scratch on all datasets with the addition of a new dataset, and the difficulty in dealing with imbalanced sampling when the new dataset is further of a smaller size. In this paper, for the first time, we propose a framework to learn a universal deep neural network for undersampled MRI reconstruction. Specifically, anatomy-specific instance normalization is proposed to compensate for statistical shift and allow easy generalization to new datasets. Moreover, the universal model is trained by distilling knowledge from available independent models to further exploit representations across anatomies. Experimental results show the proposed universal model can reconstruct both brain and knee images with high image quality. Also, it is easy to adapt the trained model to new datasets of smaller size, i.e., abdomen, cardiac and prostate, with little effort and superior performance.

This image has an empty alt attribute; its file name is image-1.png

Jun Wei, Yiwen Hu, Ruimao Zhang, Zhen Li, S.Kevin Zhou, Shuguang Cui, “Shallow attention network for polyp segmentation,” International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Accurate polyp segmentation is of great importance for colorectal cancer diagnosis. However, even with a powerful deep neural network, there still exists three big challenges that impede the development of polyp segmentation. (i) Samples collected under different conditions show inconsistent colors, causing the distribution gap and overfitting issue. (ii) Due to repeated feature downsampling, small polyps are easily degraded; (iii) Foreground and background pixels are imbalanced, leading to a biased training. To address above issues, we propose the shallow attention network (SANet) for polyp segmentation. Specifically, to eliminate the effects of color, we design color exchange operation to decouple the image contents and colors, and force the model to focus more on the target shape and structure. Furthermore, to enhance the segmentation quality of small polyps, we propose the shallow attention module to filter out the background noise of shallow features. Thanks to the high resolution of shallow features, small polyps can be preserved correctly. In addition, to ease the severe pixel imbalance for small polyps, we propose a probability correction strategy (PCS) during the inference phase. Note that even though PCS is not involved in the training phase, it can still work well on a biased model and consistently improve the segmentation performance. We verify the effectiveness of SANet on five datasets.

This image has an empty alt attribute; its file name is image-2.png

This image has an empty alt attribute; its file name is image-3.png

H. Zhu, Q. Yao, L. Xiao, and S. Kevin Zhou, “You only learn once: Universal anatomical landmark detection,” International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Detecting anatomical landmarks in medical images plays an essential role in understanding the anatomy and planning automated processing. In recent years, a variety of deep neural network methods have been developed to detect landmarks automatically. However, all of those methods are unary in the sense that a highly specialized network is trained for a single task say associated with a particular anatomical region. In this work, for the ﬁrst time, we investigate the idea of ”You Only Learn Once (YOLO)” and develop a universal anatomical landmark detection model to realize multiple landmark detection tasks with end-to-end training based on mixed datasets. The model consists of a local network and a global network: The local network is built upon the idea of universal U-Net to learn multi-domain local features and the global network is a parallelly-duplicated sequential of dilated convolutions that extract global features to further disambiguate the landmark locations. It is worth mentioning that the new model design requires much fewer parameters than models with standard convolutions to train. We evaluate our YOLO model on three X-ray datasets of 1,588 images on the head, hand, and chest, collectively contributing 62 landmarks. The experimental results show that our proposed universal model behaves largely better than any previous models trained on multiple datasets. It even beats the performance of the model that is trained separately for every single dataset.

This image has an empty alt attribute; its file name is image-4.png

Lyu, Yuanyuan and Fu, Jiajun and Peng, Cheng and Zhou, S Kevin, “U-DuDoNet: Unpaired dual-domain network for CT metal artifact reduction,”International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Recently, both supervised and unsupervised deep learning methods have been widely applied on the CT metal artifact reduction (MAR) task. Supervised methods such as Dual Domain Network (DuDoNet) work well on simulation data; however, their performance on clinical data is limited due to domain gap. Unsupervised methods are more generalized, but do not eliminate artifacts completely through the sole processing on the image domain. To combine the advantages of both MAR methods, we propose an unpaired dual-domain network (UDuDoNet) trained using unpaired data. Unlike the artifact disentanglement network (ADN) that utilizes multiple encoders and decoders for disentangling content from artifact, our U-DuDoNet directly models the artifact generation process through additions in both sinogram and image domains, which is theoretically justified by an additive property associated with metal artifact. Our design includes a self-learned sinogram prior net, which provides guidance for restoring the information in the sinogram domain, and cyclic constraints for artifact reduction and addition on unpaired data. Extensive experiments on simulation data and clinical images demonstrate that our novel framework outperforms the state-of-the-art unpaired approaches.

This image has an empty alt attribute; its file name is image-5.png

H. Li, L. Chen, H. Han, and S. Kevin Zhou, “Conditional training with bounding map for universal lesion detection,” International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Universal Lesion Detection (ULD) in computed tomography plays an essential role in computer-aided diagnosis. Promising ULD results have been reported by coarse-to-fine two-stage detection approaches, but such two-stage ULD methods still suffer from issues like imbalance of positive v.s. negative anchors during object proposal and insufficient supervision problem during localization regression and classification of the region of interest (RoI) proposals. While leveraging pseudo segmentation masks such as bounding map (BM) can reduce the above issues to some degree, it is still an open problem to effectively handle the diverse lesion shapes and sizes in ULD. In this paper we propose a BM-based conditional training for two-stage ULD, which can (i) reduce positive vs. negative anchor imbalance via a BM-based conditioning (BMC) mechanism for anchor sampling instead of traditional IoU-based rule; and (ii) adaptively compute size-adaptive BM (ABM) from lesion boundingbox, which is used for improving lesion localization accuracy via ABMsupervised segmentation. Experiments with four state-of-the-art methods show that the proposed approach can bring an almost free detection accuracy improvement without requiring expensive lesion mask annotations.

M. Guan, Y. Lyu, W. Cao, X. Wu, J. Lu, and S. Kevin Zhou, “Perceptual quality assessment of chest radiograph,”International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: The quality of a chest X-ray image or radiograph, which is widely used in clinics, is a very important factor affects doctors’ clinical decision making. Since there is no chest X-ray image quality database so far, we conduct the first study of perceptual quality assessment of chest X-ray images by introducing a Chest X-ray Image Quality Database, which contains 2,160 chest X-ray images obtained from 60 reference images. In order to simulate the real noise of X-ray images, we add different levels of Gaussian noise and Poisson noise, which are most commonly found in X-ray images. Mean opinion scores (MOS) have been collected by performing user experiments with 74 subjects (25 professional doctors and 49 non-doctors). The availability of MOS allows us to design more effective image quality metrics. We use the database to train a blind image quality assessment model based on deep neural networks, which attains better performances than conventional approaches in terms of Spearman rank-order correlation coefficient and Pearson linear correlation coefficient. We plan to open source the entire study to benefit the whole chest X-ray imaging community.

P. Cheng, S. Kevin Zhou, and R. Chellappa, “DA-VSR: Domain adaptable volumetric super-resolution for medical images,” International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Medical image super-resolution (SR) is an active research area that has many potential applications, including reducing scan time, bettering visual understanding, increasing robustness in downstream tasks, etc. However, applying deep-learning-based SR approaches for clinical applications often encounters issues of domain inconsistency, as the test data may be acquired by different machines or on different organs. In this work, we present a novel algorithm called domain adaptable volumetric super-resolution (DA-VSR) to better bridge the domain inconsistency gap. DA-VSR uses a unified feature extraction backbone and a series of network heads to improve image quality over different planes. Furthermore, DA-VSR leverages the in-plane and through-plane resolution differences on the test data to achieve a self-learned domain adaptation. As such, DA-VSR combines the advantages of a strong feature generator learned through supervised training and the ability to tune to the idiosyncrasies of the test volumes through unsupervised learning. Through experiments, we demonstrate that DA-VSR significantly improves super resolution quality across numerous datasets of different domains, thereby taking a further step toward real clinical applications.

Q. Yao, Z. He, Y. Lin, K. Ma, Y. Zheng, and S. Kevin Zhou, “Hierarchical feature constraint to camouflage medical adversarial attacks,”International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: Deep neural networks for medical images are extremely vulnerable toadversarial examples (AEs), which poses security concerns on clinical decision-making. Recent findings have shown that existing medical AEs are easy to detectin feature space. To better understand this phenomenon, we thoroughly investi-gate the characteristic of traditional medical AEs in feature space. Specifically, wefirst perform a stress test to reveal the vulnerability of medical images and com-pare them to natural images. Then, we theoretically prove that the existing adver-sarial attacks manipulate the prediction by continuously optimizing the vulnera-ble representations in a fixed direction, leading to outlier representations in fea-ture space. Interestingly, we find this vulnerability is a double-edged sword thatcan be exploited to help hide AEs in the feature space. We propose a novel hierar-chical feature constraint (HFC) as an add-on to existing white-box attacks, whichencourages hiding the adversarial representation in the normal feature distribu-tion. We evaluate the proposed method on two public medical image datasets,namely Fundoscopy and Chest X-Ray. Experimental results demonstrate the su-periority of our HFC as it bypasses an array of state-of-the-art adversarial medicalAEs detector more efficiently than competing adaptive attacks.

Q. Yao, Q. Quan, L. Xiao, and S. Kevin Zhou, “One-shot medical landmark detection,”International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Strasbourg, France, September 2021.

Abstract: The success of deep learning methods relies on the availabil-ity of a large number of datasets with annotations; however, curatingsuch datasets is burdensome, especially for medical images. To relievesuch a burden for a landmark detection task, we explore the feasibilityof usingonly a single annotated imageand propose a novel frame-work named Cascade Comparing to Detect (CC2D) for one-shot land-mark detection. CC2D consists of two stages: 1) Self-supervised learning(CC2D-SSL) and 2) Training with pseudo-labels (CC2D-TPL). CC2D-SSL captures the consistent anatomical information in a coarse-to-finefashion by comparing the cascade feature representations and generatespredictions on the training set. CC2D-TPL further improves the perfor-mance by training a new landmark detector with those predictions. Theeffectiveness of CC2D is evaluated on a widely-used public dataset ofcephalometric landmark detection, which achieves a competitive detec-tion accuracy of 81.01% within 4.0mm, comparable to the state-of-the-artfully-supervised methods using a lot more than one training image.