The digitalisation of histopathology slides is opening innovative opportunities in pathology, particularly by integrating artificial intelligence into routine histopathological diagnoses1. However, deep learning (DL) algorithms are susceptible to biases caused by unintended distortions or alterations in tissue images, commonly referred to as imaging artefacts [2-4]. Understanding and mitigating these imaging artefacts is essential for enhancing the reliability of diagnoses. In this blog post, we discuss causes of frequently occurring artefacts in whole slide images (WSIs), their impact on DL algorithms and various approaches to mitigate them.
1. Blurriness: WSIs are often completely or partially blurry due to the images getting out of focus or due to motion during scanning. The focus points are sometimes not evenly distributed throughout the tissue surface. If the focus points are pointing at debris on the coverslip, the scanner can focus on the wrong imaging plane. This blurriness can obscure cellular details such as cell boundaries and nuclear features, hindering DL algorithms from accurately identifying pathological features in tissues.
Figure 1: Blurriness in WSIs. Black arrow indicates blurry regions, and red arrow indicates focussed regions in the image.
2. Tissue folds: During tissue flotation step of slide preparation, tissue sections may fold over themselves and create overlapping regions on the glass slides. These overlapping regions can be misinterpreted by the DL algorithms as abnormal structures. The presence of these folds can often lead to false positives or negatives, especially if they cover a significant area in WSIs.
Figure 2: Tissue folds in WSIs. Black arrow indicates folded regions, and red arrow indicates flat regions in the image.
3. Bubbles: Sometimes air can get trapped during the mounting of coverslip over the tissue slides, leading to air bubbles. Air bubbles appear as clear round voids masking underlying tissue. If the slides are not dehydrated adequately before mounting, the residual water leads to formation of water droplets on the slides. These water or air bubbles can not only be misidentified as an abnormal tissue structure, but also hide the actual cellular details of the tissues.
Figure 3: Bubbles in WSIs. Black arrow indicates regions with bubbles, and red arrow indicates clear regions in the image.
4. Stain variability: Stain variability arises due to inconsistencies in staining protocols and reagent concentrations. The colour variations introduced due to stain variability, such as difference in intensity and hue can confuse colour-based feature extraction algorithms leading to inaccurate segmentation and classification.
5. Debris and contaminants: Foreign particles such as dust, hair, or contamination by various microorganisms on the slides can occlude tumour features and be misinterpreted as cellular components, thus introducing noise that reduces the accuracy of feature detection.
Figure 4: Dust and contaminants in WSIs. Black arrow indicates regions with dust and contaminants in the image.
1. Preprocessing strategies: Applying image enhancement methods, such as stain normalisation, thresholding and contrast adjustment, can improve the clarity of histopathological images, facilitating better feature extraction by DL models [5-6].
2. Artefacts detection models: Developing models trained to identify and exclude artefacts can prevent corrupted regions from influencing analysis [7-9]. For instance, a mixture of experts (MoE) scheme has been proposed by researchers for detecting artifacts like damaged tissue, blur, folded tissue, air bubbles, and histologically irrelevant blood in whole slide images10. MoE approach combines multiple deep learning networks and visions transformers trained to identify one artefact each, followed by a gating mechanism for final predictions.
3. Data augmentation: Enhancing training datasets with synthetic images can improve model robustness [11-12]. Since manual annotation of artefacts can get expensive and time-consuming, augmenting the WSIs with artefacts can help in generating training datasets for artefact detection. Techniques like Sharp-GAN have been developed to generate realistic histopathology images with clear nuclei contours, aiding in training models that are resilient to variations in image quality13.
By understanding and addressing these artefacts through rigorous quality controls, we make our histopathological image analysis pipeline at PAICON artefact-aware, thus making our AI based diagnosis more reliable, unbiased and accurate. Curious about our image data harmonisation pipelines? Contact us today on our website www.paicon.com.