media

Multimodal Integration of Transcriptomics and Histopathology: Addressing Population Variation in AI-Based Cancer Diagnostics

  • userDr. John KL Wong

  • calendarMarch 26, 2025

  • clock4 min read

In the era of precision medicine, the convergence of genomics and histopathology represents a transformative opportunity for oncology diagnostics. This whitepaper explores the integration of Whole transcriptome sequencing with histopathological analysis and its crucial role in developing unbiased AI models. By combining whole transcriptome sequencing data with our extensive whole slide image (WSI) archives, PAICON is developing AI-powered diagnostic tools that leverage both genetic and phenotypic diversity to assist clinicians for more accurate cancer diagnoses.

Introduction

Genomics, particularly Whole transcriptome sequencing has emerged as a crucial tool for characterizing gene expression profiles in cancer diagnosis and treatment. When integrated with histopathology, Whole transcriptome sequencing provides unprecedented insights into tumor biology by correlating molecular data with tissue morphology. At PAICON, we recognize that understanding this data complexity requires a paradigm shift towards unbiased, inclusive AI. Our archive of 1.4 million whole slide images, combined with our expanding Whole transcriptome sequencing database provides the statistical power and molecular depth necessary to develop AI models that can effectively capture population-level variations.

The Power of Whole transcriptome sequencing in Oncology

Whole transcriptome sequencing offers several key advantages in cancer diagnostics:

  • High resolution and sensitivity in capturing both abundant and rare transcripts
  • Hypothesis-free detection approach, enabling discovery of novel transcripts and fusion genes
  • Broad dynamic range for precise quantification of gene expression levels of tumor biomarkers, which can combined with histopathological image analyses. For example, clinicians can correlate molecular alterations with morphological features, enhancing diagnostic accuracy and our understanding of tumor heterogeneity.
  • AI models can be trained to assist clinicians in decision making, especially on difficult to extract features.

Data Complexity and the Imperative for Unbiased AI

The integration of diverse data types—such as WSI and Whole transcriptome sequencing— introduces challenges and opportunities that mirror the population stratification issues seen in genomic studies. Population stratification, defined by systematic differences in allele frequencies among subgroups, has long been a key consideration in genomics, as these differences can lead to false associations and biased conclusions if not adequately managed.
Similarly, in AI-driven diagnostics, historical genomic research has relied on datasets that primarily represent certain ethnic groups, resulting in models that may not perform consistently across diverse populations. Key challenges include:

  • Overfitting: Models may become too tailored to homogeneous training data, similar to how population-specific variants can skew genomic studies.
  • Underrepresentation: Certain patient subgroups may be inadequately represented, potentially concealing critical, population-specific molecular signatures.
  • Perpetuating Disparities: There is a risk that AI models trained on non-representative data could reinforce existing healthcare inequalities.

 

PAICON’s commitment to unbiased AI drives us to incorporate a wide spectrum of genomic data, ensuring our models learn from genetically and phenotypically diverse samples. This approach is critical for addressing health disparities and developing diagnostics that work effectively for all patients.

PAICON's Innovative Approach

Our strategy leverages advanced technical solutions to ensure robust and unbiased AI development:

  • Data Harmonization: Standardizing both imaging and genomic data to ensure consistency across diverse datasets.
  • AI-Driven Quality Assurance: We use AI to continuously monitor and evaluate data quality, automatically identifying and flagging anomalies, inconsistencies, or errors that could compromise model performance.
  • Model Trained from Diverse Data: We have sourced data from 33 countries, encompassing a broad spectrum of ethnicities, genetic backgrounds, and healthcare environments. This extensive diversity ensures that our AI models are trained on a globally representative dataset, enhancing their accuracy and generalizability across different populations.

Conclusion

The integration of whole transcriptome sequencing with histopathology represents a significant methodological advance in cancer diagnostics. This multimodal approach enables the correlation of transcriptional profiles with morphological features and clinical outcomes, providing deeper insights into tumor subtypes. PAICON is committed to use our repository whole slide images with our growing Whole transcriptome sequencing for developing robust AI models that can account for both molecular and histological variation across diverse populations, aiming to improve diagnostic accuracy through machine learning and AI techniques.

Related Articles

bacground image
bacground image

Subscribe to our newsletter

Loading