Clinical applications of multimodal AI in OC pathology
MMAI has significantly advanced the clinical management of OC, offering enhanced capabilities in diagnosis, treatment response prediction, prognostic stratification and molecular feature inference. By integrating diverse data types, such as histopathological images, genomic and proteomic profiles, radiological imaging and clinical variables, MMAI provides a more comprehensive and accurate toolset for addressing the heterogeneity and complexity of OC. Table 3 summarises the MMAI that has been applied in the pathological research of OC. This section highlights key clinical applications supported by robust empirical evidence from recent studies.
Summary of the application of multimodal AI in pathological research of ovarian cancer
Tumour diagnosis and subtyping
Accurate distinction between benign and malignant ovarian tumours and precise classification of histological subtypes are crucial for determining appropriate treatment strategies. Traditional diagnostic methods, which rely heavily on histopathological examination and serum biomarkers, often suffer from inter-observer variability and limited specificity. MMAI approaches have markedly improved diagnostic accuracy by combining morphological, molecular and clinical data.
For benign-malignant differentiation, Vijayarajan et al integrated 49 clinical-imaging features, achieving 99.47% accuracy on 349 patients.41 Complementary information from clinical markers and imaging improved performance. The model successfully captured complementary information: clinical markers provided quantitative risk assessment, while imaging identified morphological abnormalities such as irregular tumour borders and nuclear pleomorphism.
In addressing rare and challenging subtypes such as peritoneal serous papillary carcinoma (PSPC), frequently misclassified as epithelial ovarian cancer (EOC), Wang et al used DL models (Improved_InceptionV3_MS and Improved_MIL_RNN) trained on H&E WSIs and mismatch repair (MMR) protein IHC. Their models achieved accuracies of 97.2% in distinguishing PSPC from EOC, with nearly perfect sensitivity and specificity using MSH2 and MSH6 markers.22 This high performance is clinically significant given the distinct surgical and management strategies required for PSPC. On this basis, the differential diagnosis between LGSOC and HGSOC is challenging, as is the case for other rare subtypes. MMAI combined with multi-omics data is expected to play a breakthrough role in the accurate diagnosis of such rare subtypes, but there are still few relevant studies, which is an important direction in the future.
For pathological subtype classification, several MMAI frameworks have been developed to categorise OC into its major subtypes (eg, HGSOC, clear cell, endometrioid, mucinous). Udeda et al introduced a pipeline combining NASNet-A-Large-based tile-level pattern recognition with decision-tree aggregation to classify HGSOC into four subtypes: mesenchymal transition (MT), immune reactive (IR), papillo glandular (PG) and solid proliferative (SP). The model demonstrated high accuracy (mean 0.910–0.933 across cohorts) and identified the MT subtype as an independent prognostic factor for poor OS.34 In another study, Klein et al used matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry to extract proteomic features from tissue microarrays and trained a convolutional neural network (CNN) to discriminate between five histotypes, achieving an overall accuracy of 85%, surpassing conventional histopathological assessment.36 As an intrinsic multimodal technology, MALDI-IMS can obtain molecular spectrum information while preserving the spatial structure of tissues, realise the in situ fusion of morphology and proteomics, and provide a new perspective for histological typing.
Beyond accuracy, the clinical adoption of MMAI for diagnosis relies on interpretability. Attention maps and feature importance analyses (eg, SHAP) help pathologists understand why a tumour is classified as malignant or a specific subtype, fostering trust and facilitating integration into diagnostic workflows.57 58
Treatment response prediction
Predicting response to platinum-based chemotherapy and PARP inhibitors (PARPis) is essential. MMAI improves on traditional biomarkers by integrating multimodal data capturing morphological and molecular determinants of sensitivity.
In platinum response prediction, Kilim et al combined H&E WSIs and proteomic data using pathway-guided attention (SurvPath), achieving AUCs up to 0.835.26 Ahn et al designed PathoRiCH, significantly stratifying patients by PFI.28 Although the DeepHRD model based on H&E images was slightly less accurate in predicting HRD status (AUC 0.81) than the standard genomic HRD assay (AUC >0.9), its cost was reduced by approximately 80%, with turnaround time reduced from weeks to minutes. This makes it an attractive prescreening tool for optimising the allocation of healthcare resources by narrowing the pool of patients who require expensive genomic testing.32 Similarly, in the study by Nero et al, its model had moderate performance (AUC ~0.7) but provided critical information as an effective supplementary diagnostic tool when tissue samples were insufficient or sequencing failed.33
For predicting PARPi efficacy, Xiong et al built a LightGBM model incorporating clinical (eg, BRCA status, PARPi type), pathological (IHC markers: Ki67, p53) and biochemical data (eg, CA-199, total bile acids). The model achieved AUCs of 0.79 (primary) and 0.72 (recurrent) in internal validation, with favourable generalisability in external cohorts. SHAP analysis identified BRCA/HRD status and bile acids as top contributors, aligning with known clinical predictors.40 Wang et al also demonstrated that deep learning models trained on H&E and MMR IHC could predict bevacizumab efficacy with high accuracy (100% mean sensitivity/specificity with MSH2 (MutS homolog)), suggesting potential applicability to combination therapies.22
Prognostic stratification
MMAI enhances prognostic stratification by integrating diverse data modalities to predict OS and progression-free survival (PFS).
In OS prediction, Bi et al developed FoMu, a foundation model-driven approach integrating clinical, MRI and pathology data. It attained C-indices up to 0.836 in external validation.38 However, the C-index of the proposed model was as high as 0.836 in the external validation cohort A, but its performance dropped to 0.78 to 0.82 in cohorts B and C without pathological modalities. This highlights the high dependence of the model on the integrity of input modalities and the risk of generalisation performance degradation due to missing data in real-world complex clinical settings. Similarly, Yang et al developed ovarian cancer digital pathology index (OCDPI) using graph deep learning, with HRs of 1.916–2.796.59 Notably, this study further revealed biological pathways (eg, angiogenesis, epithelial-mesenchymal transition) enriched in the high-score regions of OCDPI by transcriptome analysis, linking the morphological phenotype with the underlying molecular mechanisms, and providing a basis for the biological interpretation of the model. In the external validation, the AUC of the model decreased from 0.93 in the internal validation to 0.70 in The Cancer Genome Atlas dataset, suggesting overfitting to the source domain due to differences in slice preparation, scanners and staining protocols. Other approaches combining WSIs and RNA-seq data identified high-risk pathways, providing mechanistic insights.25
For PFS prediction, Wu et al used an attention-based deep survival network leveraging H&E image features and clinical data to stratify patients into risk groups with significantly different PFS (log-rank p=0.00845). The model also revealed associations between risk scores and drug sensitivity, suggesting potential therapeutic alternatives.35 Desbois et al integrated digital pathology (CD8+ T cell density) and transcriptomics to classify tumour immune microenvironments into ‘infiltrated,’ ‘excluded,’ and ‘desert’ phenotypes, finding that ‘excluded’ tumours had the worst PFS due to stromal activation and impaired antigen presentation.21
The choice of fusion strategy (early, late, hybrid) significantly impacts prognostic performance. In general, early and hybrid fusion tend to achieve higher C-indices by modelling inter-modal interactions, as seen in the FoMu model.38 Late fusion, while simpler, may suffice when modalities provide independent predictive signals or when computational constraints are a priority.
Molecular feature prediction
MMAI models predict molecular alterations directly from H&E slides, reducing reliance on costly assays.
For BRCA mutation prediction, Zeng et al integrated H&E WSIs with multi-omics, achieving AUCs of 0.952 and 0.912 for BRCA1/2.60 Nero et al used CLAM-based image-only models with modest performance but low-cost screening potential.33 Although their performance is limited, the value of such models is as pre-screening tools to identify patients with BRCA wild-type who are likely to carry HRD-related phenotypes from a large sample, thereby optimising the limited genomic testing resources, reflecting the concept of ‘image-first’ hierarchical diagnosis in multimodal strategies. This provides further evidence that image-based models can be used as complementary tools for genetic testing, especially in resource-limited settings. For microsatellite instability (MSI) status, Wang et al trained deep learning models on H&E and MMR IHC data, achieving accuracies up to 96% in external validation.22 Multi-omics approaches have also been applied to molecular subtyping, with one study reporting AUCs >0.91 for all four HGSOC subtypes and demonstrating improved prognostic stratification compared with single-omics models.60
Dynamic treatment monitoring and combination therapy optimisation
Beyond static predictions at baseline, emerging MMAI applications are leveraging longitudinal data to monitor therapeutic response dynamically and guide adaptive treatment strategies. For instance, Xiong et al explored the use of multimodal data (clinical, pathological, biochemical) to predict PARPi efficacy over time, hinting at the potential for dynamic risk stratification.40 Integrating serial imaging (eg, CT or MRI scans during neoadjuvant chemotherapy) with repeated biopsies or circulating tumour DNA (ctDNA) analysis could enable real-time prediction of emerging platinum resistance, allowing for timely adjustment of treatment regimens. Furthermore, MMAI models can be extended to optimise combination therapies, such as identifying patients most likely to benefit from PARPi plus anti-angiogenic agents based on baseline and on-treatment histomolecular features. Future models incorporating time-series data will be crucial for realising the full potential of precision oncology in OC.