利用统一多模态嵌入系统解码病理形态和分子图谱
清华团队在 Nature Methods 上发了一个把病理切片和分子组学统一嵌入的框架,跨 12 种癌种验证,做计算病理的同行值得细看,但对普通 AI 从业者来说离落地还远。
Multi-Embed是一个统一且可解释的多模态学习框架,用于整合多级病理形态和多层分子图谱。该框架在形态-分子推断与整合、细粒度组织架构识别和时空轨迹建模等多项基准任务中表现出卓越性能,覆盖12种癌症类型的数据集。它通过跨模态学习提升疾病生物学理解,为疾病发病机制研究提供新工具,代码和在线交互平台已公开可用,支持精准医学应用。
Systematically decoding pathological morphologies and molecular profiles with unified multimodal embedding | Nature Methods
Your privacy, your choice
We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media, as well as to allow video information to be shared for both marketing, analytics and editorial purposes.
By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.
See our privacy policy for more information on the use of your personal data.
Manage preferences for further information and to change your choices.
Accept all cookies Reject optional cookies
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Abstract
Systematic cross-modality inference and integration of pathological morphologies and multilayer molecular profiles have advanced disease biology; however, methodological challenges remain in multimodal learning. Here, we present Multi-Embed, a unified and interpretable framework for multimodal learning between multilevel morphologies and multilayer molecular profiles. Multi-Embed achieves superior performance in morphology–molecule inference and integration, fine-grained tissue architecture identification and spatiotemporal trajectory modeling across diverse benchmark tasks, underscoring its utility for enhancing our understanding of disease pathogenesis.
Access through your institution
This is a preview of subscription content, access via your institution
Access options
Access through your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 /30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout
Fig. 1: Methodological framework and benchmark analysis of Multi-Embed.
The alternative text for this image may have been generated using AI.
Fig. 2: Representative applications of multimodal integration with Multi-Embed.
The alternative text for this image may have been generated using AI.
Similar content being viewed by others

Harnessing multimodal data integration to advance precision oncology
Article 18 October 2021

A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics
Article 12 June 2023

Biological and functional multimorbidity—from mechanisms to management
Article 18 July 2023
Data availability
The pathology images paired with RNA-seq data across 12 cancer types used in this study can be accessed from TCGA (https://portal.gdc.cancer.gov/). The CPTAC data used in this study are available at https://portal.gdc.cancer.gov/projects/CPTAC-3/, with corresponding pathology images available at the The Cancer Imaging Archive (https://www.cancerimagingarchive.net/). The SurGen36 dataset for prognosis prediction is available at https://www.ebi.ac.uk/biostudies/bioimages/studies/S-BIAD1285/. The HER2ST38, TNBC28 ST and ORION-CRC39 spatial proteomics data used in this study are available at Zenodo via https://doi.org/10.5281/zenodo.3957256 (ref. [44](https://www.nature.com/articles/s41592-026-03070-5#ref-CR44 "Andersson, A. Spatial deconvolution of HER2-positive breast cancer delineates tumor-associated cell type interactions. Zenodo https://doi.org/10.5281/zenodo.4751624
(2021).")), [https://doi.org/10.5281/zenodo.14204217](https://doi.org/10.5281/zenodo.14204217) (ref. [45](https://www.nature.com/articles/s41592-026-03070-5#ref-CR45 "Venet, D. ST TNBC. Zenodo
https://doi.org/10.5281/zenodo.14204217
(2024).")) and [https://doi.org/10.5281/zenodo.7637988](https://doi.org/10.5281/zenodo.7637988) (ref. [46](https://www.nature.com/articles/s41592-026-03070-5#ref-CR46 "Lin, J. et al. labsyspharm/ORION-CRC. Zenodo
https://doi.org/10.5281/zenodo.7637988
(2023).")), respectively. The 10x Visium ST dataset of human stomach cancer used in this study is available on the Gene Expression Omnibus platform under accession code [GSE287979](http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE287979) (ref. [30](https://www.nature.com/articles/s41592-026-03070-5#ref-CR30 "Zhang, P. et al. Systematic inference of super-resolution cell spatial profiles from histology images. Nat. Commun. 16, 1838–1838 (2025).")). The 10x Visium ST dataset of human breast cancer used in this study is available at 10x Genomics ([https://www.10xgenomics.com/datasets/human-breast-cancer-block-a-section-1-1-standard-1-1-0/](https://www.10xgenomics.com/datasets/human-breast-cancer-block-a-section-1-1-standard-1-1-0/)). The spatial multi-omics data of human tonsil are available at the 10x Genomics Cytassist platform ([https://www.10xgenomics.com/datasets/gene-protein-expression-library-of-human-tonsil-cytassist-ffpe-2-standard](https://www.10xgenomics.com/datasets/gene-protein-expression-library-of-human-tonsil-cytassist-ffpe-2-standard)). The 10x Xenium ST dataset of human breast cancer ([https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast/](https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast/)) and lung cancer samples ([https://www.10xgenomics.com/datasets/xenium-human-lung-cancer-post-xenium-technote/](https://www.10xgenomics.com/datasets/xenium-human-lung-cancer-post-xenium-technote/), [https://www.10xgenomics.com/datasets/preview-data-ffpe-human-lung-cancer-with-xenium-multimodal-cell-segmentation-1-standard/](https://www.10xgenomics.com/datasets/preview-data-ffpe-human-lung-cancer-with-xenium-multimodal-cell-segmentation-1-standard/)) are available at 10x Genomics. The 10x Visium HD dataset of human colorectal cancer ([https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-human-crc/](https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-human-crc/)) and lung cancer ([https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-human-lung-cancer-post-xenium-expt/](https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-human-lung-cancer-post-xenium-expt/)) are available at 10x Genomics. The number of patients or samples in each dataset involved our study is provided in Supplementary Table [1](https://www.nature.com/articles/s41592-026-03070-5#MOESM4). [Source data](https://www.nature.com/articles/s41592-026-03070-5#Sec33) are provided with this paper.
Code availability
The code of Multi-Embed is available on GitHub (https://github.com/Epoch1128/Multi-Embed/). The online interactive platform (https://multiembed.qhdyr.net/) has been developed for users to conveniently explore Multi-Embed’s functionalities.
References
- Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell40, 1095–1110 (2022).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Boehm, K. M., Khosravi, P., Vanguri, R., Gao, J. & Shah, S. P. Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer22, 114–126 (2022).
ArticleCASPubMedGoogle Scholar
- Steyaert, S. et al. Multimodal data fusion for cancer biomarker discovery with deep learning. Nat. Mach. Intell.5, 351–362 (2023).
ArticlePubMedPubMed CentralGoogle Scholar
- Boehm, K. M. et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat. Cancer3, 723–733 (2022).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Chen, R. J. et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell40, 865–878 (2022).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Chen, V. et al. Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments. Nat. Methods21, 1454–1461 (2024).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Bao, F. et al. Integrative spatial analysis of cell morphologies and transcriptional states with MUSE. Nat. Biotechnol.40, 1200–1209 (2022).
ArticleCASPubMedGoogle Scholar
- Argelaguet, R. et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol.21, 111 (2020).
ArticlePubMedPubMed CentralGoogle Scholar
- Chen, W. et al. A visual–omics foundation model to bridge histopathology with spatial transcriptomics. Nat. Methods22, 1568–1582 (2025).
ArticlePubMedPubMed CentralGoogle Scholar
- Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med.30, 850–862 (2024).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Schmauch, B. et al. A deep learning model to predict RNA-seq expression of tumours from whole slide images. Nat. Commun.11, 3877 (2020).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Hoang, D. -T. et al. A deep-learning framework to predict cancer treatment response from histopathology images through imputed transcriptomics. Nat. Cancer5, 1305–1317 (2024).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Ellis, M. J. et al. Connecting genomic alterations to cancer biology with proteomics: the NCI Clinical Proteomic Tumor Analysis Consortium. Cancer Discov.3, 1108–1112 (2013).
ArticleCASPubMedPubMed CentralGoogle Scholar
Zhang, D. et al. Inferring super-resolution tissue architecture by integrating spatial transcriptomics with histology. Nat. Biotechnol.42, 1372–1377 (2024).
Janesick, A. et al. High resolution mapping of the tumor microenvironment using integrated single-cell, spatial and in situ analysis. Nat. Commun.14, 8353 (2023).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Oliveira, M. F. D. et al. High-definition spatial transcriptomic profiling of immune cell populations in colorectal cancer. Nat. Genet.57, 1512–1523 (2025).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Xiang, J. et al. A vision–language foundation model for precision oncology. Nature638, 769–778 (2025).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Wang, X. et al. A pathology foundation model for cancer diagnosis and prognosis prediction. Nature634, 970–978 (2024).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Liu, H., Xie, X. & Wang, B. Deep learning infers clinically relevant protein levels and drug response in breast cancer from unannotated pathology images. NPJ Breast Cancer10, 18 (2024).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Wu, E. et al. ROSIE: AI generation of multiplex immunofluorescence staining from histopathology images. Nat. Commun.16, 7633 (2025).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Hoang, D. -T. et al. Prediction of DNA methylation-based tumor types from histopathology in central nervous system tumors with deep learning. Nat. Med.30, 1952–1961 (2024).
ArticleCASPubMedGoogle Scholar
Coleman, K. et al. Resolving tissue complexity by multimodal spatial omics modeling with MISO. Nat. Methods 22, 530–538 (2025).
Hu, J. et al. SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat. Methods18, 1342–1351 (2021).
- Long, Y. et al. Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST. Nat. Commun.14, 1155 (2023).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Dong, K. & Zhang, S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat. Commun.13, 1739 (2022).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Zhao, E. et al. Spatial transcriptomics at subspot resolution with BayesSpace. Nat. Biotechnol.39, 1375–1384 (2021).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Vanguri, R. S. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer3, 1151–1164 (2022).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Wang, X. et al. Spatial transcriptomics reveals substantial heterogeneity in triple-negative breast cancer with potential clinical implications. Nat. Commun.15, 10232 (2024).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Teillaud, J.-L., Houel, A., Panouillot, M., Riffard, C. & Dieu-Nosjean, M.-C. Tertiary lymphoid structures in anticancer immunity. Nat. Rev. Cancer24, 629–646 (2024).
ArticleCASPubMedGoogle Scholar
- Zhang, P. et al. Systematic inference of super-resolution cell spatial profiles from histology images. Nat. Commun.16, 1838–1838 (2025).
ArticleCASPubMedPubMed CentralGoogle Scholar
10x Genomics. In Visium Spatial Gene Expression, Human Breast Cancer (Block A Section 1) dataset by Space Ranger 1.1.0 (2020).
Erickson, A. et al. Spatially resolved clonal copy number alterations in benign and malignant tissue. Nature608, 360–367 (2022).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature566, 496–502 (2019).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Zhang, P., Wang, B. & Li, S. Network-based cancer precision prevention with artificial intelligence and multi-omics. Sci. Bull.68, 1219–1222 (2023).
- Zhang, P. et al. Network pharmacology: towards the artificial intelligence-based precision traditional Chinese medicine. Brief. Bioinform.25, bbad518 (2024).
- Myles, C., Um, I. H., Marshall, C., Harris-Birtill, D. & Harrison, D. J. SurGen: 1020 H&E-stained whole-slide images with survival and genetic markers. GigaScience14, giaf086 (2025).
ArticlePubMedPubMed CentralGoogle Scholar
- Zhang, P. et al. Dissecting the single-cell transcriptome network underlying gastric premalignant lesions and early gastric cancer. Cell Rep.27, 1934–1947 (2019).
ArticleCASPubMedGoogle Scholar
- Andersson, A. et al. Spatial deconvolution of HER2-positive breast cancer delineates tumor-associated cell type interactions. Nat. Commun.12, 6012 (2021).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Lin, J. -R. et al. High-plex immunofluorescence imaging and traditional histology of the same tissue section for discovering image-based biomarkers. Nat. Cancer4, 1036–1052 (2023).
ArticleCASPubMedPubMed CentralGoogle Scholar
Kim, D., Kim, N. & Kwak, S. Improving cross-modal retrieval with set of diverse embeddings. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 23422–23431 (IEEE/CVF, 2023).
Radford, A. et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning 8748–8763 (PMLR, 2021).
Luca, B. A. et al. Atlas of clinically distinct cell states and ecosystems across human solid tumors. Cell184, 5482–5496 (2021).
ArticleCASPubMedPubMed CentralGoogle Scholar
- Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun.10, 1523 (2019).
ArticlePubMedPubMed CentralGoogle Scholar
Andersson, A. Spatial deconvolution of HER2-positive breast cancer delineates tumor-associated cell type interactions. Zenodohttps://doi.org/10.5281/zenodo.4751624 (2021).
Venet, D. ST TNBC. Zenodohttps://doi.org/10.5281/zenodo.14204217 (2024).
Lin, J. et al. labsyspharm/ORION-CRC. Zenodohttps://doi.org/10.5281/zenodo.7637988 (2023).
Acknowledgements
We thank experienced pathologist W. Zhou for pathological diagnosis and annotations, and Z. Yuan for helpful discussion. S.L. acknowledges support from the National Natural Science Foundation of China (grant no. T2341008) and Innovation Team and Talents Cultivation Program of National Administration of Traditional Chinese Medicine (ZYYCXTD-D-202405). P.Z. acknowledges support from the Fundamental and Interdisciplinary Disciplines Breakthrough Plan of the Ministry of Education of China (JYB2025XDXM612), the National Natural Science Foundation of China (grant no. 82305047) and the National Key Research and Development Program for Young Scientists of the Ministry of Science and Technology of China (2023YFC3504700).
Author information
Author notes
- These authors contributed equally: Peng Zhang, Chaofei Gao.
Authors and Affiliations
- Institute of TCM-X/MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRist/Department of Automation, Tsinghua University, Beijing, China
Peng Zhang,Chaofei Gao,Zhuoyu Zhang&Shao Li
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
Kui Hua
Authors
- Peng ZhangView author publications Search author on:PubMedGoogle Scholar
- Chaofei GaoView author publications Search author on:PubMedGoogle Scholar
- Kui HuaView author publications Search author on:PubMedGoogle Scholar
- Zhuoyu ZhangView author publications Search author on:PubMedGoogle Scholar
- Shao LiView author publications Search author on:PubMedGoogle Scholar
Contributions
C.G. and P.Z. developed the approach and C.G. conducted simulation experiments. C.G. and Z.Z. performed the data analysis and created the figures. P.Z. was responsible for drafting and writing of the paper. C.G., Z.Z. and K.H. contributed to the discussion of the paper and response to the reviewers' comments. P.Z. and S.L. were responsible for the overall conception of the approach and for revising the paper. S.L. was responsible for the overall design of the study. All authors read and approved the paper.
Corresponding author
Correspondence to Shao Li.
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Xiaoyu Song and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Madhura Mukhopadhyay, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Morphology-based gene expression prediction across 12 cancer types.
a. 5-fold cross-validation of morphology-related genome-wide expression profiles prediction was conducted for Multi-Embed and two benchmark models (HE2RNA and DeepPT), separately for 12 involved cancer types on TCGA data. On average, approximately 18,000 genes were selected for each tumor type. The Pearson correlation coefficients (PCC) between predicted and original profiles of each gene were averaged over cross-validation folds and the per-gene averaged PCC profiles for all involved genes were shown. In these violin plots, the center line denotes the median and box limits denote the upper and lower quartiles. p values were calculated using two-sided t-test. b. The number of morphologically predictable genes identified by Multi-Embed across cancer types. For each cancer type, morphologically predictable genes were determined as the ones with PCC over 0.5 across all five folds. c. Heatmap demonstrates the representative morphologically predictable genes with their morphological predictabilities (that is PCC values) across different cancer types. d. Pathway enrichment of the MPGs shared across all 12 tumor types. p values were calculated using hypergeometric test and Benjamini-Hochberg correction algorithm. e. Jaccard heatmap showing the overlap of morphologically predictable genes across different cancer types.
Extended Data Fig. 2 External validation of Multi-Embed in predicting morphology-related gene expression.
a. External validation of Multi-Embed and the two benchmark models in morphology-related gene expression prediction for seven individual cancer types from the independent CPTAC database with the training data from TCGA database. The number of samples for each cancer type was indicated on the panel. b-d. External validation of ST gene expression predictions for HVGs on the independent 10x Visium dataset (TNBC, b, n = 785 genes), 10x Xenium dataset (c, n = 156, https://www.10xgenomics.com/datasets/preview-data-ffpe-human-lung-cancer-with-xenium-multimodal-cell-segmentation-1-standard) and 10x Visium HD dataset (d, n = 1500, https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-human-lung-cancer-post-xenium-expt). In panel b, the average PCC across all ST samples in TNBC dataset (n = 270 samples) for each gene is shown, with the models pre-trained on the independent 10x Visium dataset HER2ST. p value: two-sided t-test. In all boxplots and violin plots, the center line denotes the median and box limits denote the upper and lower quartiles.
Extended Data Fig. 3 Generalizability of Multi-Embed to multi-layer molecular profiles for cross-modality inference.
a-c. Cross-modality inference analysis in predicting morphology-related multi-layer molecular profiles including methylation (a, DEPLOY as the benchmark model, n = 4096), protein abundance (b, WSI2RPPA as the benchmark model, n = 487) and genomic mutation (c, CHIEF and MUSK as the benchmark models, n = 7), across different cancer types in TCGA. d. The morphology-related abundance prediction for 16 proteins (n = 16) with Multi-Embed and comparison models (OmiCLIP and ROSIE20) on the spatial proteomics dataset ORION-CRC. n = 41 samples. S1-S41 denotes sample 1 to sample 41. e-h. Synchronous inference of normalized mutation probability (f), gene expression (g) and protein abundance (h) profiles for TP53 with Multi-Embed, based on the same pathology image (e) from the patient TCGA-CM-6164 in TCGA. The representative regions with relatively high putative multi-layer molecular values are marked by squares. In all boxplots and violin plots, the center line denotes the median and box limits denote the upper and lower quartiles.
Extended Data Fig. 4 Gene-to-image retrieval analysis with Multi-Embed.
The recall performance of Multi-Embed in gene-to-image retrieval across TCGA-BRCA (n = 1098 patients, a), TCGA-HNSC (n = 523 patients, b), TCGA-KIRC (n = 537 patients, c) and TCGA-STAD (n = 443 patients, d). Here, the pathology image associated to queried gene or gene list was retrieved if its morphological feature showed top-k Cosine similarity with the corresponding molecular feature of the queried gene or gene list in the Multi-Embed space. The Recall@K for K = 1, 5 and 10 were reported. Error bars indicate 95% confidence intervals.
Extended Data Fig. 5 Cross-modality integration analysis on spatial datasets.
The schematic framework of benchmark cross-modality integration of Multi-Embed through unsupervised spatial clustering analysis on the HER2ST dataset (a, n = 8 samples) and TNBC dataset (b, n = 94 samples). c. Cross-modality integration performance of Multi-Embed and the comparison multimodal models (OmiCLIP, MISO, MUSE and SpaGCN) and unimodal models (GraphST, STAGATE and BayesSpace) for spatial clustering on the HER2ST dataset (n = 8). The adjust rand index (ARI) was used as the evaluation metric. d. Cross-modality integration performance of Multi-Embed and the comparison unimodal models for spatial clustering on the TNBC dataset (n = 94). e-f. Spatial clustering on the spatial multi-omics dataset of human tonsil (https://www.10xgenomics.com/datasets/gene-protein-expression-library-of-human-tonsil-cytassist-ffpe-2-standard) that was generated by 10x Genomics CytAssist Visium platform and co-profiled with spatial transcriptomics and proteomics. e. F1-scores of germinal centers identification based on the spatial clustering results under six different conditions, including Multi-Embed with integration of morphological, transcriptomic and proteomic features (‘Multi-Embed image+RNA+protein’), Multi-Embed with integration of morphological and transcriptomic features only (‘Multi-Embed image+RNA’), Multi-Embed with integration of morphological and proteomic features only (‘Multi-Embed image+protein’), MISO with integration of morphological, transcriptomic and proteomic features (‘MISO image+RNA+protein’), MISO with integration of morphological and transcriptomic features only (‘MISO image+RNA’), and MISO with integration of morphological and proteomic features only (‘MISO image+protein’). Data are presented as the error bar of 95% confidence intervals (CIs) where the centre was the F1 score based on the all involved sample. 95% CIs for the F1 scores were determined based on the bootstrapping strategy (n = 1000). f. The germinal centers annotated by experts (left) and identified by ‘Multi-Embed image+RNA+protein’ (right, F1 score = 0.786). p value, paired t-test. In the boxplots of panel c-d, the center line denotes the median and box limits denote the upper and lower quartiles.
Extended Data Fig. 6 Clinical outcome prediction with Multi-Embed.
a. The significances of Multi-Embed-based multimodal prognosis models in risk stratification for the internal validation cohort during 10-fold cross-validation (n = 10), across 12 cancer types. Points in the boxplot represent the negative log-transformed p-values derived with log-rank test for individual 10-fold cross-validation rounds. The red dotted line indicates that the p value is 0.05. b-g. The prognosis performances of Multi-Embed-based prognosis models on six external datasets, including the TNBC dataset for breast cancer (TNBC-BRCA, b), the SurGen dataset for colorectal cancer (SurGen-COAD, c), the CPTAC dataset for pancreatic ductal adenocarcinoma (CPTAC–PAAD, d), head and neck squamous cell carcinoma (CPTAC-HNSC, e), lung adenocarcinoma (CPTAC-LUAD, f) and uterine corpus endometrial carcinoma (CPTAC-UCEC, g). The Kaplan-Meier curves of high- and low-risk patient subgroups stratified by Multi-Embed-based (left) and PORPOISE-based (right) multimodal prognosis models, respectively, with the thresholds predetermined on the training data. p value, log-rank test. N.S. denotes no significance. h-j. The performance (h, 5-fold cross validation) and image-omics biomarker discovery of Multi-Embed on Cyclophosphamide treatment response prediction in the TCGA-BRCA dataset (n = 164 patients). i. Representative Multi-Embed and PORPOISE-inferred response indicators (for example, attention scores, left) alongside high-attention tiles (right) uniquely identified by Multi-Embed (n > 3 replicates). j. Corresponding Multi-Embed-inferred expression profiles associated with the identified high-attention tiles. In the boxplots of panel a&h, the center line denotes the median and box limits denote the upper and lower quartiles.
Extended Data Fig. 7 TLS identification through multimodal spatial clustering on the TNBC ST dataset.
a. The schematic framework of tertiary lymphoid structures (TLSs) identification with Multi-Embed, where the Multi-Embed was pre-trained on single ST sample with manual TLS annotation (CN15_D2) and it was applied on the other 269 samples, including 14 samples with TLS annotations. b-c. Representative false negative (b, blue box) and false positive (c, red box) cases of TLS identification (coloured in orange) with comparison methods, OmiCLIP and MISO, where both two cases were correctly identified with Multi-Embed. The original annotations, together with the TLS regions identified by Multi-Embed, OmiCLIP and MISO, respectively, were shown from top to bottom. c. The new TLS region (marked by the green rectangle) identified by Multi-Embed and confirmed by pathologists. d-e. Prognostic significance of TLS-like morphological biomarkers identified by Multi-Embed. d. Kaplan-Meier curves illustrating the associations between putative ratio of TLS regions and overall survivals. p value, log-rank test. e. Representative cases with high and low prognostic risk with identified TLS area.
Extended Data Fig. 8 Malignant-related spatiotemporal trajectory inference on the published ST samples of early gastric cancer and breast cancer.
a. The UMAP plots for the spatial clustering with MISO (left), OmiCLIP (middle) and Multi-Embed (right). The dots are coloured according to the pathological lesions. EGC, early gastric cancer; CAG, chronic atrophic gastritis; LGD, low-grade dysplasia; IM, intestinal metaplasia. b-d. Clone inference and spatial clustering for the epithelial spots (excluding stromal spots) of the EGC ST sample. The clones of the epithelial spots (b) were determined by SpatialInferCNV. These spots were re-clustered with different methods (c, MISO, left; OmiCLIP, middle; Multi-Embed, right), and the barplot of ARI values for the clustering performance of these methods was shown in the panel d. e. The identified spatiotemporal trajectories with two branches. f. The PCCs between the pseudo-time values along the branch 2 and the CNV scores (upper) inferred by SpatialInferCNV, as well as the average expression profiles (bottom) of the previously reported EGC-related signature consisting of six marker genes (KLK10, SLC11A2, SULT2B1, KLK7, ECM1 and LMTK3), across ST spots (right). g. The pathology lesions annotations for the breast cancer sample. IDC, invasive ductal carcinoma. DCIS, ductal carcinoma in situ. h. The UMAP plots for the spatial clustering with different methods (MISO, left; OmiCLIP, middle; Multi-Embed, right), and the barplot of ARI values for the clustering performance of these methods was shown in the panel i. The dots in the panel g are coloured according to the pathological lesions. j-l. Clone inference and spatial clustering for the epithelial spots (excluding stromal spots) of the breast cancer ST sample. The clones of the epithelial spots (j) were determined by SpatialInferCNV. These spots were re-clustered with different methods (k, MISO, left; OmiCLIP, middle; Multi-Embed, right), and the barplot of ARI values for the clustering performance of these methods was shown in the panel l. m-n. The inferred pseudo-trajectory (m) and its spatial distribution (n) for the epithelial spots of the breast cancer sample with Multi-Embed, where three different branches were identified. o. The PCCs between the pseudo-time values along the trajectory and the CNV scores (upper) inferred by SpatialInferCNV, as well as the averaged expression profiles (bottom) of the previously reported gene signatures related to different breast cancer subtypes (10.1038/s41588-021-00911-1), across ST spots of these three different branches (Branch 1, left; Branch 2, middle; Branch 3, right). In panels d, i and l, p values were calculated by paired t-test and data are presented as the error bar of 95% confidence intervals (CIs) where the centre was the ARI value based on the all involved samples. 95% CIs for the ARI values were determined based on the bootstrapping strategy (n = 1000) (Supplementary Table 3).
Supplementary information
Supplementary Information (download PDF )
Supplementary Notes 1–7 and Figs. 1–7.
Reporting Summary (download PDF )
Peer Review File (download PDF )
Supplementary Table 1 (download XLSX )
The pathology images and molecular profile data involved in this study.
Supplementary Table 2 (download XLSX )
Predictability of morphologically predictable genes across 12 cancer types based on the TCGA database.
Supplementary Table 3 (download XLSX )
The standard variations and CIs involved in this study.
Source data
Source Data Fig. 1 (download XLSX )
Source data for Fig. 1.
Source Data Fig. 2 (download XLSX )
Source data for Fig. 2.
Source Data Extended Data Fig. 1 (download XLSX )
Source data for Extended Data Fig. 1.
Source Data Extended Data Fig. 2 (download XLSX )
Source data for Extended Data Fig. 2.
Source Data Extended Data Fig. 3 (download XLSX )
Source data for Extended Data Fig. 3.
Source Data Extended Data Fig. 4 (download XLSX )
Source data for Extended Data Fig. 4.
Source Data Extended Data Fig. 5 (download XLSX )
Source data for Extended Data Fig. 5.
Source Data Extended Data Fig. 6 (download XLSX )
Source data for Extended Data Fig. 6.
Source Data Extended Data Fig. 7 (download XLSX )
Source data for Extended Data Fig. 7.
Source Data Extended Data Fig. 8 (download XLSX )
Source data for Extended Data Fig. 8.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, P., Gao, C., Hua, K. et al. Systematically decoding pathological morphologies and molecular profiles with unified multimodal embedding. Nat Methods23, 903–908 (2026). https://doi.org/10.1038/s41592-026-03070-5
Received: 18 March 2025
Accepted: 23 March 2026
Published: 24 April 2026
Version of record: 24 April 2026
Issue date: May 2026
Anyone you share the following link with will be able to read this content:
Get shareable link
Sorry, a shareable link is not currently available for this article.
Copy shareable link to clipboard
Provided by the Springer Nature SharedIt content-sharing initiative
Subjects
Close