Primate Monocytes - CD14, CD16 - Ziegler-Heitbrock

Contact

Predicting and interpreting protein and phosphoprotein abundance from pan-cancer and single-cell transcriptomes.

Abstract

Proteins that impact phenotype and disease are often approximated by RNA expression, which poorly infers protein abundance. We developed DeepGxP, a deep-learning model trained on The Cancer Genome Atlas pan-cancer data, to predict protein abundance from transcriptome profiles. DeepGxP outperformed conventional models, achieving median Pearson's correlation of 0.68 (n = 187) and predictive performance of 0.74 and 0.64 for proteins with high (>=0.31) and low (<0.31) self-gene/protein correlation, respectively. We also developed DeepEnrich, an integrated gradient-based interpretation framework that identifies predictor genes and enriched functions. For example, predictors of cyclin B1 and E2 are enriched in mitotic chromatid segregation and G2/M transition, respectively. In lung adenocarcinoma, we uncovered distinct EGFR/HER2 phosphorylation patterns in alveolar cells. In breast cancer, p53 protein, but not TP53 mRNA, correlated with survival. DeepGxP also accurately predicted the abundance of single-cell surface proteins, confirming cell identification. Our findings underscore DeepGxP's potential in decoding gene-to-protein relationships for cancer biomarker discovery.

Authors: Tsai HM, Hsiao TH, Chiu YC, Huang Y, Chuang EY, Chen Y,
Journal: iScience;2026Mar20; 29 (3) 114815. doi:10.1016/j.isci.2026.114815
Year: 2026
PubMed: PMID: 41816284 (Go to PubMed)