• CCNA2 and PLK1 as Prognostic Biomarkers in Prostate Cancer: The Cancer Genome Atlas Transcriptomic Analysis
  • Saeid Latifi-Navid,1,* Fatemeh Hedayat,2 Seyedeh Azin Azad Abkenar,3 MohammadAli Shahmohammadi,4
    1. University of Mohaghegh Ardabili
    2. University of Mohaghegh Ardabili
    3. University of Mohaghegh Ardabili
    4. University of Mohaghegh Ardabili


  • Introduction: Prostate cancer (PCa) is the second most common cancer type in men after lung cancer [1, 2]. It is considered a major cause of death among men worldwide, posing a significant health burden. Despite advancements in diagnosis and treatment, the underlying molecular mechanisms of this disease remain unclear, making it challenging to manage. This study identified potential genes involved in PCa survival rates that could serve as biomarkers for its prognosis.
  • Methods: Firstly, RNA-seq data in raw format (STAR-Count) and clinical information of prostate adenocarcinoma (PRAD) was obtained from The Cancer Genome Atlas (TCGA) data portal through the “TCGAbiolinks” package and R programming language. Protein-coding genes were selected from the count expression matrix using the "biomaRt" package in R software. Furthermore, during the preprocessing procedure, genes with zero or near-zero expression were excluded based on CPM (Count per million<10 in 70% of samples) and the remaining expression matrix was normalized by “edgeR” package through TMM (trimmed mean of M-values) method. Using the “limma” package, data were converted into log2 scale and differentially expressed genes (DEGs) were identified from two distinct groups of normal and cancer according to the following criteria: |𝑙𝑜𝑔2𝐹𝐶| > 1 and adj P-value < 0.01. The Database for Annotation, Visualization and Integrated Discovery (DAVID) was then used to elucidate potential GO function and signaling pathways (KEGG) related to DEGs. Moreover, univariate Cox regression analysis was employed to DEGs that are associated with survival outcomes (meeting the significance criterion of P<0.05). Following that, the protein-protein interaction (PPI) network was constructed by STRING and analyzed by Cytoscape software. Additionally, the degree of all nodes was calculated by the cytoHubba plugin with the focus on degree. Finally, the GEPIA2 database was utilized to draw Kaplan-Meier curves of hub genes.
  • Results: The count expression matrix consisted of 502 PCa patients and 52 normal samples with 60,660 genes. Eliminating genes with low expression values resulted in a reduced dataset containing 14,565 genes. After normalization, we had only 12,279 genes. In total, 1,899 DEGs (559 up-regulated genes and 1,340 down-regulated genes) were identified between PCa tissue samples and normal prostate samples. The GO analysis results were classified into molecular functions (MF), biological processes (BP) and cellular components (CC). For MF, DEGs were mainly associated with calcium ion binding (GO:0005509). In the CC category, DEGs were enriched in plasma membranes (GO:0005886), and as for BP, cell differentiation process (GO:0030154) was enriched. Furthermore, KEGG analysis indicated that DEGs were mainly enriched in neuroactive ligand-receptor interaction pathways. After performing univariate Cox regression analysis on DEGs, we found that the prognosis of 99 genes was statistically significant, which had P<0.01 and HR>1. Furthermore, we selected the top 10 degree through cytoHubba, of which only two were significantly associated with poor prognosis based on Kaplan-Meier curve and log-rank test (P<0.05): cyclin A2 (CCNA2) and polo-like kinase 1 (PLK1).
  • Conclusion: In the present study, a number of key genes and pathways were uncovered for prostate cancer. These genes may serve as biomarkers for poor prognosis and possible treatment targets.
  • Keywords: Prostate cancer, Biomarkers, Bioinformatics, Gene expression analysis, TCGA