Logo Medical Science Monitor

Call: +1.631.470.9640
Mon - Fri 10:00 am - 02:00 pm EST

Contact Us

Logo Medical Science Monitor Logo Medical Science Monitor Logo Medical Science Monitor

24 March 2021: Database Analysis  

Genome-Scale Analysis Identified , , and as Prognosis Markers of Overall Survival in Gastric Cancer

Zexing Shan1ACDE, Wentao Wang1B, Yilin Tong1B, Jianjun Zhang1B*

DOI: 10.12659/MSM.929558

Med Sci Monit 2021; 27:e929558



BACKGROUND: Gastric cancer is the most common gastrointestinal tumor, and the rates of recurrence and metastasis are high. Research results on molecular biomarkers used for prognosis of gastric cancer remain inconclusive. This study aimed to explore the gene expression module of gastric cancer and to determine potential prognostic biomarkers.

MATERIAL AND METHODS: Three microarray datasets (GSE13911, GSE79973, and GSE29272) from Gene Expression Omnibus (GEO), including 206 pairs of gastric tumors and adjacent normal samples, were used for analysis of differentially expressed genes (DEGs). The 3 microarray datasets yielded 144 genes associated with the progression and prognosis of gastric cancer. After this, a risk score model was developed for result validation using an independent dataset from The Cancer Genome Atlas.

RESULTS: The validation of the independent dataset showed significantly increased NID2, SPARC, and MFAP2 expression in gastric tumor tissues, which were associated with poor outcomes in gastric cancer patients. Moreover, the high risk score obtained was associated with poor overall survival (HR: 1.787; 1.069-2.986; P=0.027). Subgroup analyses revealed that these significant prognostic values were detected in patients aged <65.0 years, tumors in the antrum/distal colon, grade 3 tumors, or TNM-M0 stages of cancer.

CONCLUSIONS: The findings of this study show that NID2, SPARC, and MFAP2 are upregulated in gastric tumor tissues and are significantly associated with poor overall survival. Therefore, the predictive values of the risk score model employed for the prognosis of gastric cancer could be improved by using these 3 upregulated DEGs.

Keywords: Biological Markers, Models, Genetic, Biomarkers, Tumor, Calcium-Binding Proteins, Cell Adhesion Molecules, Databases, Genetic, Disease Progression, Gastric Mucosa, Gene Expression, Gene Expression Profiling, Gene Expression Regulation, Neoplastic, Gene Regulatory Networks, Microarray Analysis, Osteonectin, Prognosis, RNA Splicing Factors, Survival Analysis


Gastric cancer is the third leading cause of cancer-related deaths in the world, accounting for 8.2% of all cancer-related deaths [1,2]. Effective preventive and treatment strategies are required to improve the treatment and prognosis of gastric cancer, especially in Asian countries. Currently, the overall survival of gastric cancer has already been improved due to the diagnosis of the disease at an early stage and the timely application of adjuvant chemotherapy [3–5]. Although the advances in the multidisciplinary approaches and the combination treatment regimen, the prognosis of advanced gastric cancer remains dismal. Moreover, the heterogeneity of somatic or germline changes in patients are associated with the prognosis of gastric cancer.

Earlier studies have already identified the potential value of genetic and epigenetic alterations for gastric cancer prognosis. These alterations affect cycle regulation, cell adhesion, angiogenesis, and tumor carcinogenesis, having a significant prognostic role in the survival outcome in gastric cancer patients [6–9]. Moreover, investigations have already evaluated the gene expression profile of gastric cancer based on DNA microarray data, and explored the potential role of differentially expressed genes (DEGs) in the prognosis of gastric cancer [10–12]. However, the results of the above studies are limited due to their small sample sizes and the lack of validation datasets established in clinical practice. Hence, the use of the identified DEGs for prognosis of gastric cancer has been limited. Therefore, potential novel DEGs should be identified whose role in the overall survival in gastric cancer patients should be assessed.

The potential role of genes in the progression and prognosis of gastric cancer could be revealed through microarray analysis [13,14]. Three microarray data (GSE13911, GSE79973, and GSE29272) were integrated and 144 DEGs were identified. After the validation of DEGs in The Cancer Genome Atlas (TCGA), we noted that NID2, SPARC, and MFAP2 were more significantly upregulated in gastric tissues than in their adjacent normal tissues. Therefore, the high expression of NID2, SPARC, and MFAP2 might affect the prognosis for GC, and the risk scores determined on the basis of NID2, SPARC, and MFAP2 expression on OS in patients with GC after adjustment for potential confounders should be explored.

Material and Methods


The Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) was applied to obtain the discovery and validation datasets. Three independent gastric cancer microarray datasets – GSE13911, GSE79973, and GSE29272 – were used to identify the DEGs, with 206 pairs of gastric tumors and adjacent normal samples. These datasets were generated on the basis of GPL570 platforms (Affymetrix Human Genome U113 Plus 2.0 Array) and GPL6947 platforms (Illumina HumanHT-12 V3.0 expression beadchip). GSE13911 and GSE79973 datasets composed 62 and 10 pairs of matched primary gastric tumors and their adjacent normal tissues, respectively. The GSE29272 dataset contained 62 pairs of cardia and 72 pairs of non-cardia gastric cancer. Data related to gastric cancer gene expression and in clinical practice were obtained from The Cancer Genome Atlas-Stomach Adenocarcinoma (TCGA-STAD). A total of 761 samples were identified, and 333 cases were selected in the survival analysis after removing the normal sample data and the data from patients with insufficient follow-up.


The raw probe-level data in this study were downloaded in CEL files, and the robust multi-array average algorithm RMA from the Affy package of R software was employed for processing the raw probe-level data [15]. The background correction, quantile normalization, and summarizing probe set values into 1 expression measure were processed for analysis of the data of gene expression. The annotations for the probe arrays were obtained from GEO, and the mean of the probe sets values was considered as a value of the expression when multiple probe sets were mapped to the same gene [16]. In our study, the log FC in datasets met the criteria for normal distribution.


The identified DEGs were evaluated using the LIMMA package, with bayesian adjusted t-statistics from the linear models for microarray data [17]. Genes with |log2 fold change (FC)| >1 and P<0.05 were regarded as DEGs between tumors and normal tissues. We constructed volcano plots and Venn diagrams using ggplot2, and Venn diagram packages of R software were used to visualize the identified DEGs.

The GO and KEGG pathway analyses of functional enrichment analysis for 144 common DEGs was conducted by using the online software Database for Annotation, Visualization, and Integrated Discovery (DAVID, https://david.ncifcrf.gov/). All P values are 2-sided, and P<0.05 was considered to indicate statistically significant enrichment.

SPSS software (version 22.0, SPSS, Chicago, IL, USA) was used for statistical analysis. The risk score model consisted of gene expression, which could be validated in TCGA database. Next, the risk score model was constructed in TCGA-STAD, and the risk score of each individual patient was calculated. Moreover, the risk score was categorized into high and low, and the cutoff value was set to be the median of the risk score. The baseline characteristics between groups were compared using Kruskal-Wallis and chi-square tests based on the type of data. The propensity score analysis was used to adjust for imbalance in the baseline characteristics to avoid undue influences of confounding factors, which was analyzed using the MatchIt propensity score of R software, and the standardized mean difference for matching variables was defined as <20% between the groups. Kaplan-Meier and log-rank tests were employed for survival analysis. Subgroup analyses were also performed according to age, race, anatomic tumor site, grade, TNM-T, TNM-N, TNM-M, and stage. P<0.05 was considered to indicate statistical significance.



GSE13911, GSE79973, and GSE29272 were employed as the discovery datasets for the identified DEGs expressed in gastric tumors and their adjacent normal tissues. These 3 datasets included 206 pairs of gastric tumors and their adjacent normal samples. The DEGs were explored to evaluate the association between gene expression alteration and gastric cancer progression. The details regarding the expression data from primary gastric tumors and adjacent normal samples are shown in Figure 1. A total number of 144 DEGs were detected for the intersecting part of the 3 sets, which were generally related to gastric samples and potentially associated with the progression and prognosis of gastric cancer (Figures 2, 3). Detailed information of the 144 DEGs established is presented in Table 1.


GO and KEGG pathway enrichment analyses were performed to investigate the biological roles of DEGs in gastric cancer progression, including cell cycle and cell adhesion. The enriched GO terms were mainly associated with the extracellular matrix of the cellular component, and the KEGG pathway analysis results showed that the most highly enriched pathway was ECM-receptor interaction. The results of the GO and KEGG pathway enrichment analyses are summarized and displayed in Tables 2 and 3.


TCGA-STAD included 333 GC patients, who were regarded as a validation cohort, which was assessed to verify the expression of DEGs. The results indicated that NID2, SPARC, and MFAP2 were the 3 top-ranked upregulated genes for the risk of GC. Further, we developed a risk score model described by the following formula: risk score=0.005974532×ExpNID2+ 0.004623909×ExpSPARC+ 0.054586198×ExpMFAP2. The categories of high and low risk scores were based on the median values of the of risk scores.


The baseline characteristics of the high (n=166) and low (n=167) risk score groups are presented in Table 4. Significant differences were observed between groups in terms of race and tumor stages, whereas no significant differences were established for age, sex, anatomic tumor site, tumor grade, TNM-T, TNM-N, and TNM-M. Overall, we noted that a high risk score was obviously associated with poor overall survival (HR: 2.041; 95% CI: 1.272–3.274; P=0.003; Figure 4). Significant associations were observed mainly in the following patients: <65.0 years, with a tumor in the antrum/distal colon, with a grade 3 tumor, irrespective of the TNM-T stage, TNM-N2-3, TNM-M0, and stage III and IV (Table 2). After propensity score analysis, the higher risk scores were associated with poorer overall survival (HR: 1.787; 1.069–2.986; P=0.027; Figure 5). Subgroup analysis showed that high risk scores were associated with poor overall survival in patients age <65.0 years and if they had a tumor in the antrum/distal colon, grade 3 tumor, or TNM-M0 stages gastric cancer (Table 5).


The gene expression modules at the genome-wide scale in gastric cancer were investigated in our study through integrating multiple gastric cancer transcriptome microarray datasets. Our findings provide information on alterations at the molecular level; we achieved higher robustness than that of data from a single dataset. We screened 144 DEGs in gastric tumors and adjacent normal samples and discovered that the expression levels of NID2, SPARC, and MFAP2 were the 3 top-ranked upregulated genes. Next, a risk score model based on 3 DEGs was constructed (risk score=0.005974532×ExpNID2+ 0.004623909×ExpSPARC+ 0.054586198×ExpMFAP2), which was significantly associated with poor overall survival in patients with GC, based on data from TCGA database. Furthermore, using propensity score analysis, we observed these associations mainly in patients younger than 65.0 years, with a tumor in the antrum/distal colon, with a grade 3 tumor, or with TNM-M0 stages of GC.

The results of this study indicated that GC is involved in cell cycle, cell adhesion, and the extracellular matrix; these processes were found in patients with upregulated NID2, SPARC, and MFAP2. The cell adhesion dysfunction was significantly associated with gastric cancer metastasis, which could be considered to represent multiple activated signaling pathways in the malignancy [18]. Moreover, the common characteristics of gastric cancer were the dense stroma with enormous quantities of extracellular matrix in the surrounding area [19,20]. The gene annotation analysis results support our findings on the enriched cellular components of extracellular matrix and ECM-receptor interaction pathway.

We noted the expression of NID2, SPARC, and MFAP2 in gastric tumors was upregulated compared with adjacent normal tissue samples. The role of abnormal NID2 methylation in cancer prognosis at various sites has already been highlighted in previous research [21–25]. NID2, which is a member of the nidogen protein family, has been reported to maintain the stability and integrity of the basement membrane. Moreover, the involvement of SPARC in the prognosis of gastric cancer has also been confirmed in many studies [26–28]. The study conducted by Liao et al identified 4 microarray datasets and found SPARC is significantly upregulated in gastric tissues, which was associated with poor prognosis [26]. Evidence has shown that cell adhesion, proliferation, migration, and tissue remodeling are regulated by SPARC during cell development and the extracellular matrix turnover processes [29,30]. Recently, MFAP2 was found to modulate tropoelastin deposition into micro-fibrils, which participates in the formation of elastic fibers [31]. Moreover, it was considered as the co-expressed gene of the NF-κB/Snail/YY1/RKIP circuitry, which was upregulated in tumor tissues; the extent of this upregulation was specific evidence of lymph node metastasis [32,33]. In the present study, we constructed a risk score model for predicting overall survival of gastric cancer patients, which showed that a high risk score was associated with poor overall survival. Moreover, stratified analyses of patients’ characteristics also confirmed our findings.

Several limitations to this study should be acknowledged: (1) The interpretation of the results should be cautions due to the collection of data from different platforms; (2) Bioinformatics analysis was applied, whose findings should be verified in further research to clarify the mechanisms of the association between these genes and poor GC survival; (3) The range of the analyses was limited due to variations in the characteristics of the patients; and (4) The role of the expression of the studied 3 genes associated with other survival outcomes in patients with GC should be further explored, including the determination of progression-free survival.


In conclusion, the findings of this study suggest the upregulation of NID2, SPARC, and MFAP2 is strongly associated with overall survival in patients with gastric cancer. Moreover, the risk score of the overall survival of gastric cancer patients is affected by age, the anatomic tumor site, tumor grade, and TNM-M. Further research should be conducted in laboratory settings to explore the underlying molecular mechanisms and to translate these research findings into the development of novel targeted-treatment strategies.


1. Ferlay J, Soerjomataram I, Ervik M: GLOBOCAN 2012 v10, Cancer Incidence and Mortality Worldwide:IARC CancerBase No. 11, Lyon, France, International Agency for Research on Cancer http://globocan.iarc.fr

2. Bray F, Ferlay J, Soerjomataram I, Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries: Cancer J Clin, 2018; 68(6); 394-424

3. Information Committee of Korean Gastric Cancer Association, Korean Gastric Cancer Association Nationwide Survey on Gastric Cancer in 2014: J Gastric Cancer, 2016; 16(3); 131-40

4. Bang YJ, Kim YW, Yang HK, Adjuvant capecitabine and oxaliplatin for gastric cancer after D2 gastrectomy (CLASSIC) : A phase 3 open-label, randomised controlled trial: Lancet, 2012; 379(9813); 315-21

5. Sasako M, Sakuramoto S, Katai H, Five-year outcomes of a randomized phase III trial comparing adjuvant chemotherapy with S-1 versus surgery alone in stage II or III gastric cancer: J Clin Oncol, 2011; 29(33); 4387-93

6. Akama Y, Yasui W, Yokozaki H, Frequent amplification of the cyclin E gene in human gastric carcinomas: Jpn J Cancer Res, 1995; 86(7); 617-21

7. Graziano F, Mandolesi A, Ruzzo A, Predictive and prognostic role of E-cadherin protein expression in patients with advanced gastric carcinomas treated with palliative chemotherapy: Tumour Biol, 2004; 25(3); 106-10

8. Tanigawa N, Amaya H, Matsumura M, Correlation between expression of vascular endothelial growth factor and tumor vascularity, and patient outcome in human gastric carcinoma: J Clin Oncol, 1997; 15(2); 826-32

9. Sanz-Ortega J, Steinberg SM, Moro E, Comparative study of tumor angiogenesis and immunohistochemistry for p53, c-ErbB2, c-myc and EGFr as prognostic factors in gastric cancer: Histol Histopathol, 2000; 15(2); 455-62

10. Cho JY, Lim JY, Cheong JH, Gene expression signature-based prognostic risk score in gastric cancer: Clin Cancer Res, 2011; 17(7); 1850-57

11. Chen CN, Lin JJ, Chen JJ, Gene expression profile predicts patient survival of gastric cancer after surgical resection: J Clin Oncol, 2005; 23(29); 7286-95

12. Wang X, Liu Y, Niu Z, Prognostic value of a 25-gene assay in patients with gastric cancer after curative resection: Sci Rep, 2017; 7(1); 7515

13. Chang W, Ma L, Lin L, Identification of novel hub genes associated with liver metastasis of gastric cancer: Int J Cancer, 2009; 125(12); 2844-53

14. Zhu T, Gao YF, Chen YX, Genome-scale analysis identifies GJB2 and ERO1LB as prognosis markers in patients with pancreatic cancer: Oncotarget, 2017; 8(13); 21281-89

15. Irizarry RA, Hobbs B, Collin F, Exploration, normalization, and summaries of high density oligonucleotide array probe level data: Biostatistics, 2003; 4(2); 249-64

16. Li W, Li K, Zhao L, Bioinformatics analysis reveals disturbance mechanism of MAPK signaling pathway and cell cycle in Glioblastoma multiforme: Gene, 2014; 547(2); 346-50

17. Diboun I, Wernisch L, Orengo CA, Microarray analysis after RNA amplification can detect pronounced differences in gene expression using limma: BMC Genomics, 2006; 7; 252

18. Liu JP, Liu D, Gu JF, Shikonin inhibits the cell viability, adhesion, invasion and migration of the human gastric cancer cell line MGC-803 via the Toll-like receptor 2/nuclear factor-kappa B pathway: J Pharm Pharmacol, 2015; 67(8); 1143-55

19. Gan L, Meng J, Xu M, Extracellular matrix protein 1 promotes cell metastasis and glucose metabolism by inducing integrin beta4/FAK/SOX2/HIF-1alpha signaling pathway in gastric cancer: Oncogene, 2018; 37(6); 744-55

20. Wu Q, Li X, Yang H, Extracellular matrix protein 1 is correlated to carcinogenesis and lymphatic metastasis of human gastric cancer: World J Surg Oncol, 2014; 12; 132

21. Wang J, Zhao Y, Xu H, Silencing NID2 by DNA hypermethylation promotes lung cancer: Pathol Oncol Res, 2020; 26(2); 801-11

22. Wu Q, Zhang B, Wang Z, Integrated bioinformatics analysis reveals novel key biomarkers and potential candidate small molecule drugs in gastric cancer: Pathol Res Pract, 2019; 215(5); 1038-48

23. van der Heijden AG, Mengual L, Ingelmo-Torres M, Urine cell-based DNA methylation classifier for monitoring bladder cancer: Clin Epigenetics, 2018; 10; 71

24. Chai AWY, Cheung AKL, Dai W, Elevated levels of serum nidogen-2 in esophageal squamous cell carcinoma: Cancer Biomark, 2018; 21(3); 583-90

25. Torky HA, Sherif A, Abo-Louz A, Evaluation of serum Nidogen-2 as a screening and diagnostic tool for ovarian cancer: Gynecol Obstet Invest, 2018; 83(5); 461-65

26. Liao P, Li W, Liu R, Genome-scale analysis identifies SERPINE1 and SPARC as diagnostic and prognostic biomarkers in gastric cancer: Onco Targets Ther, 2018; 11; 6969-80

27. Li Z, Li AD, Xu L, SPARC expression in gastric cancer predicts poor prognosis: Results from a clinical cohort, pooled analysis and GSEA assay: Oncotarget, 2016; 7(43); 70211-22

28. Wang Z, Hao B, Yang Y, Prognostic role of SPARC expression in gastric cancer: A meta-analysis: Arch Med Sci, 2014; 10(5); 863-69

29. Funk SE, Sage EH, The Ca2(+)-binding glycoprotein SPARC modulates cell cycle progression in bovine aortic endothelial cells: Proc Natl Acad Sci USA, 1991; 88(7); 2648-52

30. Lane TF, Sage EH, The biology of SPARC, a protein that modulates cell-matrix interactions: FASEB J, 1994; 8(2); 163-73

31. Mecham RP, Gibson MA, The microfibril-associated glycoproteins (MAGPs) and the microfibrillar niche: Matrix Biol, 2015; 47; 13-33

32. Zaravinos A, Kanellou P, Lambrou GI, Gene set enrichment analysis of the NF-kappaB/Snail/YY1/RKIP circuitry in multiple myeloma: Tumour Biol, 2014; 35(5); 4987-5005

33. Silveira NJ, Varuzza L, Machado-Lima A, Searching for molecular markers in head and neck squamous cell carcinomas (HNSCC) by statistical and bioinformatic analysis of larynx-derived SAGE libraries: BMC Med Genomics, 2008; 1; 56


26 January 2023 : Editorial  

Editorial: The XBB.1.5 (‘Kraken’) Subvariant of Omicron SARS-CoV-2 and its Rapid Global Spread

Med Sci Monit In Press; DOI: 10.12659/MSM.939580  

19 January 2023 : Clinical Research  

Evaluation of Health-Related Quality of Life and Mental Health in 729 Medical Students in Indonesia During ...

Med Sci Monit In Press; DOI: 10.12659/MSM.938892  

27 December 2022 : Clinical Research  

Effect of Physiotherapy to Correct Rounded Shoulder Posture in 30 Patients During the COVID-19 Pandemic in ...

Med Sci Monit 2022; 28:e938926

In Press

30 Jan 2023 : Clinical Research  

Splenic Artery Steal Syndrome in Patients with Liver Cirrhosis: A Retrospective Clinical Study

Med Sci Monit In Press; DOI: 10.12659/MSM.938998  

27 Jan 2023 : Database Analysis  

Association Between Neutrophil-Lymphocyte Ratio and All-Cause Mortality in Critically Ill Patients with Chr...

Med Sci Monit In Press; DOI: 10.12659/MSM.938554  

27 Jan 2023 : Clinical Research  

Proposal for a Simple Equation for Limb Muscle Weight Calculation

Med Sci Monit In Press; DOI: 10.12659/MSM.938606  

26 Jan 2023 : Editorial  

Editorial: The XBB.1.5 (‘Kraken’) Subvariant of Omicron SARS-CoV-2 and its Rapid Global Spread

Med Sci Monit In Press; DOI: 10.12659/MSM.939580  

Most Viewed Current Articles

13 Nov 2021 : Clinical Research  

Acceptance of COVID-19 Vaccination and Its Associated Factors Among Cancer Patients Attending the Oncology ...

DOI :10.12659/MSM.932788

Med Sci Monit 2021; 27:e932788

30 Dec 2021 : Clinical Research  

Retrospective Study of Outcomes and Hospitalization Rates of Patients in Italy with a Confirmed Diagnosis o...

DOI :10.12659/MSM.935379

Med Sci Monit 2021; 27:e935379

08 Mar 2022 : Review article  

A Review of the Potential Roles of Antioxidant and Anti-Inflammatory Pharmacological Approaches for the Man...

DOI :10.12659/MSM.936292

Med Sci Monit 2022; 28:e936292

01 Jan 2022 : Editorial  

Editorial: Current Status of Oral Antiviral Drug Treatments for SARS-CoV-2 Infection in Non-Hospitalized Pa...

DOI :10.12659/MSM.935952

Med Sci Monit 2022; 28:e935952

Your Privacy

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website, You can decise for yourself which categories you you want to deny or allow. Please note that based on your settings not all functionalities of the site are available. View our privacy policy.

Medical Science Monitor eISSN: 1643-3750
Medical Science Monitor eISSN: 1643-3750