Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context

Gad Abraham, Adam Kowalczyk, Sherene Loi, Izhak Haviv, Justin Zobel

Research output: Contribution to journalArticlepeer-review

71 Scopus citations

Abstract

Background: Different microarray studies have compiled gene lists for predicting outcomes of a range of treatments and diseases. These have produced gene lists that have little overlap, indicating that the results from any one study are unstable. It has been suggested that the underlying pathways are essentially identical, and that the expression of gene sets, rather than that of individual genes, may be more informative with respect to prognosis and understanding of the underlying biological process.Results: We sought to examine the stability of prognostic signatures based on gene sets rather than individual genes. We classified breast cancer cases from five microarray studies according to the risk of metastasis, using features derived from predefined gene sets. The expression levels of genes in the sets are aggregated, using what we call a set statistic. The resulting prognostic gene sets were as predictive as the lists of individual genes, but displayed more consistent rankings via bootstrap replications within datasets, produced more stable classifiers across different datasets, and are potentially more interpretable in the biological context since they examine gene expression in the context of their neighbouring genes in the pathway. In addition, we performed this analysis in each breast cancer molecular subtype, based on ER/HER2 status. The prognostic gene sets found in each subtype were consistent with the biology based on previous analysis of individual genes.Conclusions: To date, most analyses of gene expression data have focused at the level of the individual genes. We show that a complementary approach of examining the data using predefined gene sets can reduce the noise and could provide increased insight into the underlying biological pathways.

Original languageEnglish
Article number277
JournalBMC Bioinformatics
Volume11
DOIs
StatePublished - 25 May 2010
Externally publishedYes

Bibliographical note

Funding Information:
This work was supported by the Australian Research Council, and by the NICTA Victorian Research Laboratory. NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Center of Excellence program. SL is supported by the National Health and Medical Research Council of Australia (NHMRC) and the European Society of Medical Oncology.

Funding

This work was supported by the Australian Research Council, and by the NICTA Victorian Research Laboratory. NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Center of Excellence program. SL is supported by the National Health and Medical Research Council of Australia (NHMRC) and the European Society of Medical Oncology.

FundersFunder number
NICTA Victorian Research Laboratory
Australian Research Council
National Health and Medical Research Council
Department of Broadband, Communications and the Digital Economy , Australian Government
European Society for Medical Oncology

    Fingerprint

    Dive into the research topics of 'Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context'. Together they form a unique fingerprint.

    Cite this