OpenAccess: Closed

This work is not Open Acccess. We may still have a PDF, if this is the case there will be a green box below.

Research issues and strategies for genomic and proteomic biomarker discovery and validation: a statistical perspective

Ziding Feng,Ross L. Prentice,Sudir Srivastava

Overfitting

Biomarker discovery

Pooling

2004

The development and validation of clinically useful biomarkers from high-dimensional genomic and proteomic information pose great research challenges. Present bottlenecks include: that few of the biomarkers showing promise in initial discovery were found to warrant subsequent validation; and biomarker validation is expensive and time consuming. Biomarker evaluation should proceed in an orderly fashion to enhance rigor and efficiency. A molecular profiling approach, although promising, has a high chance of yielding biased results and overfitted models. Specimens from cohorts or intervention trials are essential to eliminate biases. The high cost for biomarker validation motivates some novel study design features, including sequential filtering and DNA pooling. For data analysis, logistic regression (in particular, boosting logistic regression) has features of robustness against model misspecification, and has resistance to model overfitting. Model assessment and cross-validation are critical components of data analysis. Having an independent test set is a vital feature of study design.

Cite this:

Generate Citation

“Research issues and strategies for genomic and proteomic biomarker discovery and validation: a statistical perspective” is a paper by Ziding Feng Ross L. Prentice Sudir Srivastava published in 2004. It has an Open Access status of “closed”. You can read and download a PDF Full Text of this paper here.