Multiple Testing for Pattern Identification, with Applications to Microarray Time-Course Experiments

Computational Biology Colloquium

In time-course experiments, it is often desirable to identify genes that exhibit a specific pattern of differential expression over time and thus gain insights into the mechanisms of the underlying biological processes. Two challenging issues in the pattern identification problem are: (i) how to combine the simultaneous inferences across multiple time points and (ii) how to control the multiplicity of Type I errors while accounting for the strong dependence. We formulate a compound decision-theoretic framework for set-wise multiple testing and propose a data-driven procedure that aims to minimize the missed set rate (MSR) subject to a constraint on the false set rate (FSR). The hidden Markov model (HMM) proposed in Yuan and Kendziorski (2006) is generalized to capture the temporal correlation in the gene expression data. Both theoretical and numerical results are presented to show that our data-driven procedure controls the multiplicity, provides an optimal way of combining simultaneous inferences across multiple time points, and greatly improves the conventional combined p-value methods. In particular, we demonstrate our method in an application to a study of systemic inflammation in humans for detecting early and late response genes.

Multiple Testing for Pattern Identification, with Applications to Microarray Time-Course Experiments

To add event to calendar, click the desired date below.

Thursday, September 5, 2013 05/09/2013 14:00:00 05/09/2013 15:30:00 6 Multiple Testing for Pattern Identification, with Applications to Microarray Time-Course ExperimentsIn time-course experiments, it is often desirable to identify genes that exhibit a specific pattern of differential expression over time and thus gain insights into the mechanisms of the underlying biological processes. Two challenging issues in the pattern identification problem are: (i) how to combine the simultaneous inferences across multiple time points and (ii) how to control the multiplicity of Type I errors while accounting for the strong dependence. We formulate a compound decision-theoretic framework for set-wise multiple testing and propose a data-driven procedure that aims to minimize the missed set rate (MSR) subject to a constraint on the false set rate (FSR). The hidden Markov model (HMM) proposed in Yuan and Kendziorski (2006) is generalized to capture the temporal correlation in the gene expression data. Both theoretical and numerical results are presented to show that our data-driven procedure controls the multiplicity, provides an optimal way of combining simultaneous inferences across multiple time points, and greatly improves the conventional combined p-value methods. In particular, we demonstrate our method in an application to a study of systemic inflammation in humans for detecting early and late response genes.University Park Campus
2:00 PM to 3:30 PM
University Park Campus
Ray R. Irani Hall (RRI)
101
213.740.5557
  • Britta Bothe
  • University of Southern California
  • Taper Hall 353
  • 3501 Trousdale Parkway #255
  • Los Angeles, CA 90089-4353