Deep Learning Inference Using Knockoffs

Oct. 1, 2021 | Fengzhu Sun, professor of quantitative and computational biology and mathematics

Round portraits of five people.

New technologies generate enormous amounts of data, such as genetic polymorphisms, gene expression or microbiome. These studies usually generate thousands to hundreds of thousands of variables (features) such as abundance levels of microbial organisms or expression levels of genes. To link these features to public health is a highly important problem.

Machine learning, and particularly deep learning, has been successfully used to predict health outcomes. However, selecting variables truly associated with the health status with controlled false discovery rates is an unsolved problem.

In our paper, “DeepLINK: Deep learning inference using knockoffs with applications to genomics,” published in the Proceedings of the National Academy of Sciences, we developed a method to select features from high dimensional data using deep learning. Study authors include myself, Ph.D. student Zifan Zhu; Yingying Fan, professor of data sciences and operations, and Jinchi Lv, Kenneth King Stonier Chair in Business Administration and professor of data sciences and operations, both at USC Marshall School of Business; and Yinfei Kong, associate professor of information systems and decision sciences at California State University, Fullertion.

Our USC Dornsife group, together with Jed Fuhrman, McCulloch-Crosby Chair in Marine Biology and professor of biological sciences, and Emily Zakem, Simons Foundation Postdoctoral Fellow in marine and environmental biology, Yan Liu, professor of computer science at the USC Viterbi School of Engineering, and Dr. Lv received a National Science Foundation Understanding the Rules of Life $2.5 million grant to study microbial interactions in marine environments.

It is also great to know that Eric Webb, associate professor of biological sciences and environmental studies, et. al. also received a grant from the same program. This is probably quite unique for two groups from the same university.