About
I am an assistant professor at the Department of Biostatistics, University of Pittsburgh. I received my Ph.D. degree in Biostatistics from the University of Michigan. Before that, I got my B.A. in Mathematics and M.S. in Statistics from the University of Virginia. I was once an undergraduate student at the Sun Yat-sen University.
My research lies at the intersection of biostatistics and machine learning, with a broad goal of promoting and propelling health data science. I am particularly interested in developing statistical methods for integrative data analysis that combines data sets from multiple sources or knowledge of different types to achieve higher precision and power. With this in mind, my current research program focuses on developing methods that support regression, prediction and decision making based on large scale (distributed) data sets. I also develop data processing tools for analyzing high-dimensional data. Most of my work is inspired by and closely related to applications in bioinformatics, clinical trials, electronic health records, environmental health sciences, fairness and disparity, and health policies.
Major grant support
PI (2023-2025) NIH R21DA055672 Federated learning methods for heterogeneous and distributed Medicaid data
PI (2023-2026) NSF DMS 2310217 Fusion pursuit for pattern-mixture models with application to longitudinal studies with nonignorable missing data
Co-I (2023-2026) NIH R01LM014142 Disease subtyping guided by clinical phenotype for precision medicine
Co-I (2022-2026) NIH R01DA055585 Improving racial equity in opioid use disorder treatment in Medicaid
Co-I (2021-2025) NIH R01GM141081 Precision medicine approach to glucocortisteroids in sepsis
My research lies at the intersection of biostatistics and machine learning, with a broad goal of promoting and propelling health data science. I am particularly interested in developing statistical methods for integrative data analysis that combines data sets from multiple sources or knowledge of different types to achieve higher precision and power. With this in mind, my current research program focuses on developing methods that support regression, prediction and decision making based on large scale (distributed) data sets. I also develop data processing tools for analyzing high-dimensional data. Most of my work is inspired by and closely related to applications in bioinformatics, clinical trials, electronic health records, environmental health sciences, fairness and disparity, and health policies.
- Data integration and meta-analysis
- Causal inference and precision medicine
- Longitudinal data analysis
- Subgroup analysis
- High-dimensional data analysis
Major grant support
PI (2023-2025) NIH R21DA055672 Federated learning methods for heterogeneous and distributed Medicaid data
PI (2023-2026) NSF DMS 2310217 Fusion pursuit for pattern-mixture models with application to longitudinal studies with nonignorable missing data
Co-I (2023-2026) NIH R01LM014142 Disease subtyping guided by clinical phenotype for precision medicine
Co-I (2022-2026) NIH R01DA055585 Improving racial equity in opioid use disorder treatment in Medicaid
Co-I (2021-2025) NIH R01GM141081 Precision medicine approach to glucocortisteroids in sepsis