Lu Tang | Home

About

I am an associate professor at the Department of Biostatistics & Health Data Science, University of Pittsburgh. I received my Ph.D. degree in Biostatistics from the University of Michigan. Before that, I got my B.A. in Mathematics and M.S. in Statistics from the University of Virginia. I was once an undergraduate student at the Sun Yat-sen University.

My research lies at the intersection of biostatistics and machine learning, with a broad goal of promoting and propelling health data science. I am particularly interested in developing statistical methods for integrative data analysis that combines data sets from multiple sources or knowledge of different types to achieve higher precision and power. With this in mind, my current research program focuses on developing methods that support regression, prediction and decision making based on large scale (distributed) data sets. I also develop data processing tools for analyzing high-dimensional data. Most of my work is inspired by and closely related to applications in bioinformatics, clinical trials, electronic health records, environmental health sciences, and health policies.

Data integration and transfer learning
Causal inference and precision medicine
High-dimensional data and subgroup analysis
Longitudinal data analysis

Major grant support
PI (2025-2027) NIH R56LM014522 Improving safety and trustworthiness in data-driven decision learning for sepsis
PI (2023-2026) NSF DMS 2310217 Fusion pursuit for pattern-mixture models with application to longitudinal studies with nonignorable missing data
PI (2023-2025) NIH R21DA055672 Federated learning methods for heterogeneous and distributed Medicaid data
Co-I (2023-2026) NIH R01LM014142 Disease subtyping guided by clinical phenotype for precision medicine
Co-I (2022-2026) NIH R01DA055585 Improving racial equity in opioid use disorder treatment in Medicaid
Co-I (2021-2025) NIH R01GM141081 Precision medicine approach to glucocortisteroids in sepsis

News

I presented at the Pitt Senior Vice Chancellor’s Research Seminar 2025. See the recording here.
I am honored to receive an NIH award to develop safe and trustworthy decision learning methods for sepsis.
I am honored to become an Elected Member of the International Statistical Institute (ISI).
Crystal Zang received Poster Award (Section on Text Analysis) at the 2025 Joint Statistical Conference.
Xinlei Chen received ASA 2025 Student Paper Award (Statistical Computing and Statistical Graphics Sections).

Selected Publications

See my Google Scholar page for the complete list and citation metrics.
__ student as first/second author; * as corresponding author

Methods

Distributed fusion R-Learner of heterogeneous treatment effect using distributed Medicaid data
[Link] -- Li, J., Donohue, J.M., and Tang, L.*
2026 -- Biometrics (Accepted)
Harmony-based data integration for distributed single-cell multi-omics data
[Link] -- Yuan, R., Rong, Z., Hu, H., Liu, T., Tao, S., Chen, W.*, and Tang, L.*
2025 -- PLOS Computational Biology
Robust transfer learning for individualized treatment rules in the presence of missing data
[Link] -- Sui, Z., Ding, Y., and Tang, L.*
2025 -- Biostatistics
Federated learning of robust individualized decision rules with application to heterogeneous multi-hospital sepsis population
[Link] -- Chen, X., Talisa, V.B., Tan, X., Qi, Z., Kennedy, J.N., Chang, C.H., Seymour, C.W., and Tang, L.*
2025 -- Annals of Applied Statistics
RISE: robust individualized decision learning with sensitive variables
[Link] -- Tan, X., Qi, Z., Seymour, C.W., and Tang, L.*
2022 -- Neural Information Processing Systems (NeurIPS)

Applications

Development and evaluation of a machine learning model to predict acute care for opioid use disorder among Medicaid enrollees engaged in a community‐based treatment program
[Link] -- Xue, L., Yin, R., Cole, E.S., Lo-Ciganic, W.H., Gellad, W.F., Donohue, J.M., and Tang, L.*
2025 -- Addiction
Heterogeneity in the effect of early goal-directed therapy for septic shock: A secondary analysis of two multicenter international trials
[Link] -- Shah, F.A., Talisa, V.B., Chang, C.H., Triantafyllou, S., Tang, L., Mayr, F.B., Higgins, A.M., Peake, S.L., Mouncey, P., Harrison, D.A. and DeMerle, K.M., Kennedy, J.N., Cooper, G.F., Bellomo, R., Rowan, K., Yealy, D.M., Seymour, C.W., Angus, D.C., and Yende, S.P.
2025 -- Critical Care Medicine
Development and validation of an overdose risk prediction tool using prescription drug monitoring program data
[Link] -- Gellad, W.F., Yang, Q., Adamson, K.M., Kuza, C.C., Buchanich, J.M., Bolton, A.L., Murzynski, S.M., Thomas Goetz, C., Washington, T., Lann, M.F., Chang, C.H., Suda, K.J., and Tang, L.
2023 -- Drug and Alcohol Dependence
Duration of medication treatment for opioid-use disorder and risk of overdose among Medicaid enrollees in eleven states: A retrospective cohort study
[Link] -- Burns, M., Tang, L., Chang, C.H., Kim, J.Y., Ahrens, K., Lindsay, A., Cunningham, P., Gordon, A., Jarlenski, M.P., Lanier, P., Mauk, R., McDuffie, M.J., Mohamoud, S., Talbert, J., Zivin, K., and Donohue, J.
2022 -- Addiction
Use of medications for treatment of opioid use disorder among US Medicaid enrollees in 11 states, 2014-2018
[Link] -- Donohue, J.M., Jarlenski, M., Kim, J.Y., Tang, L., Ahrens, K., Allen, L., Austin, A., Barnes, A.J., Burns, M., Chang, C.H., Clark, S., Cole, E., Crane, D., Cunningham, P., Idala, D., Junker, S., Lanier, P., Mauk, R., McDuffie, M.J., Mohamoud, S., Pauly, N., Sheets, L., Talbert, J., Zivin, K., Gordon, A.J., and Kennedy, S.
2021 -- Journal of the American Medical Association

Students

Current Students

[PhD] Xinlei Chen (co-advised with Victor Talisa)
[PhD] Crystal (Ziwei) Zang (co-advised with Rebecca Deek)
[PhD] Ruizhi Yuan (co-advised with Wei Chen)
[MS] Junlin Liu

Past Students

[PhD] Jinhong Li (co-advised with Guan Yu) @Eli Lilly
[PhD] Zhiyu Sui (co-advised with Ying Ding) @Novartis
[PhD] Haoyi Fu (co-advised with Robert Krafty) @Novartis
[PhD] Xiaoqing (Ellen) Tan (co-advised with Gong Tang) @Meta
[PhD] Peng Liu (co-advised with George Tseng) @Merck
[MS] Tyler J. Kelly @PANTHERx
[MS] Yaxin Lin @Health Services Advisory Group
[MS] Liling Lu @UPMC
[MS] Jason N. Kennedy (co-advised with Jeanine Buchanich) @Pitt
[MS] Zhuxuan Fu @St. Luke's Health System
[MS] Ruishen Lyu @Cleveland Clinic

Software

Federated Harmony: the classic Harmony but for distributed data [GitHub]

Python package for implementing privacy-preserving batch-effect correction for single-cell expression matrices. It simulates the collaboration between multiple data-holding institutions and a coordinating center to harmonise latent representations without centralising raw data. Yuan et al, 2025.
RTL: transfer learning of individualized treatment rules with missing data [GitHub]

Python code for robust transfer learning of individualized treatment rules in the presence of missing data. It implements a quantile-based optimization framework to handle covariate shift and missing covariates. Sui et al, 2025.
FLoRI: federated learning of robust individualized treatment rules [GitHub]

Python code for federated learning of robust individualized decision rules across heterogeneous multi-hospital networks without sharing patient-level data. It implements a privacy-preserving training method while accounting for cross-site heterogeneity. Chen et al, 2025.
RISE: learning robust individualized treatment rules with sensitive variables [GitHub]

Python package to learn robust individualized decisions to improve the worst-case outcomes of individuals caused by sensitive variables that are unavailable at the time of decision. Tan et al, 2022.
DFRlearner: joint approach for learning linear conditional average treatment effects [GitHub]

R package for harmonizing the conditional average treatment effects across studies while keeping the individual-participant data secure within each study site. Li et al, 2026.
ifedtree: joint approach for learning heterogeneous individualized treatment effects [GitHub]

R package for harmonizing individualized treatment rules derived from heterogeneous data sources to boost the power of a target study, without the need for individual-level data from the other sources. It provides visualization of the heterogeneous association stucture across studies. Tan et al, 2022.
GEEfuse: joint approach for fitting heterogeneous GEEs [Paper]

R code for the detection of heterogeneous effects across multiple independent datasets when fitting GEE models. Tang and Song, 2021.
metafuse: joint approach for fitting heterogeneous GLMs [CRAN]

R package allowing detection of heterogeneous effects across multiple independent datasets when analyzed jointly. It provides visualization of covariate-specific effect subgrouping via dendrograms, and enables variable selection. Tang and Song, 2016.
modac: divide-and-conquer for fitting penalized GLMs [GitHub]

Python map-reduce functions for fitting GLM when a dataset is large and stored on distributed Hadoop clusters. The method provides stable inference. Tang et al, 2020.
eSIR: an extension of the SIR infectious disease model [GitHub]

R package of an epidemiological forecast model for assessing interventions based on COVID-19 data. Wang et al, 2020.
Calculator for quick random effects meta-analysis [Link]

R Shiny app based on metafor, it yields visualization and summary statistics to help understand betweeb-site heterogeneity.

Miscellaneous

Outside of work, I like to swim, run, and spend time with my family.
I got into the hobby of woodworking during the pandemic. Check here for some of my work.

This page was last modified on: 01/30/2026