List of Projects by Sonish Sivarajkumar
Current Projects
Fair Patient Model: Mitigating Bias in the Patient Representation Learned from
Electronic Health Records
We propose a novel model to pre-train fair and unbiased patient embeddings from Electronic Health Records
(EHRs) using a novel weighted loss function that reduces bias and improves fairness in deep
representation learning models.We defined a new loss function, called weighted loss function, in the deep
representation learning model to balance the importance of different groups of patients and
features. We applied the proposed model, called Fair Patient Model (FPM), to a sample of
34,739 patients from the MIMIC-III dataset and learned patient representations for four clinical
outcome prediction tasks.
Generative deep patient
Working on foundational models and language models for generative patient representation using soft prompting and zero-shot learning on generative patient models. Generative patient models are models that can create new patient data or embeddings based on some input data or conditions.
Lung Cancer NLP
Developing NLP and ML algorithms to predict immunotherapy response and metastases prediction on lung adenocarcinoma patients, using structured and unstructured EHRs(clinical notes).
Past Projects
HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural Language Processing
We developed a novel prompt-based clinical NLP framework called HealthPrompt and applied the paradigm of prompt-based learning on clinical texts. In this technique, rather than fine-tuning a Pre-trained Language Model(PLM), the task definitions are tuned by defining a prompt template. We performed an in-depth analysis of HealthPrompt on six different PLMs in a no-data setting. Our experiments prove that prompts effectively capture the context of clinical texts and perform remarkably well without any training data.
Predictive site recommendation system using Information Retrieval
Developed a predictive tool that can recommend clinical trials sites and principal investigators (PIs) based on Roche internal data and external sources such as Citeline, AACT, and ClinicalTrials.gov. The tool uses advanced AI and NLP techniques to create vector space representations of the sites and PIs, and then performs information retrieval based on these embeddings. The tool also considers the aspects of diversity and inclusion in the site selection process.
Patient Analytics Project
Developed and deployed AI and analytics systems for clinical and patient data using RWE and EHR sources. The project covered various aspects such as patient analytics, clinical trials pipeline automation, HCP segmentation and targeting, and big data scheduler using Apache airflow. The project used NLP and big data tools and leverages IQVIA’s private cloud infrastructure.