Natalie Harris
Natalie is a research engineer in Google Research and a member of the Brain Health Research team. She joined Google in 2014, since then working in the Kirkland/Seattle, Zurich and London offices. She has previously worked on multiple teams across Google and Deepmind, most recently working on continuous prediction of adverse events using Electronic Health Records. Currently her focus is Machine Learning for Healthcare, with a particular interest in Fairness & Ethics. She earned her BS in Computer Science from the University of Washington.
Authored Publications
Sort By
Diagnosing failures of fairness transfer across distribution shift in real-world medical settings
Sanmi Koyejo
Eva Schnider
Krista Opsahl-Ong
Alex Brown
Diana Mincu
Christina Chen
Silvia Chiappa
Proceedings of Neural Information Processing Systems 2022 (2022)
Preview abstract
Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the structure of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is encountering in practice. In this work, we adopt a causal framing to motivate conditional independence tests as a key tool for characterizing distribution shifts. Using our approach in two medical applications, we show that this knowledge can help diagnose failures of fairness transfer, including cases where real-world shifts are more complex than is often assumed in the literature. Based on these results, we discuss potential remedies at each step of the machine learning pipeline.
View details
Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records
Nenad Tomašev
Sebastien Baur
Anne Mottram
Xavier Glorot
Jack William Rae
Michal Zielinski
Harry Askham
Andre Saraiva
Valerio Magliulo
Clemens Meyer
Suman Venkatesh Ravuri
Alistair Connell
Cían Hughes
Julien Cornebise
Hugh Montgomery
Geraint Rees
Christopher Laing
Clifton R. Baker
Thomas Osborne
Ruth Reeves
Demis Hassabis
Dominic King
Mustafa Suleyman
Trevor John Back
Christopher Nielsen
Martin Gamunu Seneviratne
Shakir Mohamad
Nature Protocols (2021)
Preview abstract
Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks.
View details
Multi-task prediction of organ dysfunction in the ICU using sequential sub-network routing
Diana Mincu
Eric Loreaux
Anne Mottram
Hugh Montgomery
Ali Connell
Nenad Tomašev
Martin Seneviratne
Journal of the American Medical Informatics Association (JAMIA) (2021)
Preview abstract
Introduction:
Multi-task learning (MTL) using electronic health records (EHRs) allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential sub-network routing (SeqSNR) architecture which uses soft parameter sharing to find related tasks and encourage cross-learning between them.
Materials and Methods:
Using the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we train deep neural network models to predict the onset of six endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single task models (ST) with naive multi-task (shared bottom, SB) and SeqSNR in terms of discriminative performance and label efficiency.
Results:
SeqSNR showed a modest yet statistically significant performance boost across at least 4 out of 6 tasks compared to SB and ST. When the size of the training dataset was reduced for a given task, SeqSNR outperformed ST for all cases showing an average AU PRC boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels respectively.
Discussion and Conclusion:
Multi-task learning has variable performance compared to single-task learning, with the possibility for negative transfer. The SeqSNR architecture outperforms SB and ST in discriminative performance and shows superior performance in terms of label efficiency. SeqSNR should be considered for multi-task predictive modeling using EHR data.
View details