This project is a continuation of the EXTREME-Pilot project.
Project leader - main PI:
Co-PIs
Researchers
Project period: 2020-01-01 to 2024-05-31
Funding source: Digital Futures
Digital Futures Website: EXTREMUM: Explainable and Ethical Machine Learning for Knowledge Discovery from Medical Data Sources
Budget: 8.4M SEK
This is a continuation of the EXTREME pilot project, which ran in 2019 and 2020 for 3.85M SEK.
This project intends to build a novel data management and analytics framework, focusing on three pillars: (1) data integration and federated learning, (2) explainable machine learning, and (3) legal and ethical integrity of predictive models. The final product will be a set of methods and tools for integrating massive and heterogeneous medical data sources in a federated manner, a set of predictive models for learning from these data sources, with emphasis on interpretability and explainability of the models rationale for the predictions, while focusing on maintaining ethical integrity and fairness in the underlying decision making mechanisms that govern machine learning. The project will focus on two critical application areas: adverse drug event detection and heart failure treatment. The project is a collaborative effort between four research institutions: the department of Computer and Systems Sciences at Stockholm University, the Department of Law at Stockholm University, RISE Research Institute Sweden, and KTH.
Objective 1: Unified data representation and integration. We will define novel unifying space representations, similarity measures, and methods for searching and indexing large and complex data spaces. The basic challenge is the temporal nature of the data spaces and the inherent temporal dependencies that may exist within the same and across different data sources in these spaces. Particular emphasis will be given on providing theoretical guarantees on the performance of the proposed indexing techniques in terms of retrieval accuracy and efficiency.
Objective 2: Explainable predictive models. We will develop novel predictive modeling mechanisms for combining and enhancing the aggregate knowledge from heterogeneous data sources, with particular emphasis on the temporal properties of the data. The main challenge will be how to extract and fuse meaningful static and temporal features from multiple data sources, with focus on sequential and temporal data. The constructed models will be interpretable to the domain experts by employing explainable features and rules.
Objective 3: Legal and ethical implications of machine learning models. We will focus on the legal and ethical risks, implications, and potential harms resulting from the development and use of predictive modelling in relation to the analysis of healthcare data. To this end, we will embed existing predictive modelling schemes with legal and ethical considerations, thereby making them more accessible to regulatory and policy demands.
The implementation of the project is organized in five implementation WPs, one for each of the three objectives (WP1, WP2, WP3), one for validation on real data sources (WP4), and one for dissemination and exploitation (WP5). The project coordination (WP6) is done by SU-DSV.