EPFL software mines rich medical data while keeping it secure
MedCo, developed in the lab of EPFL professor Jean-Pierre Hubaux in collaboration with professor Bryan Ford’s DEDIS lab and the Lausanne University Hospital (CHUV), is the first operational system to protect sensitive patient data – including genetic information – so that it can be used collectively for crucial medical research.
Using the MedCo system, authorized researchers will be able to explore data from multiple sources like hospitals based on clinical and genetic criteria without compromising patient privacy. The software uses a combination of advanced cryptographic approaches that make it possible to collectively analyze data stored at different sites – and even in different countries – and protect it by keeping track of how information is accessed and by whom.
“The idea is to have a level of protection such that each researcher can access only the data he or she is allowed to touch,” explains Hubaux, who leads the Computer Communications and Applications Laboratory 1 in the School of Computer and Communication Sciences (IC).
The MedCo system “brings the computation to the data, instead of the data to the computation”. © 2019 MedCo/EPFL
Answer to a mathematical – and medical – problem
Hubaux says that MedCo provides a unique solution to the conundrum of modern medical research: Even as increasing quantities of rich personalized data – whether from genetics research or Fitbits – and improved machine learning techniques offer limitless possibilities for precision medicine, analyzing such vast quantities of highly sensitive data securely is becoming more and more difficult.
“There is going to be more and more data about our bodies, and this can be a treasure trove in terms of better diagnosis and therapies – especially in the fields of oncology and rare diseases,” Hubaux explains. But while genetic data is key to developing precision treatments, it is also difficult to manage securely because it can be used to identify a patient.
That’s why MedCo’s approach of “bringing the computation to the data, instead of the data to the computation” and its combination of cryptographic tools, Hubaux says, is a game changer in a world where medical data are often still shared from site to site with some degree of security risk.
To achieve this level of protection, the MedCo system uses cryptographic approaches including homomorphic encryption and private blockchain, which allows computations to be performed on encrypted data without decrypting it. The system also achieves differential privacy, which guarantees patient confidentiality by introducing numerical randomness to the results of computations.
Nicolas Rosat, Deputy Director of the CHUV's Information Systems Department, points out that, "data security is a constant priority for the CHUV. The first use of MedCo will be as part of the collaboration between Swiss University Hospitals (SPHN) and will make it possible to count the patients likely to be included in specific clinical studies. Organizational safeguards will strengthen the use of this application in a broader context, for example to avoid re-identifying patients by combining data.”
First release complete
MedCo has recently been released as open-source code, and Hubaux and his team are already in talks with companies that are interested in turning the prototype system into a product for use by hospitals, pharmaceutical companies and life sciences researchers. The interface is designed to be used by biomedical researchers who aren’t necessarily computer science experts.
Built by EPFL and CHUV researchers, MedCo will be deployed among the Lausanne, Bern and Geneva University Hospitals for pilot tests. The project was funded by the Swiss Personalized Health Network and the Personalized Health and Related Technologies strategic focus area of the ETH Domain. It is the first system to come out of the Data Protection in Personalized Health project (DPPH), which Hubaux coordinates.
The research that went into developing MedCo has also been published in the journal IEEE/ACM Transactions on Computational Biology and Bioinformatics.