Information Leakage: Types, Remedies, and Open problems

Julia Sidorova , Speaker at Oncology Conference

Research Scientist

Julia Sidorova

Instituto Carlos III de Salud (CIBER-EHD), Spain

Abstract:

Information Leakage threatens and questions the use of machine learning model in real-life clinical applications. In effect, information leakage is similar to vulgar overfitting, yet rather more subtle and even when detected much harder to remove. Some recent research indicates that if overfitting is removed, deep neural networks perform systematically worse than linear regression models. This statement is not very far from our results in survival analysis. There are different types of leakage and some are specific to deep neural networks. E.g. the effects of pretraining have not been thoroughly studied. In the talk, I will review the current understanding of what is information leakage and its subtypes. The types and examples were largely defined within different applications of machine learning. The RQ asked is: -- Is there anything a clinical bioinformatician should learn from the current concerns and work done in chemoinformatics, political science etc. Do the protocols of analysis keep us safe and where it is dangerous waters?

Biography:

Dr. Julia Sidorova holds PhD from Universidad Pompeu Fabra. She is a Research Scientist in service at the Bioinformatics Platform, CIBER, the Spanish national consortium of hospitals, part of Instituto Carlos III de Salud. As far as research is concerned, her interests lie in classical data analysis vs deep neural networks, -- understanding their suitability or deficiencies. I serve on the Editorial Board of Frontiers of Neurology (Biomakers) and International Journal of Molecular Sciences MDPI (currently organizing a SI on AI in Molecular Mechanisms of Cancer).

Copyright 2024 Mathews International LLC All Rights Reserved

Watsapp
Top