I am a machine learning researcher with experience in developing, enhancing and delivering novel statistical and machine learning methods tailored to healthcare analytics. In 2020 I joined Novartis’ Advanced Methodology and Data Science group, which focuses on developing new machine learning methods with the aim of improving drug development in multiple projects. I am member of the editorial board of Machine Learning Journal (MLJ) and vice-chair of the technical committee on Statistical Pattern Recognition Techniques of the International Association for Pattern Recognition (IAPR).
I did my PhD in statistical machine learning on the area of hypothesis testing and feature selection in semi-supervised scenarios in the University of Manchester’s Department of Computer Science. Afterwards, I spent many years as post-doctoral researcher on developing novel methodologies for analysing: self-reported epidemiological data with Manchester’s Health e-Research Center, clinical trials data for personalised medicine with AstraZeneca and digital healthcare data for digital biomarker development with Roche.
Disclaimer: this is my personal page, the content is my own responsibility and it is not connected to/supported by any entity with which I have been, am now, or will be affiliated.
PhD in Machine Learning, 2015
University of Manchester, UK
MSc in Information Systems, 2011
Aristotle University of Thessaloniki, Greece
MSc in Communications and Signal Processing, 2009
Imperial College London, UK
MEng in Electrical and Computer Engineering, 2006
Aristotle University of Thessaloniki, Greece
Jan 2023: I have been appointed vice-chair of the technical committee on Statistical Pattern Recognition Techniques of the International Association for Pattern Recognition (IAPR).
Dec 2022: Our paper on benchmarking methods for characterising treatment effect heterogeneity in clinical trials published in Biometrical Journal. If you are interested on simulating datasets of heterogeneous treatment effects you can check our benchtm package.
Sept 2022: Organised the session on knockoffs and multiple testing with biomedical applications in the Multiple Comparison Procedures (MCP) conference, where Lucas Janson, Zhimei Ren, Jinzhou Li and Asher Spector presented their exciting works.
May 2022: Extemely happy to present our work in Novartis on quantifying uncertainty on machine learning-based predictive biomarker discovery to the MSc in Data and Web Science, of the Artistotle University of Thessaloniki. More details here.
April 2022: This year we organise the third edition of PharML workshop, colocated with ECML-PKDD 2022. The call for papers is officially open: https://easychair.org/cfp/pharml2022.
Dec 2021: Lasse Hansen‘s work on assessing depression using speech emotion recognition systems published in Acta Psychiatrica Scandinavica. For those interested, Lasse has a nice twitter thread that summarizes some of the main points of the paper.
Clinical trials dataR code for the project of deriving predictive biomarkers using information theoretic methods can be found in GitHub If you make use of the code, please cite the paper: Distinguishing Prognostic and Predictive Biomarkers: An Information Theoretic Approach.
Semi-supervised dataMatlab code for the project of semi-supervised feature selection can be found in GitHub. If you make use of the code, please cite the paper: Simple strategies for semi-supervised feature selection.
Under-reported dataMatlab code for the project of feature selection with under-reported variables can be found in GitHub. If you make use of the code, please cite the paper: Dealing with under-reported variables: An information theoretic solution.
Positive-unlabelled dataMatlab code for the project of hypothesis testing/power analysis/sample size determination in positive-unlabelled data can be found in project’s homepage. If you make use of the code, please cite the paper: Statistical hypothesis testing in positive unlabelled data.
Multi-label dataJava code for the project of stratification for multi-label data can be found in Mulan, a Java Library for Multi-Label Learning. If you make use of the code, please cite the paper: On the Stratification of Multi-label Data.Our algoirthm for iterative stratification have been implemented in various other languages, e.g. R and Matlab. In Python there are various packages that include our algorithm, such as the Scikit-multilearn and the iterative-stratification.