Better cancer detection thanks to machine learning

Imane Araf, PhD student from the 2nd cohort of the 100PhDs for Africa © 2025 EPFL
A doctoral student at Mohammed VI Polytechnic University (Morocco), Imane Araf explores how machine learning can help improve the diagnosis of rare diseases using complex medical data. Her work focuses in particular on cancers affecting women, with the aim of making AI models more accurate and reliable. Here, she talks about her career path, her research, and what motivates her on a daily basis.
Could you introduce yourself?
My name is Imane Araf, and I am a third-year PhD student at the Faculty of Medical Sciences (FMS) at UM6P. Before beginning my doctoral studies, I graduated with an engineering degree in computer science and AI, then worked for a few years as an AI engineer at the R&D center MAScIR, where I gained a strong practical foundation before transitioning into academic research. Currently, my work focuses on developing cost-sensitive methods for imbalanced medical data, with a particular emphasis on female cancers, where predicting rare but critical outcomes is a major challenge.
How did you hear about the Excellence in Africa 100 PhDs programme?
I first heard about the programme through colleagues, and particularly through Amal, a PhD student from the first cohort, who shared her very positive experience with me. I then did some research online and found several useful resources. I also came across additional information about the programme in the UM6P newsletter.
What motivated you to apply?
The co-supervision structure was what initially attracted me, as I knew my research would benefit from international collaboration. I also appreciated that the programme is specifically designed to strengthen research capacity in Africa and to foster long-term scientific partnerships that address real challenges on the continent.
Did you find the application process easy?
Yes, the process was clear, well-organized, and transparent. It required diligence and precision, which is expected from such a competitive programme, but the instructions were straightforward and communication with the team was smooth.
Could you describe your project? What are your research questions?
My project focuses on cost-sensitive learning for imbalanced medical data, with a particular emphasis on female cancers. The core challenge is that in medical datasets, the most critical cases are often the rarest. Standard machine learning models treat all prediction errors equally, whereas clinically, missing a high-risk patient has far more serious repercussions than incorrectly flagging a low-risk one.
My research questions center on how to build models that explicitly account for these different costs, how to ensure they generalize well despite working with highly imbalanced data where positive outcomes are underrepresented, and how to make sure they remain reliable and clinically meaningful in real-world medical settings.
Could you give some practical examples of how your research might be applied?
My work can help improve the early detection of female cancers, especially when it concerns early-stage cases that are rare and difficult to identify; reduce the number of false negatives, which are particularly harmful in cancer diagnosis; and support clinical prioritization by identifying high-risk patients more accurately. This allows clinicians to adjust monitoring or intervention strategies as needed, which is especially valuable in resource-constrained settings where identifying who requires urgent attention is essential. Moreover, these methods can also extend to other diseases and healthcare challenges involving rare but high-risk outcomes.
What is the scientific challenge of your research topic?
The main challenge is that medical datasets are highly imbalanced; most patients do not experience the worst outcomes, yet those are precisely the cases that need to be detected. The models must account for different clinical costs without overfitting to the minority class or producing too many false alarms. In oncology, this is further complicated by high-dimensional data and limited sample sizes. Balancing sensitivity to rare cases, controlling false positives, and ensuring good generalization is particularly challenging.
Could you briefly present your PhD supervisor and co-supervisor?
My supervisor is Prof. Ali Idri, who has strong expertise in machine learning and software engineering. My co-supervisor, Prof. Pascal Frossard, brings deep expertise in signal processing and machine learning, which has helped me approach my research from different technical angles. Together, they provide complementary perspectives that strengthen my work.
What are the advantages of a co-supervision between your African supervisor and your EPFL co-supervisor?
Co-supervision allows me to benefit from both worlds: a deep understanding of local medical needs and data contexts, and top-tier scientific expertise from EPFL. The African supervision keeps the work grounded and relevant, while the European co-supervision brings broader methodological perspectives and international visibility. This combination enables me to produce research that is both locally meaningful and rigorous by global standards.
How will the collaboration with EPFL help you meet the scientific challenge described in the previous question?
EPFL provides an exceptional environment, with advanced expertise, research groups working at the forefront of machine learning, and access to strong scientific resources. Working with Prof. Pascal has been particularly valuable; he has been available whenever I needed guidance, and our discussions have helped me think more carefully about the most technical aspects of the project. The lab environment, seminars, and exposure to the work of other researchers have also broadened my perspective and helped me structure my research in a more rigorous and innovative way.
Can you tell us about your stay in Switzerland? How did you prepare for your move to Switzerland?
The process from Morocco, particularly the visa procedure with the embassy, was smooth and straightforward, which certainly helped. Settling in Lausanne, however, was cumbersome. The most challenging part was finding housing and handling administrative procedures, which required time and effort. Once those steps were sorted out, everything else went smoothly. The hosting lab members, including Prof. Pascal, were very welcoming and kind, which made my stay both pleasant and memorable. Kelly and Carine from EXAF were also extremely helpful and supportive; whenever I had questions or needed guidance, I knew I could count on them.
Do you have any funny or unexpected anecdotes to share with us about your stay?
One small anecdote is that I never quite managed to navigate the EPFL campus properly. There were far too many shortcuts to memorize, so I often ended up taking the longer route or walking into the wrong building. It was not all bad though, as it helped me discover new parts of the campus along the way.
What does excellence mean to you?
Excellence, to me, means producing work that is both rigorous and meaningful. It is about asking the right questions, pursuing answers with intellectual honesty even when the results challenge your initial hypotheses, and maintaining high standards and integrity throughout the process.