CYBER-DEFENCE FELLOWSHIPS: Ana-Maria Cretu

Marc Luna; Ana-Maria Cretu

CYBER-DEFENCE FELLOWSHIPS: Ana-Maria Cretu

To promote research and education in cyber-defence, EPFL and the Cyber-Defence (CYD) Campus will soon launch a new call for Doctoral and Distinguished Postdoctoral Fellowships – A Talent Program for Cyber-Defence Research.
This month we introduce you to Ana-Maria Cretu, a CYD Distinguished Postdoctoral Fellowship recipient in the Security and Privacy Engineering Laboratory (SPRING) at EPFL.

How did you find out about the CYD Fellowships and what motivated you to apply?

I learned about the CYD fellowship from my current advisor at EPFL, Prof. Carmela Troncoso. I reached out to her inquiring about postdoc openings and she told me about the opportunity to do one jointly with the Cyber-Defence (CYD) Campus. I was very excited about the possibility to collaborate with CYD’s researchers and to work on applied projects that have a concrete impact in cyber-defence. In my research, I study privacy and security of emerging data-driven technologies such as generative AI and synthetic data. I always find it more interesting to ground my research in real-world applications, and the CYD has plenty of those. Furthermore, at this stage in my career it is beneficial to expand my network and I thought I would benefit from exposure to CYD’s network of researchers and industry partners.

What was your CYD Fellowship project about?

The goal of my CYD fellowship project was to study the privacy and security of emerging data-driven technologies such as generative AI and synthetic data. In my main lead-author project, I study whether it is possible to prevent harmful content generation by general-purpose image generative AI models. Recently, models such as DALL-E 2 and Stable Diffusion have achieved impressive capabilities in generating photorealistic images from natural language descriptions. However, they are also used to create photorealistic AI-generated child sexual abuse material of real and fictional children on an unprecedented scale. This has led to calls for technical approaches to prevent the generation of this harmful and illegal content. In this project, I am implementing and evaluating a defense called concept filtering. This project is the result of an interdisciplinary collaboration with researchers in security, computer vision, and online abuse from Switzerland, the US and Germany, which I am leading. I found this collaboration to be a very enriching experience from a professional standpoint. Apart from this big project, I am also working on several projects relating to the privacy of synthetic data and other machine learning applications.

What were the advantages of conducting your master thesis project at the CYD Campus?

I see four main advantages. First, I was assigned a mentor from the CYD Campus, Dr. Raphael Meier, who became my collaborator. His presence was especially important when I arrived in Switzerland. He introduced me to other people and helped me familiarize myself with the CYD. Second, I was invited many times, through the CYD network, to present my work at events open to large audiences, including the CYD conference, where I gave a talk in front of 300 people. I met a lot of people at these events, which has opened up new collaborations within the local Swiss network. Third, I have benefited from the CYD’s generous funding for traveling to conferences and paying for other research costs. I have used this funding to, for instance, attend a Dagstuhl seminar. This seminar gathered an interdisciplinary group of privacy experts, both computer scientists and legal scholars, from all over the world to create together an independent evaluation framework for privacy enhancing technologies, with a particular focus on detecting privacy washing. Finally, I would like to acknowledge the large-scale computing resources provided by the CYD and the help of the personnel responsible for the cluster whenever I needed it. Both have been instrumental in the success of my research as part of the CYD program.

Did you as a child dream of working in cyber-defence?

No, my dream as a child was to become a mathematician. After high school I went on to study mathematics but over time I shifted to data science and computer science for my master studies; and later to data privacy for my doctoral studies. These shifts were driven by my desire to work on real-world applications, and later on the intersection between machine learning, privacy and security. In the end, I believe I have found a good compromise between my childhood dream of becoming a mathematician and what I do in my day-to-day research: I research privacy and security topics grounded in real-world applications, but I always end up using a formal approach in my research and I go very deep into things. These approaches are driven by my background in mathematics.

What is driving you to pursue research in cyber-defence?

What is driving me is curiosity and skepticism about big claims relating to data-driven technologies and automation more generally. With recent developments in AI, we are seeing a push by big tech and governments to adopt data-driven technologies in our day-to-day lives. This push creates a feeling of powerlessness and inevitability that automation is the only way to go. At the same time, we do not understand well enough the limits of these technologies (whether they can really do what they promise to) and the harms they create for society (what are the trade-offs that we have to accept if we really want to adopt them widely). Yet, building this understanding is very important for society, including for cyber-defence, to help governments make decisions that do not put national security at risk.

I was fortunate to study these topics -- the limits and privacy harms of data-driven technologies – during my doctoral thesis. For instance, I developed ML-based tools for automatically discovering privacy vulnerabilities in data releases, enabling more comprehensive auditing of these systems. I also evaluated the robustness of client-side scanning solutions for detecting illegal content in end-to-end-encrypted communications. Finally, I explored new threats and attack methods against machine learning models. Working on these projects has made me aware of the importance of rigorously evaluating claims made about the capabilities and privacy of data-driven technologies, and of developing tools and frameworks for making these evaluations more accessible to practitioners and policymakers.

What is the most important lesson you have learned in your scientific career so far?

The most important lesson I learned is that collaboration is crucial for doing excellent work. As no two people are the same, in terms of expertise but also personal research taste, whenever I collaborate closely with other people we end up pushing the work in slightly different directions. This means that the common output is richer and compelling to a broader audience than what would have been possible working alone. I am also a very curious person, I enjoy learning all the time, and so I find that through collaboration I can learn more about the problem at hand through others’ perspectives and expand my research skillset by observing how others approach the same problem.

What are you most proud of in your career to date?

I am proud of the contributions I made during my PhD work.

One of these contributions is developing the first automated method, along with an open source tool, to analyze the privacy of query-based systems. Modern privacy regulation, such as the European Union’s General Data Protection Regulation (GDPR) and Switzerland’s New Federal Act on Data Protection (nFDAP), impose strict limits on the sharing and use of personal data. Sharing de-identified record-level data has been shown repeatedly to not satisfy the definitions of anonymization of these laws, and has thus fallen out of favor. Query-based systems (QBS) are a popular alternative to record-level releases, where analysts are given access to the data through a controlled interface, i.e., they can query a dataset for aggregate statistics without directly accessing individual records. Ensuring that QBSes provide adequate privacy protection is however extremely challenging, and QBSs often implement complex combinations of defenses, making their privacy difficult to analyze. Yet, manually designing and implementing attacks against complex and expressive QBSes is a difficult and time-consuming process. To address this problem, I developed QuerySnout, the first automated method for discovering privacy vulnerabilities in QBSs. QuerySnout is a general method that combines evolutionary search techniques to explore the space of queries susceptible to privacy attacks using machine learning and to combine the answers to the queries in order to infer sensitive information about individuals. My work enabled for the first time the automated search for privacy attacks against QBSs, at the click of a button, making privacy evaluations more accessible to data practitioners.

Another contribution is proposing the first evaluation of robustness of perceptual hashing-based client-side scanning (PH-CSS) to adversarial evasion attacks. End-to-end-encrypted (E2EE) communications have been argued by law enforcement agencies to facilitate the sharing of Child Sexual Abuse Material (CSAM), by hiding the content of the communication. To address this concern, PH-CSS solutions have been proposed by governments, researchers, industry, and child safety organizations as the most promising solution that would not altogether remove E2EE. PH-CSS would indeed detect illegal content directly on the user's device before encryption. Given the pervasive scope of PH-CSS deployment, these regulations have been strongly criticized by privacy and security researchers. Our evaluation of PH-CSS solutions showed that they cannot reliably detect CSAM in the presence of adversaries, as bad actors can almost always imperceptibly modify an image to evade detection. The results of this evaluation have contributed critical evidence in the worldwide debate about whether CSS solutions can reliably detect illegal content such as CSAM, and are often referenced by researchers and policymakers. For instance, our paper was cited by Ofcom, the UK's communication regulator, in its report on perceptual hashing technologies, and by researchers in an open letter on the EU’s proposed Child Sexual Abuse Regulation.

Outside the lab, what do you enjoy doing most?

My favorite hobbies are reading and listening to music. For more than 10 years since I left Romania I have been reading mostly in English and French but more recently I have returned to my roots reading mostly in Romanian. These days, I particularly enjoy reading Romanian and Moldavian female authors. I also spend a lot of time listening to and discovering music from a lot of genres and languages.

What were your expectations about the CYD Fellowships?

I expected the CYD fellowship to provide me with freedom to pursue my research objectives and to help me transition towards a senior academic role; to provide a platform for my research by allowing me to present my work to a broad audience of researchers, industry partners, and government actors; and to give me exposure to applied topics at the intersection between my research area and applications relevant to the CYD Campus and its broader community. These expectations were fulfilled.

Could you share some tips with future applicants who are considering applying for the CYD Fellowships?

I have two pieces of advice: to consider how your research proposal is relevant to the CYD, and to contact previous postdoctoral fellows to better understand the requirements of the CYD fellowship. To future applicants: do not hesitate to reach out if you have questions.

21.08.25

News

Subscription

Receive an email for each new article

Share on