CYBER-DEFENCE FELLOWSHIPS: Andrei Kucharavy

© 2021 Andrei Kucharavy

© 2021 Andrei Kucharavy

To promote research and education in cyber-defence, EPFL and the Cyber-Defence (CYD) Campus launched in summer 2020 the second call for the CYD Fellowships – A Talent Program for Cyber-Defence Research. In October 2020, the CYD evaluation committee announced the recipient of the second CYD Distinguished Postdoctoral Fellowship.

Andrei Kucharavy has been awarded a two-year CYD Distinguished Postdoctoral Fellowship, which runs from December 2020 to November 2022. The award includes a salary contribution, research and conference funds, and the opportunity to engage with experts from the CYD Campus. Andrei is in the third year of his postdoctoral research in the Distributed Computing Lab at EPFL.

  1. How did you find out about the CYD Fellowships and what motivated you to apply?

During my PhD, I had to manage a small secure subnet, right around the time when a wave or crypto-locker malwares hit hospitals and research labs. Despite user training, the credentials used to access the subnet I was managing were compromised several times through phishing. Humans are often the weakest link in the safety of cyber-physical systems, even with unsophisticated, copy-paste messages. The idea of what an attack using generative AIs to impersonate real people and learn how to best phish access credentials could achieve was terrifying.

Another terrifying prospect was what generative AI could do for large-scale engineering attacks. The danger of social networks as a vector of cyber-attacks had gained in visibility, but the damage that could be done just by a couple of humans with a simple “like-and-repeat” robot network was hard to anticipate. 5G – COVID conspiracies, spread this way, led to the burning of 5G towers across the UK and the US. US presidential election fraud conspiracies led to the January 6 attacks, which in turn allowed unauthorized access to highly confidential information from terminals that were left unlocked as well as the exfiltration of devices containing secret information. “Normal” people who took them as souvenirs were then contacted by foreign intelligence offering to take those devices off their hands for a good sum of money. At no point did the attackers in the cyber space have to expose themselves or put in danger any of their assets to destroy critical infrastructure or acquire confidential information. Now imagine how much more powerful and harder to detect/counter their attacks would be if instead of humans with imperfect English and stale memes and simple robots they were using generative AIs, capable of learning how to impersonate real people and what to say in order to achieve the desired outcome.

My research on using the theory of evolution to improve the training of generative AIs called GANs (generative adversarial networks) seemed to have ways to address that situation, and I was looking for opportunities to investigate that further. So, when I saw the call for applications for the CYD fellowship sent through the EPFL mailing lists, I jumped on the opportunity right away.

  1. What is your CYD Fellowship project about?

I hunt AIs that try to impersonate humans. For that I use mathematical abstractions from the theory of evolution that suggest optimal ways to train systems that compete with other systems in a constantly changing environment. The goal is to create distributed algorithms to train swarms of “simple” AIs capable of detecting generative AIs that are much complex than themselves. The distributed aspect is critical for the venture – both from an ethical standpoint and from an implementation perspective. When faced with attackers that can access months of compute time on specialized super-computers to train their AIs, the only way to make sure the defence against them is available to everyone is to distribute the training across many computing nodes. From an ethical perspective, it prevents an emergence of a single authority that can decide who is a robot and who is not. Instead, we can have detectors run by different institutions (or even people) coming to an agreement if an online persona is real or not. The fascinating part is that thanks to byzantine-resilient computing (or in this case byzantine-resilient machine learning), we can ensure that the agreement would still be good even if several members of the detection network are dishonest or are even trying to aggressively compromise the network.

  1. What are the advantages of conducting your post-doctoral project at the CYD Campus?

I can work on the application of my research in cyber-defence. Rather than an after-thought or an application of academic research, I can go in depth building and testing algorithms and methods. I also have access to computational resources that are hard to obtain elsewhere. Finally, perhaps the most important part is the human environment. I have an additional supervisor interested in cyber-defence topics and their implications. I have close contact to other people working on other cyber-defence topics that potentially have a strong synergy with my own one. It is a pretty awesome environment to be working in – I really have an impression of contributing to the emergence of something completely new and yet crucially important.

  1. Did you as a child dream of working in cyber-defence?

When I was a child, the concept of cyber-defence, or even cyber-security or cyber-warfare didn’t really exist (except maybe in science-fiction). However, in the mid-1990s, when I first got on the Internet, I saw it before Google, spam filters, anti-viruses and social networks made it safe and friendly. The idea that no one on the internet was necessarily who they said they were (the proverbial “On the Internet, nobody knows you're a dog”) and the constant risk that a click could take you to a dark corner of the Internet or download a virus to your computer definitely influenced me in thinking of the “assume cyber-space is hostile and defend yourself”. Regarding cyber-defense itself, I became aware of it around the time of my PhD and considered I could work in this field when I started seeing at the beginning of 2020 how my research was related to it.

  1. What is driving you to pursue research in cyber-defence?

The most prominent reason of all is that the Generative AIs in their current form are an unmitigated existential risk to societies around the world. Over the last decade we became more and more aware that large-scale social engineering attacks in cyber space are a real danger – be it by spreading anti-vaccine sentiment, instigation of genocide through social media, or the organization of politically motivated attacks on government buildings or crucial infrastructure. All of this can be kicked into overdrive if more intelligent AIs can take the place of simpler robots. From the cyber-warfare point of view, it is like the appearance of air combat in the early twentieth century – an entirely new plane of cyber-defence that needs to be understood. The second reason is that cyber-defence is fascinating in and of itself and brings together people from widely different backgrounds. Seemingly unrelated topics come to interact and synergize, opening new vectors of attack. You are racing to discover, assess and mitigate them before a potential attacker does. I am a so-called “white-hat hacker”, and yet I don’t really “hack” anything. I study evolution and generative learning to prevent people from breaking into critical cyber-physical systems.

  1. What is the most important lesson you have learned in your scientific career so far?

This is a good question. Completing a PhD and conducting active research teach you quite a lot of valuable lessons. Perhaps the one I find the most useful in my day-to-day research and life in general is that scientific research is hard. Really hard. It’s even harder to explain to others – including your peers. You will never have the entire picture, and theories that seem to explain what is known today are likely to fall apart with the data that will emerge tomorrow. However, as you advance in your research, you see more and more possible interpretations of the same. You also see how some of them, despite being less believable at first keep surviving and performing well. That difficulty of research also implies that the conclusions of a lot of papers, which could seem as a definitive proof, are to be taken with a grain of salt. There are a lot of things going on in the background that you might not be aware of, unless you tried doing research in that domain yourself. The papers advance the field; some will stand the test of time, but others will lead you astray. The reality is that not all papers and results are equivalent and figuring out which ones are more trustworthy is hard work and requires a lot of contextual knowledge.

  1. What are you most proud of in your career to date?

This is another very good question. During my PhD in computational systems biology I had the opportunity to contribute to several findings that are landmarks in their respective fields and have profound implications for how we understand and can treat serious diseases – such as cancer, Alzheimer’s disease, or multi-drug resistant infections.

However, from an intellectual standpoint, I am even more proud to have reached the level of deep conceptual abstractions, explaining observations and predicting things in several seemingly unrelated domains. Problems considered wide open in one field can be well-explored in another, but the difference in language, parametrization and conceptual framework will hinder the transfer of knowledge. I started my PhD by looking at the implications of byzantine-resilient machine learning for biological organisms, but I am now applying results from the theory of evolution in biology to machine learning. Being able to see such deep regularities and extract concrete applications from them is truly amazing.

  1. Outside the lab, what do you enjoy doing most?

I love sports and nature. Especially skiing in the winter – both in resorts and in back-country, but I am also a competitive swimmer and an amateur long-distance runner.

  1. What are your expectations about the CYD Fellowships?

The CYD Fellowship provides me with an opportunity to translate the results of my research into concrete applications in cyber-defence and present them to other actors in cyber-defence in Switzerland who can turn these results into products to be deployed on a large scale. The generative learning poses a threat that is particularly marked for Switzerland. Large-scale social engineering attacks were until recently hard to perform because there are several distinct languages and cultures that need to be attacked in a coordinated manner. Generative learning has radically changed the situation. Generative AIs is able to learn the language and culture all by itself and the regular popular votes provide a rapid feedback loop. I really hope to contribute to solutions that will make Switzerland and the rest of the world safer.

  1. Could you share some tips with future applicants who are considering applying for the CYD Fellowships?

This is a yet another great question. Probably the biggest one would be not to hesitate to apply and to spend time explaining the cyber-safety problem you are trying to address with simple terms and providing context. Cyber-warfare is built on top of “hacker” culture, which is all about using seemingly unrelated means in innovative ways to make complex cyber-physical systems do what you want them to. Cyber-defence requires thinking outside the box and often to use expertise from other domains. You might feel that since you are not a “real hacker” you don’t qualify for CYD fellowships, but if the problem you want to address is a neglected cyber-defence risk, you are the right person. For that, you need to clearly explain what problem you are solving, why it is a risk for the cyber-security of Switzerland, why it has not been solved before, why now is a good time to do it and what you are bringing to the table. Most people reading your proposal are not going to be experts in your domain. You need to make sure you provide enough background and use simple enough terms for them to understand.