No quick fix for targeting misinformation on the internet
Sabine Süsstrunk and Karl Aberer, two EPFL professors, emphasize the role of human action – and not technology – in curbing misinformation.
During their back-to-back presentations at the recent SACM Annual Conference in Lucerne, Sabine Süsstrunk and Karl Aberer dashed attendees’ tech-fueled hopes. “Don’t expect technology to come to the rescue,” says Süsstrunk, who heads EPFL’s Image and Visual Representation Lab. “The best way to recognize deep fakes is still with the human eye and our discerning intelligence.” Meanwhile, Aberer, who heads EPFL’s Distributed Information Systems Laboratory, wasn’t more optimistic about a one-size-fits-all, technology-based approach to AI-generated misinformation: “We’ll have misinformation for as long as the sources of what we find online are kept anonymous.”
As researchers, Süsstrunk and Aberer are able to take a healthy distance from the hype surrounding AI and its role in generating misinformation. But they’re keenly aware of the dangers it poses to the public. Their presentations discussed the cutting edge of technology and highlighted how government oversight, education and source identification all touch on the fundamental issues of information science and public trust. We spoke with Süsstrunk and Aberer just after the conference to get their thoughts on the topic of AI-generated misinformation.
Prof. Süsstrunk, you come from signal processing, whereas Prof. Aberer, you’re a computer scientist. Are your fields now converging with the emergence of text to image generative AI?
Karl Aberer (KA): There’s indeed something interesting going on. Our fields are very different yet we’re seeing some convergence, and it doesn’t have to do with field-specific knowledge. Signal processing is essentially applied mathematics, which is about understanding low-level signals. And computer science involves building representations in the form of computer models. And Sabine, your work involves turning signals into models, right?
Sabine Süsstrunk (SS): Yes, kicking and screaming, by the way.
KA: In computer science, we had increasingly realized that we could incorporate signals into our models. Now these converge because the models are produced using large language models.. Now, the signals are kind of going away because, in a sense, you don't need signal processing anymore. All the earlier work that was done has become obsolete.
SS: Absolutely, there’s convergence in that sense. We’d always been proud of designing our algorithms “by hand.” We knew exactly how they worked and described them through mathematical equations. This was the applied mathematics part of signal and image processing. But then, neural networks and deep learning models came along. Now we have models that are used on many images and for many applications – but we don't know exactly how they work.
Do you feel like you have to constantly play catch-up to the tech giants?
SS: No, I don’t. First of all, it’s inspiring to see that our field is going somewhere exciting. And second, we’re clever enough to find opportunities that others might not be working on. For example, we developed a very successful segmentation algorithm. Today it’s no longer viable, but because I created the first one, I understood what happened when a new one came along and replaced it. What’s more, we’re always coming up with fresh ideas, saying: “here’s an interesting direction, and I haven’t seen anybody actually doing something with it.”
KA: I agree that we’re not playing catch-up. In my view, we simply address new problems as they arise. For instance, for a long time in computer science we were highly focused on semantics. But now we have a wonderful semantics-generating engine that has eliminated this need. In fact, the engine does it much better than we did since it’s able to crunch big numbers. We therefore moved our focus back to optimization because computer models are quite expensive to develop and run, and you have to think very hard about how to make them practical for many applications. That was great for me personally because my background is in databases, which are essentially an optimization problem. Now our field is going back to data management systems. Today’s models will generate tons of data and we don’t yet know what to do with them or how to generate and process them efficiently.
The idea of not understanding exactly how generative AI works gets to the issue of how to regulate it. Prof. Süsstrunk, as you mentioned in your presentation, many big tech companies could be hiding behind the fact that AI is a black box – the engineers themselves don’t actually know how some of the systems work. Do you think that’s true, or is it a way to avoid responsibility?
SS: I think it's a matter of both. First, I do believe big tech companies are using the black box claim as an excuse because they don't want to be regulated. But it’s true that their AI models were trained on zillions of data points and have zillions of parameters. It makes sense that if someone changes a given parameter x, they can’t totally predict what the outcome will be. That said, if the big tech companies know what data sets their AI models were trained on, if they themselves programmed and trained those models, then they should have a pretty good idea and should take responsibility for the models’ output. There is something to the statement by big tech companies that they “don't know how everything works.” But to then magically conclude that they can wash their hands of the models’ output is, to me, a way of shirking responsibility.
KA: Some very new things are happening. Models for generating text went through a phase change two or three years ago – they broke through a threshold and suddenly performed radically better. This was clearly linked to the size of the models. But today we’re already way past that threshold. To extend the phase change metaphor, generative AI models are transitioning into a new state. Maybe it has to do with the sheer amount of data that are available. But we have to be careful because engineers don’t understand the details of what goes on in the models. That said, I think they do have a good understanding on a large scale.
We should also bear in mind that these models will one day be able to talk to each other. They won't need us anymore, and we basically don't know what’s going to happen when they begin to interact. Models are already starting to develop their own language. They can generate text that’s illegible to humans but not to other models. So we won't even understand what they’re saying to each other. That's quite interesting, right?
Prof. Aberer, in your presentation you stressed the role of reputation management in curbing misinformation. As you clearly illustrated, the reputation of a source can’t be separated from its identity, and it’s often more efficient to identify reliable sources than to look just at the content itself. Do you see any reliable identification systems on the horizon?
KA: Methods for source identification already exist. For example, scientific publishing is strongly rooted in establishing authors’ identities. It's very hard to get a paper published in a well-reputed journal without verifying that you’re indeed the one who wrote it. There may be the occasional cheater, but not on a large scale. News outlets are a bit similar. The New York Times has a strong reputation; if something comes from The New York Times, then you probably don’t have to fact-check. So there won’t be just one solution but rather a combination of solutions. The big problem today is that people can spread all sorts of information without verifying their identities. I think most internet users aren’t aware of the scale of the problem, or simply don't believe it exists at all.
To wrap up, if we can’t rely solely on technology to combat misinformation, do you think there are any regulatory measures that could make a difference?
SS: In my presentation, I discussed the leaps that have been made in deep-fake technology since 2017. Developments are happening so fast that if regulators want to keep on top of them, they’ve got to stay one step ahead of the developers. And I worry that our political system isn’t built for that kind of fast response.
KA: I agree. What makes things worse are the financial interests driving these developments. Even if our political system did allow for the kind of technological understanding that would enable policymakers to regulate effectively, embedded financial interests would get in the way.
SS: I believe it comes down to education more than anything. AI also has the potential to open up amazing learning opportunities. So we’ve got to teach people the right way to use the technology while also being aware of its limitations. I think that’s the most immediate threat to address.
KA: Yes, AI should be viewed as a tool. It will speed up the way we process information, much like previous technological advances. Looking ahead, we’ll probably progress in tandem with the technology since AI systems will be at least as powerful as we are – if not more so – in processing information. And we’ll probably be interacting with systems that are doing more than we even understand. One example of this is the fact that some systems have already developed their own language – it’s almost like a next step in human development. This is pure speculation on my part, but it points to the challenges inherent in regulating this kind of technology, provided there’s the political will and capacity to do so.