Aarhus University Seal

Metadata: the traces you didn’t know you were leaving online

Message encryption has become the de-facto standard to protect online communication. Yet, most people are unaware that who and when they text—their so-called metadata—can be monitored and sometimes give away information just as valuable as the actual message content. Researchers from the Department of Computer Science at Aarhus University have set out to counteract the problem by designing tools to protect people’s metadata, and are creating the first instant messaging solution with provable privacy for both message content and metadata.

Online communication platforms such as Slack, Discord, and WhatsApp all provide certain privacy guarantees to protect their users. However, most platforms only offer message confidentiality, not metadata confidentiality. This means that only what is said is confidential, but when it is said and to whom is not protected.

The potential usage of metadata seems endless: if you know who is texting whom, when, and how often, you can infer a lot of information about people and their behavior. You might for example learn which apps someone uses, who their friends are, and exactly when and how often they use their device, solely based on metadata.

“Protecting message content is good, but it isn’t always enough. For instance, if you are texting a domestic abuse hotline or a politically radical group, what is written in the message isn’t necessarily the most interesting information but rather the fact that you are in contact with these organizations,” explains Boel Nelson.

Boel is a postdoc at the Department of Computer Science at Aarhus University, and leads the MSCA-funded project Provable Privacy for Metadata. The project runs for two years, and also includes associate professor Aslan Askarov. By proposing to combine their expertise from different areas within computer science, they have been selected by the European Commission to tackle metadata privacy from a new perspective.

The extent of the privacy threat

“We don’t yet have the imagination to fully grasp what metadata can be used for, but we know it can be collected,” Aslan Askarov continues. Currently, it is technically possible to collect metadata just from observing the presence of online communication. Someone who has direct access to a network, such as an internet service provider or a network administrator, could in other words easily collect metadata about the connected people or devices.  “The law dictates that metadata should be protected, but in real life, we don’t yet have the technical solutions to comply with that. Legislation might help push the tech industry to come up with suitable solutions. However, we need people to be more aware and engaged in this topic to create solutions that truly protect them. Otherwise, we might end up with a quick fix like the cookie banner that no one really benefits from,” Askarov assesses.
“What’s worrying on a societal level is that most people are unaware that their metadata can be collected, and how metadata poses a privacy threat. If people don’t know about the threat, they can’t make informed choices on how to protect their privacy; and even if they were aware, the available options out there are insufficient to protect metadata,” Nelson continues.                                   

To eventually make metadata privacy a reality, Nelson and Askarov are focusing on creating new, improved versions of protocols used in mainstream applications. Most recently, the researchers designed a metadata private protocol for instant messaging

No direct communication visible

To create a metadata private protocol suitable for instant messaging, Nelson and Askarov built on top of the Signal protocol, which is used in popular messaging services such as Signal, Facebook Messenger, and WhatsApp. Their protocol takes a new direction compared to prior research: instead of protecting the metadata of all communication, the protocol protects the metadata of some communication.

“Ideally we would want to protect all communication all of the time, but this has been a difficult problem which researchers have worked on solving since the 1980s, so in this project we’re instead exploring if it’s possible to do the next best thing,” Nelson explains. Nelson and Askarov build on the observation that certain communication may be less sensitive than other communication—for example it is unsurprising that we are communicating with our friends, family, and coworkers—which presents an opportunity to use this non-secret communication to hide the more sensitive interactions.

“Let’s say a civil servant wants to share a story with a journalist, but they don’t want anyone else to learn that they communicated with the journalist. They can’t leave a digital trace directly to the journalist, but they can be helped by the fact other people have regular contact with the press. Using our system, all messages contain both a non-secret message and a so-called stealthy message, which may be empty, and all messages need to be passed through a server before they reach the recipient. “When the civil servant messages their friend, a stealthy message for the journalist can be attached to the non-secret message to make its way to the message server. When someone else then messages the journalist, the message server passes on the stealthy message which will only be visible to the intended receiver,” Nelson explains.

Some niche services for metadata privacy already exist, like Tor (The Onion Router), but an unfortunate side-effect of such niche tools is that they have small user bases which makes them easy targets. For example, the fact that few people use a tool may in itself make the use of the tool suspicious. Making matters worse, a specialized tool, such as Tor, is easy for network administrators to block since it is easy to detect. With more users, however, it is possible to offer stronger metadata privacy. “Popular services benefit from a phenomenon called the Cute Cat Theory of Censorship, which says that when a platform has more users, it is harder to block” Askarov remarks. In other words, blocking a mainstream service like WhatsApp or Facebook Messenger is more difficult from a social perspective than blocking a specialized tool such as Tor.

Provable Privacy for Metadata

Fact box: Provable Privacy for Metadata

Provable Privacy for Metadata is a 2-year project funded by the European Union under Horizon Europe’s Marie Skłodowska-Curie Actions (MSCA). The aim is to address the problem of metadata privacy and provide practically feasible solutions that give formal privacy guarantees. That is, the solutions will both be designed to work in realistic settings, e.g., on battery constrained devices such as phones, and proven to be free from unintentional information leakage. By combining Nelson’s background in privacy, and Askarov’s expertise in formal methods, the project covers comprehensive solutions ranging from conceptual design to implementations of protocols in software.

For more information, see the project’s website:

https://cordis.europa.eu/project/id/101064140

Works both in theory and in practice

The extended Signal protocol is still a few years from being implemented in real services. In the past, solutions would tend to focus on either being practical, or proven secure in theory, but rarely both. Nelson and Askarov, on the other hand, have a proof-of-concept they have shown to be both practical and theoretically sound. The next step for Nelson and Askarov’s protocol would be a real-life large-scale test.

“Businesses are beginning to realize that people want proper privacy and that it can provide a competitive edge. Hopefully, a platform like Signal will be interested in investigating the problem, so we can develop our work further; there are still more practical challenges and academic questions that will be relevant to address before our research can benefit people,” Askarov says.

The scientific community is already welcoming Boel and Aslan’s progress. Most recently, their paper Metadata Privacy Beyond Tunneling for Instant Messaging, co-authored with assistant professor Elena Pagnin from Chalmers University of Technology, was accepted for publication and will be presented at the 9th IEEE European Symposium on Security and Privacy in Vienna this summer.

In the meantime, the two researchers are in full swing. There is still plenty of exciting work awaiting before they have reached their dream goal: to bring metadata privacy to everyone, making it possible for people to defend themselves against online surveillance.

 

Cybersecurity is as relevant as ever, and the scientific community has taken the challenge to heart. At MatchPoints 2024, leading cybersecurity researchers from across the world will come together for a three-day conference at Aarhus University to discuss the latest developments in the field.

The conference will be held on April 18-20, 2024. Check it out here: https://matchpoints.au.dk/matchpoints2024

Meet the researchers behind the project

Aslan Askarov

Associate Professor Department of Computer Science

Boel Nelson

Postdoc Department of Computer Science