Get Your Own Data, Not Your Own Facts

Privacy: A Quick Reality Check

Robin Berjon

2021-07-21

Stylised image of a gathering storm over Manhattan.

It often feels like there are many viewpoints about privacy, and that they cannot be reconciled. In turn, this creates doubt that we can map a path forward towards a healthier digital ecosystem. But not all opinions are equally valid, and concerning ourselves only with those based in fact would lead to a much healthier debate.

There has been ample discussion in some tech circles as to just how much of a privacy war is really being waged. My personal sense is that it's not so much of a war as it is a reality check. It has become very painfully obvious that the same old simple solutions don't work — and some people are up in arms that reality is being inconvenient to them.

Let's take consent, to start somewhere. Consent is just one tool in the privacy toolbox. Consent is a bit like an avocado slicer: when you need to slice an avocado, it's pretty damn good. When you need to bang in a nail, it's pretty damn useless. In the overwhelming majority of everyday privacy contexts, we don't use consent because that would be absurd: is it okay that I listen while you're talking to me? Is it okay that I see you when you enter the room? Is it okay that, as your doctor, I analyse the symptoms you just described to me?

The answer to problems caused by consent isn't more consent, automated consent, or consent that covers more processing. It's to use consent where and when it is the right tool to increase people's autonomy, and to stop pretending that you can bang nails in with it in all the other cases. And no matter how appealing it might feel on a priori grounds, if it can't be enforced at scale and it is susceptible to hard-to-litigate gaming then it's just a bad approach to data protection. There's a reason why data brokers love consent so much and won't ever stop talking about "transparency and choice": it works in favour of whichever party has the time to invest in creating a consent funnel, and that's never individuals.

All that consent does is that it offloads privacy labour to the user, and only under very specific conditions does it increase the user's autonomy. This isn't new information. There's a review article from Science in 2015 that brings close to a hundred references to bear on the fact that digital self-determination in the face of privacy issues is highly manipulable and that's even without dark patterns. There are entire books about just the failure of notice regimes. The pathologies of digital consent are known. Lindsey Barrett summarised the situation well when she described notice and choice as "a method of privacy regulation which promises transparency and agency but delivers neither."

Or we could look at the perennial question of pseudonymous data. To people who have no experience in data protection, it sounds pretty reasonable. I mean, it's just a number, what do you really learn about someone? Only just yesterday, a priest was outed using pseudonymous data. There's an entire industry offering deanonymisation services. Those companies can get quite sophisticated, but the basics of breaking identifiers are straightforward. Reporters — smart folks, yes, but not professional data scientists — picked up the required skills for a stunning report and then other colleagues of them did it again for another. In fact, Arvind Narayanan recently said that it's so easy to do, you can't even get a research paper published about the practice. Even with rotating keys, the protection is only good if you can protect against timing attacks and have sufficient k-anonymity. You'd think we might learn because Netflix very publicly failed at this over ten years ago, as did AOL in 2006.

People who should know better are routinely wrong — by which I mean mathematically wrong — about the safety of identifiers, how do we expect laypeople to meaningfully consent to this?

Another evergreen proposal is that we can somehow use a web of contracts to ensure self-regulation in the data market. We have that — we've had it since 2000. The deal struck in the negotiations that the FTC coordinated in 1999-2000 was essentially that the tracking industry would be allowed to keep operating in exchange for setting up a self-regulatory regime. I don't think it's unreasonable to consider that 20 years is more than enough of a chance for this approach to prove itself. We gave it more than a fair shot; it's time to call the deal off.

Yet another recurring contention is that, after two decades of unfettered personal data broadcasting led to some of the most significant market power concentration in the history of humankind, what we really need is more unfettered personal data broadcasting because that will clearly lead to greater competition. There's an emoji for what my face looks like when I try to process that logic, but the Unicode Consortium is afraid to standardise it. I can see that there are non-market factors at play in the current situation, but still: given the empirical evidence, the burden of proof sits squarely with those who make that statement.

From what I've read, it's not obvious that it is much more than wishful thinking. Data is inherently relational, meaning it forms a network. The value of data depends in large part on its volume and its variety. This can lead to network effects in which, when data is broadly available, having even just a little bit more volume or variety than others can lead to winner-take-all outcomes from network effects. The OECD's analysis indicated that data enables multi-sided markets which can combine with increased returns to scale and scope, leading to dominance, winner-take-all, and competition for the market rather than in the market. In a separate study, they point to non-linear returns and network effects, with obvious competition implications.

And speaking of dysfunctional markets, there's good evidence that the claim that people will enter into a "value exchange" involving their data is an imaginary position that only data marketers believe in. People don't see a value exchange, they just hate you in silence. I don't blame anyone for wishing such an exchange existed, but I believe that reality-based marketing works better.

Which brings us to another trope: that we somehow don't have a definition of privacy. I mean, it's a somewhat contested space, but not that contested. We have a pretty broadly accepted definition that works very well in the trenches (it's what I've used at The Times for the past four years), that has its own conference, is massively cited, and is referenced in social and data science textbooks. I covered it for a general audience. I also introduced it for use in a standards context context, which hopefully TAG/PING can find some consensus around soon.

I could keep going, but this might be enough of a literature review for today. The way I see it, we have a pretty straightforward choice. We can keep loudly blustering that doing more of exactly what we've done for the past two decades will somehow magically lead to different outcomes, or we can actually bite the bullet, whether we like it or not, and find solutions that actually work.

Detractors often depict privacy work as being "ideological." If believing that people shouldn't live in fear of their tech betraying them is ideological, I'll take it. But there's a purely profits-driven consideration if that's what you want: people don't want to be recognised across contexts, and trying to force that to happen is an arms race against users which you'll eventually lose.

I know that these are not easy changes. We have yet to scale a business model that does not rely on advertising (subscriptions are highly reliant on it), and much of advertising has, for a while, been privacy-hostile. Changing that is a big reinvention. But we need to reform data and advertising, despite the complexity and the risk, because it's the only discernible path forward that has any sustainability to it.

So with this in mind, I'd like to suggest that we stop wasting time revisiting the failed strategies of a broken system, and instead invest in making it work. There's no path forward listening to privacy denialists and not much in the way of facts to back them up — so let's stop pretending bullshit should have a seat at the table.

I always welcome feedback: @robin.berjon.com, @robin@mastodon.social, robin@berjon.com.