Robin Berjon

Putting the agent in user agent

You're Gonna Need A Bigger Browser

A Man Slaying a Monster Carp with a Sword

If we're agreed that the Web is for user agency, then in order to figure out where the Web goes next we should probably spend some time looking at user agents a.k.a. browsers.

Browsers are hugely load-bearing in the Web's architecture, and yet they haven't changed very much in quite a while. You could pick up a copy of Mozilla Navigator from twenty years ago, or even IE3 from the 90s and you wouldn't be lost. Don't get me wrong, it's not as if browsers have been abandoned: they've been seeing remarkable incremental changes and, overall, browser engines are beautifully complex feats of engineering, called upon trillions of times a day. But our idea of what a browser is and does for us hasn't moved much in a long time and if we intend to work towards greater user agency it seems plausible that we should at the very least be open to considering a major overhaul of what a Web user agent looks like. We should, if nothing else, be asking some questions. Maybe, just maybe, our ancestors were wrong about some things.

Decisions that were arbitrary when made can become so familiar that we take that design for granted and forget that things could work differently. There is some exploration of the design space carried out by alternative browsers in the wild (and some of the small browsers even break into several million monthly users) but overall I think it's fair to say that browser design has hit a local optimum and could have some of its assumptions challenged.

As with the whole rest of this series, my goal isn't to reinvent everything from the ground up. In many cases, we can obtain a different system by using the same pieces in a different configuration or by adding a novel primitive. My focus is on inviting you to think about what could use shaking up.

In this post, I'm focusing on questioning some aspects that strike me as worth exploring for change; where needed there will be a follow-up post going into greater detail.

Think Like A User

First: are we right to think of the browser as a standalone product? I think that that's primarily a figment of Web engineers' imaginations. Unquestionably, it is the case that from a software architecture viewpoint, the browser is an orthogonal component. However, from a product perspective, looking at how people understand it, the browser does not exist as a thing that is separate from a search engine (and, with in-app browsing in mobile, it is decreasingly understood as separate from a social network). The architectural view in which search, social, and browsing are distinct is a distraction that does not map to the experienced reality of most people and that is in fact conceptually arbitrary with respect to the tasks that people actually seek to accomplish on the Web. In a sense, we're missing the Web for the tabs.

Rather than trying to enforce that separation because we believe in theoretical purity, we should lean into our users' experience and consider browsers moving up the application stack as agents, eating up search and social as protocols. From a product standpoint, the browser should cover all discovery mechanisms (and keep in mind that extensions can enable innovative UI if discovery methods are exposed as a data layer). I have already covered how making search an API is an important component in lifting search out of its current distressing situation, and I will do the same for social.

Technical people often react strongly against bundling concerns that can be kept separate. But the product view beats the architectural view every time. The question isn't "should we or should we not bundle these concerns?" but rather "given that these concerns are bundled up in actual real-world use and perception anyway, what underlying design can we come up with so that the resulting architecture makes sense?"

Keeping search and browsing separate means that one gets to commoditise the other — and right now we're seeing search commoditise browsers because that's where the money is coming from. Chrome increasingly looks like the in-app browser for the Google universe and it's easy to imagine a future in which it is just replaced by the Google app and the people who keep using an actual browser will just be a small bunch of harmless weirdoes that can be safely ignored.

We should commoditise search and social instead, and make them serve user agency in ways that they currently refuse to.

Think Bigger

Second question: should the user agent just be a user client? Ok, granted, when you put it that way it's not much of a question. Back when the Web was just a document retrieval system, a client-centric approach made a lot of sense. But in a world in which the Web is some kind of everything-computer, we need a serious power-up. By far almost all of the intelligence in a browser engine is dedicated to executing server-provided instructions. Apart from a number security protections, the agent side of the Web is, let's be blunt, profoundly dumb from the user's perspective. Browsers as pure clients is an architecture that creates an asymmetry of automation that makes authors more powerful than users; it needs to be reversed.

Some cracks are appearing in this design. Multiple browsers are shipping with VPN support (and therefore a server component). Brave ships with a full-fledged IPFS node. Opera had tried including a server in the browser. (Opera tried everything before it was cool.) Chrome is moving some advertising functions from servers into the client.

I will dive more deeply into what a Personal Data Server (PDS) architecture could do in a future post, but one way to thinking about a transition to an agent that's more than a pure protocol client is to imagine the user agent as a perimeter formed by software on a coordinated (by standards) set of user devices. There is no reason that your browsers on different devices should be different agents (even if they have different vendors), or that that perimeter wouldn't also include your own piece of the cloud that works for you.

There is any number of typical features that would work better if they were in people's hands: recommendations, search & social filters, content and people blocking, identity, comments, shopping cart management, subscription & membership management. All of these would strongly benefit from from an agent eating up the stack with a form of server component (local-first with sync). These features are well supported by Gordon Brander's list of aspects that people need to own at the protocol level for user agency to be even possible as well as by Bernhard Seefeld's proposed three inversions in computing. The convergence shouldn't be surprising: the moment you start asking what your computer should be doing for you, you start travelling a path that leads to similar conclusions. The world we currently have in which you have to entrust your life to a company that, for the most part, you know isn't trustworthy and there is very little your browser can do about it doesn't make sense — it's just what we're used to.

Settle Your Tabs

Tabs do have something going for them: they're not windows. Back in the day, we had to carry out horse to work and back in 3 metres of snow uphill both ways, when your computer blue-screened you died in real life, and every single web page you wanted to keep open would have to be in its own independent window. It was a rough life but we had each other, we knew what was what and, well, I digress but tabs sure were an improvement when they came. But are tabs really that good?

There's room to improve the browser UI more, and perhaps more importantly there's room to become unstuck. People are tinkering at the margins with vertical tabs, a bunch of extensions that make them more usable, Chrome's grouping thing, or even that splendid zooming tab management UI that Firefox teased for months and, true to form, never shipped. But the fact is: tabs don't work. Not like this, not this hard.

I don't have numbers to back this up, but from observing people there are basically two kinds of tab users:

Not that you had any doubts but since you're asking, this is my current Firefox arrangement of 26 windows containing 884. And you haven't seen my mobile browser.

26 different browser windows shown on a desktop

Tabs are bad for apps. No one wants to run an app in a tab. No one at all. Putting applicative content inside of a document interface is such a dumb idea they wouldn't even have it on MTV Pimp My Ride. It's terrible for composability, if you worked on the issues involved in making Web pages composable (tackled in a coming post) the tab UI would still stand in the way.

For all users, tabs are the wrong answer to something people want to do: organise their information, even if it's just a small current stack of interactions. When you think about how much the browser observes about what you do, it's scary how little it puts that to use. (Except Chrome that puts it to use sharing your entire browsing history with Google, but that's hardly helpful to anyone but them.)

Think About The Business Model

You can find some variations here and there, but by and large the business model for browsers is selling the search engine default. There's very real money in browsers and yet they're free, which creates well-know misalignment of interests and indeed I have already covered that this leads to significant ethical problems on top of destroying innovation in search. Whenever I see a "Privacy. That's iPhone." ad I can't help mentally adding "except when Google pays us enough for the data."

To get a sense for how much money is available, as a lower bound, in the browser space, let's consider this back of the envelope calculation:

This isn't made-up money, it's money that comes from the Web economy. Just looking at the two top browsers we've found a 90+% profit margin being extracted out of the Web and put elsewhere and something like 28+17=28 + 17 = $45bn that could go to funding Web projects but that instead pays for unrelated Google/Apple products and for profits to be handed out to their shareholders.

Do you know what we could do for the Web, as a public good, with an extra $45bn a year? I'd say hand it over to me but the truth is that there's no undemocratic entity smart enough to deploy that kind of funding intelligently. It's just not possible.

If we assume that Google loses its artificial search monopoly, search royalties would possibly drop in price significantly, but even one tenth of that would make a huge difference. It wouldn't be hard to put this money to more productive use than making a small number of geeks and their shareholders too rich. If the web is supposed to put people above profits, our browsers are doing a strikingly bad job of it.

And that's only looking at the money that browsers can make from a tie-in to search. If we consider other sources of revenue (such as operating a PDS, ads) then there are funding sources in more than ample quantity. We should move away from the assumption that browsers are free and therefore cannot make money — it's a lie. There's money to fund a much better Web than we have and much more powerful user agents that support user agency. That money is currently locked up because iOS and Android have the defaults set and literally nothing else matters. But there's room for a leapfrog strategy focused on offering an experience that the incumbents cannot match without undermining their revenue stream.

To conclude, the core problem isn't that browsers are bad — it's more that browsers are not browser enough. There's room for them to do more, there's ample room for product differentiation if we move beyond the ancient and uninspired tabs-and-a-bit-of-chrome model, and there's money to be seized. I see all the markers that change is possible.

Credits

Cover picture A Man Slaying a Monster Carp with a Sword, by Totoya Hokkei, sourced from Ukiyo-e Search.


This post is part of a series on reimagining parts of the Web. You can read the other entries in the series at:

  1. Building the Next Web
  2. The Web Is For User Agency
  3. You're Gonna Need A Bigger Browser
  4. Web Tiles
  5. ActivityPub Over ATProto

Acknowledgements

Many thanks to the following excellent people (in alphabetical order) for their invaluable feedback: Amy Guy, Benjamin Goering, Ben Harnett, Blaine Cook, Boris Mann, Brian Kardell, Brooklyn Zelenka, Dave Justice, Dietrich Ayala, Dominique Hazaël-Massieux, Fabrice Desré, Ian Preston, Juan Caballero, Kjetil Kjernsmo, Marcin Rataj, Margaux Vitre, Maria Farrell, and Tess O'Connor. Needless to say, anything dumb and stupid in this article is entirely mine.