Reinventing the Web II

June 16, 2014 · · Posted by Greg Lloyd

ImageUpdated 19 Jun 2016 Why isn't the Web a reliable and useful long term store for the links and content people independently create? What can we do to fix that? Who benefits from creating spaces with stable, permanently addressable content? Who pays? What incentives can make Web scale permanent, stable content with reliable bidirectional links and other goodies as common and useful as Web search over the entire flakey, decentralized and wildly successful Web? Here's a good Twitter conversation to read:

How the Web was Won

I believe Tim Berners-Lee's original HTTP and HTML protocols succeeded beyond his original vision of a globally scalable, loosely coupled network of Web pages that anyone could edit. The fact that his original protocols were simple, decentralized, and free for anyone to use were essential to success in a world of competing proprietary Internet publishing and commerce "standards" from Microsoft and others. But in my opinion, the Web won by turning permanence and stability into a decentralized economic decision.

Berners-Lee's original W3C protocols appeared at the right time to open clear field opportunities for distributed publishing, marketing, sales and advertising that fueled the Web's growth and evolution. Recapping the argument from my first Reinventing the Web post:

The idea that any sensible person would rely on a global hypertext system where links on one computer pointed at locations on another computer which would break whenever the remote computer was unilaterally moved, renamed, taken off line or abandoned seemed absurd.

The idea that you would have no way to know what incoming links would break when editing or refactoring content seemed just as bad.

The Word Wide Web protocols looked like they would work for relatively small cooperative groups like CERN who could keep things from breaking by having shared goals, and using peer pressure plus out of band communication to keep distributed content alive.

Actually that intuition was pretty good, because the World Wide Web took off in a direction based on other incentives compatible with those assumptions - and grew like crazy because unlike alternatives, it was was simple, massively scalable, cheap and eliminated the need for centralized control.

1) The Web became a distributed publishing medium, not the fabric for distributed editing and collaboration that Tim Berners-Lee and others envisioned. People and Web publishing engines like Amazon created content and kept it online while it had economic value, historical value (funded by organizations), or personal value. Content hosting became cheap enough for individuals or tiny groups. Advertising supported content became "free".

2) Search engines spanned the simple Web. Keeping content addressable now gained value since incoming links not only allowed people to bookmark and search engines to index what you had to publish (or sell), but the incoming links gained economic value through page rank. This provided even greater motivation to edit without breaking links, and to keep content online while it retained some economic, organizational or personal value.

3) People and organizations learned how to converse and collaborate over the Web by making it easy to create addressable content others could link to. The simple blog model let people just add content and have it automatically organized by time. The Wiki model required more thought and work to name, organize and garden content, but also creates stable, addressable islands of pages based on principals that reward cooperative behavior.

4) Search engines, syndication and notification engines built over the Web's simple, scalable protocols connected the Web in ways that I don't think anyone really anticipated - and work as independent and competing distributed systems, making rapid innovation possible.

Tim Berners-Lee made an inspired set of tradeoffs. Almost every concept of value on the Web: search engines, browsers, notification is built over his simple, open, highly scalable architecture.

I believe it's possible to provide what TBL calls "reasonable boundaries" for sharing sensitive personal or organizational data without breaking basic W3C addressable content protocols that makes linking and Web scale search valuable. That should be the goal for social and business software, not siloed gardens with Web proof walls.

Building a better Web over the Web we have

Telephone companies used to call their simplest and cheapest legacy service POTS (Plain Old Telephone Service). I believe it's possible to build a richer and more stable Web over POWS (Plain Old Web Services) without necessarily starting from scratch.

One answer to "who benefits?" and "who pays?" are the businesses who benefit from a richer and more stable Web connecting the systems they use to get work done. Stable fine-grain links and bi-directional relationships connecting systems of record and systems of engagement open the door to business systems that are more flexible, effective, simple to develop, and pleasant to use - more like the public Web than traditional line of business systems.

Museums, libraries, and archives such as Brewster Kahle's Internet Archive, the Library of Congress and others have a mission to collect and curate our cultural heritage and knowledge. The Internet Archive shows how little it costs to collect and index an archive of the content of the visible Web.

Commercial publisher monetize their archive, but have weaker economic incentives to maintain stable links to content outside their own domain.

Commerce sites and providers of consumer-focused Web services may have the greatest economic incentive for deep linking with stable references and relationships spanning devices you own, your home, your health and healthcare providers, your car, your family - and your work, see Continuity and Intertwingled Work.

If I'm right, there are economic incentives for Web content creators to make their work more linkable, visible and useable using straightforward, decentralized, and non-proprietary upwards compatible extensions of Plain Old Web Services.

I believe that indices spanning permalinked locations as well as incoming and outgoing permalink references to content in "stable islands in the storm tossed sea" can be created and maintained in near real time at Web scale, preserving the integrity of links to archival content distributed across the Web.

For example, any domain could publish an index to its permalinked content. Other domains implementing the same protocol could make incoming references to that content by permalink. This is a simple decentralized protocol, no more magical than the published external references that a link editor or dynamic linking system uses to resolve references connecting independently compiled modules of code.

Domains that agree to implement the same protocol, and use permalink (URI) references for content in other compatible domains then have a more stable, decentralized model for permanent links. If domains also publish their own permalink outgoing references (external as well as internal), a Web level service could build and maintain reliable inverted indices of bi-directional internal and domain spanning links. The federation of such domains could be spidered by any number of independently developed services, creating a more stable and useful Web as a decentralized service without breaking the simple Web protocols that every browser and other Web service relies on.

I don't know who has suggested this before; it seems obvious, and is a straw man not a solution. I'm using it to argue that we can and should invent ways to improve the capabilities of the Web using the same simple, decentralized philosophy that made the Web wildly successful versus "better" hypertext systems.

See Michael Peter Edson's Dark Matter essay and my Thought Vectors - Vannevar Bush and Dark Matter response.

Related

Update 19 Jun 2016 See the Internet Archive Decentralized Web Summit, 8-9 June 2016 Locking the Web Open. See videos of the Summit and Brewster Kahle's notes: "Building a web that is decentralized— where many websites are delivered through a peer-to-peer network– would lead to a the web being hosted from many places leading to more reliable access, availability of past versions, access from more places around the world, and higher performance. It can also lead to more reader-privacy because it is harder to watch or control what one reads. Integrating a payments system into a decentralized web can help people make money by publishing on the web without the need for 3rd parties. This meeting focused on the values, technical, policy, deployment issues of reinventing basic infrastructure like the web."

Reinventing the Web (2009) Ted Nelson, Tim Berners-Lee and the evolution of the Web. Ted Nelson wants two-way links, stable transclusion, micropayments. Tim Berners-Lee wants a new Web with open, linked data. I believe that most of what they want can be delivered using the current flakey, decentralized and wildly successful Web as the delivery medium for richer, more stable, more permanent internal models, as stable federations of islands in a storm-tossed sea.

The Internet's Original Sin by Ethan Zuckerman, The Atlantic, Aug 14, 2014. Ethan confesses his role - invention of the pop-up Ad - stating "It’s obvious now that what we did was a fiasco, so let me remind you that what we wanted to do was something brave and noble." He makes a convincing case that the apple in the Web's garden is Investor storytime "... when someone pays you to tell them how rich they’ll get when you finally put ads on your site." A darkly comic but heartfelt essay on the past and future economy of the Web: "It's not too late to ditch the ad-based business model and build a better web"

Intertwingled Work (2010) No one Web service or collection of Web servers contain everything people need, but we get along using search and creative services that link content across wildly different sources. The same principal applies when you want to link and work across wildly diverse siloed systems of record and transactional databases.

Dark Matter: The dark matter of the Internet is open, social, peer-to-peer and read/write—and it’s the future of museums by Michael Peter Edson on May 19, 2014.

Continuity and Intertwingled Work (2014) A level above an Internet of Things: seamless experience across devices for you, your family, your health and trusted service providers, at home and at work.

Reinventing the Web III (2014) followup Twitter conversation with @zeynep, @jeffsonstein, @kevinmarks, and @roundtrip.

The Web of Alexandria (2015) by Bret Victor "We, as a species, are currently putting together a universal repository of knowledge and ideas, unprecedented in scope and scale. Which information-handling technology should we model it on? The one that's worked for 4 billion years and is responsible for our existence? Or the one that's led to the greatest intellectual tragedies in history?"

And Victor's followup post "Whenever the ephemerality of the web is mentioned, two opposing responses tend to surface. Some people see the web as a conversational medium, and consider ephemerality to be a virtue. And some people see the web as a publication medium, and want to build a "permanent web" where nothing can ever disappear. Neither position is mine. If anything, I see the web as a bad medium, at least partly because it invites exactly that conflict, with disastrous effects on both sides."

Update 13 Jul 2014 Added new section headings, added the inline recap and economic benefit examples, added a link to a Jul 2014 Reinventing the Web III Twitter conversation on the same topic.

Update 23 Aug 2014 Added link and brief note on Ethan Zuckerman's fine essay on advertising as the Internet's Original Sin.

Update 29 May 2015 Added links to Web of Alexandria and followup by Bret Victor on why the Web is a bad medium.

Update 19 Jun 2015 Added link to Brewster Kahle's summary of the Internet Archive's Decentralized Web Summit of 8-9 June 2016.

Page Top