Thoughts on technology and social web

October 6, 2009

Service federation in the cloud – PushButton or Google Wave?

Filed under: Real time web, Social Networking — Tags: , , — Ravikant Cherukuri @ 6:54 pm

PushButton technologies – Pubsubhubbub (PuSH), Webhooks and ATOM , provide a lighter weight and web friendly alternative to a more heavy weight alternatives that Google wave uses. Mainly for sever to server federation. On a high level, this seems true as the pshb uses http posts to communicates vs persistent connections of xmpp. Seems more web friendly and easier to implement subscribers and publishers on a language of your choice.

Disclaimer: On one level, pubsubhubbub and wave are solving different problems. But at a higher level both implement real time federation with pub-sub. My attempt is to evaluate wave as a vehicle for service federation. I actually think that wave is a great idea but its too heavy for the web as we have it today. Web is evolutionary wave is too drastic and revolutionary.

What makes wave compelling?

There are several interesting aspects. The first is the shell (the UX). The presentation is like e-mail but is more real time. It takes a paradigm that everybody understands and then extends it. The second is the extensibility. With gadgets and robots you could extend the wave system. With embedding you could extend the way waves are visualized and consumed. The third is federation. Federation makes it possible to have different providers who own their own data to communicate and make it possible for their user bases to work with each other. This kind of federation is not new. FriendFeed inter ops with many sites like flickr using SUP. Many IM networks inter operate too. But wave federation tries to standardize that to provides for XMPP like universal federation where you dont have to know or work with all the providers to federate with them.

What makes Wave complex?

Wave tries to solve a bunch of problems to make real time persistent communication possible. The protocol that wave uses to federate dictates that all federated partners maintain complete copies of objects even though the object is owned by one of them. Operational transformation(OT) dictates that all the operations on the document are stored along with the latest copy of the document. When a user from a federated provider joins a wave, the provider gets a copy of the wave in terms of the operations. All these features provide for an e-mail like decentralization to data. Wave builds on top of this with real time collaborative document editing. All these make the protocol incredibly complex.

The wave protocol imposes heavy requirements of what a remote partner has to implement. The protocol data model (which has to implement OT) would require a third party to implement state in terms of operations on it. That is, you will have to keep track of not only the current state of teh wave but also the history operations that made it so. This is not a pattern that services normally use. For a large scale service inter-op, each partner has to maintain a full copy of the wave. This doesn’t sound very feasible at web scale. Imagine an enterprise wave user joining a large public wave. Now suddenly my enterprise wave server needs to handle this barrage of updates.

Could it be simpler?

Activity streams is an effort by several people involved in social networking to standardizing protocols for different networks to inter-operate. The great thing about activity streams is that it defines the protocol for expressing activities without defining how the providers should implement their services. This is how web is today. We cant (and should not try to) anticipate how other would use a protocol/service. Just define the interface and leave the rest to the rest.

PuSH with ATOM is more web friendly than pub-sub protocol than XEP-0060 that wave uses. Simple HTTP POSTs to make and break subscriptions and receive notifications makes the interface very natural. Can a collaborative data sync algorithm like OT be implemented on top of PuSH or is it even necessary to achieve federation? As the format of the data going over PuSH is ATOM, services could exchange ATOM items to synchronize, the world would be much simpler.

Without OT, if the federating providers want to maintain copes of data that is actually owned by another provider, there is still the complexity of synchronization. In most cases, this level of sync is not required. Its only for when an item of small granularity if being edited simultaneously by multiple people. If the systems can identify conflicts and either overwrite items or prompt users to correct, it could be acceptable for most cases. The FeedSync protocol that Live Mesh uses works this way. Your content is a feed of items and items are synchronized using a simple algorithm and conflicts are flagged.

Combining Activity Streams with PuSH and FeedSync can provide a much simpler, more web like infrastructure for a universal federation model that is easy to build on and participate in. Federation is nothing but mashups with a business model. While the mashup space delights us with innovations everyday, service federation moves at snails pace. A simpler model to federate at web scale (in size and diversity) would lead to seamless aggregation of your data from any service (that your friends might be using) to the services that you use.

Blog at WordPress.com.