10.06.09

Service federation in the cloud – PushButton or Google Wave?

Posted in Real time web, Social Networking tagged , , at 6:54 pm by Ravikant Cherukuri

PushButton technologies - Pubsubhubbub (PuSH), Webhooks and ATOM , provide a lighter weight and web friendly alternative to a more heavy weight alternatives that Google wave uses. Mainly for sever to server federation. On a high level, this seems true as the pshb uses http posts to communicates vs persistent connections of xmpp. Seems more web friendly and easier to implement subscribers and publishers on a language of your choice.

Disclaimer: On one level, pubsubhubbub and wave are solving different problems. But at a higher level both implement real time federation with pub-sub. My attempt is to evaluate wave as a vehicle for service federation. I actually think that wave is a great idea but its too heavy for the web as we have it today. Web is evolutionary wave is too drastic and revolutionary.

What makes wave compelling?

There are several interesting aspects. The first is the shell (the UX). The presentation is like e-mail but is more real time. It takes a paradigm that everybody understands and then extends it. The second is the extensibility. With gadgets and robots you could extend the wave system. With embedding you could extend the way waves are visualized and consumed. The third is federation. Federation makes it possible to have different providers who own their own data to communicate and make it possible for their user bases to work with each other. This kind of federation is not new. FriendFeed inter ops with many sites like flickr using SUP. Many IM networks inter operate too. But wave federation tries to standardize that to provides for XMPP like universal federation where you dont have to know or work with all the providers to federate with them.

What makes Wave complex?

Wave tries to solve a bunch of problems to make real time persistent communication possible. The protocol that wave uses to federate dictates that all federated partners maintain complete copies of objects even though the object is owned by one of them. Operational transformation(OT) dictates that all the operations on the document are stored along with the latest copy of the document. When a user from a federated provider joins a wave, the provider gets a copy of the wave in terms of the operations. All these features provide for an e-mail like decentralization to data. Wave builds on top of this with real time collaborative document editing. All these make the protocol incredibly complex.

The wave protocol imposes heavy requirements of what a remote partner has to implement. The protocol data model (which has to implement OT) would require a third party to implement state in terms of operations on it. That is, you will have to keep track of not only the current state of teh wave but also the history operations that made it so. This is not a pattern that services normally use. For a large scale service inter-op, each partner has to maintain a full copy of the wave. This doesn’t sound very feasible at web scale. Imagine an enterprise wave user joining a large public wave. Now suddenly my enterprise wave server needs to handle this barrage of updates.

Could it be simpler?

Activity streams is an effort by several people involved in social networking to standardizing protocols for different networks to inter-operate. The great thing about activity streams is that it defines the protocol for expressing activities without defining how the providers should implement their services. This is how web is today. We cant (and should not try to) anticipate how other would use a protocol/service. Just define the interface and leave the rest to the rest.

PuSH with ATOM is more web friendly than pub-sub protocol than XEP-0060 that wave uses. Simple HTTP POSTs to make and break subscriptions and receive notifications makes the interface very natural. Can a collaborative data sync algorithm like OT be implemented on top of PuSH or is it even necessary to achieve federation? As the format of the data going over PuSH is ATOM, services could exchange ATOM items to synchronize, the world would be much simpler.

Without OT, if the federating providers want to maintain copes of data that is actually owned by another provider, there is still the complexity of synchronization. In most cases, this level of sync is not required. Its only for when an item of small granularity if being edited simultaneously by multiple people. If the systems can identify conflicts and either overwrite items or prompt users to correct, it could be acceptable for most cases. The FeedSync protocol that Live Mesh uses works this way. Your content is a feed of items and items are synchronized using a simple algorithm and conflicts are flagged.

Combining Activity Streams with PuSH and FeedSync can provide a much simpler, more web like infrastructure for a universal federation model that is easy to build on and participate in. Federation is nothing but mashups with a business model. While the mashup space delights us with innovations everyday, service federation moves at snails pace. A simpler model to federate at web scale (in size and diversity) would lead to seamless aggregation of your data from any service (that your friends might be using) to the services that you use.

09.07.09

pubsubhubub and rss-cloud : changing the way you read the web

Posted in Cloud Computing, Real time web, Social Networking at 11:39 pm by Ravikant Cherukuri

Today WordPress declared support for RSSCloudPubsubhubub was adopted by blogger/ google reader a few days back. These technologies are being adopted much faster than I expected. Much like AJAX a few of years back, the buzz is building up. Anil Dash’s posts on pushbutton talk of the potential of these technologies. The basic idea is very simple. A RESTful pubsub protocol that provides real time notifications for web content.

  • Subscribers register their interest in publisher content notifications to hubs.
  • Publishers send contents pings to the hub when new content is posted.
  • The hub reacts to the ping by fetching content from the publisher and posting the content to the subscribers.
  • All content is in RSS/Atom format.
  • Communication to and from the hub happens over HTTP using webhooks.

This has the potential to change web as we know it. It could bring Twitter-like real-time notifications to changes on all of web’s content. As an end user, this is exciting because this could finally bring (much deserved) death to the browser refresh button by chanelling all the content you are interested into streams that can be delivered to you in real time. Pushbutton with the DiSO/Activity-Streams, is great for social network federation. Activity stream’s atom format should work well with pushbutton.

Hubs federating and chaining subscriptions with each other provides a distributed and decentralized pub-sub infrastructure. This has some definite advantages over XMPP. XMPP’s XEP-0060 provides a similar generic pub-sub functionality. But still the REST based approach is simpler and more web savvy. Webhooks take the honors here with its REST based callbacks as opposed to XMPP’s connection based approach.

Add the bidirectional communication channels on web pages (using long poll today or web-sockets in HTML 5) to the mix and we have a means to deliver notifications to end users. When I am on facebook and I get a mail in my gmail account, the notification could be delivered to me through facebook.

Update : A few of good posts comparing pubsubhubub and rsscloud -

How to publish and receive blog posts in Real-time
PubSubHubbub vs. rssCloud
PubSubHubbub = rssCloud + ping ?
RSSCloud Vs. PubSubHubbub: Why The Fat Pings Win
There’s a Reason RSSCloud Failed to Catch On
The Reason RSS Cloud Can Work Now
The Web At a New Crossroads

08.13.09

Real time roundup [Part 2] — responsive web applications

Posted in Uncategorized at 2:22 pm by Ravikant Cherukuri

This is often a hard to appreciate aspect of the real time stack. But only till you experience it. Think about the instant delivery of notifications in Facebook or the character by character real time transmission in Google wave. On the face of it, it might look like a minor feature but from UX responsiveness, this is big. Once you use these apps, the rest of the world would look sluggish kind of like how a dial-up internet connection feels today. This is as much of a leap as the transition from refreshing web pages to see updates to letting the RSS reader do it for you.

This pattern can be applied to many web applications. This is the key that will bridge the gap that web applications have over desktop applications. Their ability to react to user/data events in real time. With HTML web sockets, this might be more standardized but even today, long poll/comet etc are pretty scalable and general purpose. Once the common web programming environments like ASP.NET/JSP/PHP etc embrace this concept fully, there is great potential here.

This could even be in the form of a generic pub-sub provided as a secure cloud service. As a web site developer, I need a pipe to transfer data to and from the browser in real time. But I dont want to invest in a pub-sub infrastructure which is very different from the HTTP programming model that I am used to and has entirely different scaling characterstics. If this is provided as a service on Microsoft Azure/Amazon ECS/Google AppEngine, if would be much easier to integrate into the current programming model. But what would such a reusable model include ? A few things off the top of my head :

  1. The long poll connection to the server with a javascript component that is embedded in the web pages. This will enable the duplex connectivity.
  2. A pub sub infrastructure that will provide some session storage and topic based subscriptions to the sessions. For example, when a user visits a rel time web page, a session will be created for it on the pub sub backend. Then the web site (instance) will register for topics that it is interested in.  The web server could publish these topics to the pub-sub system and the pub-sub system would push these to the browsers.
  3. Topics can be transient entries in the pub sub system that are created as pages subscribe to them.
  4. An API exposed to the web site backend that would push arbitrary topic updates to teh pub sub system.

This kind of an infrastructure component provided as a piece of code to be reused or as a cloud service would make it easy to build real time responsive web sites.

07.17.09

Real time roundup

Posted in Cloud Computing, NextWeb, Real time web, Social Networking at 6:50 pm by Ravikant Cherukuri

Real time web has been a favorite topic for me for a while now. I work on one of the large IM systems and am very interested in these developments. Real time web started emerging as the platform for the next web in the last year it two. The main idea here is that events happening on the web are brought to you as they happen. Think the coverage that the iranian election aftermath got with twitter. The more obvious scenarios are already a reality and there are several more subtle but equally impressive usages that many are working on. As with all technologies that are driven by guy-in-the-garage start ups, its tough to see where this is all leading (if at all). Real time web focuses on several aspects of the web.

  • filtered web streams
  • instant delivery of web posts
  • real time collaboration
  • responsive web applications
  • S2S data filtering

The lack of full duplex connectivity has been a down side of HTTP. This gave native applications a one-up. HTML5 is out to correct that with Web Sockets. Meanwhile, technologies like COMET, BOSH etc provide a near real time connectivity over HTTP. So, the technology seems to be in place for the real time web. But why does real time web matter? We already have near-real-time, polling based push technologies like RSS/Atom. In my mind the answer is user experience. Real time gives the most natural experience. Compare a walkie-talkie to a telephone. A walkie-talkie is near real time with a lot of user level sync. Its just a quirk of technology. A telephone conversation is a full duplex real life conversation. Makes a lot of difference.

More links to the real time web :

Introduction to the Real-TimeWeb

The Real-Time Web – O’Reilly Broadcast

Is Real-time the Future of the Web?

Building Real Time Web Applications Using HTML 5 Web Sockets

How to Deal with the Real Time Web: Navigating the River

Real time search and filtered streams

For me the coolest part of microblogging is real time search. The significance of most information rapidly fades over time. So, getting short spurts of information as and when it happens is super userful. Twitter’s hash tags are a good example. You follow information rather than people. I find the following ways to track real-time information interesting.

  • Twitter hash tags. Send something like #iwant in the tweet and people can easily track your tweets and might contact you if they are selling what you want. Gives meta data  to your tweet and makes it trackable by other users. The #iwant and #ihave tags are a good example. There are third party sites that consume the twitter firehose and build a realtime marketplace .  Such mining verticals on real time data are fast emerging. Many hugely popular twitter games like spymaster are another example.
  • Google’s “e-mail me when a new item is found that matches my search criteria” feature. This is super useful. If you know what you are searching for, you can be the first to find information as it comes live. Google can deliver this to you with in a few hours of the informations getting posted on web. This is realtively real time (compared to others) but is still a push. When this becomes true real time, it will be awesome.
  • Aggregated real time search. There are several real time search engines out there that can get you real time feed search from twitter/facebook/identica etc in one interface. Some of these have AJAX powerd interfaces and some have true real time with XMPP (collecta).

The basic idea behind this is to subscribe to receive changes data on the web without tying in to specific web sites, in real time. Eventually, the bigger search engines like google/yahoo/bing will comeup with a tighter integration of real time into all of the web that they index. With that, real time search will seamlessly integrate with the normal web search as we know it.

Instant information delivery

This category of real time applications aim to deliver your data of interest to you as soon as its posted. Sample scenarios here include

  • Deliver blog posts and comments that you are interested in instantly. This builds on the current RSS based systems by bringing true real time to content delivery.
    • http://im.wordpress.com : Word press now allows you to keep track of other word press blogs and comments and comments on your wordpress blog over XMPP IM with clients like GTalk.
    • Tweet.IM delivers your twitter feed over IM in real time using XMPP
    • Friendfeed aggregates feeds form all your friends blogs, social streams and comments. Friendfeed has a feature that delivers these are friend feed finds them over XMPP IM.
    • http://www.lazyfeed.com/
    • Google Wave uses XMPP IM protocol to sync wave content in real time.
    • There are a couple of startups like iNezha that provide real time updates to all the blogs that you are interested in.
  • Enable efficient content aggregation uisng XMPP. Google PubSubHubub is a good example. There were also several experiments by companies like Gnip, FriendFeed etc to use XMPP for this purpose.
  • IM systems are integrated with email systems (by the same vendor) today. You get an email and you are presented with a toast my the IM system in real time.
  • Windows Live Messenger also supports alerts from different third party providers. This is real time events from arbitrary providers with whom you have registered your interest. You can click on the windows live alerts button on my blog and get alerts onto your messenger whenever I update my blog.

[More to come in a later post]

06.19.09

Protocols for the real-time web

Posted in NextWeb, Social Networking at 5:46 am by Ravikant Cherukuri

Today Collecta unveiled their real time search engine toady. One of the several players in this fastly evolving space. Others include twitter search, OneRiot, Tweetmeme, Facebook search, rumored google real-time search etc. The interesting thing about collecta is that they are true real time. You will see updates reach you within seconds (of collecta seeing them). This is because they use Jabber’s XMPP protocol to push updates to the client. This is one of many techniques used for real time communication on the web. Some invented as people needed them and others like XMPP that are standardized. What are these techniques and how do they stack up?

As early as 10 years back, we started seeing applications that tried to bring you information as it changes or as it is created on the web. This evolved into RSS/Atom based feeds. This is a polling based pull that simulates push. Your browser or blog reader will periodically poll for changes to your feeds and update you when they change. This evolved into feed aggregation services where all your feeds are aggregated into a single feed that you can poll from the client. This makes it efficient to poll on the client but the serivce that is the owner of the feed still gets the load. Consider this. For the web to be real time, the zillions of objects on web from product listings to blog posts to wikipedia articles have to be able to communicate to users in real time. The feed model just dosent scale to this.

As the web UI evolved we needed web applications to be more responsive and so AJAX was born. AJAX (Asynchronous Javascript And Xml) enabled javascript to make XML based calls to the web server and get data back to the current page without reloading it. This made web apps more responsive but the model is still the same as javascript now used XMLHttpRequest to poll feeds from the server.

Then consider real time collaboration scenarios like instant messages, collaborative document editing etc. These need real time responses as other users are watching the screen to see them. These applications need to be more realtime and constant polling will either overwhelm the servers (for short polling interval) or degrade user experience (with longer polling interval). Long polling / Comet comes to the  rescue here. The browser keeps long running connections open with the server so the server can send events to teh browser as they happen. The basic technique is for teh browser to make a request to the server for which the server does not respond till it has some data to send. Once the browser gets some data from the server, it make another request to the server. Many web apps like gmail, facebook, meebo etc use this technique to bring real time functionality to the web.

These techniques are also used by APIs that bring realtime to web by implementing web wrappers around existing proprietary realtime protocols like Messenger Web Toolkit,  Web AIM, Yahoo messenger SDK etc. The Messenger web toolkit provides a cool feature that allows you to send non IM messages that can be used to build higher level collaboration applications.

The Jabber/XMPP protocol is an extensible protocol that is used for publish-subscribe (mainly in instant messaging). This protocol is finding way to many real time web scenarios like -

  • real time search (Collecta), twitter (tweet.im), friendfeed,
  • aggregators like PixelPipe and Ping.fm that let you interact with your social networks via XMPP,
  • WordPress firehose where partners like search engines and market intelligence providers who would like to ingest a real-time stream of new WordPress.com posts and comments the second they get published.
  • Twitter firehose where thrird parties can get the realtime stream of twitter data to mine and search.
  • Google wave extends XMPP to build a collaboration system

XMPP also has javascript API for the web like Strophe and xmpp4js. There is a technology similar to comet for XMPP to run on HTTP called BOSH. BOSH takes care of firewall traversal and tunnels XMPP over HTTP. Overall this is a fairly well designed and extensible protocol with a lot of good documentation and several reference implementations. This is becoming the protocol of choice for the real time web.

There is also the WebSockets API in the HTML5 specification. The HTML 5 specification introduces the Web Socket interface, which defines a full-duplex communications channel that operates over a single socket and is exposed via a JavaScript interface in HTML 5 compliant browsers. It tarverses firewalls and proxies and provides bi-directional transport with streaming capability without the long poll overhead. The javascript API is also very simple. COMET can surely take advantage of this and things become more straight forward without hidden iframes and arcane protocols. So can XMPP. Sounds like the holy grail in making the browser a two-way real-time medium.

This space is fast changing and a lot of smart folks are figuring out how to make the web more responsive and realtime. And the protocols keep evolving to acocomodate that.

Update : There is a good article about XMPP progress in 2009 at http://blog.xmpp.org/index.php/2009/06/xmpp-roundup-10/.

06.18.09

HTML5 – the way ahead

Posted in NextWeb at 4:17 am by Ravikant Cherukuri

For many web developers, even though web 2.0 and AJAX was a gaint leap, we were stuck in a place where the HTML 4 (and its many interpretations) was restrictive and there is still a difference between the look and feel of a well designed desktop app and a well designed web-app. Adobe flash, AIR, SilverLight etc attempt to bridge this gap. But what if HTML could do it all. Can JavaScript be powerful enough to do all the things that C#/ActionScript can do? Can HTML5 carry forward the experience and knowledge collectively gained in the last 20 odd years of UX programming? The HTML5 standardization process is going to take a few more years, but browsers are already implementing the draft standard. For developers and companies to invest in HTML5, we need a firm commitment from the browser makers. Especially since the last iteration did not go well.

Browser Alert
multibrowser_logos

In the current imperfect world, HTML4 itself did not realize its potential because of the interpretations that different browsers made of the standard. The end result being all the browser specific quirks that developers are forced to learn. IE (6/7/8), Firefox, Opera, Safari and now Chrome all of them need to get their act together. The user has choice today. So better get your act together or people will move to a different browser. This is the biggest hurdle for HTML5 and also a key test for browsers.

Flashy without flash

I started looking at HTML 5 like many others after watching the google wave demo. The fluidity and interactiveness of the interface is amazing.

A few more examples of what HTML + Javascript could do. Here is a visual studio like editor in HTML developed by firefox.

Firefox 3.5 implemented the video tag of HTML5. This is a really awesome demo.

OTOY is a company trying to take your console into the cloud. It does all the graphics processing on a server and just renders the picture onto your browser with no plugins or downloads. How cool is that?

HTML 5

What does HTML5 have that makes it powerful? There are several new features in HTML5 (list from wikipedia). It brings many patterns of usage today into the language.

  • New parsing rules oriented towards flexible parsing and compatibility
  • New elementssection, article, footer, audio, video, progress, nav, meter, time, aside, canvas, datagrid
  • New types of form controls – dates and times, email, url, search
  • New attributesping (on a and area), charset (on meta), async (on script)
  • Global attributes (that can be applied for every element) – id, tabindex, hidden
  • Deprecated elements dropped – center, font, strike, frames

The canvas tag makes it possible to get flexible drawing into the realm of HTML. I have used other javascript apis that simulate a canvas. These work by drawing 1 pixel divs that are absolutely positioned. The canvas tag is way more economical and elegant.

At last there are combo boxes that are native in HTML using the datagrid tag.

The input tag now has many new types – datetime, datetime-local, date, month, week, time, number, range, e-mail, url, search, color. The progress tag will display a progress bar. It was so dumb that there were 100 different implementations for these. Many other new tags provide HTML native versions of elements that were not natively supported and resourceful javascript junkies had to hand code (figure, footer, header, meter etc)

The video/audio tag (as seen in the firefox demo above), provides rich video/audio embedding and manipulation capabilities natively in HTML.

WebSockets API that provides in-built support for two way communication. This is a good idea with the proliferation of real-time web applications with long-poll/COMET etc. This might take us to a true real-time duplex connectivity. An interesting comparision of AJAX/COMET to HTML5 can be found here.

The <script> tag has an async attribute that makes the page load to continue while the script is loading. This will speed up page loading and improve user experience. With the complexity and size of the scripts on the rise, this is very handy.

The browsing context defines the algorithm that browsers use for navigation and the related sessions history traversal. A browsing context is an environment in which Document objects are presented to the user.

Session history and navigation using javascript APIs.

Cross domain communication using Window.postMessage. This is already in most browsers and what a relief. The earlier cross domain communication techniques (hacks) relied on changing the hashes (the test that follows #) in a URL and a background timer. eeeks!

Offline web application caches. Google gears/Windows live mesh etc have build this with no browser support in a non standard way. Now HTML5 defines this as a standard. The ononline and onoffline events give javascript the ability to track connectivity and will enable the AJAX application to behave accordingly (detailed article).

DOM Storage as a way to store meaningful amounts of client-side data in a persistent and secure manner. John Resig of the jQuery fame wrote this article that explains this in detail. Looks very powerful and can be used to enable and optimize many desktop app like scenarios.

Editing API in combination with a new global contenteditable attribute and something called an UndoManager that will support undo/redo for content edits. Sounds good for a totally in browser content editors and development environments.

Support for spelling and grammar checking. Some drag and drop support. Many link types to give meaning to links. This looks like part of micro-format and semantic web support.

Frankly i thought there would be more to support semantic web, linked data and such. But may be there is. This spec is a work in progress (a mess). Somewhere it mentions that the HTML5 spec has a lot of things that should have their own specs but due to the lack of volunteers to own the specs, all the stuff got dumped into one. Sad for such an important piece of work.

With all these features and many more that i could not get to, HTML5 is aiming to bring the flexibility and performance of desktop applications to the web. I am sure over time, these features will get better defined so that they can be interpreted without ambiguity. This is important to make web application work cross browser. The more I read about this stuff, the more I am convinced that this will succeed and take over many of the scenarios that are served by Flash, AIR and SilverLight. There might still be some applications where these will stay relevant. But if you are looking for state of the art presentation and interactivity, HTML5 will do it.

Developer’s Woes

jQuery and prototype are two JavaScript libraries with a lot of promise. Take a look at jQuery Tools and what HTML4 can do. With HTML5, and all the new functionality, javascript now is a very complicated (and powerful) language. How do you handle your application complexity and size as yout javascript, HTML and CSS grow in size? There are many solutions for managing HTML like ASP.NET, JSP, PHP etc. How about your javascript? How would you unit test your javascript code? How would you make it reusable?

One option is to go with libraries like prototype and jQuery that make it very easy to do common tasks and if you buy into their design philosophy, its a very compelling was to design your code as well. JavaScript is powerful and elegant. If you know how to write good code that is. Its not very difficult to get a hang of it. The advantage of this approach is that you stay true and antive to JavaScript and can take advantage of the fast pace of innovation here. The prototype library has unit test support too. Never used it myself though.

The other option is to use the “more evolved” server side languages like java or C# to write your code and use a cross compiler to generate the javascript for you. GWT (Java -> JavaScript) and Script# (C# to JavaScript) are good examples here. There are also other cross compilers like pyjamas (Python -> JavaScript), Objective-J (Objective-C to JavaScript). The advantage of using one of these approaches is that you could use your native language (if its not already JavaScript). You can use a development enviroment that is built to handle size and complexity (like Visual Studio/Eclipse etc) and take advantage of features like intellisense and refactoring tools. You could also leverage existing unittesting frameworks. Quite a few complex AJAX applications on the web today use this model.

If HTML5 is to replace the other desktop application development technologies, it would need more powerful and integrated tools so that you dont have to be an expert to develop these apps. The HTML WYSIWYG tools have to evolve to support not only the language features but also the common patterns so well designed software will be easy to do.

Will update this blog with more exciting HTML5 thingies as I find them.

06.14.09

Ubiquitous, rich and real-time – the future of communication

Posted in Social Networking at 11:47 pm by Ravikant Cherukuri

Communication models on the internet have been changing at a good pace in the last 20 years. This pace has picked up quite a bit lately and a new model is emerging. Based on federation, content models and the real time web, the near future promises great innovations in this space. Faster machines, unlimited bandwidth, cloud computing and the faster more agile mindset of developers will hasten these changes and the gold rush is on to get there. Social streams, Google wave, web integration of instant messaging, Open ID/OAuth are all driving this change. Google Wave in particular is ambitious in getting to the next paradigm. Some of what I discuss below is already in the realm of what google is going to release this fall.

Ubiquitous

Ranging from BBS, news groups, e-mail, instant messaging, SMS and lately social streams (from myriad social networking sites), each of these services provide some niche service and have their entrenched users. More users are using more than one of these services and the integration between these is in minimal and piecemeal. Aggregation services (like friendfeed) are going to play a big role going forward. Unifying these means of communications so they seamlessly inter-operate is a challenge not yet solved though many companies worked on this for decades. With web (HTML / HTTP / javascript) being the point where different technologies and platforms merge and gel together, this is more possible than ever. The race to be the frond-end where the users look to consume all this data is on. Facebook, friendfeed, google, microsoft are all looking to retain their users by bringing in all the users data into their presentation realm. Federation with other services is imperative in this more nuanced version of walled gardens.

Being ubiquitous is being able to roam your identity across several technologies (this is becoming a reality with wide spread acceptance of OpenID/OAuth). But perhaps more subtly, it also means that your conversations roam with you. For example, you should be able to continue a conversation you are having on a blog, on twitter. You should be able to reply to a comment on your facebook wall via an instant messaging conversation window. Taking this one step further, you should be able to take your conversation into the context of any web resource that you are browsing. Being ubiquitous also means that on my cell phone I have just one app that gets me all my communications. One push interface that gets me my information from all my channels. One UX for me to look at everything.

Rich

What else can make this communication bus more effective? Most information shared and exchanges in the older models (like e-mail) is mostly text with some urls and images. The tools to express yourself are evolving. I can share much more than text and links in my facebook feed. I can share richer application specific posts with which my friends can interact easily. Still what I can share and how easily I can do it depends on a lot of factors. Being able to share from desktop applications and web sites to any of my streams at different venues (hotmail in-box, facebook news-feed, twitter feed etc) easily and uniformly  is the next big step. What I am sharing has to be decoupled from how I am sharing it. That is, the technology that knows about what I am sharing needs to inter-operate with the technology that is the transport that gets the shared information from one place to another. With the complexity and richness of today’s (and tomorrows) apps and the simplicity of SMTP. Being able to embed objects into conversations, gives context and enhances the value of the communique to the users.

Another aspect of richness of communication is persistence and the conversation object model. Let me explain a bit here. Lot of today’s technologies are basic transports. They just get the information that you share from you to your buddy. Most of the semantics of the conversation is lost. Today, in case of email, you can  manually organize your mail into folders and make it easy on yourself.  Gmail broke this paradigm by making mail searchable (which works very well for me). Google Wave promises to bring a wiki like collaboration model to the mix. IMHO, this is a big step. Being able to have the information exchanged in a group conversation automatically available and organized for future reference is big. Today, most of the technologies, model the user and the buddies as entities that you can interact with. But conversation content is not treated that well. Its mostly treated as blobs passed between users. If the semantics of this conversation is understood and included in the object model of the application, a lot of possibilities open up. Semantic web itself is progressing (with micro-formats and now common-tags) in the direction of empowering the users to define semantic concepts within the context of their content and link it to the greater web. The same could be done with conversations too. Tools for the participants to organize the ideas and concepts in discussion would help.

Real-time

In most of today’s communication applications, real time integrates as an after thought. Like integrating instant messaging into e-mail. You still have to distinct apps here, its just that you can access them from one UX. This is often overlooked because a few minutes delay (as in e-mail) doesn’t sound that bad for most conversations. Also, we are used to technologies that poll and pull data for us. But the real time aspect with protocols like XMPP are the new gold standard for responsiveness and interactive nature of collaboration. Being able to collaboratively edit a document (floor-plans/blueprint/health records) and see each others changes in real time, reliably is vital. As important is the ability to preserve this data in the rich conversation repository that can be edited and enhanced later.

There are several products today that provide a complete stack for real time collaboration. The problem with these is that they are constrained to their domain and are not built for building upon. They are not built as a platform. By far, XMPP has evolved into the most extensible and standards based real-time transport. This should be treated as TCP was for the network and as HTTP was to the web. Application level protocols are being built on top of XMPP as extensions (jingle is an XMPP extension for voice for gtalk). Rich and real-time should be built on such standards based stack to be able to scale and federate with the myriad technologies and social networks.

Real-time takes some effort out of following all the information you are interested in on the web. You can rest assured that if some thing happens you will get to know and you will get to know as soon as it happens. You can put the tired F5 key to rest. Real-time also takes some effort out of what the services need to do to keep you updated. Polling takes a lot of resources, especially at web scale. Imagine 50 million people each following a 100 object on the web and each want to know the moment they object changes. Here is an interesting presentation of how friendfeed crawled flickr 3 million times for 45000 users, only 6K of whom were logged in.

05.29.09

Google Wave and Live Mesh

Posted in Uncategorized at 4:49 am by Ravikant Cherukuri

Google released its new communication and collaboration tool called Google Wave today. Its an exciting piece of technology that could lead the web in the way we consume and contribute to social networks. It brings together e-mail, IM and social streams together into conversations managed and consumed from a single interface. Federating web conversations from third party sites gets to where friendfeed is today and what facebook wants to be. But the integration with email makes it instantly useful.

The most interesting thing for me was the real time nature of conversations. Characters are transferred as you type, photo sharing that looks very real time with thumb nails appearing before the pictures etc make the conversation seem real. Little things that will improve user experience. Real time collaboration is really cool and surely looks like the where things are going in Web 3.0.

Friendfeed does a good job of aggregating conversations from around the web. AlertThingy friendfeed edition uses the friendfeed API to provide a cool conversation interface in a desktop app. The wave conversation reminds me of that and then some.

Though not obvious on first look, there are many similarities between Google Wave and Microsoft’s Live Mesh. Live Mesh has a platform to collaborate and communicate. It lacks a compelling application that can demo its capabilities as well as the Wave demo. An application that is similar to the Google Wave could be built on top of Mesh. Would be interesting to see if something like that comes out.

Mesh works on top of entities called mesh objects. And the mesh platform provides a pub-sub platform for changes to the mesh object. This facilitates sharing and collaboration. Google Wave has the Wave entity and the blip entity that is a change to the wave. Both of them have some kind of pub-sub infrastructure behind them. Both of them provide a good abstraction to represent higher level entities in communication that will take social networking from being able to share blurbs and links to sharing objects with context.

05.14.09

Browser extensions as a service

Posted in Browser SOA at 12:26 am by Ravikant Cherukuri

I hate browser extensions. If you do too and you don’t need any more reasons to convince you they are evil, move on to the second part of my post.

PART I : EXTENSIONS ARE EVIL

I have been using Extensions/Add-ons for a long time to customize IE and Firefox and get more out of my browser. There are many good extensions out there that provide a lot of great functionality. But there are also a lot of malicious ones that caused havoc, the result being that extensions are viewed as security holes and people are wary of them. Add to that the fact that the IE extension programming model is primitive and gives every extension that you install the keys to the kingdom. Extension are a little better on firefox but either way they have a lot of downsides.

  • [IE] No sandbox in which the extensions work. They can pretty much do as they please. Brings down the trust level. Much better in Firefox.
  • [IE] Based on ActiveX. Yuck!. Again much better in Firefox with XUL and a JavaScript based programming model.
  • [IE]Most browser crashes are induced by extensions. Once they crash, the user disables them and that’s the end of story.Again much better in Firefox. Much harder to crash Firefox from a JavaScript based extension.
  • Makes the browser slow even when you are not using the extension. Applies to all browsers.
  • Extension developers have to write multiple versions of their extension for each browser.
  • Sucks to be in the old world of installed applications. Firefox again has an auto-update feature that can keep your extension up to date with new features. But still its not as good as being a service. IE leaves it to the extension writer.
  • Extensions don’t roam. You will have to install the extensions you need to all of your machines.
  • Extensions will not work on your mobile devices. Or even if your extension developer supports your mobile device, you will have to install each on your mobile phone.

Ok with so many headaches and low trust, people end up not installing (m)any extensions. Lets ignore the ones that come factory-installed for the moment. That’s a rant for another post. And remember its easy to disable extensions and toolbars even in IE from 7.0 onwards. So, pre-installing only elevates you to a spammer status.

PART II : SERVICE ORIENTED EXTENSION MODEL

Many of the same problems have been solved when desktop installed applications transitioned to web based services. If browser extensions can be provided as a service, it will be more acceptable to the end user and extensible for the extension (service) developer. Trust, distribution, ease of use, customization, roaming, discovery etc are problems that have been solved for the web.

Greasemonkey brought a new approach to extending the browser. Install one extension and add scripts that will enhance different sites. It has the potential of customizing different sites to your liking and mashing a few of them to see what you want, the way you want it. But I always felt that the experience Greasemonkey gave you is disjoint. There all these cool script that will work if you know that they exist and how to use them. But there is no common thread that will bring all these experiences together.

Bookmarklets are generally used to write one click helpers that extract data from the current page and process it in some way. For example,

    • Press This : take text from the current page and post it to your blog.
    • Readability : format the current page for readability (really good)
    • Favelet suite : reverse engineer a web site using the bookmarklet tools

With a combination of JavaScript injection, HTML 5 features like DOM Storage, cross document messaging and bookmarklets, it is possible to implement “extension as a service”. What does this mean?

  • Needs no installation when used as a bookmarklet
  • Can download extension applets on demand and based on context
  • Extension code hosted as a service
  • User specific extension data stored on the service
  • Can easily roam with the user
  • Same functionality for all browsers with one code base

More on the architecture of this service in a later post.

04.30.09

Walled gardens

Posted in Social Networking at 1:58 am by Ravikant Cherukuri

Wikipedia says -

“A walled garden, with regards to media content, refers to a closed set or exclusive set of information services provided for users (a method of creating a monopoly or securing an information system).”

In context of social networks, these are the networks that will want you to stay in network by locking your information in network. Social networks are dime a dozen. Some are old and some new. Some are popular and some not so much. With innovation on steroids, how would a successful network retain its user base when what the user is looking for changes faster and the want (need??) for new ways to communicate becomes stronger by the day. I will move from MySpace to facebook (facebook to some other) in a minute if I am convinced that facebook gives me more.

When is a walled garden a good thing? Not after the party moves outside.  No wonder many social networks are racing to share and inter-operate so their users can party inside their network. There is an increased capacity of small companies to successfully out-innovate (and become big) any established player. At the same time, there is very little brand loyalty. People go to wherever it is hip to go today or whatever service provides them with the best service and rightly so. Any revenue model that depends on users staying in a walled network will not work. So, what can one do to keep users in?

Innovate. A good strategy but innovation within a company cannot consistently beat the distributed innovation that happens across the many startup garages and passionate minds. This distributed innovation will always win. It takes so little to start a service. Think youtube, facebook etc.

Acquire. Leave the innovation to the small and nimble guys and acquire them to build your service. Easier said than done. These rarely work. Individual pieces are worth more than the whole. Sure, the founders will make money, the big company that takes over will make some news. The probability of success. Especially guaranteed success is low. Hotmail was a good acquisition for Microsoft and youtube might prove to be for Google. But it is sure harder to pull off.

Partner. This is the model that many networks today are shooting for. Provide a platform for the small guys to innovate on. The small guy who used to reverse engineer protocols of the walled garden is the most prized asset today. Take advantage of distributed innovation and keep the users in your network with the diversity of content. But at the same time, open API could ruin your bottom line. Fewer banner ads as people are viewing your content via a different third party client. Sure. But its better than extinction. When you partner to share the innovation, you partner to share the revenue too. The hope is that you would expand faster and live longer. Win with scale. This definitely looks like a workable model.

I am a big fan of The Innovator’s Dilemma. Its explains why the leader in one generation of technology fails to lead the next even if they invented the next generation technology. Disk manufacturers are used as one of the examples. Today with the faster pace of innovation and easier path from idea to reality, this is truer than ever. The main reason for not making it in the next generation is that the leader is busy making money in the current setup and is reluctant to rock the boat.

Keep trying to disrupt yourself. Or somebody else will. Be paranoid. Keep your boat rocking. Linear plans are for suckers (or bell companies of the 70s). In fact, if you are making any money today, somebody is already trying to disrupt you out of existence. If you succeed in convincing your users that your old service is a joke and that they should use your new service that flips the current paradigm, good! You still have your users. If somebody else convinces your users, you are extinct.

Being open interop is not an option. Its necessary to survive. At the same time, openness should be coupled with a business model that can make money in the new world. An API to expose your service and build upon is good. But compelling features that bring users to your service should go along with this.

Next page