Thoughts on technology and social web

March 2, 2010

Pubsubhubbub with .NET and WCF – Part 1

Filed under: Uncategorized — Ravikant Cherukuri @ 12:42 am

Pubsubhubbub (PuSH) has been gaining momentum in the real-time space as a protocol for server to server notifications on changes to feeds. Google Buzz API is based on PuSH. Several google services have adopted this and its now catching momentum among other services as well. Superfeedr is my favorite startup in this space. They host hubs for your feeds, handling the complexity of hub implementation for you. The real-time web is becoming truly real-time now and companies like superfeedr could provide the backbone for this new model.

The PuSH protocol itself simple to implement and defined in an open spec. Most of the code samples available for this are in ruby/python/php etc. I started working on a PuSH subscriber in C# with a WCF REST service. WCF provides an easy to use framework to build web services in .NET. To implement PuSH, there are some WCF quirks to getting this done. In this post I explain how to build a WCF subscriber service.

These are couple of good posts about how to use and implement PuSH.

http://blog.superfeedr.com/API/pubsubhubbub/getting-started-with-pubsubhubbub/
http://josephsmarr.com/2010/03/01/implementing-pubsubhubbub-subscriber-support-a-step-by-step-guide/

We need a service that supports the following REST APIs to behave as a PuSH subscriber.

  • Verify
    This will receive a GET request to verify the subscription once we POST a subscribe to the hub.
  • Callback
    This will receive new content posted to the feed.

Note that PuSH is a server to server protocol and so the subscribing entity is also a server. There are two pieces of code we need to write for the subscriber. One is the client code that posts a subscribe onto the hub and the second is the service that responds to the verification and data callbacks. The service is implemented using WCF and the subscriber client library using HttpWebRequest. I used WcfContrib library to encode/decode requests and responses. The WCF service interface would look like this.

[ServiceContract]
public interface IPubSubHubbub
{
    [OperationContract]
    [WebInvoke(Method = Verbs.Get, UriTemplate = "/callback3?hub.mode={mode}&hub.topic={topic}&hub.challenge={challenge}&hub.lease_seconds={leaseSeconds}&hub.verify_token={verifyToken}")]
    Stream Verify(SubscriptionMode mode, Uri topic, string challenge, int leaseSeconds, string verifyToken);

    [OperationContract]
    [WebInvoke(Method=Verbs.Post, UriTemplate = "/callback3")]
    [ServiceKnownType(typeof(Atom10FeedFormatter))]
    void SubscriberCallback(SyndicationFeedFormatter formatter);
}

When a subscriber subscribes to a hub, the subscriber provides a callback URL to the hub. The hub calls this URL to verify the subscription. Later, when the topic has new content published, the hub will call the same URL with the content (PuSH does a fat ping). This is why both methods above gave the same base URI. Verification is a GET request while the data callback is a POST. Subscription mode is defined using DataContract as below. Notice that the Verify method uses a Stream return value. This is because the Verify method needs to return the challenge that the hub sends in “text/plan” format. The new content notification happens as a POST in ATOM format.

[DataContract]
public enum SubscriptionMode
{
    [EnumMember(Value = "subscribe")]
    Subscribe,
    [EnumMember(Value = "unsubscribe")]
    Unsubscribe,
}

Below is the WCF server implementation.

[AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)]
[WebDispatchFormatterConfiguration("application/x-www-form-urlencoded")]
[WebDispatchFormatterMimeType(typeof(WcfRestContrib.ServiceModel.Dispatcher.Formatters.FormUrlEncoded), "application/x-www-form-urlencoded")]
[WebDispatchFormatterMimeType(typeof(WcfRestContrib.ServiceModel.Dispatcher.Formatters.PoxDataContract), "application/atom+xml")]
public class Service1 : IPubSubHubbub
{
    public Stream Verify(SubscriptionMode mode, Uri topic, string challenge, int leaseSeconds, string verifyToken)
    {
        WebOperationContext.Current.OutgoingResponse.ContentType = "text/plain";
        return new MemoryStream(Encoding.UTF8.GetBytes(challenge));
    }

    public
void Callback(SyndicationFeedFormatter formatter)
    {
        ServiceReference1.Service1Client service = new ServiceReference1.Service1Client();

        foreach (SyndicationItem item in formatter.Feed.Items)
        {
       
        }
    }
}

Things of note here.

  • The "application/x-www-form-urlencoded" encoding is handled by the WcfContrib library’s FormUrlEncoded formatter class.

  • The verify method returns the challenge back as a memory stream and sets the encoding as "text/plain" on the WebOperationContext. Without this, the default encoding will be "text/xml" and hub would fail the subscribe.

  • The Callback method would received a parsed feed. The SyndicationFeedFormatter should be able handle both ATOM and RSS feeds, though I tested it only with ATOM.

To establish the subscription, the subscriber needs to make a REST call to the hub. The initial input that the subscriber would have about the topic that needs to be subscribed to is the URL to the feed. The feed header would have a link of type “hub” that would point to the hub for this feed.

String GetHubFromFeed(String feed)
{
    WebRequest r = HttpWebRequest.Create(feedUrl);
    WebResponse re = r.GetResponse();

    using (StreamReader sr = new StreamReader(re.GetResponseStream()))
    {
        XmlDocument doc = new XmlDocument();

        doc.LoadXml(sr.ReadToEnd());
        XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
        nsMgr.AddNamespace("atom", http://www.w3.org/2005/Atom);
        XmlElement hubElement = (XmlElement)doc.SelectSingleNode(
                                                                  "//atom:feed/atom:link[@rel=’hub’]", nsMgr);
        return hubElement.SelectSingleNode("@href", nsMgr).Value;
    }
}

Once we have the hub URL, we need to setup a subscription to the hub for our topic URL. The topic URL is the URL in the atom:link with rel=”self” (There has been discussion in the PuSH group about being able to subscribe with any URL that points to the feed and in the latest version of the spec, its no longer a requirement to subscribe to the “self” URL).

public String Subscribe(string feedUrl, string hubUrl, string callbackUrl)
{
    StringBuilder contentBuilder = new StringBuilder();
    NameValueCollection collection = new NameValueCollection();
    byte[] responseContent = null;

    collection.Add(hubCallbackParamName, callbackUrl);
    collection.Add(hubMode, "subscribe");
    collection.Add(hubTopic, feedUrl);
    collection.Add(hubVerify, "sync");
    collection.Add(hubVerifyToken, "test_token");

    for (int i = 0; i < collection.Keys.Count; i++)
    {
        contentBuilder.AppendFormat("{0}={1}&",
        HttpUtility.UrlEncode(collection.GetKey(i)),
        HttpUtility.UrlEncode(collection.Get(i)));
    }

    int stringLength = contentBuilder.Length;
    byte[] requestContent = Encoding.UTF8.GetBytes(
    contentBuilder.ToString(0, stringLength > 0 ? stringLength – 1 : 0));

    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(hubUrl);
    request.Method = "POST";
    request.ContentType = "application/x-www-form-urlencoded";
    request.ContentLength = requestContent.LongLength;

    using (Stream requestStream = request.GetRequestStream())
    {
        requestStream.Write(requestContent, 0, requestContent.Length);
        requestStream.Flush();
    }

    using
(HttpWebResponse response = (HttpWebResponse)request.GetResponse())
    {
        if (request.HaveResponse)
        {
             responseContent = new byte[response.ContentLength];
             using (Stream responseStream = response.GetResponseStream())
             {
                 responseStream.Read(responseContent, 0, responseContent.Length);
             }
        }
    }

    return Encoding.UTF8.GetString(responseContent);
}

This completes a PuSH client implementation. In a later post I will go over the implementation of a hub in WCF.

October 19, 2009

Workflow with activity streams

Filed under: Uncategorized — Tags: — Ravikant Cherukuri @ 11:15 pm

For a while I have been thinking about how the social stream technologies (facebook newsfeed/twitter timeline) can be used to solve other problems. Facebook and the other 100 social networks out there familiarized millions of users to the concept of a life stream. The activity streams effort puts a standardized structure on the lifestream. This enables the activities to roam networks. There is a lot of energy being spent on how this roaming can be standardized and made to work as seemlessly as the telephone network works today. All these ideas are building a platform for the future web. It’s time to start buiding the applications for it.

Today we think of our lifestream as the flow of stuff we like, we want to share, we want to comment on etc. These are intended for people in our network to consume. This data is pretty much unstructured. Life would include much more than what you share with your friends. If the tasks that you (need to) accomplish on a day to day become a part of your activity stream, lifestream applications can help you achieve them. These streams could be private and you could share them with applications that you choose. For example,

  • your calendar could talk to you via your lifestream. Something like evite could be managed via people’s lifestream. You post the invite on your stream and make it visible to people you want to invite. They could accept the invitation right from your stream.
  • Another example – you drop your car at the dealer and you want him to contact you when its done. In a world where lifestreams are ubiquitous, you should be able to expose a private and time bound stream to him that he could write to (just like you share a telephone number today). He could talk back to you on this stream.
  • You could open your stream to your travel agent on which the agent could push itineraries to you that you could comment on or accept. Once you accept one, the stream could be handed over to the airlines and so on.

The trick is to do this in a way that is simple for users, secure in a way that this does not expose more data than that is necessary to the third parties and works in a network  independent manner. OAuth and activity streams look like the right technologies. Enabling the devices and services that you use on a day to day basis to get things done, to interface with you using your life stream seems like a logical step forward. These kind of things have happened on a crude level on email. Email still does not have a standardized content format and so does not tend well to programmability.

August 13, 2009

Real time roundup [Part 2] — responsive web applications

Filed under: Uncategorized — Ravikant Cherukuri @ 2:22 pm

This is often a hard to appreciate aspect of the real time stack. But only till you experience it. Think about the instant delivery of notifications in Facebook or the character by character real time transmission in Google wave. On the face of it, it might look like a minor feature but from UX responsiveness, this is big. Once you use these apps, the rest of the world would look sluggish kind of like how a dial-up internet connection feels today. This is as much of a leap as the transition from refreshing web pages to see updates to letting the RSS reader do it for you.

This pattern can be applied to many web applications. This is the key that will bridge the gap that web applications have over desktop applications. Their ability to react to user/data events in real time. With HTML web sockets, this might be more standardized but even today, long poll/comet etc are pretty scalable and general purpose. Once the common web programming environments like ASP.NET/JSP/PHP etc embrace this concept fully, there is great potential here.

This could even be in the form of a generic pub-sub provided as a secure cloud service. As a web site developer, I need a pipe to transfer data to and from the browser in real time. But I dont want to invest in a pub-sub infrastructure which is very different from the HTTP programming model that I am used to and has entirely different scaling characterstics. If this is provided as a service on Microsoft Azure/Amazon ECS/Google AppEngine, if would be much easier to integrate into the current programming model. But what would such a reusable model include ? A few things off the top of my head :

  1. The long poll connection to the server with a javascript component that is embedded in the web pages. This will enable the duplex connectivity.
  2. A pub sub infrastructure that will provide some session storage and topic based subscriptions to the sessions. For example, when a user visits a rel time web page, a session will be created for it on the pub sub backend. Then the web site (instance) will register for topics that it is interested in.  The web server could publish these topics to the pub-sub system and the pub-sub system would push these to the browsers.
  3. Topics can be transient entries in the pub sub system that are created as pages subscribe to them.
  4. An API exposed to the web site backend that would push arbitrary topic updates to teh pub sub system.

This kind of an infrastructure component provided as a piece of code to be reused or as a cloud service would make it easy to build real time responsive web sites.

May 29, 2009

Google Wave and Live Mesh

Filed under: Uncategorized — Ravikant Cherukuri @ 4:49 am

Google released its new communication and collaboration tool called Google Wave today. Its an exciting piece of technology that could lead the web in the way we consume and contribute to social networks. It brings together e-mail, IM and social streams together into conversations managed and consumed from a single interface. Federating web conversations from third party sites gets to where friendfeed is today and what facebook wants to be. But the integration with email makes it instantly useful.

The most interesting thing for me was the real time nature of conversations. Characters are transferred as you type, photo sharing that looks very real time with thumb nails appearing before the pictures etc make the conversation seem real. Little things that will improve user experience. Real time collaboration is really cool and surely looks like the where things are going in Web 3.0.

Friendfeed does a good job of aggregating conversations from around the web. AlertThingy friendfeed edition uses the friendfeed API to provide a cool conversation interface in a desktop app. The wave conversation reminds me of that and then some.

Though not obvious on first look, there are many similarities between Google Wave and Microsoft’s Live Mesh. Live Mesh has a platform to collaborate and communicate. It lacks a compelling application that can demo its capabilities as well as the Wave demo. An application that is similar to the Google Wave could be built on top of Mesh. Would be interesting to see if something like that comes out.

Mesh works on top of entities called mesh objects. And the mesh platform provides a pub-sub platform for changes to the mesh object. This facilitates sharing and collaboration. Google Wave has the Wave entity and the blip entity that is a change to the wave. Both of them have some kind of pub-sub infrastructure behind them. Both of them provide a good abstraction to represent higher level entities in communication that will take social networking from being able to share blurbs and links to sharing objects with context.

Blog at WordPress.com.