July 15, 2006
@ 10:25 PM

Nathan Torkington has a blog post entitled A Week in the Valley: GData on the O'Reilly Radar blog that talks about the growth of the usage of GData & the Atom Publishing Protocol within Google as well as Marc Lukovsky's take on how this compared to his time at Microsoft working on Hailstorm. Nat writes

They're building APIs to your Google-stored data via GData, and it's all very reminiscent of HailStorm. Mark, of course, was the architect of that. So why's he coming up with more strategies to the same ends? I figure he's hoping Google won't screw it up by being greedy, the way Microsoft did...The reaction to the GData APIs for Calendar have been very positive. This is in contrast to HailStorm, of course, which was distrusted and eventually morphed its way through different product names into oblivion. Noting that Mark's trying again with the idea of open APIs to your personal data, I joked that GData should really be "GStorm". Mark deadpanned, " I wanted to call it ShitStorm but it didn't fly with marketing".

Providing APIs to access and manipulate data owned by your users is a good thing. It extends the utility of the data outside that of the Web applications that may be the primary consumer of the data and it creates an ecosystem of applications that harness the data. This is beneficial to customers as can be seen by looking around today at the success of APIs such as the MetaWeblog API, Flickr API or del.icio.us API.

Five years ago, while interning at Microsoft, I saw a demo about Hailstorm in which a user visiting an online CD retailer was showed an ad for a concert they'd be interested in based on their music preferences in Hailstorm. The thinking here was that it would be win-win because (i) all the user's data is entered and stored in one place which is convenient for the user (ii) the CD retailer can access the user's preferences from Hailstorm and cut a deal with the concert ticket provider to show their ads based on user preferences and (iii) the concert ticket provider gets their ads shown in a very relevant context.

The big problem with Hailstorm is that it assumed that potential Hailstorm partners such as retailers and other businesses would give up their customer data to Microsoft. As expected most of them told Microsoft to take a long walk of a short pier. 

Unfortunately Microsoft didn't take the step of opening up these APIs to its online services such as Hotmail and MSN Messenger but instead quietly canned the project. Fast forward a few years later and the company is now playing catchup to ideas it helped foster. Amusingly, people like Mark Lucovsky and Vic Gundotra who were influential during the Hailstorm days at Microsoft are now at Google rebuilding the same thing.

I've taken a look at GData and have begun to question the wisdom of using Atom/RSS as the baseline for information interchange on the Web. Specifically, I have the same issues as Steven Ickman raised in a comment on DeWitt Clinton's blog where he wrote

From a search perspective I’d argue that the use of either format, RSS or Atom, is pretty much a hack. I think OpenSearch is awesome and I understand the motivators driving the format choices but it still feels like a hack to me.

Just like you I want to see rich structured results returned for queries but both formats basically limit you to results of a single type and contain a few known fields (i.e. link, title, subject, author, date, & enclosure) that are expected to be common across all items.

Where do we put the 100+ Outlook defined contact fields and how do we know that a result is a contact and not an appointment or auction? Vista has almost 1000 properties defined in its schema so how do we convey that much metadata in a loseless way? Embedded Microformats are a great sugestion for how to deal with richer content but it sort of feels like a hack on top of a hack to me? What’s the Microformat for an auction? Do I have to wait a year for some committee to arrive at joint aggreement on what attributes define an auction before I can return structured auction results?

When you have a hammer, everything looks like a nail. It seems Steven Ickman and I reviewed OpenSearch/GData/Atom with the same critical lens and came away with the same list of issues. The only thing I'd change in his criticism is the claim that both formats (RSS & Atom) limit you to results of a single type, that isn't the case. Nothing stops a feed from containing data of wildly varying types. For example, a typical MSN Spaces RSS feed contains items that represent blog posts, photo albums, music lists, and book lists which are all very different types.

The inability to represent hierarchical data in a natural manner is a big failing of both formats. I've seen the Atom Threading Extensions but that seems to be a very un-XML way for an XML format to represent hierarchy. Especially given how complicated message threading algorithms can be for clients to implement.

It'll be interesting to see how Google tackles these issues in GData.


 

July 14, 2006
@ 06:15 PM

Microsoft has stated that the recently announced interop between Yahoo! Messenger and Windows Live Messenger has created the world's largest IM network. Exactly how big is it compared to the others? Check out the table below which contains ComScore numbers from May 2006. The excerpt is from the Silicon Valley Sleuth blog post entitled Google Talk fails to find an audience

Google's instant messaging service ranks at the bottom of the overall ranking, which is dominated by MSN Messenger/Windows Live Messenger (204m subscribers), Yahoo! Messenger (78m), AIM (34m) and ICQ (33.9m).

ICQ actually grew by more than 10 per cent year-over-year, the data indicated. The network is owned by AOL and is considered the first mainstream instant messaging application.

Another interesting factoid from the data is that E-buddy (formerly known as E-messenger) rules the unified messenger category ahead of Trillian, claiming 3.9m vs. 1.3m unique visitors.

E-buddy offers on online unified messenger for MSN, AOL and Yahoo – no installation required. The great benefit is that it allows users on bolted down corporate networks to connect to instant messaging services without any intervention from the IT department.

Immarket

Interestingly enough, when I read geek blogs I tend to see people assume that Trillian, Meebo and AOL Instant Messenger are the dominant applications in their category. People often state anecdotally that "All my friends are using it so it must be #1", given that IM buddy lists are really social networks it's unsurprising when everyone you know uses the same IM application in much the same way that is unsurprising that everyone you know hangs out at the same bar or coffee shop. However one doesn't extrapolate the popularity of a bar or coffee shop just because everyone you know likes it. The same applies to online hangouts whether they be instant messaging applications, social networking sites, or even photo sharing sites.


 

Categories: Social Software

The Google Adwords API team has a blog post entitled Version 3 Shutdown Today which states

Please take note… per our announcement on May 12, we will shutdown Version 3 of the API today.

Please make sure you have migrated your applications to Version 4 in order to ensure uninterrupted service. You can find more information about Version 4 (including the release notes) at http://www.google.com/apis/adwords/developer/index.html.

-- Rohit Dhawan, Product Manager

This is in compliance with the Adwords API versioning policy which states that once a new version of the WSDL for the Adwords API Web service is shipped, the old Web service end point stops being supported 2 months later. That's gangsta.

Thanks to Mark Baker for the link.


 

From the press release entitled Yahoo! and Microsoft Bridge Global Instant Messaging Communities we learn

SUNNYVALE, Calif., and REDMOND, Wash. — July 12, 2006 — Yahoo! Inc. (Nasdaq: “YHOO”) and Microsoft Corp. (Nasdaq: “MSFT”) today will begin limited public beta testing of interoperability between their instant messaging (IM) services that enable users of Windows Live® Messenger, the next generation of MSN® Messenger, and Yahoo!® Messenger with Voice to connect with each other. This interoperability — the first of its kind between two distinct, global consumer IM providers — will form the world’s largest consumer IM community, approaching 350 million accounts.1

Consumers worldwide from Microsoft and Yahoo! will be able to take advantage of IM interoperability and join the limited public beta program. They will be among the first to exchange instant messages across the free services as well as see their friends’ online presence, view personal status messages, share select emoticons, view offline messages and add new contacts from either service at no cost.2 Yahoo! and Microsoft plan to make the interoperability between their respective IM services broadly available to consumers in the coming months.

The Windows Live Messenger team also has a blog post about this on their team blog entitled Talk to your Yahoo! friends from Windows Live Messenge which points out that Windows Live Messenger users can sign up to participate in the beta at http://ideas.live.com. Once accepted in the beta, Windows Live Messenger users can add people on the Yahoo! IM network to theor Windows Live Messenger buddy list simply by adding new contacts (i.e. add 'Yahoo ID' + @yahoo.com to our IM contact list). Windows Live Messenger users don't need a Yahoo! account to talk to users of Yahoo! Messenger and vice versa. That is how it should be.

Where it gets even cooler is how we handle Windows Live Messenger users that utilize an "@yahoo.com" email address as their Passport account Windows Live ID (e.g. yours truly). If you add such a user to your IM contact list, you get the following dialog

You then get two buddies added for that person, one buddy represents that contact on the Yahoo! IM network and the other is the same buddy on the Windows Live IM network. This is a lot different from what happens when Windows Live Messenger interops with a corporation that uses Microsoft Office Live Communication Server because people are forced to change their Passport account Windows Live ID to an @messengeruser.com address to resolve the ambiguity of using one email address on two IM networks. I much prefer the solution we use for Yahoo! IM interop.


 

Categories: Windows Live

Last month Clemens Vasters wrote a blog post entitled Autonomy isn't Autonomy - and a few words about Caching where he talks about "autonomous" services and data caching. He wrote

A question that is raised quite often in the context of "SOA" is that of how to deal with data.  Specifically, people are increasingly interested in (and concerned about) appropriate caching strategies
...
By autonomous computing principles the left shape of the service is "correct". The service is fully autonomous and protects its state. That’s a model that’s strictly following the Fiefdoms/Emissaries idea that Pat Helland formulated a few years back. Very many applications look like the shape on the right. There are a number of services sticking up that share a common backend store. That’s not following autonomous computing principles. However, if you look across the top, you'll see that the endpoints (different colors, different contracts) look precisely alike from the outside for both pillars. That’s the split: Autonomous computing talks very much about how things are supposed to look behind your service boundary (which is not and should not be anyone’s business but yours) and service orientation really talks about you being able to hide any kind of such architectural decision between a loosely coupled network edge. The two ideas compose well, but they are not the same, at all.

..
However, I digress. Coming back to the data management issue, it’s clear that a stringent autonomous computing design introduces quite a few challenges in terms of data management. Data consolidation across separate stores for the purposes of reporting requires quite a bit of special consideration and so does caching of data. When the data for a system is dispersed across a variety of stores and comes together only through service channels without the ability to freely query across the data stores and those services are potentially “far” away in terms of bandwidth and latency, data management becomes considerably more difficult than in a monolithic app with a single store. However, this added complexity is a function of choosing to make the service architecture follow autonomous computing principles, not one of how to shape the service edge and whether you use service orientation principles to implement it.
...
Generally, my advice with respect to data management in distributed systems is to handle all data explicitly as part of the application code and not hide data management in some obscure interception layer. There are a lot of approaches that attempt to hide complex caching scenarios away from application programmers by introducing caching magic on the call/message path. That is a reasonable thing to do, if the goal is to optimize message traffic and the granularity that that gives you is acceptable. I had a scenario where that was a just the right fit in one of my last newtelligence projects. Be that as it may, proper data management, caching included, is somewhat like the holy grail of distributed computing and unless people know what they’re doing, it’s dangerous to try to hide it away.

That said, I believe that it is worth a thought to make caching a first-class consideration in any distributed system where data flows across boundaries. If it’s known at the data source that a particular record or set of records won’t be updated until 1200h tomorrow (many banks, for instance, still do accounting batch runs just once or twice daily) then it is helpful to flow that information alongside the data to allow any receiver determine the caching strategy for the particular data item(s).

Service autonomy is one topic where I still have difficulty in striking the right balance. In an ideal SOA world, you have a mesh of interconnected services which depend on each other to perform their set tasks. The problem with this SOA ideal is that it introduces dependencies. If you are building an online service, dependencies mean that sometimes you'll be woken up by your pager at 3AM in the morning and it's somebody else's fault not yours. This may encourage people who build services to shun dependencies and build self-contained web applications which reinvent the wheel instead of utilizing external services. I'm still trying to decide if this is a bad thing or not.

As for Clemens' comments on caching and services, I find it interesting how even WS-* gurus inadvertently end up articulating the virtues of HTTP's design and the REST architectural style when talking about best practices for building services. I wonder if we will one day see WS-* equivalents of ETags and If-Modified-Since. WS-Caching anyone? :)


 

Categories: XML Web Services

I was chatting with Kurt Weber yesterday and asked when Windows Live Expo would be getting out of beta. He asked me to check out the team blog later in the day and when I did I saw his blog post entitled Official U.S. Launch of Windows Live Expo. It turns out that yesterday was launch day and below is an excerpt of his blog post describing some of the new features for the launch

 Some of the new features for our latest release include:
  • New Look - A brand new look & feel for the site which includes the official Windows Live look and integration, accessibility, scaling, and easier to use.
  • Comments on a listing – Similar to comments on a blog; this feature will allow users to discuss issues in the soapbox area or ask the seller for more details about an item.
  • APIs – Developers can now access all of our listings using a variety of parameters in order to create cool mash-ups (such as http://www.blockrocker.com). Full details about the API are available at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnlive/html/winliveexpo.asp
  • Driving directions – Users can now easily get driving directions to whatever listing they are viewing (courtesy of our friends at Live Local) by simply clicking a button.

For those keeping score, Expo is the fourth fifth Windows Live service to come out of beta.

Update: Thanks to Szajd for reminding me that there have been five Windows Live services to come out of beta; (Windows Live OneCare, Windows Live Favorites, Windows Live Messenger, Windows Live Custom Domains and Windows Live Expo).
 

Categories: Windows Live

Now that a bunch of Windows Live services are coming out of beta (e.g. Windows Live Messenger, Windows Live Favorites) and a couple more MSN properties are about to make the switch (e.g. MSN Spaces to Windows Live Space) there has begun to be a bit more marketing effort being done around Windows Live. The marketing teams have created a number of websites that explain the value proposition of Windows Live and take you behind the scenes. Check them out

  1. discoverspaces.live.com: This website gives a preview of Windows Live Spaces including some new features such as the Friends list.

  2. inside.live.com: Interviews with members of Windows Live product teams like Leah PearlMan (Windows Live Messenger) and Reeves Little. (Windows Live Mail).

  3. wire.live.com: An aggregation of news stories, blog posts and message board postings about Windows Live. Think of it as Microsoft Presspass on crack.

  4. experience.live.com: This site aggregates the above sites and has place holders for a couple of other upcoming promotional sites about Windows Live.
This is pretty hot, for once I have to say our marketing guys are kicking ass.
 

Categories: Windows Live

Tim O'Reilly has a blog post entitled Operations: The New Secret Sauce where he summarizes an interview he had with Debra Chrapaty, the VP of Operations for Windows Live. He writes

People talk about "cloud storage" but Debra points out that that means servers somewhere, hundreds of thousands of them, with good access to power, cooling, and bandwidth. She describes how her "strategic locations group" has a "heatmap" rating locations by their access to all these key limiting factors, and how they are locking up key locations and favorable power and bandwidth deals. And as in other areas of real estate, getting the good locations first can matter a lot. She points out, for example, that her cost of power at her Quincy, WA data center, soon to go online, is 1.9 cents per kwh, versus about 8 cents in CA. And she says, "I've learned that when you multiply a small number by a big number, the small number turns into a big number." Once Web 2.0 becomes the norm, the current demands are only a small foretaste of what's to come. For that matter, even server procurement is "not pretty" and there will be economies of scale that accrue to the big players. Her belief is that there's going to be a tipping point in Web 2.0 where the operational environment will be a key differentiator
...
Internet-scale applications are really the ones that push the envelope with regard not only to performance but also to deployment and management tools. And the Windows Live team works closely with the Windows Server group to take their bleeding edge learning back into the enterprise products. By contrast, one might ask, where is the similar feedback loop from sites like Google and Yahoo! back into Linux or FreeBSD?

This is one of those topics I've been wanting to blog about for a while. I think somewhere along the line at MSN Windows Live we realized there was more bang for the buck optimizing some of our operations characteristics such as power consumption per server, increasing the number of servers per data center, reducing cost per server, etc than whatever improvements we could make in code or via database optimizations. Additionally, it's also been quite eye opening how much stuff we had to roll on our own which isn't just standard parts of a "platform". I remember talking to a coworker about all the changes we were making so that MSN Spaces could be deployed in multiple data centers and he asked why we didn't get this for free from "the platform". I jokingly responded "It isn't like the .NET Framework has a RouteThisUserToTheRightDataCenterBasedOnTheirGeographicalLocation() API does it?".

I now also give mad props to some of our competitors for what used to seem like quirkiness that now is clearly a great deal of operational savviness. There is a reason why Google builds their own servers, when I read things like "One-third of the electricity running through a typical power supply leaks out as heat" I get quite upset and now see it as totally reasonable to build your own power supplies to get around such waste. Unfortunately, there doesn't seem to be a lot of knowledge out there about the building and managing a large scale, globally distributed server infrastructure. However we are feeding a lot of our learnings back to the folks building enterprise products at Microsoft (e.g. our team now collaborates a lot with the Windows Communication Foundation team) as Debra states which is great for developers building on Microsoft platforms. 


 

About two weeks ago, Greg Reinacker wrote about NewsGator's past, present and future in two blog posts entitled NewsGator platform roadmap - Part I (a look back) and NewsGator platform roadmap - Part II (a look forward). The blog posts are a good look at the achievements of a company that has gone from a one-man shop building an RSS reading plugin for Outlook into being the dominant syndication platform company on almost any platform from Windows & Mac to the Web & mobile phones. If you are interested in XML syndication, then Greg's posts are bookmark-worthy since they describe the future plans of a company that probably has the best minds building RSS/Atom applications working there today. Below are some excerpts from his posts in my areas of interest

NewsGator Online

As I said 16 months ago, the proposed feature list is long and distinguished - and it still is.  There is so much to do here...some of the short-term planned additions range from more interactive feed discovery mechanisms (based on the larger community of users and their subscriptions), to completely different user interface paradigms (where a user could potentially select from different options, each catering to a different kind of user).

A larger initiative is around the whole paradigm. Techies aside, users don't want to think about feeds, and subscriptions, and searching for content...Given all that, we're really rethinking the way we present information to the user, and the way users discover new information.  We're designing ways for people to participate in a larger community if they wish, and get more value out of the content they consume, at the point they discover it.  While we all have our own set of feeds, and we all participate to some extent in the larger ecosystem, there is a lot of potential in linking people with similar interests to each other.  Some users will continue to use our system as they always have - and others will use it in completely different ways.  We're testing a couple of approaches on this right now - I think it's truly a game-changer.

NewsGator Inbox, FeedDemon, NetNewsWire

As I mentioned before, the enthusiasm around these products has continued to grow - people obviously see the value in a rich, synchronized, offline-capable user experience for consuming content.  Moving forward, online integration will get tighter, and more complete - ranging from the low hanging fruit like FeedDemon "News Bins" becoming Clippings (and thus synchronize with the entire platform), to more involved features like analytics-related features (recommendations, interest-based surfacing, etc.) and community-related features.
...
NewsGator core platform

This is the heart of our entire product line (with the exception of NewsGator Enterprise Server).  Moving forward, we're investing a lot in the platform.  We're building out more support for deep analytics (which we can use to deliver different kinds of user experience), and building out a much deeper metadata engine (which means if a client retrieves content from our system, they'll get much richer data than they otherwise would).  We'll have other ways to "slice" our data to get what you need, without having to subscribe to hundreds of feeds.

The API has been very successful, and we process millions of API calls per day from client applications, web services, and private label clients.  This traffic actually makes up a large percentage of our overall system traffic - which I think is a testament to the popularity and utility of the API.  Moving forward here, we're obviously very committed to the API story, and we'll continue to enhance it as we add platform capabilities.

There's lots of good stuff here. The first thing that pops out at me is that while a bunch of startups these days tend to proclaim the death of desktop software, NewsGator is actually seeing the best of both worlds and improving the quality of the desktop experience by harnessing a Web-based platform. It's not Web-based software replacing desktop software, it's desktop software becoming better by working in tandem with APIs and applications on the Web. When Ray Ozzie talks about "live software", NewsGator is the company that leaps most readily to my mind.

I like the idea of making discovery of new content more of a social experience. It'd be interesting to see what would happen if NewsGator Online had a del.icio.us-inspired interface for browsing and subscribing to people's feeds. I notice that Gordon Weakliem who works on the NewsGator API recently wrote a post entitled Needles in Haystacks where he talks about serendipitous discovery of new websites by browsing bookmarks of people with similar interests to him in del.icio.us. I'm sure it's just a matter of time before NewsGator adds these features to their platform.

I also like the idea of exposing richer metadata in the NewsGator API especially if it relates to the social features that they plan to unveil in the next couple of months. Unfortunately, I've never been able to get the NewsGator API to work quite right with RSS Bandit but I'll be revisiting that code later in the summer.


 

Since my girlfriend has kids, I spend a lot more time around kids than I expected to at this age. One of the things I've realized is that I'll probably end up as one of those dads that shows strangers his baby pictures. Since I don't have baby pictures to show y'all, you get the next best thing

  1. Scene: On Our Way To Dinner

    Kids: What Does Your Shirt Say?

    Me: I Only Date Crack Whores [see the T-shirt here]

    Kids: Mommy Isn't A Crack Whore.

    Me: I'll Go Change My Shirt

    This explains why my girlfriend made me throw out my I don't have a girlfriend. But I do know a woman who'd be mad at me for saying that T-shirt. I'm guessing she forgot about this one.

  2. Scene: Playing Video Games with one of Their Friends

    Me: I'm too old to play games with you guys

    Kids: You're not old, you're only 28.

    Kids Friend: You're 28? My mom is 28 and she likes black guys. You should marry my mom.

    Me to girlfriend: Should I tell her mom she said that?

    My Girlfriend: No. Dummy!


 

Categories: Personal