December 14, 2006
@ 05:09 PM

I've noticed that some problems with viewing feeds of sites hosted on TypePad for the past few months in RSS Bandit. The problem was that every other post in a feed would display raw markup instead of correctly rendered HTML. I decided to look into the problem this morning and tracked down the problem. Take a look at http://blog.flickr.com/flickrblog/atom.xml. Here are relevant excerpts from the feed


<content type="html" xml:lang="en-ca" xml:base="http://blog.flickr.com/flickrblog/">
&lt;div xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&gt;&lt;p&gt;&&nbsp;&lt;/p&gt;
&lt;div style=&quot;text-align: center;&quot;&gt;&lt;a href=&quot;http://www.flickr.com/gift/&quot;&gt;&lt;img border=&quot;0&quot; src=&quot;http://us.i1.yimg.com/us.yimg.com/i/ww/news/2006/12/12/gtfof.gif&quot; style=&quot;padding-bottom: 6px;&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;p&gt;It&#39;s now easier than ever to spread joy this holiday season by giving the &lt;a href=&quot;http://www.flickr.com/gift/&quot;&gt;&lt;strong&gt;Gift of Flickr&lt;/strong&gt;&lt;/a&gt;. You can purchase a special activation code that you can give to anyone, whether or not they have an existing Flickr account. We&#39;ve even created a special Gift Certificate card that you can print out yourself, fold up and stuff in a stocking, under a tree or hidden away for after the candles are lit (of course, you can also send the gift code in an email).&lt;/p&gt;

&lt;p&gt;And it&#39;s even better to give the gift of Flickr since now your recipients will get &lt;a href=&quot;http://www.flickr.com/help/limits/#28&quot;&gt;&lt;strong&gt;unlimited uploads&lt;/strong&gt;&lt;/a&gt; — the two gigabyte monthly limit is no more (&lt;em&gt;yep, pro users have no limits on how many photos they can upload&lt;/em&gt;)! At the same time, we&#39;ve upped the limit for free account members as well, from &lt;a href=&quot;http://www.flickr.com/help/limits/#28&quot;&gt;&lt;strong&gt;20MB per month up to 100MB&lt;/strong&gt;&lt;/a&gt; (yep, five times more)!&lt;/p&gt;

&lt;p&gt;The Flickr team also wants to take this opportunity to thank you for a wonderful year and wish you and yours all the best of the season. Yay!&lt;/p&gt;&lt;/div&gt;
</content>
...
<content type="xhtml" xml:lang="en-ca" xml:base="http://blog.flickr.com/flickrblog/">
<div xmlns="http://www.w3.org/1999/xhtml"><p><a href="http://www.flickr.com/photos/eye_spied/313572883/" title="Photo Sharing"><img width="500" height="357" border="0" src="http://static.flickr.com/117/313572883_8af0cddbc7.jpg" alt="Dec 2 2006 208 copy" /></a></p>

<p><a title="Photo Sharing" href="http://www.flickr.com/photos/mrtwism/71294604/"><img width="500" height="375" border="0" alt="riding" src="http://static.flickr.com/34/71294604_b887c01815.jpg" /></a></p>

<p>See more photos in the <a href="http://www.flickr.com/photos/tags/biggame/clusters/cal-berkeley-stanford/">"Berkeley," "Stanford," "big game" cluster</a>.</p>

<p>Photos from <a href="http://www.flickr.com/photos/eye_spied/" title="Link to caryniam's photos">caryniam</a> and <a title="Link to mrtwism's photos" href="http://www.flickr.com/photos/mrtwism/">mrtwism</a>.</p></div>
</content>

So the first mystery is solved. The reason some posts look OK and some don't is that for some reason TypePad seems to alternate between escaped HTML and well-formed XHTML as the content of an entry in the feed. When the feed uses well-formed XHTML the item looks fine but when it uses escaped HTML it looks like crap. The next question is why the items aren't rendered correctly when escaped HTML is used.

So I referred to section 3.1 of the Atom 0.3 specification and saw the following

3.1.2  "mode" Attribute

Content constructs MAY have a "mode" attribute, whose value indicates the method used to encode the content. When present, this attribute's value MUST be listed below. If not present, its value MUST be considered to be "xml".

"xml":
A mode attribute with the value "xml" indicates that the element's content is inline xml (for example, namespace-qualified XHTML).
"escaped":
A mode attribute with the value "escaped" indicates that the element's content is an escaped string. Processors MUST unescape the element's content before considering it as content of the indicated media type.
"base64":
A mode attribute with the value "base64" indicates that the element's content is base64-encoded [RFC2045]. Processors MUST decode the element's content before considering it as content of the the indicated media type.

To prevent aggregators from having to use their psychic powers to determine when an item contains plain text or escaped HTML, the Atom folks introduced a mode attribute that indicated whether the content should be treated as is or should be unescaped. As you can see the default value for this is not "escaped". Since the TypePad Atom feeds do not state that the HTML content is escaped then the aggregator is not expected to unescape the content before rendering it. Second mystery solved. Buggy feeds are the culprit. 

Even though these feeds are broken it is probably faster for me to special case feeds fromTypePad than trying to track down and convince the folks at SixApart that this is a bug worth fixing. This issue will be fixed in the next beta of the Jubilee release of RSS Bandit.


 

December 13, 2006
@ 03:05 AM

Six Months Ago: 10 people who don't matter

Mark Zuckerberg
Founder, Facebook
In entrepreneurship, timing is everything. So we'll give Zuckerberg credit for launching his online social directory for college students just as the social-networking craze was getting underway. He also built it right, quickly making Facebook one of the most popular social-networking sites on the Net. But there's also something to be said for knowing when to take the money and run. Last spring, Facebook reportedly turned down a $750 million buyout offer, holding out instead for as much as $2 billion. Bad move. After selling itself to Rupert Murdoch's Fox for $580 million last year, MySpace is now the Web's second most popular website. Facebook is growing too - but given that MySpace has quickly grown into the industry's 80-million-user gorilla, it's hard to imagine who would pay billions for an also-ran.

Today: Yahoo’s “Project Fraternity” Docs Leaked

At Yahoo, the long running courtship has lasted at least as long as this year, and is internally referred to as “Project Fraternity.” Leaked documents in our possession state that an early offer was $37.5 million for 5% of the company (a $750 million valuation) back in Q1 2006. This was rejected by Facebook.

Things really heated up mid year. Yahoo proposed a $1 billion flat out acquisition price based on a model they created where they projected $608 million in Facebook revenue by 2009, growing to $969 million in 2010. By 2015 Yahoo projects that Facebook would generate nearly $1 billion in annual profit. The actual 2006 number appears to be around $50 million in revenue, or nearly $1 million per week.

These revenue projections are based on robust user growth. By 2010, Yahoo assumes Facebook would hit 48 million users, out of a total combined highschool and young adult population of 83 million.

Our sources say that Facebook flatly rejected the $1 billion offer, looking for far more. Yahoo was prepared to pay up to $1.62 billion, but negotiations broke off before the offer could be made.


 

Nick Bradbury, the author of the excellent FeedDemon RSS reader, has a blog post entitled Simplicity Ain't So Simple, Part II: Stop Showing Off where he writes

One mistake I see developers make over and over again is that we make a feature look complicated just because it was hard to create.
...
For example, the prefetching feature I blogged about last week hasn't been easy to create.  This feature prefetches (downloads) links and images in your feeds so that they're browse-able inside FeedDemon when you're working offline.  It works in the background so you can keep using FeedDemon while it does its business, and it's smart enough to skip web bugs, links to large downloads, and other items that shouldn't be cached (including items that have already been cached in a previous session).

It didn't seem like a complex feature when I started on it, but it ended up being a lot more work than I anticipated.  It could easily be an application all by itself, complete with all sorts of configurable options.

But instead of turning this feature into a mini-application, I demoted it to a lowly menu item

I've had that feeling recently when thinking about a feature I'm currently working on as part of podcasting support in RSS Bandit. The feature is quite straightforward. It is the ability for users to specify a maximum amount of space dedicated to podcasts on computer to prevent their hard drive from filling up with dozens of gigabytes of ScobleShow and Channel 9 videos. Below is a screenshot of what the option looks like.

As I started to implement this feature every question I asked myself led to two or three more questions and the complexity just spiralled. I started with the assumption that we'd enforce the download limit before files were downloaded. So if you have allocated 500MB as the maximum amount of space dedicated to podcasts and you attempt to download funny_video.mov (200MB), funny_song.mp3 (5MB) and scary_short_movie.mpg (300MB) in order then we will issue a warning or an error indicating that there won't be enough room to download the last file before attempting to download it. Here's where I got my first rude awakening; there's no guaranteed way to determine the size of the file before downloading. There is a length attribute of the <enclosure> element but it sometimes doesn't have a valid value in certain podcast feeds. Being a Web geek, I thought to myself "Ha, I can always fall back on making an HTTP HEAD request and then reading the Content-Length header". It turns out this isn't always guaranteed to be set either.

So now we have the possibility that the user could initiate three downloads which would exceed the 500MB she has allocated to enclosures. The next question was when to enforce the limit on the files being downloaded. Should we wait until the files have finished downloading and then fail when we attempt to move the downloaded file from the temporary folder to the user specified podcast folder? Or should we stop downloads as soon as we hit 500MB regardless of the state of the downloaded files which means we'll have to regularly collate the size of all pending downloads and add that to the size of all downloads in the podcast folder to ensure that we aren't over the limit? I was leaning towards the former but when I talked to Torsten he pointed out that it seems like cheating if I limit the amount of space allocated to podcasts to 500MB but they could actually be taking over 1GB on disk because I have four 300MB files being downloaded simultaneously. Unfortunately for me, I agreed. :)

Then there's the question of what to actually do when the limit is hit. Do we prompt the user to delete old files, if so what interface do we provide the user to make the user flow sensible and not irritating? Especially since some of the files will be podcasts in the podcast folder and others will be incomplete files that are pending downloads in a temp folder. Yeah, and it goes on and on.

However all our users will see is that one checkbox and field to enter the numeric value.


 

Categories: RSS Bandit

December 12, 2006
@ 02:29 AM

I've had a number of people mention the article about Steve Berkowitz and MSN/Windows Live in the New York Times entitled Looking for a Gambit to Win at Google's Game which contains a bunch of choice negative quotes about our products supposedly from Steve Berkowitz. The article starts of without pulling punches as you can see from the following excerpt

The pressure is on for Mr. Berkowitz to gain control of Microsoft’s online unit, which by most measures has drifted dangerously off course. Over the last year, its online properties have lost users in the United States. The billions of dollars the company has spent building its own search engine have yet to pay off. And amid a booming Internet market, Microsoft’s online unit is losing money.

Google, meanwhile, is growing, prospering, and moving increasingly onto Microsoft’s turf.

Microsoft lost its way, Mr. Berkowitz says, because it became too enamored with software wizardry, like its new three-dimensional map service, and failed to make a search engine people liked to use.

A lot of decisions were driven by technology; they were not driven by the consumer,” he said. “It isn’t always the best technology that wins. It is the best experience.”
...
Mr. Berkowitz does not defend the brand choice he inherited.

“I don’t know if Live is the right name,” he said, saying he had not decided what to do about it. But before he gets around to deciding whether to change the brand, he wants to make Microsoft’s search engine itself more appealing to consumers.

What he did decide was to keep the MSN name afloat, too, as it is well known and its various services have 430 million users around the world. He promoted Joanne K. Bradford, Microsoft’s head of advertising sales, to oversee and revive the MSN portal.

Definitely some harsh words attributed to our corporate VP which has led some Windows Live watchers to wonder whether the brand is going to be tossed. I'm going to ignore the obvious flame bait of seeing an article claiming that one of our corporate vice presidents criticized what is probably the only best of breed online service we provide (i.e. http://maps.live.com) and just focus on an implicit yet incorrect assumption carried throughout the article. The assumption is that Steve Berkowitz runs Windows Live.

I've commented on our org chart before but here is a refresher course for the reporters and bloggers out there that feel compelled to write about Windows Live and MSN. If you go back to the press release after our last major reorg Microsoft Realigns Platforms & Services Division for Greater Growth and Agility you'll notice that it beaks out Microsoft's internet business into the following three pieces

Windows and Windows Live Group
With Sinofsky in charge, the Windows and Windows Live Group will have engineering teams focused on delivering Windows and engineering teams focused on delivering the Windows Live experiences. Sinofsky will work closely with Microsoft CTO Ray Ozzie and Blake Irving to support Microsoft’s services strategy across the division and company.
Windows Live Platform Group
Blake Irving will lead the newly formed Windows Live Platform Group, which unites a number of MSN teams that have been building platform services and capabilities for Microsoft’s online offerings. This group provides the back-end infrastructure services, platform capabilities and global operational support for services being created in Windows Live, Office Live, and other Microsoft and third-party applications that use the Live platform. This includes the advertising and monetization platforms that support all Live service offerings.
Online Business Group
The new Online Business Group includes advertising sales, business development and marketing for Live Platforms, Windows Live and MSN — including MSN.com, MSNTV and MSN Internet Access. David Cole, senior vice president, will lead this group until his successor is named before his leave of absence at the end of April. [Dare - Steve Berkowitz is the replacement]

As you can see from the above press release you'll note that Steve Berkowitz owns the sales, marketing and business aspects of Windows Live but not the products themselves. Steven Sinofsky and his subordinates, specifically Chris Jones and Christopher Payne, are responsible for Windows Live. Although Steve Berkowitz is probably the right guy to talk to about the marketing and branding of Windows Live, he probably isn't the right person to talk to about the future of Windows Live products like search (holla at Christopher Payne) or email/IM/blogging (talk to Chris Jones).

I find it interesting to see articles like NY Times: Will Berkowitz keep Windows Live? because I think although things are confusing now with two poorly differentiated and overlapping brands, it would send out the wrong signal to the the market, our competitors and our customers if we decided to go back to the MSN brand for all our online services. What do you think? 


 

Categories: MSN | Windows Live

Keith Teare of Edgeio has a blog post entitled De-portalization and Internet revenues where he writes

7. Publisher driven revenue models will increasingly replace middlemen. There will be no successful advertiser driven models in the foothills, only publisher centric models. Successful platform vendors will put the publisher at the center of the world in a sellers market for eyeballs. There will be more publishers able to make $180,000 a month.
8. Portals will need to evolve into platform companies in order to participate in a huge growth of Internet revenues. Service to publishers will be a huge part of this. Otherwise they will end up like Infospace, or maybe Infoseek. Relics of the past.
9. Search however will become more important as content becomes more distributed. Yet it will command less and less a proportion of the growing Internet traffic.
10. Smart companies will (a) help content find traffic by enabling its distribution. (b) help users find content that is widely dispersed by providing great search. (c) help the publishers in the rising foothills maximize the value of their publications.

I find Keith's post interesting especially when juxtaposed against Fred Wilson's take on how the big Web companies like Yahoo! can relate to this trend in his blog post The De-Portalization of the Internet (aka What I Would Do If I Were Running Yahoo!) where he writes

Today, we shop directly with the Internet merchants we like or we use a shopping search engine to find what we want. We can look for jobs on Indeed, meet people on MySpace or Facebook, find roomates on Craigslist, and use Meebo for instant messaging. It's rarely true that the best of breed service exists on a "portal". The portals continue to buy best of breed services like Flickr, but now they let the service continue to exist on the web with its own look and feel and URL structure.
...
So if you buy that the web has been de-portalized, what do you do if you run the largest portal in the world? I think its pretty simple actually. Yahoo! needs to offer its users and customers (advertisers) the ability to get the same experience they get on Yahoo! all over the web. They need to stop thinking about keeping their audience on Yahoo.com and start thinking about serving their audience wherever they are on the web. They need to stop thinking about selling ads on Yahoo.com and start thinking about selling ads all over the web.
...
So what are some concrete things they need to do? Well first, they need to improve their search service. On a de-portalized web, it all starts with search. I never hear of companies that have 80 percent of their traffic coming from Yahoo! I hear of companies all the time that have 80 percent of their traffic coming from Google. Yahoo! may have 28% of all Internet searches, but for some reason that I am not sure I completely understand, Yahoo! does not generate 28% of Internet traffic.
...
And Yahoo! needs to get its YPN (Yahoo! Publisher Network) service in gear. They need to offer advertisers the ability to reach people when they are not on Yahoo! They've done some things recently, like the eBay partnership, that suggest they are headed in that direction. But I would urge them to move faster in this direction than they are moving now. It might mean buying some ad networks instead of just investing in them.

This is probably the best advice I've seen on this topic and one I'm sure a lot of folks over here at MSN Windows Live would nod their heads in agreement as they read Fred's advice. The one thing missing from Fred's advice is how exactly Yahoo! should "offers its users and customers (advertisers) the ability to get the same experience they get on Yahoo! all over the web". I'm not sure Fred realizes it but Yahoo! is already halfway there if you look at a number of their initiatives. For one, there are the numerous APIs for Yahoo! services which enable websites and Yahoo! users to incorporate Yahoo! content and services wherever they want on the Web. More importantly, there is now Yahoo! Browser Based Authentication (BBAuth) which is a low cost way for any site on the Web to appear to end users as a member of the Y! network of services since it accepts Yahoo! credentials. Yahoo! is making a lot of the right moves, their big problem now seems to be whether they can evangelize market these initiatives to their customers and other websites in a way that increases adoption. Ideally, they need to show websites how to make that $$$ by partnering with Yahoo!, Google has the advantage in that they have lead with providing $$$ to websites outside their network and now have in that is difficult to beat when it comes to "giving users the Google experience wherever they are on the Web". One could argue that Google Custom Search Engine is a good example of Google embracing the de-portalization trendin the only Google service that end users actually care about.

When it comes to the importance of search, one thing to note is how delicate of a position the major commercial sites such as Amazon and eBay are in. The pattern with the major portals search engines is that they look for what customers are searching a lot for and then provide that as a service. Witness Google's integration of Google Video into the main search page when they realized how much traffic they were sending to YouTube. However the YouTube brand was too strong to be defeated by such tactics and eventually Google purchased the site instead of competing with it. Thus far, Google has embraced de-portalization by providing ads for commercial sites like Amazon but what happens when they realize that they send a ton of traffic to the Amazon and could be getting a cut of the referral fees? I'd keep an eye on Google Checkout if I worked at Amazon or eBay. I suspect that it is just a matter of time before paying the Google tax will be part of the cost of doing business on the Web, in the same way giving Google a cut of your advertising revenues (i.e. being a Google AdSense customer) is almost a given when venturing into the content business on the Web today.

Embracing de-portalization means becoming the ultimate middle man. I remember when erecting Internet Toll Booths was a bad thing. ;) 


 

December 11, 2006
@ 02:03 PM

Edd Dumbill has a blog post entitled Afraid of the POX? where he writes

The other day I had was tinkering with that cute little poster child of Web 2.0, Flickr. Looking for a lightweight way to incorporate some photos into a web site, I headed to their feeds page to find some XML to use.
...
The result was interesting. Flickr have a variety of outputs in RSS dialects, but you just can't get at the raw data using XML. The bookmarking service del.icio.us is another case in point. My friend Matt Biddulph recently had to resort to screenscraping in order to write his tag stemmer, until some kind soul pointed out there's a JSON feed.

Both of these services support XML output, but only with the semantics crammed awkwardly into RSS or Atom. Neither have plain XML, but do support serialization via other formats. We don't really have "XML on the Web". We have RSS on the web, plus a bunch of mostly JSON and YAML for those who didn't care for pointy brackets.

Interesting set of conclusions but unfortunately based on faulty data. Flickr provides custom XML output from their Plain Old XML over HTTP APIs at http://www.flickr.com/services/api as does del.icio.us from its API at http://del.icio.us/help/api. If anything, this seems to indicate that old school XML heads like Edd have a different set of vocabulary from the Web developer crowd. It seems Edd did searches for "XML feeds" from these sites then came off irritated that the data was in RSS/Atom and not custom XML formats. However once you do a search for "API" with the appropriate service name, you find their POX/HTTP APIs which provide custom XML output.

The morale of this story is that "XML feeds" pretty much means RSS/Atom feeds these days and is not a generic term for XML being provided by a website.

PS: This should really be a comment on Edd's blog but it doesn't look like his blog supports comment.
 


Categories: XML

One of the interesting things about Microsoft is that the company is so big that it is quite possible to be working on similar ideas to other groups in the company without significantly exchanging information or cross pollinating ideas. Earlier this week, I was at a cross-divisional information sharing event where I got to see where a lot of products were going with integrating the ideas from social software trends on the Web into their products.

One of the presentations I was most impressed with was the one forthe Knowledge Network for Microsoft Office SharePoint Server 2007. This is a product that integrates with enables people at a company to

  • Discover who knows what and who knows whom within an organization. Quickly and easily locate people by subject expertise or social relationships with key contacts or companies.
  • Simplify creating automated user profiles for each member of the network. Knowledge Network automates the discovery and sharing of undocumented knowledge and relationships for each member in the network. The user-customizable automated profile is secure and requires member approval before it is shared.
  • Effectively search and pinpoint individuals. Knowledge Network provides the ability to connect with internal and external contacts, and calculates the shortest social distance between any two people in the network.

The problem of discovering people with subject matter expertise is a big one at a company like Microsoft with over 70,000 employees. How do you track down the best person to send feedback about Windows Live Spaces or ask a question about some of the idiosyncracies of C#? Knowledge Network attempts to address this in two ways. Recently I was on a mail thread where some folks suggested building a database of employees and annotating it with tags that identified certain attributes or skills of these employees such as the products they worked on, technologies they were experts at and so on. People quickly pointed out that asking people to create a profile of themselves on an internal site then tag themselves is a hassle that few would undertake. What many people on the mail thread [including myself] didn't realize is that Knowledge Network is actually targetted at exactly this scenario. To get over the boot strapping problem, the Knowledge Network client application indexes your email inbox and extracts two sets of information from it (a) a graph of your professional relationships based on who you exchange mail with regularly and (b) a set of keywords that describes subject matter your regularly communicate about. This information can then be uploaded to your company intranet's "People Search" feature where people can then search for you by tags keywords and then once they find you can then ask "Show Me How I Am Connected to this Person" which uses information gleaned from the org chart and email chains to figure out how your social networks overlap. This is seriously cool stuff. 

Although I had heard of the Knowledge Network product I haven't been deeply familiar with it which seems really unfortunate given that a lot of the kinds of social networking features I've been thinking about for Windows Live would benefit from the ideas I've seen implemented by the Knowledge Network team and Sharepoint. If only there was a way I can search for and browse people working on "social networking" technologies at Microsoft so I don't miss information like this in future. :)  I wonder if I can subscribe to an RSS feed of "People Search" results so I can keep track of when new people that have been tagged as "social networking" enter the system (i.e. join the company or start working on a new product). I need to investigate or propose this as a feature if it isn't already there. 

By the way, the Knowledge Network folks have a team blog at http://blogs.msdn.com/kn which has a lot of informative posts about their product such as What is Knowledge Network and Why Should You Care? and How KN Integrates with SharePoint. Definitely add their blog to your news reader if you are interested in social networking within the enterprise.


 

December 8, 2006
@ 04:59 PM

From Jon Udell's blog post entitled A conversation with Jon Udell about his new job with Microsoft he writes

Q: Your new job is with Microsoft?

A: That's right. My last day at InfoWorld will be Friday Dec 15. On Jan 15, after a month-long sabbatical, I'll become a Microsoft employee. My official title will be Evangelist, and I'll report to Jeff Sandquist. He's the leader of the team that creates Channel 9 and Channel 10, websites that feature blogs, videos, screencasts, and podcasts for Microsoft-oriented developers.

Q: What will your role be?

A: The details aren't nailed down, but in broad terms I've proposed to Microsoft that I continue to function pretty much as I do now. That means blogging, podcasting, and screencasting on topics that I think are interesting and important; it means doing the kinds of lightweight and agile R&D that I've always done; and it means brokering connections among people, software, information, and ideas -- again, as I've always done.

Q: Why are you doing this?

A: I'm often described as a leading-edge alpha geek, and that's fair. I am, and probably always will be, a member of that club. But I'm also increasingly interested in reaching out to the mainstream of society.

For those of us in the club, it's a golden age. With computers and networks and information systems we can invent new things almost as fast as we can think them up. But we're leaving a lot of folks behind. And I'm not just talking about the digital divide that separates the Internet haves from the have-nots. Even among the haves, the ideas and tools and methods that some of us take for granted haven't really put down roots in the mainstream.

I had dinner with Jon a couple of weeks ago when he came up to Microsoft for interviews and I was impressed with the plan he described for the future of his career. I was pretty sure that once anyone interviewing him spent even a few minutes talking to him they'd be convinced they'd found the right person for the job, even though the job was Jon's idea. I was honored that Jon contacted me to talk to me about his plans and have been on pins & needles wondering if the folks at Microsoft would hire him or not.

Congrats to Jeff Sandquist. First Rory, now Jon Udell. You're hiring all the right folks.


 

Categories: Life in the B0rg Cube

December 7, 2006
@ 01:27 AM

Via Sam Ruby's blog post entitled Equal Time I noticed that there has been an interesting conversation brewing about message security and RESTful Web services between Pete Lacey and Gunnar Peterson. However they both seem to be cherry picking parts of each others arguments to dispute which reduces some the educational value of their posts.

Gunnar Peterson started the discussion going with his post REST Security (or lack thereof) where he writes

So the whole REST security thing just gets funnier, the S for Simple folks forget that S also stands for security. Here was a response to my post on the fact that people who say REST is simpler than SOAP with WS-Security conveniently ignore things like, oh message level security:

HTTP Basic or HTTP Digest or SSL (certificate-based) for authentication. SSL for encryption and digital signatures. You know, the way we've been doing things since 1995.

Where to start? Right, it was state of the art in 1995. no bout a doubt it. The world has moved on slightly since then. You know a couple 97 million stolen identities, endless phishing/pharming (growing double digit pct each month), malware taking 60% cpu utilization on consumer desktops. You know little stuff like that
...
Now if you are at all serious about putting some security mechanisms in to your REST there are some good examples. One being Amazon's developer tokens using HMAC for authentication at the message level (you know where the data is). But if you are going to say that REST is so much simpler than SOAP then you should compare REST with HMAC, et. al. to the sorts of encryption and signature services WS-Security gives you and then see how much simpler is. And, you know, maybe even see, oh golly gee I don't know, which one protects your customers' data better? Until then, we'll just continue (as Gene Spafford said) using an armored car to deliver between someone living in a cardboard box and someone living on a park bench.

Gunnar has a good point which he ruins with some of his examples. The point being that HTTP authentication and SSL aren't the be all and end all of securely communicating on the Web. However his examples of spyware and phishing are unrelated to his point and end up harming his argument. For one, there's nothing one can do at the service security layer to protect against a user that has malware running on their computer. Once the user's machine has been compromised, it is over. As for phishing, that is a problem that relies on the unique combination of social engineering and the unfortunate characteristics of email readers and Web browsers. Phishing is not really an architectural problem that affects machine to machine interaction via Web service. It is an end user problem of the HTML Web.

In Pete Lacey's response entitled RESTful Security he writes

Gunnar notes that the world has moved past SSL etc., and cites as examples identity theft, phishing/pharming, and malware. But these security threats are completely orhtogonal to the security concerns SSL addresses. Ditto, I might add, WS-Security. Both of these standards address identity propagation, message encryption, and message integrity only, and neither will protect you from the threats just mentioned. Security is a BIG subject and the areas covered by SSL and WS-Security are just one small part of it. We also need good practices around securing persisted data (and what data to persist); education to prevent social engineering attacks; properly designed operating systems that won’t run anything with a .exe extension or run needless services; developers who are cognizant of buffer overflows, SQL injection, and cross-site scripting attacks; properly managed perimeter defenses; and so on and so on.
...
With all of that behind us, I can get on to what seems to be Gunnar’s main point and the only significant difference (outside of the whole simplicity and interoperability thing) between SSL and WS-Security. And that is that SSL provides transport level, point-to-point security while WSS provides message level, end-to-end security. That’s true, but that doesn’t provide WSS with magical security powers, it just solves a different problem. Nor does it relegate SSL to the scrap heap of history. SSL is not a security panacea–nothing is, but it does what it is does very well. Regardless, there is nothing in REST that prohibits the use of message-level encryption, though the mechanism–should it be needed–would need to be spec’d out.

I’m not dismissing WSS, it’s a perfectly adequate specification for what it does (though it requires the WS-I Security Profile to introduce enough constraints to provide a reasonable chance of interoperability). But the value of message level security should still be questioned. For one thing, what’s the business case? If message-level encryption is so important, why isn’t anyone using it? When Burton Group queried its clients as to their use of WSS, it was found that the only use was to pass identity tokens over HTTPS. When I was working at Systinet (now HP) I vividly recall the WASP (not Systinet Server for Java) product manager spitting nails because his team had just spent six months implementing WSS at our customer’s request and no one–not even those who requested the feature–was using it. Also, this is not the first time message level security has been proposed. When I was working at Netscape back in 1997 I spent a fair amount of my time advocating for S/MIME. Now, nearly ten years later, how many people are using S/MIME to secure their email? And how many are using SSL? Exactly.

I tend to agree with Pete Lacey that a lot of the people who claim that they need message level security actually are fine with the transport level security provided by SSL. Message level security is primarily needed if the message will be passing through hostile intermediaries without secure point-to-point communications between the sender and receiver. But how often does that really happen on the Web? One could argue that the vaunted example by Gunnar Peterson, Amazon Web Services which utilize HMAC-SHA1 hashes of a developer's secret key for authentication could just as easily have been implemented using SSL. After all, man-in-the-middle attacks are prevented in both examples. If the issue is what happens if the sender's machine has been compromised (e.g. by malware) then both approaches fall down flat.

That said, there are times when one has to author an application where the message has to pass through potentially hostile intermediaries and message level security is needed. I've actually had to deal with one such situation in my day job so I know that they are real although I doubt that there are many that will encounter the exact same problem that we did at work.

Once you get to that point, the tough problems are usually around key exchange, key protection and key revokation not around the niceties of whether you should roll your own usage of XML Signatures or should go with a fully featured yet inconsistently implemented protocol like WS-Security. Using the Amazon Web Services as an example, I couldn't find any information on how to protect my secret key beyond admonitions "not to send it around in email" nor did I find any mechanism to revoke or reissue my secret key if it became compromised. As a Web service developer, you'll likely spend more time worrying about those issues than you will figuring out how to integrate signing or encryption of XML documents into your RESTful Web Service.


 

Categories: XML Web Services

Linking to Niall Kennedy's blog reminded me that I owed him an email response to a question he asked about a month ago. The question asked what I thought about the diversity of speakers at the Widgets Live conference given my comments on the topic in my blog post entitled Who Attends 'Web 2.0' Conferences

After thinking about it off and on for a month, I realize that I liked the conference primarily because of its content and focus. The speakers weren't the usual suspects you see at Web conferences nor were they homogenous in gender and ethnic background. I assume the latter is a consequence of the fact that the conference was about concrete technical topics as opposed to a gathering to gab with the hip Web 2.0 crowd which meant that the people who actually build stuff were there...and guess what they aren't all caucasian males in their 20s to 30s, regardless of how much conferences like The Future of Web Apps and Office 2.0 pretend otherwise.

This is one of the reasons I decided to pass on the Web 2.0 conference this year. It seems I may have made the right choice given John Battelle's comments on the fact that a bunch of the corporate VP types that spoke at the conference ended up losing their jobs the next week. ;)


 

Categories: Trip Report