January 13, 2004
@ 04:06 PM

Mark Pilgrim's recent post entitled There are no exceptions to Postel's Law among other things implies that news aggregators should process ill-formed XML feeds given that it is better for end users since they don't care about esoteric rules defined in the XML 1.0 recommendation but they do care if they can't read the news from their favorite sites.

This has unleashed some feedback from XML standards folks such as Tim Bray's On Postel, Again and Norman Walsh's On Atom and Postel's Law who argue that if an feed isn't well-formed XML then it is a fatal error. Aggregator authors have also gotten in the mix. Brent Simmons has a posted a number, of, entries on the topic where he mentions that NetNewsWire currently doesn't error on RSS feeds that are ill-formed XML if it work around the error but plans to change this for ATOM so that it errors on ill-formed feeds. Nick Bradbury has posted similar thoughts with regards to how FeedDemon has behaved in the past and will behave in future. On the other end of the spectrum is Greg Reinacker, the author of NewsGator, who has stated that NewsGator will process ill-formed RSS or ATOM feeds because he feels this is the best choice for his customers.

My thoughts on this matter are the same as Dave Winer's in his post Postel's Law has two parts 

Personally I disagree with the first half of the law when applied to XML -- the idea that aggregators should bend over backwards to accept poorly formed XML. I always understood that XML was trying to do something different, as a response to the awful mess that HTML became because browser vendors adopted the first half of Postel's philosophy.

When I adopted XML, in 1997, as I understood it -- I signed onto the idea of rejecting invalid XML. It was considered a bug if you accepted invalid XML, not a bug if you didn't.

Brent Simmons, an early player in this market, says users are better served if he reads bad feeds, but when he does that, he's raising the barrier to entry, in undocumented ways that are hard to reproduce.

His interests are served by high barriers to entry, but the users do better if they have more choice.

Now, the users are happy as long as Brent is around to keep updating his aggregator to work around feed bugs, but he might move on, it happens for all kinds of reasons. It's better to insist on tight standards, so users can switch if they want to, for any reason; so that next year's feed will likely work with this year's aggregator, even if it doesn't dominate the market.

I yearn for just one market with low barriers to entry, so that products are differentiated by features, performance and price; not compatibility.

I work on the XML team at Microsoft and one of the things I have to do is coordinate with all the other teams using XML at Microsoft. The ability to consume and produce XML is or will be baked into a wide range of products including BizTalk, SQL Server, Word, Excel, InfoPath, Windows, and Visual Studio. This besides the number of developer technologies for processing XML from XQuery and XSLT to databinding XML documents to GUI components. In a previous post I mentioned my XML Litmus Test for deciding whether XML would beneifit your project

Using XML for a software development project buys you two things (a) the ability to interoperate better with others and (b) a number of off-the-shelf tools for dealing with format.

Encouraging the production and consumption of ill-formed XML damages both these benefits of using XML since interoperability is lost when different tools treat the same XML document differently and off-the-shelf tools can no longer be reliably used to process the documents of that format. This poisons the well for the entire community of developers and users.

Developers and users of RSS or ATOM can't reap the benefits of the various Microsoft technologies and products (i.e querying feeds using XQuery or storing feeds in SQL Server) if there is a proliferation of ill-formed feeds. So far this is not the case (ill-formed feeds are a minority) but every time an aggregator vendor decides to encourage content producers to generate ill-formed XML by working aroound it and displaying the feed to the user with no visible problems that is one more drop of cyanide in the well.


Categories: XML

Robert Scoble writes

 I was just over at Slashdot and found that a Microsoft general manager is not happy that HP has partnered with Apple on its iPod and iTunes service.

You know, I've been darn supportive of Microsoft's strategies lately. But, not this time. This strategy of "whine about lack of choice" isn't a winning one.

When someone is beating you in the marketplace, the thing to do isn't to whine about choice (and, if anyone says Apple isn't winning in the marketplace with its iPod then they are drinking far better Merlot than the $5.49 Columbia Crest stuff I can afford). A winning strategy, instead, would be to give consumers a better product and if you believe you have one, tell its story and don't knock the competition!

Shortly after joining Microsoft I attended a class on interacting with customers and competitiveness, where the presenter emphatically pointed out that Microsoft has zero credibility when it tries to attack other companies for being the 800 lb. gorilla in a particular market. At the time I thought this was mind-numbingly obvious and wondered why she was wasting our time telling us what everyone should know. After two years at Microsoft I now realize I was mistaken and many worker drones in the B0rg cube have no idea what the external perception of the company is actually like.

If it wasn't so sad given that I work here I'd find it hilarious that a Microsoft executive is actually trying to pull a “freedom of choice” argument given the company's history. Of course, the folks on Slashdot had a field day with that one.


Categories: Life in the B0rg Cube

January 13, 2004
@ 03:02 PM

The MiddleWare Company  announces 

TMC today launched TheServerSide.NET, an enterprise .NET architecture and development community. The launch is part of a vision aimed at building communities (online sites, conferences, user groups, etc) to serve all technology practitioners in the middleware industry. TSS.NET will be similar to TSS.com in style and quality, but both communities will be operated independently.

It looks like Ted Neward will be the editor-in-chief of the site. It looks like the site will be top heavy on it's focus on XML Web Services so I doubt I'll be subscribing to their RSS feed but if that's your bag it looks like a good site to check out. I've been reading TheServerSide.com for about two years and I've found it useful for getting insight into what's going on in the Java world.

Speaking of the Java world, which community blog site has better signal to noise ratio between Weblogs @ Java.net and Java.Blogs? I've been considering subscribing to one of them but I'm already swamped with lots of content of dubious quality from Weblogs @ ASP.NET and don't want to repeat the experience.


January 13, 2004
@ 06:55 AM

Scott Hanselman writes


The MSN Direct (Wrist.NET) watch is the best PDA-Watch I've seen so far.  As a Palm fan, I was stoked about the Palm PDA Watch, but it was WAY too big, and tried to be too much.  I don't want to use a freakin' stylus on my watch.  At the same time, I have been one of the 'bat-belt' people with a cell-phone, pager, PDA, digital camera and laptop (not to mention a Glucose Meter and Insulin Pump).  I really don't need another battery to charge! 

I don't expect a watch to replace my current ONE device - a Blackberry Phone (that handles email, calendar, web and cell phone on a single device) - but I would like something to provide me with a little more information than just the time, without making me feel overloaded with information.  Plus, it has to look good and not make my one arm 5 lbs heavier than the other.

[Via Rory Blyth]

I have to agree with Rory and Scott, this is the dork watch. So far I haven't seen anyone at work rocking one of these but I'd definitely like to see one close to see if it confirms some suspicions I have about their trendiness and utility factor.


January 13, 2004
@ 06:39 AM

C|Net reports

Yahoo plans to test RSS technology for its personalization tools, giving people the ability to automatically receive news and information feeds from third parties onto MyYahoo pages.

The Sunnyvale, Calif.-based company has been experimenting with technology called Really Simple Syndication (RSS), a format that is widely used to syndicate blogs, discussion threads and other Web content. Yahoo already started using RSS for its Yahoo News service, allowing other sites to automatically "scrape" Yahoo's top stories daily.

Last week, the company started beta testing RSS for MyYahoo, but soon pulled the experiment shortly after.

I've been using My Yahoo! almost from the beginning and have always wanted a way to plugin to their syndication architecture. Being able to syndicate RSS feeds directly into My Yahoo! content is a killer feature. I wonder if they'll go the Slashbox route or allow users to subscribe to feeds directly thus using My Yahoo! as an RSS news aggregator?

Either way it's definitely a step in the right direction.   


A couple of months ago I read How to Ignore Your Best Customers, the TiVo Way (Part 1)  which begins

We’re big TiVo fans, and have been for three years.

There’s tens of thousands of us who evangelize the company’s precedent-setting digital video recorder and how it has changed our lives. Online, 40,000 of TiVo’s customers have self-organized the TiVo Community forum, which we joined a year ago. The group is Beyond Thunderdome-loyal.

Browse the forums and you will find spirited discussions on topics as varied as these:

  • Why TiVo customers often take over for a hapless retail store salesperson

  • How-to guides on the best ways to convince a loved one to buy and keep a TiVo

  • The May 2004 conference in Las Vegas for TiVo enthusiasts that forum members are organizing

For most companies, a self-organized community of 40,000 passionate fans is unfathomable—a Holy Grail and marketing nirvana that many wish for but few attain.

The interesting thing is that I find myself to be one of these people. Whenever I start talking to someone who doesn't have a TiVo about owning one the conversation eventually a sales pitch. I've found that talking to people about the iPod to be the same way. Halfway through the conversation there's the frustration that washes over me because I can't seem to find the words to truly express to the person I'm talking to about how much the iPod or TiVo would change that aspect of their lives.

Watching TV hasn't been the same since I bought the TiVo and I can't imagine ever going back to not having one. Now I have my iPod I can't imagine what would possess me to buy a CD ever again yet I can listen the almost any song I've ever liked from James Brown to Metallica to 50 Cent anywhere I want, whenever I want.

I can't remember any technology ever affecting me this significantly. I believe when I first got a broadband connection it was the same thing and before that probably the first time I got on the World Wide Web. Before that nothing...


Categories: Ramblings

Mark Pilgrim has a fairly interesting post entitled There are no exceptions to Postel’s Law which contains the following gem

There have been a number of unhelpful suggestions recently on the Atom mailing list...

Another suggestion was that we do away with the Atom autodiscovery <link> element and “just” use an HTTP header, because parsing HTML is perceived as being hard and parsing HTTP headers is perceived as being simple. This does not work for Bob either, because he has no way to set arbitrary HTTP headers. It also ignores the fact that the HTML specification explicitly states that all HTTP headers can be replicated at the document level with the <meta http-equiv="..."> element. So instead of requiring clients to parse HTML, we should “just” require them to parse HTTP headers... and HTML.

Given that I am the one that made this unhelpful suggestion on the ATOM list it only seems fair that I clarify my suggestion. The current proposal for how an ATOM client (for example. a future version of RSS Bandit) determines how to locate the ATOM feed for a website or post a blog entry or comment is via Mark Pilgrim's ATOM autodsicovery RFC which basically boils down to parsing the webpage for <link> tags that point to the ATOM feed or web service endpoints. This is very similar to RSS autodiscovery which has been a feature of RSS Bandit for several months.

The problem with this approach is that it means that an ATOM client has to know how to parse HTML on the Web in all it's screwed up glory including broken XHTML documents that aren't even wellformed XML, documents that use incorrect encodings and other forms of tag soup. Thankfully on major platforms developers don't have to worry about figuring out how to rewrite the equivalent of the Internet Explorer or Mozilla parser themselves because others have done so and made the libraries freely available. For Java there's John Cowan's TagSoup parser while for C# there's Chris Lovett's SgmlReader (speaking of which it looks like he just updated it a few days ago meaning I need to upgrade the version used by RSS Bandit). In RSS Bandit I use SgmlReader which in general works fine until confronted with weirdness such as the completely broken HTML produced by old versions of Microsoft Word including tags such as 

<?xml:namespace prefix="o" ns="urn:schemas-microsoft-com:office:office" />

Over time I've figured out how to work past the markup that SgmlReader can't handle but it's been a pain to track down what they were and I often ended up finding out about them via bug reports from frustrated users. Now Mark Pilgrim is proposing that ATOM clients must have to go through the same problems that're faced by folks like me who've had to deal with RSS autodiscovery.

So I proposed an alternative, that instead of every ATOM client having to require an HTML parser that instead this information is provided in a custom HTTP header that is returned by the website. Custom HTTP headers are commonplace on the World Wide Web and are widely supported by most web development technologies. The most popular extension header I've seen is the X-Powered-By header although I'd say the most entertaining is the X-Bender header returned by Slashdot which contains a quote from Futurama's Bender. You can test for yourself which sites return custom HTTP headers by trying out Rex Swain's HTTP Viewer. Not only is generating custom headers widely supported by web development technologies like PHP and ASP.NET but also extracting them from an HTTP response is fairly trivial on most platforms since practically every HTTP library gives you a handy way to extract the headers from a response in a collection or similar data structure.

If ATOM autodiscovery used a custom header as opposed to requiring clients to use an HTML parser it would make the process more reliable (no more worry about malformed [X]HTML borking the process) which is good for users as I can attest from my experiences with RSS Bandit and reduce the complexity of client applications (no dependence on a tag soup parsing library).  

Reading Mark Pilgrim's post the only major objection he raises seems to be that the average user (Bob) doesn't know how add custom HTTP headers to their site which is a fallacious argument given that the average user similarly doesn't know how to generate an XML feed from their weblog either. However the expectation is that Bob's blogging software should do this not that Bob will be generating this stuff by hand.

Mark also incorrectly states that the HTML spec states that any “all HTTP headers can be replicated at the document level with the <meta http-equiv="..."> element”. The HTML specification actually states

META and HTTP headers

The http-equiv attribute can be used in place of the name attribute and has a special significance when documents are retrieved via the Hypertext Transfer Protocol (HTTP). HTTP servers may use the property name specified by the http-equiv attribute to create an [RFC822]-style header in the HTTP response. Please see the HTTP specification ([RFC2616]) for details on valid HTTP headers.

The following sample META declaration:

<META http-equiv="Expires" content="Tue, 20 Aug 1996 14:25:27 GMT">

will result in the HTTP header:

Expires: Tue, 20 Aug 1996 14:25:27 GMT

That's right, the HTML spec says that authors can put <meta http-equiv="..."> in their HTMl documents and a web server gets a request for a document it should parse out these tags and use them to add HTTP headers to the response. In reality this turned out to be infeasible because it would be highly inefficient and require web servers to run a tag soup parser over a file each time they served it up to determine which headers to send in the response. So what ended up happening, is that certain browsers support a limited subset of the HTTP headers if they appear as <meta http-equiv="..."> in teh document.

It is unsurprising that Mark mistakes what ended up being implemented by the major browsers and web servers as what was in the spec after all he who writes the code makes the rules.

At this point I'd definitely like to see an answer to the questions Dave Winer asked on the atom-syntax list about its decision making process. So far it's seemed like there's a bunch of discussion on the mailing list or on the Wiki which afterwards may be ignored by the powers that be who end up writing the specs (he who writes the spec makes the rules). The choice of <link> tags over using RSD  for ATOM autodiscovery is just one of many examples of this occurence. It'd be nice to some documentation of the actual process as opposed to the anarchy and “might is right” approach that currently exists.


Categories: XML

January 10, 2004
@ 07:58 PM

The National Pork Board would like to remind you self-righteous, holier-than-thou beef-eatin' snobs there's never been a single case of "mad pig" disease.

Pork: The other white meat, Bee-yotch!

The Boondocks comic is consistently funny unlike other online comics that have recently started falling off *cough*Sluggy*cough*. Also it is the only other regular newspaper comic that decries the insanity of the current situation in the US.


Slashdot ran yet another article on outsourcing today, this one about how Tech Firms Defend Moving Jobs Overseas. It had the usual comments one's come to expect from such stories. It's been quite interesting watching the attitudes of the folks on Slashdot over the past few years. I started reading the site around the time of the RedHat IPO when everyone was cocky and folks useed to brag about getting cars as signing bonuses. Then the beginning of the downturn when the general sentiment was that only those who couldn't hack it were getting fired. Then the feeling that the job loss was more commonplace started to spread and the xenophobic phase began with railings againsg H1Bs. Now it seems every other poster is either out of work or just got a job after being out of work for a couple of months. The same folks who used to laugh at the problems the RIAA had dealing with the fact that "their business model was obsolete in a digital world" now seek protectionalist government policies to deal with the fact that their IT careers are obsolete in a global economy.   

Anyway, I digress. I found an interesting link in one of the posts to an article on FastCompany entitled The Wal-Mart You Don't Know. It begins

A gallon-sized jar of whole pickles is something to behold. The jar is the size of a small aquarium. The fat green pickles, floating in swampy juice, look reptilian, their shapes exaggerated by the glass. It weighs 12 pounds, too big to carry with one hand. The gallon jar of pickles is a display of abundance and excess; it is entrancing, and also vaguely unsettling. This is the product that Wal-Mart fell in love with: Vlasic's gallon jar of pickles.

Wal-Mart priced it at $2.97--a year's supply of pickles for less than $3! "They were using it as a 'statement' item," says Pat Hunn, who calls himself the "mad scientist" of Vlasic's gallon jar. "Wal-Mart was putting it before consumers, saying, This represents what Wal-Mart's about. You can buy a stinkin' gallon of pickles for $2.97. And it's the nation's number-one brand."

Therein lies the basic conundrum of doing business with the world's largest retailer. By selling a gallon of kosher dills for less than most grocers sell a quart, Wal-Mart may have provided a ser-vice for its customers. But what did it do for Vlasic? The pickle maker had spent decades convincing customers that they should pay a premium for its brand. Now Wal-Mart was practically giving them away. And the fevered buying spree that resulted distorted every aspect of Vlasic's operations, from farm field to factory to financial statement.

and has this somewhere in the middle

Wal-Mart has also lulled shoppers into ignoring the difference between the price of something and the cost. Its unending focus on price underscores something that Americans are only starting to realize about globalization: Ever-cheaper prices have consequences. Says Steve Dobbins, president of thread maker Carolina Mills: "We want clean air, clear water, good living conditions, the best health care in the world--yet we aren't willing to pay for anything manufactured under those restrictions."

which is particularly interesting given the various points I've seen raised about outsourcing in the IT field. The US is definitely headed for interesting times.


January 9, 2004
@ 05:27 AM

For the last couple of months I've noticed a rather annoying bug with my cellphone, an LG 5350. Whenever I enter a new contact it also copies the person's number over the number of an existing contact. If I later delete the new entry it also deletes the copied over number from the other contact. I've lost a couple of folk's phone numbers due to this annoyance. I'm now in the market for a new phone.

The main features I want besides the standard cell phone features (makes calls, addressbook) are the ability to sync with my calendar in Outlook and perhaps the ability to get information on traffic conditions as well.   

I'm currently torn between getting a SmartPhone and a Pocket PC Phone Edition. Too bad stores don't let you test drive cellphones like they do cars.


Categories: Ramblings