Matevz Gacnik points out Serious bug in System.Xml.XmlValidatingReader, he writes

The schema spec and especially RFC 2396 state that xs:anyURI instance can be empty, but System.Xml.XmlValidatingReader keeps failing on such an instance.

To reproduce the error use the following schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="AnyURI" type="xs:anyURI">
  </xs:element>
</xs:schema>

And this instance document:

<?xml version="1.0" encoding="UTF-8"?>
<AnyURI/>

There is currently no workaround for .NET FX 1.0/1.1. Actually Whidbey is the only patch that fixes this. :)

The schema validation engine in the .NET Framework uses the System.Uri class for parsing URIs. This class doesn't consider an empty string to be a valid URI which is why our schema validation considers the above instance to be invalid according to its schema. However it isn't clear cut in the specs whether this is valid or not at least not without a bunch of sleuthing. As Micheal Kay (XSLT working group member) and C.M. Speilberg-McQueen (chairman of the XML Schema working group) wrote on XML-DEV

To: Michael Kay <michael.h.kay@ntlworld.com>
Subject: RE: [xml-dev] Can anyURI be empty?
From: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
Date: 07 Apr 2004 10:49:51 -0600
Cc: xml-dev@lists.xml.org

On Wed, 2004-04-07 at 03:47, Michael Kay wrote:
> > If it couldn't, it would be wrong. An empty string is a valid URI.
>
> On this, like so many other things, RFC 2396 is a total disaster. An empty
> string is not valid according to the BNF syntax, but the RFC gives detailed
> semantics for what it means (detailed semantics, though very imprecise
> semantics).
>
> And the schema REC doesn't help. It has the famous note saying that the
> definition places "only very modest obligations" on an implementation, and
> it doesn't say what those obligations are.

Yes.  This is a direct result of our realization that
we have as much trouble understanding RFC 2396 as anyone
else.  The anyURI type imposes the obligations of
RFC 2396, whatever those are.  Any attempt to paraphrase
them on our part would lead, I fear, to an unsatisfactory
result: either we would make some mistake (like believing
that since the BNF does not accept the empty string,
it must not be legal)
or we would make no mistakes.  In
the one case, we'd be misleading our readers, and in
either case, we'd find ourselves mired in a never-ending
effort to prove that our paraphrase was, or was not,
correct. 

RFC 2396 is one of the fundamental specifications of the World Wide Web yet it is vague and contradictory in a number of key places. Those of us implementing standards often have to go on gut feel or try and track the spec authors whenever we bump across issues like this but sometimes we miss them.

All I can do is apologize to people like Matevz Gacnik who have to bear the brunt of the lack of interoperability caused by vaguely written specifications implemented on our platform and for the fact that a fix for this problem won't be available until Whidbey.


 

Categories: XML

This week was the 2004 MVP Summit, where several hundred MVPs for various Microsoft technologies and products descended on Redmond to interact with the various product teams at Microsoft. It was a little hectic organizing things to ensure that the XML MVPs got enough face time but I think everyone was happy with the way things turned out.

On Monday, a number of Microsoft folks and MVPs had dinner at Rikki Rikki, a rather nice sushi restaurant in Kirkland, WA. The geeks in attendance included Tim Ewald, Ted Neward, Don Box, Rory Blyth, Kirk Allen Evans, Drew Marsh, Julia Lerman, Jeff Julian, Sam Gentile, Joshua AllenChris Anderson, Arpan Desai, Mark Fussell, DonXML Demsak, Daniel Cazzulino,  Aaron Skonnard , Christoph Schittko, Rick Strahl, Joe Fawcett, Peter Provost, Cathi Gero, Michael Rys, Bryant Likes, Jeffrey Richter and a number of others. The dinner was Kirk's idea and Don suggested the place, it was definitely a pleasant evening talking XML geekery. One of the things we talked about was why some of the APIs that were in the PDC preview won't make it to Whidbey such as the XmlAdapter and the XPathChangeNavigator.  In retrospect the functionality provided by both APIs was complex to implement yet could be satisfied through other mechanisms.

At the end of the dinner Kirk took a group photograph. Afterwards a couple of us stragglers saw the movie Hellboy which was an entertaining super hero movie although the ending could have been better.

On Wednesday, eight XML MVPs (DonXML Demsak, Joe Fawcett,  Daniel Cazzulino, Jeff Julian, Matevz Gacnik, Rolandas Gricius, Bryant Likes, and J. Michael Palermo IV) got to spend a day with the WebData XML team. Shortly after 9 AM there was an hour long open panel discussion with the MVPs on one side and a few dozen members of the WebData XML team on the other with questions flying back and forth. For many members of the team, getting candid feedback from an array of customers with different backgrounds was very illuminating. The rest of the day was filled with presentations and Q & A sessions with the MVPs. They got a preview of what we'll be doing in Whidbey and maybe Orcas in the area of XML tools, XML<->Relational mapping technologies, XQuery and core XML APIs. Since one of the complaints we'd heard was that a number of sessions they'd seen earlier in the week were just rehashed PDC slides I endeavored to ensure that MVPs would see newer content or at least get more in depth information about what we plan to do. Based on the feedback I got they were pleased with the experience.

On Friday, Robert Scoble and Charles Torre swung by my office and interviewed me for Channel 9. I gave them a tour of my office and showed them my budding collection of Spawn action figures and my demotivators hanging on the wall. I'm not sure if I gave a good interview or not but I guess that's part of the charm of Channel 9. I'll post a link to the interview whenever it shows up online.


 

Categories: Life in the B0rg Cube

April 6, 2004
@ 05:15 PM

I'd like to congratulate Robert Scoble , Jeff Sandquist and all the others involved on the launch of Channel 9. The doctrine of Channel 9 positions it as an avenue for Microsoft employees and their customers to interact in an honest manner. The Who We Are page states that it is an "attempt to move beyond the newsgroup, the blog, and the press release to talk with each other, human to human".

My personal take on Channel 9 is that it reminds me a lot VBTV, people either really liked it or really hated it. Already I've begun to see posts from both ends of the spectrum, there are posts like Channel 9 - a very commendable effort and then those like Why Channel 9 is stupid. I tend to agree with the latter post but do think it is an interesting experiment.

I think Microsoft has been doing a good job of providing avenues for its employees to interact directly with their customers from newsgroups to blogs to the various developer websites such as MSDN and ASP.NET. If anything I feel like there are probably too many options than too few. I daily check the microsoft.public.dotnet.xml newsgroup, microsoft.public.xml newsgroup, the Extreme XML message board on MSDN, various blogs I'm subscribed to, the comments in my work blog as well as various internal mailing lists for feedback on the technologies I am responsible for. Then there are days like yesterday when I got to hang out, drink beer and eat sushi with Tim Ewald, Ted Neward, Don Box, Rory Blyth, Kirk Allen Evans, Drew Marsh, Julia Lerman, Jeff Julian, Sam Gentile, Joshua AllenChris Anderson, Arpan Desai, Mark Fussell, DonXML Demsak, Daniel Cazzulino, Drew Marsh,  Aaron Skonnard , Christoph Schittko, Rick Strahl, Joe Fawcett and a bunch of others.

The thought that Microsoft needs to “beyond the newsgroup, the blog, and the press release” doesn't jibe with my experiences interacting with our customers in my daily experience. In fact, I know a number of our customers dislike the fact that there's a decision tree that needs to be traversed to figure out how to get information from Microsoft (do they got to newsgroups? MSDN? find the relevant blog? go to GotDotNet? call PSS? etc). 

However as I mentioned earlier, I think it is an interesting experiment which means I will participate to some degree. I'm already scheduled to do an interview with Scoble this Friday so I'll probably be hamming it up in one of those streaming videos in the next few weeks.

Work calls.


 

Categories: Life in the B0rg Cube

Joshua Allen has a post entitled RSS Last Mile where he complains about the lack of a clear story with regards to one click subscription to RSS/ATOM feeds. I wrote about the various approaches to achieving one click subscription to ATOM and RSS feeds a few months ago which led to drafting feed URI scheme. Three months later, one click subscription to syndication feeds is still as confused as it's always been. A lot of the major aggregators support the feed URI scheme but none of the major blogging tools has decided to support it yet. Instead a lot of folks still use the 127.0.0.1 hack popularized by Radio Userland but which is now utilized by a wide number of aggregators. However most websites just do nothing with regards to one click subscription and just have a hyperlinked image, such as , which points to the RSS feed. 

The only new thing I've seen is that yet another person has cooked up their own one click subscription scheme that is incompatible with all the others. Thanks to Joshua's post I found an RFC for one click subscription to syndication feeds which seems to me to be the least advantegeous of the approaches that have shown themselves thus far.

The author of the RFC wrote the following about existing approaches, I've annotated his comments with mine in red text

Current solutions:

  • have the aggregator clients register with some mime-type (for either RSS or OPML), I don't believe anyone's actually implemented this since most aggregator authors know this doesn't work for a variety of reasons listed in my post on one click subscription to ATOM and RSS feeds
  • have a new protocol (feed:), actually this is a URI scheme not a protocol, and the process is the same as the above, have the aggregator clients register with as the handler for some URI scheme
  • support as many clients as possible via javascript (see QuickSub),
  • transform the RSS with XSL in the browser to help newbies (no really a one-click subscription solution though).This could be a one click subscription option if the prettied up RSS feed shown to the user also displays a link that uses one of the other 3 techniques mentioned above. So this approach is really orthogonal to the others and in fact can be considered complimentary

The author of the RFC post then goes on to suggesting an Internet Explorer specific solution namely that

Replace the orange Feed button:
The orange feed button needs to be wrapped with an object tag:

<object classid="clsid:0123456789ABCDEF [1]">
  <param name="feedurl" value="http://feedurl [2]">
  <param name="description" value="blah blah [3]">
  <param name="imageurl" value="http://buttonimageurl [4]">

  <a href="http://feedurl [2]"><img src="http://buttonimageurl [4]" /></a>
<object>

If the ActiveX control with class ID [1] is installed, it displays a custom "subscribe" button. When you click on it, it uses the feedurl parameter [3] to subscribe.

Besides the fact that this approach is Internet Explorer specific since it requires an ActiveX object it doesn't offer anything that the other approaches don't.  I don't see why Joshua thinks it's a good idea, considering that all 3 of the other approaches work in a variety of browsers on a variety of platforms.

 


 

Categories: RSS Bandit

I finally got to take a look at the WS-MetadataExchange specification while hanging out in Don's office last week. The spec is fairly straightforward, it defines a mechanism for one to request the WSDL, Policy or XML Schema of a target namespace (i.e. a URI) from an XML Web Service endpoint. Basically one can ask what services an endpoint supports and what the messages the end point accepts should look like. 

Both Don and Omri have suggested that WS-MetadataExchange can solve a problem I had with the SOAP-based version of the ATOM API. The problem is how an ATOM client is supposed to know what services an ATOM end point supports. Here are three descriptions of ATOM-enabled sites that I might want to interact with as an RSS Bandit user.  

  1. A weblog that supports user comments posted anonymously and provides the ability to search the weblog archives. The user comments must use a subset of HTML. For example, Sam Ruby's weblog.

  2. A weblog that doesn't have comments enabled but does provide the ability to search the weblog archives. For example, Mark Pilgrim's weblog

  3. A weblog that only supports comments that have been authenicated with TypeKey and doesn't support search.  Again user comments must use a subset of HTML. Any Movable Type blog that supports TypeKey is an example.

All three would require a smart client to give the user visual hints and clues as to how they can interact with the site. At the very minimum a search box that is grayed out when the target weblog doesn't support search.

So far the only mechanism I've seen proposed for solving this problem in the case of the ATOM API is the link element used for locating service endpoints. This allows you to get the URI of service end points like where to post comments or where to send search results if they exist but do not answer finer grained questions. Questions such as “What subset of HTML can I use in comments?” or “Do I need to be authenticated before I post comments” are currently not answered by any of the draft ATOM specs.

So far WS-MetadataExchange or something like it look like the best way to support such scenarios for SOAP-enabled ATOM end points in a way that is consistent with the Global XML Web Services architecture. I would be interested in seeing an ATOM-specific solution evolve as well since some of this issues hurt usability of weblogs. I've lost count of the amount of times I've posted a comment or seen someone post a comment only to complain about the fact that the weblog doesn't support HTML or mangled some text. Having a way to inquire about this in a standard way would definitely improve the user experience.  


 

Categories: XML

This week Torsten figured out how to get the equivalent of “Subscribe in RSS Bandit” to the context menu in Internet Explorer and Firefox when you right-click on a link. Click below for a screenshot of what it looks like in Internet Explorer.
 

Categories: RSS Bandit

April 3, 2004
@ 06:53 PM

It seems the more popular hip hop gets the more I hate the stuff that gets played on the hip hop radio stations. I particularly cringe whenever I hear J-Kwon's “Tipsy” or Kanye West's “Through the Wire”. It seems I've begun to retreat into the past or listen exclusively to mix tapes. Select tracks from the following albums have been playing semi-regularly on my iPod in the past few weeks

A friend of mine suggested picking up a Linkin Park album but I'm not sure where to start. I have heard their collaboration with the X-Ecutioners on It's Goin' Down and I liked it. So the question is whether to go with their last album or their first album.


 

Categories: Ramblings

I am subscribed to the Amazon Hip Hop Music RSS feed which provides information about available hip hop CDs on sale at Amazon. I was just thinking that it'd be really cool if once a CD I liked showed up in my RSS feed I could just click a [Buy This] button and initiate the process of purchasing the CD. Combining this with Amazon's one click shopping it's conceivable that this could be done in a single click.

At the minimum to implement something like this you'd need an annotation in the RSS feed that contained the end point for the aggregator to submit the purchase information securely and a specified format for what the submitted purchase information should look like. In the case of Amazon, it might just be a cookie while for others it might be the all the required information like credit card number and shipping address.

Now that would definitely be cool. A subscription to a list of things I might be interested in buying and the ability to buy one of them when something caught my eye.   

 


 

Categories: RSS Bandit

I heard D12's My Band on the radio and was wondering if this is the harbinger of a new album by Em or D12. The song is the typical kind of over-the-top rap you get from D12, the premise is that the rest of D12 is jealous of Em cause he gets more money and fame then them.  

I guess I need to swing by a CD store and see if I can get any news that way.


 

Categories: Ramblings

A little while ago I noticed the SAX dot NET project was announced on the XML-DEV mailing list. From the desxcription on the project page

SAX dot NET is a C# port of the original Java based SAX API specifications. When compiled into a .NET assembly it becomes available to the other .NET languages as well.

The .NET Framework doesn't ship with an implementation of a SAX push model XML parser but instead ships with the pull-model parser in the form of the System.Xml.XmlReader class. The primary reasons for this can be gleaned from my article A Survey of APIs and Techniques for Processing XML where I list the pros and cons of various approaches for processing XML. The main advanatages a pull-model XML parser like the XmlReader have over a push model XML parser like SAX are

Pull model parsers typically do not require a specialized class for handling XML processing since there is no requirement to implement specific interfaces or subclass certain classes for the purpose of registering callbacks. Also the need to explicitly track application states using boolean flags and similar variables is significantly reduced when using a pull model parser

I can understand that developers migrating to the .NET Framework from Java platforms or MSXML would like to have the familiar feel of the SAX API so I definitely welcome such projects. However I have seen some criticism of the project from Daniel Cazzulino, a Microsoft XML MVP, in his post Do we need SAX for .NET? (or does Java ports to C# make sense?) he points out of some of the disadvantages of blindly porting an API from one platform to another. He points out some inconsistencies and redundancies between SAX dot NET and the .NET Framework such as

  • There  is an XmlNamespaces class that does the same thing as the System.Xml.XmlNamespaceManager class

  • There are IAttributes AND IAttributes2, and the corresponding implementations called AttributesImpl and AttributesImpl2 which seem to imply interface versioning problems and legacy issues in a brand new project.

  • The existence of non-standard delegates such as OnPropertyChange(IProperty property, object newValue)  instead of the  typical pattern in the .NET world where it should be OnPropertyChange(object sender, ProperyChangeEventArgs e).

I think Daniel raises good points and encourage any developer porting an API to the .NET Framework to endeavor to make it consistent with the patterns and naming conventions in the .NET Framework. Doing so makes it easier for developers to understand how to use the API since it will be familiar and contains few surprises.  


 

Categories: XML