In his blog post entitled Namepaces in Xml - the battle to explain Steven Livingstone wrote

It seems that Namespaces is quickly displacing Xml Schema as the thing people "like to hate" - well at least those that are contacing me now seem to accept Schema as "good".

Now, the concept of namespaces is pretty simple, but because it happens to be used explicitly (and is a more manual process) in Xml people just don't seem to get it. There were two core worries put to me - one calling it "a mess" and the other "a failing". The whole thing centered around having to know what namespaces you were actually using (or were in scope) when selecing given nodes. So in the case of SelectNodes(), you need to have a namespace manager populated with the namespaces you intend to use. In the case of Schema, you generally need to know the targetNamespace of the Schema when working with the XmlValidatingReader. What the guys I spoke with seemed to dislike is that you actually have to know what these namespaces are. Why bother? Don't use namespaces and just do your selects or validation.

Given that I am to some degree responsible for both classes mentioned in the above post, XmlNode (where SelectNodes()comes from) and XmlValidatingReader,  I feel compelled to respond.

The SelectNodes() problem is that people would like to perform XPath expressions over nodes and have it not worry about namespaces. For example given XML such as

<root xmlns=”http://www.example.com”>

<child />

</root>

to perform a SelectNodes() or SelectSingleNode() that returns the <child> element requires the following code

  XmlDocument doc = new XmlDocument(); 
  doc.LoadXml("<root xmlns='http://www.example.com'><child /></root>"); 
  XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); 
  nsmgr.AddNamespace("foo", "http://www.example.com");  //this is the tricky bit 
  Console.WriteLine(doc.SelectSingleNode("/foo:root/foo:child", nsmgr).OuterXml);   

whereas developers don't see why the code isn't something more along the lines of

  XmlDocument doc = new XmlDocument(); 
  doc.LoadXml("<root xmlns='http://www.example.com'><child /></root>"); 
  Console.WriteLine(doc.SelectSingleNode("/root/child").OuterXml);   

which would be the case if there were no namespaces in the document.

The reason the latter code sample is not the case is because the select methods on the XmlDocument class are conformant to the W3C XPath 1.0 recommendation which is namespace aware. In XPath, path expressions that match nodes based on their names are called node tests. A node test is a qualified name or QName for short. A QName is syntactically an optional prefix and local name separated by a colon. The prefix is supposed to be mapped to a namespace and is not to be used literally in matching the expression. Specifically the spec states

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

There are a number of reasons why this is the case which are best illustrated with an example. Consider the following two XML documents

<root xmlns=“urn:made-up-example“>

<child xmlns=”http://www.example.com”/>

</root>

<root>

<child />

</root>

Should the query /root/child also match the <child> element for the above two documents as it does for the original document in this example? The 3 documents shown [including the first example] are completely different documents and there is no consistent, standards compliant way to match against them using QNames in path expressions without explicitly pairing prefixes with namespaces.

The only way to give people what they want in this case would be to come up with a proprietary version of XPath which was namespace agnostic. We do not plan to do this. However I do have a tip for developers showing how to reduce the amount of code it does take to write the examples. The following code does match the <child> element in all three documents and is fully conformant with the XPath 1.0 recommendation

XmlDocument doc = new XmlDocument(); 
doc.LoadXml("<root xmlns='http://www.example.com'><child /></root>"); 
Console.WriteLine(doc.SelectSingleNode("/*[local-name()='root']/*[local-name()='child']").OuterXml);  

Now on to the XmlValidatingReader issue. Assume we are given the following XML instance and schema

<root xmlns="http://www.example.com">
 <child />
</root>

<xs:schema targetNamespace="http://www.example.com"
            xmlns:xs="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified">
       
  <xs:element name="root">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="child" type="xs:string" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

</xs:schema>

The instance document can be validated against the schema using the following code

XmlTextReader tr = new XmlTextReader("example.xml");
XmlValidatingReader vr = new XmlValidatingReader(tr);
vr.Schemas.Add(null, "example.xsd");

vr.ValidationType = ValidationType.Schema;
vr.ValidationEventHandler += new ValidationEventHandler (ValidationHandler);

while(vr.Read()){ /* do stuff or do nothing */  

As you can see you do not need to know the target namespace of the schema to perform schema validation using the XmlValidatingReader. However many code samples in our SDK to specify the target namespace where I specified null above when adding schemas to the Schemas property of the XmlValidatingReader. When null is specified it indicates that the target namespace should be obtained from the schema. This would have been clearer if we'd had an overload for the Add() method which took only the schema but we didn't. Hindsight is 20/20.


 

Categories: XML

February 8, 2004
@ 10:15 PM

I noticed Gordon Weakliem reviewed ATOM.NET, an API for parsing and generating ATOM feeds. I went to the ATOM.NET website and decided to take a look at the ATOM.NET documentation. The following comments come from two perspectives, the first is as a developer who'll most likely have to implement something akin to ATOM.NET for RSS Bandit's internal workings and the other is from the perspective of being one of the folks at Microsoft whose job it is to design and critique XML-based APIs.

  • The AtomWriter class is superflous. The class that only has one method Write(AtomFeed) which makes more sense being on the AtomFeed class since an object should know how to persist itself. This is the model we followed with the XmlDocument class in the .NET Framework which has an overloaded Save() method. The AtomWriter class would be quite useful if it allowed you to perform schema driven generation of an AtomFeed, the same way the XmlWriter class in the .NET Framework is aimed at providing a convenient way to programmatically generate well-formed XML [although it comes close but doesn't fully do this in v1.0 & v1.1 of the .NET Framework]

  • I have the same feelings about the AtomReader class. This class also seems superflous. The functionality it provides is akin to the overloaded Load() method we have on the  XmlDocument class in the .NET Framework. I'd say it makes more sense and is more usable if this functionality was provided as a Load() method on an AtomFeed class than as a separate class unless the AtomReader class actually gets some more functionality.

  • There's no easy way to serialize an AtomEntry class as XML which means it'll be cumbersome using this ATOM.NET for the ATOM API since it requires sending  elements as XML over the wire. I use this functionality all the time in RSS Bandit internally from passing entries as XML for XSLT themes to the CommentAPI to IBlogExtension.

  • There is no consideration for how to expose extension elements and attributes in ATOM.NET. As far as I'm concerned this is a deal breaker that makes the ATOM.NET useless for aggregator authors since it means they can't handle extensions in ATOM feeds even though they may exist and have already started popping up in various feeds.


 

Categories: XML

February 8, 2004
@ 09:37 PM

Lots of people seem to like the newest version of RSS Bandit. The most recent praise was the following post by Matt Griffith

I've been a Bloglines user for almost a year. I needed a portable aggregator because I use several different computers. Then a few months ago I got a TabletPC. Now portability isn't as critical since I always have my Tablet with me. I stayed with Bloglines though because none of the client-side aggregators I tried before worked for me.

I just downloaded the latest version RSS Bandit. I love it. It is much more polished than it was the last time I tried it. Combine that with the dasBlog integration and the upcoming SIAM support and I'm in hog heaven. Thanks Dare, Torsten, and everyone else that helped make RssBandit what it is.

Also it seems that at least one user liked RSS Bandit so much that he [or she] was motivated to write an article on Getting Started with RSS Bandit. Definitely a good starting point and something I wouldn't mind seeing become part of the official documentation once it's been edited and more details fleshed out.

Sweet.


 

Categories: RSS Bandit

A few weeks ago during the follow up to the WinFX review of the System.Xml namespace of the .NET Framework it was pointed out that our team hadn't provided guidelines for exposing and manipulating XML data in applications. At first, I thought the person who brought this up was mistaken but after a cursory search I realized the closest thing that comes to such a set of guidelines is Don Box's MSDN TV episode entitled Passing XML Data Inside the CLR. As good as Don's discussion is, a video stream isn't as accessible as a written article. In tandem with coming up with some of the guidelines for utilizing XML in the .NET Framework for internal purposes I'll put together an article based on Don's MSDN TV episode with an eye towards the next version of the .NET Framework.

If you watched Don's talk and had any questions about it or require any clarifications respond below so I can clarify them in the article I plan to write.


 

Categories: XML

February 8, 2004
@ 08:59 PM

Dave Winer is going to giving a talk at Microsoft Research tomorrow. Robert Scoble has is organizing a lunch before the talk with some folks at MSFT and Dave. I may or may not make it since my mom's visiting from Nigeria and I was planning to take most of the week off. Just in case, I miss it there is one thing I'd like Dave to know; most of the problems in the XML-based website syndication space could have been solved if he didn't act as if once he wrote a spec or code for the Radio Userland aggregator then it was impossible to change. Most of the supposed “problems” with RSS would take 30 minutes to fix in the spec and about a day to fix in the Radio Userland codebase (I'm making assumptions here based on how long it would take in the RSS Bandit codebase). Instead he stonewalled and now we have the ATOM mess. Of course, we'd still need something like the ATOM effort to bring the blogging APIs into the 21st century but we wouldn't have to deal with incompatibilities at the website syndication level as well.

 

In a recent blog post Dave mentions that his MSR talk will mainly be about the themes from his article Howard Dean is not a soap bar. I don't really have an opinion on the content one way or the other but I did dislike the way he applies selective memory to prove a point specifically

In the lead-up to the war in Iraq, for some reason, people who were against the war didn't speak.

Maybe they didn't speak on the East coast but there was a very active anti-War movement on the West coast especially in the Seattle area. Actually they did speak out on the East Coast as well, in fact hundreds of thousands of voices all over the US and all over the world spoke out.

It's makes me view the “blogs are the second coming” hype with suspicion when it's boosters play fast and loose with the facts to sell their vision.


 

Categories: Life in the B0rg Cube

I've seen a lot of the hubbub about Janet Jackson's "costume reveal" at the Superbowl and tend to agree with Dan Gillmor that was just one in a series of classless and crass things about the Superbowl. However I've learned something new about Janet Jackson I didn't before this incident, she's dating Jermaine Dupri. Now I'm not one to knock someone's lifestyle choices but come on, Jermaine Dupri? That's almost as stunning as finding out that Whitney Houston ended up with Bobby “3 baby mamas” Brown.  


 

Categories: Ramblings

February 8, 2004
@ 06:02 PM

I stumbled on a link to MSN Search beta site while reading about Google cancelling its Spring IPO. What I find particularly interesting is that it seems to use very similar algorithms to Google given that it falls for Google Bombs as well. For example, check out the results for miserable failure and litigious bastards. In fact, the results are so similar at first I thought it was a spoof site that was just calling out to Google in the background.


 

February 7, 2004
@ 11:28 PM

Yesterday I wrote an entry about the fact that given Microsoft is such a big player in the software market there is the perception [whether correct or incorrect] that once Microsoft enters a technology or product space then smaller players in the market will lose out as customers migrate or Microsoft outmarkets/outspends/outcompetes them. The post also dwelled on a related topic, the perception that Microsoft is fond of vaporware announcements to hinder progress in prospective markets.

After writing the post, I deleted it after it had been on my blog for about 5 minutes. I wasn't happy with the quality of the writing and didn't feel I properly expressed my thoughts. However just the process of writing stuff down made me feel better. Having seen the effects of the Microsoft entering smaller markets on existing participants at a personal level (at least one person has claimed that the fact that I created EXSLT.NET lost him business) as well as at the product unit level (the various technologies the SQL Server Product Unit comes up with from ADO.NET to SQL Server to core XML technologies) there were various thoughts bubbling within me and writing them down helped me understand then come to grips with them.

I definitely need to get a personal journal. I'd have loved to read that post one, five and ten years from now. However it wasn't really for public consumption.


 

Categories: Life in the B0rg Cube

February 7, 2004
@ 11:10 PM
  1. Autodiscovering feeds as you browse the Web. Every link to a feed found on the web pages you browse to from RSS Bandit are available in a handy drop down

    >

  2. Unread items folder

     

    Categories: RSS Bandit

    February 7, 2004
    @ 11:00 PM

    This is a bugfix release. Differences between v1.2.0.89 and v1.2.0.90 below.

    • FIXED: RSS Bandit icons don't show up in shortucts on desktop or in start menu.

    • FIXED: Search Folders now can be saved also without specifying a search expression (e.g. Unread items only).

    • FIXED: Posts in Unread item folders look like gibberish.

    • FIXED: RSS Bandit no longer tries to convert HTML in feeds to XHTML. This means a large number of feed errors about indeclared namespaces and the like should no longer appear.

    • FIXED: The [Next Unread Item] button now iterates through posts in selected Search folder, if there are unread items.

    • FIXED: Sometimes an exception is thrown if [Next Unread Item] button pressed while comments for an item are being downloaded

    • FIXED: Tree view flickers when application is loaded


     

    Categories: RSS Bandit