December 31, 2003
@ 04:46 PM

From ThinkGeek

Skillset Exportable
Insufficient ROI
Office of Employee Termination and Overseas Outsourcing

Definitely wouldn't mind rocking this around the B0rg cube.


December 31, 2003
@ 04:41 PM

Joi Ito recently added a link to a CSS style information to the content in his RSS feed. This broke a number of news aggregators because his stylesheet clashed with whatever styles were being used by various client aggregators. As Sam Ruby points out RSS Bandit strips out such tags completely so we don't have this problem.

We started stripping certain [X]HTML tags for security reasons after I read Mark Pilgrim's article on "How To Consume RSS Safely". Since then I've recanted on striping certain tags now that we use the browser's security settings to decide whether to load ActiveX controls, execute Javascript or even load external images. However I still plan to strip style tags because RSS Bandit's XSLT themes would render quite hideously if we loaded CSS stylesheets defined in the feed in combination with them. Just imagine what would happen if I combined the style definitions in random feeds with RSS Bandit's Outlook 2003 theme, Halloween theme, or Unwise Terminal theme. Ugh.



Categories: RSS Bandit

Oleg Tkachenko writes

The goals of exposing comments are: enabling for arbitrary RSS reader application to see comments made to blog items and to post new comments. There are several facilities developed by RSS commutity, which allow to achieve these goals:

  1. <slash:comments> RSS 2.0 extension element, which merely contains number of comments made to the specified blog item.
  2. RSS 2.0 <comments> element, which provides URI of the page where comments can be viewed and added (it's usually something like http://yourblog/cgi-bin/mt-comments.cgi?entry_id=blog-item-id in MT blogs).
  3. <wfw:commentRss> RSS 2.0 extension element, which provides URI of comment feeds per blog item (to put it another way - returns comments made to specified blog item as RSS feed).
  4. <wfw:comment> RSS 2.0 extension element, which provides URI for posting comments via CommentAPI.

It works like a charm. Now users of SharpReader and RSS Bandit can view the comments to posts in Oleg's MovableType blog directly in their aggregator. Posting comments from RSS Bandit works as well. Hopefully, this will catch on and folks no longer have to choose between .TEXT and  dasBlog (i.e. ASP.NET/Windows based blogging tools) when they want a blog tool that supports exposing comment in their RSS feed. The more the merrier.


Categories: Technology

I've written the first draft of the specification for the "feed" URI scheme. From the abstract

This document specifies the "feed" URI (Uniform Resource Identifier) scheme for identifying data feeds used for syndicating news or other content from an information source such as a weblog or news website. In practice, such data feeds will most likely be XML documents containing a series of news items representing updated information from a particular news source.

The purpose of this scheme is to enable one click subscription to syndication feeds in a straightforward, easy to implement and cross platform manner. Support for one click subscription using the "feed" URI scheme is currently supported by NetNewsWire, Shrook, SharpReader and RSS Bandit. The author of NewsGator has indicated that support for one click subscription using the "feed" URI scheme will exist in next version.

Any feedback on the draft specification would be appreciated.

Update: Graham Parks has pointed out in the comments to this post that URIs of the form "feed://" are not compliant with RFC 2396. This will be folded into the next draft of the spec.


Categories: Technology

Chris Sells recently complained that a recent interview of Don Box by  Mary Jo Foley is "a relatively boring interview" because "Mary Jo doesn't dig for any dirt and Don doesn't volunteer any". He's decided to fix this by proposing an alternate interview where folks send in their favorite questions and he picks the 10 best and formwards them to Don (kinda like Slashdot interviews). Chris offers some seed questions but they are actually much lamer than any of the ones Mary Jo asked so I suspect his idea of questions that dig for dirt are different from mine.

I drafted 10 questions and picked the 3 least controversial for my submissions to the Don Box interview pool.

  1. People often come up with euphemisms for an existing word or phrase that has become "unpleasant" which although technically mean a different thing from the previous terminology are used interchangeably. A recent example of this is the replacement of "black" with "African American" in the modern American lexicon when describing people of African descent.

    I suspect something similar has happened with XML Web Services and Service Oriented Architecture. Many seem to think that the phrases are interchangeable when on the surface it seems the former is just one instance of the latter. To you what is the difference between XML Web Services and Service Oriented Architectures?

  2. For a short while you were active in the world of weblogging technologies, you tried to come up with an RSS profile and were working on a blogging tool with Yasser Shohoud and Martin Gudgin. In recent times, you have been silent about these past activities. What sparked your interest in weblogging technologies and why does that interest seem to have waned?

  3. What team would you not want to work for at Microsoft and why?

These were my tame questions but I get to hang with Don sometime this week so I'll ask him some of the others in person. I hope one of my questions gets picked by Chris Sells.


Categories: Life in the B0rg Cube | XML

Where else do you get to see movie clips of illustrious American celebrities in ads for household products they wouldn't be caught doing in the United States?, of course. The front page of the website reads

Pander:n., & v.t. 1. go-between in clandestine amours, procurer; one who ministers to evil designs. 2 v.i. minister (to base passions or evil designs, or person having these)

Japander:n.,& v.t. 1. a western star who uses his or her fame to make large sums of money in a short time by advertising products in Japan that they would probably never use. ~er (see synecure, prostitute) 2. to make an ass of oneself in Japanese media.

The clips are all crazy weird from Arnold Schwarznegger pimping energy drinks and cup o' noodles to Mel Gibson, Antonio Banderas  & Kevin Costner as Subaru pitchmen. I probably spent 30 minutes marvelling at the ads on the site, I definitely never thought I'd ever see Harrison Ford doing beer commercials. Definitely entertaining stuff.  


Just stumbled on the following article entitled So, Scrooge was right after all

Conventional economics teaches that gift giving is irrational. The satisfaction or "utility" a person derives from consumption is determined by their personal preferences. But no one understands your preferences as well as you do.

So when I give up $50 worth of utility to buy a present for you, the chances are high that you'll value it at less than $50. If so, there's been a mutual loss of utility. The transaction has been inefficient and "welfare reducing", thus making it irrational. As an economist would put it, "unless a gift that costs the giver p dollars exactly matches the way in which the recipient would have spent the p dollars, the gift is suboptimal".

The big problem I've always had with economics as I was always taught in school is that the fundamental assumption underlying it is that humans make rational decisions when buying and selling goods and services. This is simply not true. The above example is a good one; it makes more sense for everyone involved in the annual gift exchange that is Christmas if people just gave checks and gift certificates instead of buying gifts that the recipients don't want or don't need. Yet this isn't how Christmas gift giving is done in most cases. Then there's the entire field of advertising with its concept of lifestyle ads which are highly successful and are yet another example that human buying decisions aren't steeped in rationality.

What a crock...


December 27, 2003
@ 09:55 PM

An article in the Economist lets us know that research has confirmed that men lose their fiscal prudence in the presence of attractive women

Over 200 young men and women participated in the study, which was divided into three parts. In the first, the participants were asked to respond to nine specific choices regarding potentially real monetary rewards. (At the end of the session, they could roll dice to try to win one of their choices, which would be paid by an appropriately post-dated cheque issued by the university.) In each case, a low sum to be paid out the next day was offered against a higher sum to be paid at a specified future date. Individual responses were surprisingly consistent, according to Dr Wilson, so the “pre-experiment” threshold of each participant was easy to establish.

The volunteers were then asked to score one of four sets of pictures for their appeal: 12 attractive members of the opposite sex; 12 non-lookers; 12 beautiful cars; or 12 unimpressive cars. Immediately after they had seen these images, they were given a new round of monetary reward choices.

As predicted, men who had seen pictures of pretty women discounted the future more steeply than they had done before—in other words, they were more likely to take the lesser sum tomorrow. As Dr Wilson puts it, it was as though a special “I-want-that-now” pathway had been activated in their brains. After all, the money might come in handy immediately. No one else was much affected. (Women did seem to be revved up by nice cars, a result the researchers still find mystifying. But the statistical significance of this finding disappeared after some routine adjustments, and in any case previous work has suggested that women are more susceptible to displays of wealth than men are.)

I guess this explains Abercrombie & Fitch's "alleged" hiring practices. It's always interesting to see stuff you've long taken for granted backed up by research especially observing how the experiments are confucted.


December 27, 2003
@ 07:37 PM

Slashdot has posted a link to Eric Sink's "Make More Mistakes" article on MSDN. One of the anecdotes from the article reads as follows

Circa 1998, betting on Java for a graphical user interface (GUI) application was suicidal. The platform simply didn't have the maturity necessary for building quality user interfaces. I chose Java because I was "head over heels" in love with it. I adored the concept of a cross-platform C-like language with garbage collection. We were hoping to build our Java expertise and make this exciting new technology our specialty.

But Java turned out to be a terrible frustration. The ScrollPane widget did a lousy job of scrolling. Printing support routinely crashed. The memory usage was unbelievably high.

I should have gotten over my religious devotion to semicolons and done this app in Visual Basic.

Lesson learned: Be careful about using bleeding-edge technologies.

There are some on Slashdot who think that Eric learned the wrong lesson from that experience. This post entitled Alteration of rule is from a developer who was in a similar circumstance as Eric but had a different outcome

I built a Java/Swing app around the same time. It was a pretty complex user app, not just a simple program - and we managed to completely satisfy the clients and make the program perform acceptably on a very low-end target platform (PII-133 with 32 MB of memory). For what he described (replacing a complex spreadsheet) he should have been able to complete the task.

Why did our app work and his fail? Because we knew Java and Swing well by that point, and knew what was possible with some time spent optimizing. We had a plan in our head for how to reach a target level of performance that would be accepted and more than met that goal.

The lesson he should have learned was "Know your technology well before you embark on a project". The reason why it's so important to learn THAT lesson is that it applies to any project, not just ones using "bleeding edge" technologies. The only difference between an established and bleeding edge technology is the level of support you MIGHT be able to find. And that is not enough of a difference to totally affect either failure or success.

I tend to agree with the Slashdot post. Learning on the job is fine and all but when big bucks is on the line its best to go with what you are familar with especially if it is tried and tested.


Categories: Ramblings

December 27, 2003
@ 07:29 PM

I was talking to some friends of mine over Christmas who happen to still be in college and I learned about a drinking game they played in their sorority house that actually requires sports equipment and hand eye coordination. I am speaking of Beer Pong, the description of Beer Pong at reads 

The materials you need for this game are some cups of beer, a ping pong ball, and a table. The game is best played with either two people or two teams of two.

Arrange the cups of beer on either side of the table like you are setting up bowling pins. You should have at least six cups on both sides. Each team takes a turn by trying to get the ping pong ball into the other team's cups. If they succeed the other teams must drink that cup. The cup is then removed and the rest of the cups are rearranged so that they are close to each other. Each team alternates turns like this.

When all the cups on one side have been cleared the team that cleared them wins and the other teams must finish any cups remaining on the winning team's side

My friends and I would have loved this game. I wonder what else I missed out on by going to a geeky technical college and shunning the few social activities that occured on campus.

Categories: Ramblings

December 26, 2003
@ 04:07 PM

Mark Pilgrim's most recent entry in his RSS feed contains the following text

The best things in life are not things. (11 words)

Note: The "dive into mark" feed you are currently subscribed to is deprecated. If your aggregator supports it, you should upgrade to my Atom feed, which includes both summaries and full content.

A lot of the ATOM vs. RSS discussion has been mired in childishness and personality conflicts with the main proponents of ATOM claiming that the creation of the ATOM syndication format will be a good thing for users of syndication and blogging software. Now let's pretend this is true and the only people who have to bear the burden are aggregator authors like me who now have to add support for yet another syndication format. Let's see what my users get out of ATOM feeds compared to RSS feeds.

  1. Mark Pilgrim's ATOM feed: As I write this his feed contains the following elements per entry;  id, created, issued, modifed, link, summary,title, dc:subject and content. The aformentioned elements are equivalent to guid, pubDate, issued, modified, link, description, title, dc:subject and content:encoded/xhtml:body that exist in RSS feeds today. In fact an RSS feed with those elements and Mark Pilgrim's feed will be treated identically by RSS Bandit. The only problematic pieces are that his feed contains three dates that express when the entry was issued, when it was modified and when it was created. Most puzzling is that the issued date is before its created date. I have no idea what this distinction means and quite frankly I doubt many people will care.

    Basically, it looks like Mark Pilgrim's ATOM feed doesn't give users anything they couldn't get from an equivalent RSS feed except the fact that they have to upgrade their news aggregators and deal with potential bugs in the implementations of these features [because there are always bugs in new features]
  2. LiveJournal's ATOM feeds: As I write this a sample feed from Live Journal (in this case Jamie Zawinski's) contains the following elements per entry;  id, modified, issued, link, titleauthor and content . The aformentioned elements are equivalent to guid, modified, issued, link, titleauthor/dc:author and content:encoded/xhtml:body. Comparing this feed to Mark Pilgrim's I already see a bunch of ambiguity which supposedly is not supposed to exist since what ATOM supposedly gives consumers over RSS is that it will be better defined and less ambiguous than RSS. How are news aggregators supposed to treat the three date types defined in ATOM? In RSS I could always use the pubDate or dc:date now I have to figure out which of <modified>, <issued> or <created> is the most relevant one to show the user. Another point is what do I do if a feed contains <content rel="fragment"> amd a <summary>? Which one do I show the user?
  3. Movable Type's ATOM feeds: As I write this the MovableType ATOM template contains the following elements; id, modified, issued, link, titleauthor, dc:subject. summary and content. The aformentioned elements are equivalent to guid, modified, issued, link, titleauthor/dc:author, dc:subject, description and content:encoded/xhtml:body. Again besides the weirdness with dates (and I suspect RSS Bandit will end up treating <modifed> equivalent to <pubDate>) there isn't anything users get from the ATOM feed that they don't get from the equivalent RSS feed. Interesting, I'd expected that I'd find at least one of the first 3 sample ATOM feeds that I took a look at would show me why it was worth it that I spend a weekend or more implementing ATOM support in RSS Bandit. 

The fundamental conceit of the ATOM effort is that they think writing specifications is easy. Many of its proponents deride RSS for being ambiguous and not well defined yet they are producing a more complex specification with more significant ambiguities in it than I've seen in RSS. I actually have a mental list of significant issues with ATOM that I haven't even posted yet, the ones I mentioned above were just from glancing at the aforementioned feeds. My day job involves reading or writing specs all day. Most of the specs I read either were produced by the W3C or by folks within Microsoft. Every one of them contains contradictions, ambiguities and lack crucial information for determining in edge cases. Some are better than others but they all are never well-defined enough. Every spec has errata.

The ATOM people seem to think that if a simple spec like RSS can have ambiguities they can fix it with a more complex spec, which anyone who actually does this stuff for a living will tell you just leads to more complex ambiguities to deal with not less.

I wish them luck. As I implement their spec I at least hope that some of these ATOM supporters get a clue and actually use some of the features of ATOM that RSS users have enjoyed for a while and are lacking in all of the feeds I linked to above such as the ATOM equivalent to wfw:commentRss. It's quite irritating to be able to read the comments to any .TEXT or dasBlog weblog in my news aggregator but then have to navigate to the website when I'm reading a Movable Type or LiveJournal feed to see the comments.  


Categories: XML

In a post entitled A Plea to Microsoft Architects, Michael Earls writes

This post in in response to a post by Harry Pierson over at DevHawk...

It is abundantly frustrating to be keeping up with you guys right now.  We out here in the real world do not use Longhorn, do not have access to Longhorn (not in a way we can trust for production), and we cannot even begin to test out these great new technologies until version 1.0 (or 2.0 for those that wish to stay sane)...My job is to work on the architecture team as well as implement solutions for a large-scale commercial website using .NET.  I use this stuff all day every day, but I use the  1.1 release bits.

Here's my point, enough with the "this Whidbey, Longhorn, XAML is so cool you should stop whatever it is you are doing and use it".  Small problem, we can't.  Please help us by remembering that we're still using the release bits, not the latest technology... Oh yeah, we need more samples of current bits and less of XAML.

Remember, we're your customers and we love this new technology, but we need more of you to focus CURRENT topics on CURRENT RELEASE bits.  I don't want to read about how you used XAML and SOA to write a new version of the RSS wheel.  The RSS I have now is fine (short of the namespace that Harry mentions).  Leave it alone.

The only folks at Microsoft with Architect in their job title that blog I can think of are Don, Chris Anderson and Chris Brumme so I assume Michael is complaining about one or more of these three although they may be other software architect bloggers at Microsoft that I am unaware of. The first point I'd note is that most people that blog at Microsoft do so without any official direction so they blog about what interests them and what they are working on not what MSDN, PSS or our documentation folks thinks we need more public documentation and guidance around. That said, architects at Microsoft usually work on next generation technologies since their job is to guide and supervise their design so it is to be expected that when they blog about what they are working on it will be about next generation stuff. The people who work on current technologies and are most knowledgeable about them are the Program Managers, Developers and Testers responsible for the technology not the architects that oversee and advise their design.

My advice to Michael would be that he should broaden his blog horizons and consider reading some of the other hundreds of Microsoft bloggers many of whom blog about current technologies instead of focusing on those folks who are designing stuff that'll be shipping in two or more years and complaining when they blog about said technologies.

This isn't to say I disagree with Michael's feedback and in fact being a firm believer in Joel Spolsky's  Mouth Wide Shut principle I agree with most of it  (except for the weird bit about the fact that blogging about next generation stuff increases the perception that Microsoft is a monopoly). However he and others like him should remember that most of us blogging are just talking about we're working on not trying to give people "version envy" because we get to run the next version of the .NET Framework or Windows years before they ship.  

I have no idea how Chris Anderson, Don Box and other Microsoft architect bloggers will react to Michael's feedback but I hope they take some of it to heart.

[Update: Just noticed another Microsoft blogger with "architect" in his job title, Herb Sutter. Unsurprisingly he also blogs about the next release of the product he works on not current technology.]

Categories: Life in the B0rg Cube

December 24, 2003
@ 05:09 AM

Joshua Allen writes

 Before discussing qnames in content, let's discuss a general issue with qnames that you might not have known about.  Take the following XML:

<?xml version="1.0" ?>
<root xmlns:p="">
  <p:elem att1="" att2="" ... />
  <p:elem att1="" att2="" ... xmlns:p="" / >
  <x:elem att1="" att2="" xmlns:x="" />

Notice the first two elements, both ostensibly named "p:elem", but if we treat the element names as opaque strings, we'll get confused and think the elements are the same.  Luckily, we have this magical thing called a qname that uses namespace instead of prefix, and so we can note that the two element names are actually "{}elem" and "{}/elem" -- different.  By the same token, if we compare the first and third element using opaque strings, we think that they are different ("p:elem" and "x:elem").  But if we look at the qnames, we see they are both "{}elem".
so what is the big deal for qnames in content?  Look at the following XML:

<?xml version="1.0" ?>
<root xmlns:x="urn:x" xmlns:p="" >
  <p:elem>here is some data: with a colon for no good reason</p:elem>
  <p:elem xmlns:x="urn:y">x:address</p:elem>

Now, do the last two "p:elem" elements contain the same text, or different text?  If you compared using XSLT or XPath, what would be the result?  How about if you used the values in XSD key/keyref?  The answer is that XSLT and XPath have no way of knowing that you intend those last two elements to be qnames, so they will treat them as opaque strings.  With XSD, you could type the node as qname... Most APIs are smart enough to inject namespace declarations if necessary, so the first node would write correctly as:

<p:elem xmlns:p="">here is some data: with a colon for no good reason</p:elem>

But, since the DOM has no idea that you stuffed a qname in the element content, it's got no way to know that you want to preserve the namespace for x:

<p:elem xmlns:p="">x:address</p:elem>

There is really only one way to get around this, and this is for any API which writes XML to always emit namespace declarations for all namespaces in scope, whether they are used or not (or else understand enough about the XSD and make some guesses).  Some APIs do this, but it is not something that all APIs can be trusted to do, and it yields horribly cluttered XML output and other problems.

Joshua has only hit the surface of what the real problem which is that there is no standard way to write out an XML infoset with the PSVI contributions added during validation. In plain English, there is no standard way to write out an XML document that has been validated using W3C XML Schema containing all the relevant type annotations plus other infoset augmentations. In the above example, the fact that the namespace declaration that uses the "x" prefix is not included in the output is not as significant as the fact that there is no way to tell that the type of p:elem's content is the xs:QName type.

However this doesn't change the fact that using QNames in content in an XML vocabulary is a bad idea. Specifically I am talking about using the xs:QName type in your vocabulary.  The semantics of this type are so absurd it boggles the mind. Below is the definition from the W3C XML Schema recommendation

[Definition:]   QName represents XML qualified names. The ·value space· of QName is the set of tuples {namespace name, local part}, where namespace name is an anyURI and local part is an NCName. The ·lexical space· of QName is the set of strings that ·match· the QName production of [Namespaces in XML].

This basically says that text content of type xs:QName in an XML document such as "x:address" actually is a namespace name/local name pair such as  "{}address". This instantly means that you can not interpret this type without carrying around some sort of context (i.e a list of namespace name<->prefix bindings) which makes it different from most other types defined in the W3C XML Schema recommendation because it has no canonical lexical representation. A value such as  "x:address" is meaningless without knowing what XML document it came from and specifically what the namespace binding for the "x" prefix was at that particular scope.  

Of course, the existence of the QName type means you can do interesting things like use a different prefix for a particular namespace in the schema than you use in the XML instance so you can specify that the content of the <p:elem> element should be one of a:address or a:location but have x:address in the instance which would be fine if the "a" prefix is bound to the "" namespace in the schema and the "x" is bound to the same namespace in the instance document. You can also ask interesting questions such as What happens if I have a default value that is of type xs:QName but there is no namespace declaration for the namespace name at that scope? Does this mean that not only should a default value be inserted as the content of an element or attribute but also that a namespace declaration is also created at the same scope if one does not exist?

Fun stuff, not.


Categories: XML

Shannon J Hager writes

Jeff Key wants to end default buttons on Focus-Stealing Dialogs but I think the problem is bigger than that. I don't think ANYTHING should be able to steal my focus while typing. I have ranted about this before both in places where it could help (emails with MS employees) and in places where it can't (certain blogs). Not only is it annoying to suddenly find myself typing in a IM conversation with someone on AOL when less than half a word ago I was typing an invoice for a client, it is DANGEROUS for programs to be able to steal focus like this

I agree, I didn't realize how much applications that steal focus irritate me until I used a friend's iBook which runs Mac OS X where instead of having applications steal your focus has them try to get your attention by hopping around at the bottom of the screen. I thought it was cute and a lot less intrusive than finding myself typing in a differentwWindow because some application decided that it was so important that it was going to interrupt whatever I was doing.

An operating system that enforces application politness, sweet.


Choosing a name for a product or software component that can stand the test of time is often difficult and can often be a source of confusion for users of the software if the its usage outgrows that implied by its name. I have examples from both my personal life and my professional life.

RSS Bandit

When I chose this name I never considered that there might one day be another popular syndication format (i.e. ATOM) which I'd end up supporting. Given that Blogger, Movable Type, and LiveJournal are going to provide ATOM feeds and utilize the ATOM API for weblog editing/management then it is a foregone conclusion that RSS Bandit will support ATOM the specifications are in slightly less flux which should be in the next few months.

One that happens the name "RSS Bandit" will be an anachronism given that RSS will no longer be the only format supported by the application. In fact, the name may become a handicap in the future once ATOM becomes popular because there is the implicit assumption that I support the "old" and "outdated" syndication format not the "shiny" and  "new" syndication format.


In version 1.0 of the .NET Framework we shipped three classes that acted as in-memory representations of an XML document

  1. XmlDocument - an implementation of the W3C Document Object Model (DOM) with a few .NET specific extensions [whose functionality eventually made it into later revisions of the spec]
  2. XmlDataDocument - a subclass of the XmlDocument which acts as an XML view of a DataSet
  3. XPathDocument - a read-only in-memory representation of an XML document which conforms to the XPath data model as opposed to the DOM data model upon which the XmlDocument is based. This class primarily existed as  a more performant data source for performing XSLT transformations and XPath queries

Going forward, various limitations of all of the above classes meant that we came up with a fourth class which we planned to introduce in Whidbey. After an internal review we decided that it would be two confusing to add yet another in-memory representation of an XML document to the mix and decided to instead improve on the ones we had. The XmlDataDocument is really a DataSet specific class so it doesn't really fall into this discussion. We were left with the XmlDocument and the XPathDocument. Various aspects of the XmlDocument made it unpalatable for a number of the plans we had in mind such as acting as a strongly typed XML data source and moving away from a tree based DOM model for interacting with XML.

Instead we decided to go forward with the XPathDocument and add a bunch of functionality to it such as adding the ability to bind it to a store and retrieved strongly typed values via integrated support for W3C XML Schema datatyping, change tracking and the write data to it using the XmlWriter.

The primary feedback we've gotten about the new improved XPathDocument from usability studies and WinFX reviews is that there is little chance that anyone who hasn't read our documentation would realize that the XPathDocument is the preferred in-memory representation of an XML document for certain scenarios and not the XmlDocument. In v1.0 we could argue that the class was only of interest to people doing advanced stuff with XPath (or XSLT which is significantly about XPath) but now the name doesn't jibe with its purpose as much. The same goes for the primary mechanism for interacting with the XPathDocument (i.e. the XPathNavigator) which should be the preffered mechanism for representing and passing data as XML in the .NET Framework going forward.

If only I had a time machine and could go back and rename the classes XmlDocument2 and XmlNavigator. :(


Categories: Life in the B0rg Cube | XML

December 23, 2003
@ 07:29 PM

I'm kind of embarassed to write this but last week was the first time I'd installed a build of Whidbey (the next version of the .NET Framework) in about 6 months. I used to download builds on a daily basis at the beginning of the year when I was a tester working on XQuery but fell off once I became a PM. Given that certain bits were in flux I decided to wait until things were stable before installing Whidbey on my machine and writing a number of sample/test applications.

Over the next couple of weeks I'll be driving refining some of the stuff that we've designing for the next version of System.Xml and will most likely be blogging about various design issues we've had to contend with as well as perhaps giving a sneak preview of some of our end user documentation which will include answers to questions raised by some of the stuff that was shown at PDC such as whether there is any truth to the claims that XmlDocument is dead.


Categories: Life in the B0rg Cube

Torsten and I (mostly Torsten) have been working on a feature which we hope will satisfy multiple feature requests at one shot. Screenshot and details available by clicking the link below.

Categories: RSS Bandit

I just spotted the following on the wiki Ward Cunningham set up requesting advice as a new hire to Microsoft.

Take a running start and don't look back

  1. Recognize that your wonderful inventiveness is the most valuable thing you will own in a culture that values its employees solely by their latest contributions. In a spartan culture like this, you will rise quickly.

  2. Keep spewing ideas, even when those ideas are repeatedly misunderstood, implemented poorly, and excised from products for reasons that have nothing to do with the quality of the idea. When you give up on communicating your new ideas, you will just go insane waiting to vest.

  3. Be patient, or better yet, don't even look back. Don't try to track and control what people do with your ideas. It will just make you jaded and cynical. (Like many of us who have gone before :)

  4. Communicate by writing things down in compact and considered form. The most senior people, who can take your ideas the furthest fastest, are very busy. As an added side-benefit, when random program managers who just don't get it come around for the fortieth time, begging for explanations, you can provide them references to your wiki, blog, or papers for the thirty-seventh time.

  5. Don't count on the research division for anything but entertaining politics.

Have a good time, and as Don said, plan for the long-haul!

I've been in the B0rg Cube just shy of two years but the above advice rings true in more ways than one. It is a very interesting culture and with the wrong attitude one could end up being very cynical. However as with all things, the best thing to do is learn how the system works and learn how to work it. The five points above are a good starting point.   

Categories: Life in the B0rg Cube

There were a number of sessions I found particularly interesting either because they presented novel ways to utilize and process XML or because they gave an insightful glance at how others view the XML family of technologies. 

Imperative Programming with Rectangles, Triangles, and Circles - Erik Meijer
This was a presentation about a research language called Xen that experiments with various ways to reduce the Relational<->Objects<->XML (ROX) impedance mismatch by adding concepts and operators from the relational and XML (specifically W3C XML Schema) world into an object oriented programming language. The main thesis of the paper was that heavily used APIs and programming idioms eventually tend to be likely candidates for including into the language. An example was given with the foreach operator in the C# language which transformed the following regularly used idiom

IEnumerator e = ((IEnumerable)ts).GetEnumerator();
  try {
     while(e.MoveNext()) {  T t = (T)e.Current; t.DoStuff(); }
  } finally {
     IDisposable d = e as System.IDisposable;
     if(d != null) d.Dispose();


foreach(T t in ts){ 

The majority of the presentation was about XML integration. Erik spent some time talking about the XML to object impedance mismatch and how cumbersome programming with XML could be.  Either you wrote a bunch of code for walking trees manually or you queried nodes with XPath but then you are embedding one language into another and don't get type safety, etc (if there is an error in my XPath query I can't tell until runtime). He pointed out that various XML<->object mapping technologies fall short because they either don't map a rich enough set of W3C XML Schema constructs to relevant object structures but even if they did one now looses the power of being able to do rich XPath queries or XSLT/XQuery transformations. The XML integration in Xen basically came in 3 flavors; the ability to initialize classes from XML strings, support for W3C XML Schema constructs like union types and  sequences into the language and the ability to do XPath-like queries over the contents fields and properties of a class.

There were also a few other things like adding the constraint "not null" into the language (which would be a handy modifier for parameter names in any language given how often one must check parameters for null in method bodies) and the ability to apply the same method to all the members of a collection which seemed like valuable additions to a programming language independent of XML integration.

Thinking about it I am unsure of the practicality of some features such as being able to initialize objects from an XML literal in the code especially since Xen only supported XML documents with schemas although in some cases I could imagine such an approach being more palatable than using XQuery or XSLT 2.0 for constructing or querying strongly typed XML documents. Also I was suspicious of the usefulness of being able to do wildcard queries (i.e. give me all the fields in class Foo) although this could potentially be used to get the string value of an XML element with mixed content.

The language also had integrated SQL like querying with a "select" operator but I didn't pay much attention to this since I was only really interested in XML.

The meat of this presentation is available online in the paper entitled Programming with Circles, Triangles and Rectangles. The presentation was well received although sparsely attended (about two or three dozen people) and the most noteworthy feedback was that from James Clark who was so impressed he kept saying "I'm speechless" in between asking questions about the language. Sam Ruby was also impressed by the fact that not only was there a presentation but the demo which involved compiling and running various samples showed that this you could implement such a language in the CLR and even integrate it into Visual Studio.

Namespace Routing Language (NRL) - James Clark
This was a presentation for a language for validating a single XML document with multiple schemas simultaenously. This was specifically aimed at validating documents that contained XML from multiple vocabularies (e.g. XML content embedded in a SOAP envelope, RDF embedded in HTML, etc).

The core processing model of NRL is that it divides an XML document into sections each containing elements from a single namespace then each section can be validated using the schema for its namespace. There is no requirement that the same schema language is used so one could validate one part of the document using RELAX NG and use W3C XML Schema for another. There also was the ability to to specify named modes like XSLT which allowed you to match against element names against a particular schema instead of just keying off the namespace name. This functionality could be used to validate interleaved documents (such as XHTML within an XSLT stylesheet) but I suspect that this will be easier said than done in practice.

All in all this was a very interesting talk and introduced some ideas I'd never have considered on my own.  

There is a spec for the Namespace Routing Language available online.


Categories: XML

December 16, 2003
@ 05:33 PM

The XML 2003 conference was a very interesting experience. Compared to the talks at XML 2002 I found the talks at XML 2003 to be of more interest and relevance to me as an developer building applications that utilize XML. The various hallway and lunchtime conversations I had with  various people were particularly valueable. Below are the highlights from the various conversations I had with some XML luminaries at lunch and over drinks. Tomorrow I'll post about the various talks I attended.

James Clark: He gave two excellent presentations, one on his Namespace Routing Language (NRL) and the other about some of implementation techniques used in his nxml-mode for Emacs. I asked whether the fact that he gave no talks about RELAX NG meant that he was no longer interested in the technology. He responded that there wasn't really anything more to do with the language besides shepherd it through the standardization process and evangelization. However given how entrenched support for W3C XML Schema was with major vendors evangelization was an uphill battle.

I pointed out that at Microsoft we use XML schema language technologies for two things;

    1. Describing and enforcing the contract between producers and consumers of XML documents: .
    2. Creating the basis for processing and storing typed data represented as XML documents:

The only widely used XML Schema language that fit the bill for both tasks is W3C XML Schema. However W3C XML Schema is too complex and yet doesn't have enough features for the former and has too many features which introduce complexity for the latter case. In my ideal world, people would use something like RELAX NG for the former and XML-Data Reduced (XDR) for the latter. James asked if I saw value in creating a subset of RELAX NG which also satisfied the latter case but I didn't think that there would be compelling argument for people who've already baked W3C XML Schema into the core of their being (e.g. XQuery, XML Web Services, etc) to find interest in such a subset.

In fact, I pointed out that in designing for Whidbey (next version of the .NET Framework) we originally had designed the architecture to have a pluggable XML type system so that one could potentially generate Post Schema Validation Infosets (PSVI) but realized that this was a case of YAGNI. First of all, only one XML schema language exists that can generate PSVIs so creating a generic architecture makes no sense if there was no other XML schema language that could be plugged in to replace W3C XML Schema. Secondly, one of the major benefits of this approach I had envisioned was that one would be able to plug their own type systems into XQuery. This turned out to be more complicated than I thought because XQuery has W3C XML Schema deeply baked into it and it would take more than genericizing at the PSVI level to make it work (we'd also have to genericize operators, type promotion rules, etc) and once then once all that effort would have been expended any language that could be plugged in would have to act a lot like W3C XML Schema anyway.  Basically if some RELAX NG subset suddenly came into existence, it wouldn't add much to that we don't already get from W3C XML Schema (except less complexity but you could get the same from coming up with a subset of W3C XML Schema or following my various W3C XML Schema Best Practices articles on

I did think that there would be some value to developers building applications on Microsoft platforms who needed more document validation features than W3C XML Schema in having access to RELAX NG tools. This would be nice to have but isn't a showstopper preventing development of XML applications on Microsoft platforms (translation: Microsoft won't be building such tools in the forseeable future). However if such tools existed I definitely would evangelize them to our users who needed more features than W3C XML Schema provides for their document validation needs.  

Sam Ruby: I learned that Sam is on one of "emerging technologies" groups at IBM. Basically he works on stuff that's about to become mainstream in big way and helps them along the way. In the past this has included PHP, Open Source and Java (i.e. the Apache project), XML Web Services and now weblogging technologies. Given his track record I asked him to give me a buzz whenever he finds some new technology to work on. : )

I told him that I felt syndication formats weren't the problem with weblogging technologies and he seemed to agree but pointed out that some of the problems they are trying to solve with ATOM make more sense in the context of using the same format for your blog editing/management API and archival format. There were also the various interpersonal conflicts & psychological baggage which needs to be discarded to move the technology forward and a clean break seems to be the best way. On reflection, I agreed with him.

I did point out that the top 3 problems I'd like to fix in syndication were one click subscription, subscription harmonization and adding calendar events to feeds. I mentioned that I should have RFCs for the first two written up over the holidays but the third is something I haven't thought about hard. Sam pointed out that instead of going the route of coming up with a namespaced extension element to describe calendar events in an RSS feed that perhaps a better option is the ATOM approach that uses link tags. Something like

   <link type="text/calendar" href="...">

In fact he seemed to have liked this idea so much it ended up in his presentation.

As Sam and I were finishing our meals, Sam talked about the fact that the effect that blogging has had on his visibility is immense. Before blogging he was well known in tight-knit technical circles such as amongst the members of the Apache project but now he knows people from all over the world working at diverse companies and regularly has people go "Wow, you're Sam Ruby, I read your blog". As he said, this the guy sitting across from us at the table said "Wow, you're Sam Ruby, I read your blog", Sam turned to me and said "See what I mean?"

The power of blogging...

Eve Maler: I spoke to her about a talk I'd seen on UBL given by Eduardo Gutentag and Arofan Gregory where they talked about the benefits of using the polymorphic features of W3C XML Schema to good use in business applications. The specific scenario they described was the following

Imagine a small glue supplier that provides glue to various diverse companies such as a shoe manufacturer, an automobile manufacturer and an office supplies company. This company uses UBL to talk to each of its customers who also use UBL but since the types for describing purchase orders and the like are not specific enough for them they use the type derivation features of W3C XML Schema to create specific types (e.g. a hypothetical LineItem type from UBL is derived to AutomobilePart or ShoeComponent by the various companies). However the small glue company can handle all the new types with the same code if they use type aware processing such as the following path XPath 2.0 or XQuery expression  which matches all instances of the LineItem type

element(*, LineItem)

The presenters then pointed out  that there could be data loss if one of the customers extended the LineItem type by adding information that was pertinent to their business (e.g. priority, pricing information, prefeerred delivery options, etc) since such code would not know about the extensions.

This seems like a horrible idea and yet another reason why I view all the "object oriented" features of W3C XML Schema with suspicion.

Eve agreed that it probably was a bad idea to recommend that people process XML documents this way then stated that she felt that calling such processing "polymorphic" didn't sit right with her since true polymorphism doesn't require subtype relationships. I agreed and disagreed with her. There are at least four types of polymorphism in programming language parlance and the kind used above is subtype polymorphism. This is just one of the four types of polymorphism (the others being coercion, overloading and parametric polymorphism) but the behavior above is polymorphism. From talking to Eve it seemed that she was more interested in parametric polymorphism because it subtype polymorphism is not a loosely coupled approach. I pointed out that just using XPath expressions to match on predicates could be considered to be parametric polymorphism since you are treating instances similarly even though they are of different types but satisfy the same constraints. I'm not sure she agreed with me. :)    

Jon Udell: We discussed the online exchange we had about WinFS types and W3C XML Schema types. He apologized if he seemed to be coming on too strong in his posts and I responded that of the hundreds of articles and blog posts I'd read about the technologies unveiled at the recent Microsoft Professional Developer's Conference (PDC) that I'd only seen two people provide insightful feedback; his was the first and Miguel de Icaza's PDC writeup was the second. 

Jon felt that WinFS would be more valuable as an XML database as opposed to an object oriented database (I think the terms he used were "XML store" and "CLR store") especially given his belief that XML enables the "Universal Canvas". I agreed with him but pointed out that Microsoft isn't a single entity and even though some parts may think that XML is one step closer to giving us a universal data interchange format and thus universal data access which there are others who see XML as "that format you use for config files" and express incredulity when they here about things like XQuery because they wonder why anyone would need a query language for their config files. :)

Reading Jon's blog post about Word 11, XML and the Universal Canvas it seems he's been anticipating a unified XML storage model for a while which explains his disappointment that the WinFS unveiled at PDC was not it.

He also thought that the fact that so many people at Microsoft were blogging was fantastic. 


Categories: XML

December 16, 2003
@ 06:52 AM

Robert Scoble writes

Here's what I'd do if I were at Harvard and in charge of the RSS spec:

1) Announce there will be an RSS 3.0 and that it will be the most thought-out syndication specification ever.

2) Announce that RSS 3.0 will ship on July 1, 2005. That date is important. For one, 18 months is long enough to really do some serious work. For two, RSS 3.0 should be positioned as "the best way to do syndication on Microsoft's Longhorn." ...

3) Open up a mailing list, a wiki, and a weblog to track progress on RSS 3.0 and encourage community inclusion.

4) Work with Microsoft to ensure that RSS 3.0 will be able to take advantage of Longhorn's new capabilities (in specific, focus on learning Indigo and WinFS)...

5) Make sure RSS 3.0 is simply the best-of-breed syndication protocol. Translation: don't let Microsoft or Google come up with a better spec that has more features.

I'm terribly amused by the fact that Robert Scoble likes to claim that he doesn't represent Microsoft in his blog then posts items where he basically acts like he does. An RSS 3.0 that competes with Atom is probably the worst possible proposal to resolve the current conflict in the website syndication space and a very clear indication that this is all about personality conflicts. The problem  with the Atom syndication format is that it is an incompatible alternative of RSS 1.0/RSS 2.0 which provides little if any benefit to content producers or news aggregators consumers. Coming up with another version of RSS doesn't change this fact unless it is backwards compatible and even then besides clarifications to the original spec I'm unsure what could be added to the core although I can think of a number of potential candidates. However this still would be a solution looking for a problem.

While talking to Tim Bray and Sam Ruby at XML 2003 last week I stated that a number of the problems with syndication have little to do with the core spec and most aggregator authors wouldn't consider any of the problems harped upon on the Atom lists as a big deal. The major problems with syndication today have little to do with the syndication format and more to do with it's associated technologies. 

As little interest I have in an Atom syndication format I have an order of magnitude less interest in a new version of RSS that exists solely to compete with Atom..

PS: Am I the only one who caught the trademark Microsoft arrogance (which really comes from working on Windows[0]) in Scoble's post? I especially liked

"Here's what I'd do if I were at Harvard and in charge of the RSS spec...Work with Microsoft to ensure that RSS 3.0 will be able to take advantage of Longhorn's new capabilities (in specific, focus on learning Indigo and WinFS). Build a prototype (er, have MSN build one) that would demonstrate some of the features of RSS 3.0 -- make this prototype so killer that it gets used on stage at the Longhorn launch

I literally guffawed out loud. So if Harvard doesn't tie RSS to Windows then all is lost? I guess this means that NetNewsWire and Straw should get ready to be left behind in the new Microsoft-controlled RSS future. Hilarious. 

[0] When you work on the most popular piece of software in the world you tend to have a different perspective from most other software developers in the world including within Microsoft.


Categories: Ramblings

December 15, 2003
@ 05:04 PM

James Robertson writes

Ed Foster points out that MS - like many other vendors - is forbidding benchmarks as part of their standard contracts:

Is it possible Microsoft has something to hide about the performance of its server and developer products? It's hard to escape that conclusion when you see how many of its license agreements now contain language forbidding customers of those products from disclosing benchmark results.

So what are MS and the other vendors afraid of?

I'm not sure what the official line is on these contracts but I've come to realize why the practice is popular among software vendors. A lot of the time people who perform benchmarks are familiar with one or two of the products they are testing and know how to tune those for optimal performance but not the others which leads to skewed results. I know that at least on the XML team at Microsoft we don't block people from publishing benchmarks if they come to us, we just ensure that their tests are apples-to-apples comparisons and not unfairly skewed to favor the other product(s) being tested.

Just a few days ago I attended a session at XML 2003 entitled A Comparison of XML Processing in .NET and J2EE where the speaker stated that push based XML parsers like SAX was more performant than pull-based XML parsers like the .NET Framework's XmlReader when dealing with large documents. He didn't give any details and implied that they were lacking because of the aforementioned EULA clauses.  Without any details, sample code or definition of what document size is considered "large" (1MB, 10MB, 100MB, 1GB?)  it's difficult to agree or disagree with his statement. Off the top of my head there aren't any inherrent limitations of pull-based XML parsing that come to mind that should make it perform less than push based parsing of XML documents although differences in implementations makes all the difference. I suspect that occurences like this are why many software  vendors tend to have clauses that limit the disclosure of benchmark information in their EULAs.

Disclaimer: The above is my personal opinion and is in no way, shape or form an official statement of the position of my employer.


Categories: Life in the B0rg Cube

I'm now experimenting with various Windows CVS clients to see which best suits my needs for RSS Bandit development. So far I have tried WinCVS which seems OK and I'm about to evaluate Tortoise CVS which Torsten seems very happy with.

Later on I'll experiment with CVS plugins that are integrated into Visual Studio such as Jalindi Igloo or the SCC Plugin for Tortoise CVS. I never got to use the Visual Studio plugin for GotDotNet workspaces when RSS Bandit was hosted there because the original IDE I started developing RSS Bandit (Visual C# 2002 standard edition) with did not support said plugin so I am curious as to what development with source repository access as part of the IDE feels like.


Categories: RSS Bandit

December 15, 2003
@ 04:32 PM

It seems my recent post about moving the RSS Bandit from GotDotNet Workspaces to SourceForge has lead to some discussion about the motivations for the move. I've seen this question asked on Daniel Cazzulino's weblog and on the RSS Bandit message board on GotDotNet. Below is my answer to the question phrased in the form of a top 10 list which I posted in response to the question on the GotDotNet message board and also sent to Andy Oakley

Top 10 reasons why we moved to SourceForge

1. Doesn't require people have Passport accounts to download the RSS Bandit installer.

2. We get download and page load statistics.

3. Bug reports can have file attachments. This is great since a lot of the time we end up wishing people would attach their error.log or feedlist.xml file with their bug reports.

4. We can get a mailing list if we want.

5. Separate databases for features vs. bugs.

6. Source code can be browsed over HTTP via ViewCVS without having to install any software

7. Larger quotas on how much you can store on their servers.

8. Bug tracker remembers your queries and the default query is more useful to me (all open bugs) than GDN's (all bugs assigned to me even closed ones).

9. Activity score more accurately reflects activity of the project (on GDN, BlogX is scored at having 99% activity score even though the project has been dead for all intents and purposes for several months).

10. With SourceForge we get to use the BSD licence.

I hope this satisfies the curiosity of those wondering why RSS Bandit moved to SourceForge. I've been using it for a few days and I'm already much happier with it despite some initial teething problems getting adding modules to CVS.


Categories: RSS Bandit

According to Reuters

WASHINGTON (Reuters) - A Pentagon (news - web sites) audit of Halliburton, the oil services firm once run by Vice President Dick Cheney (news - web sites), found the company may have overbilled the U.S. government by more than $120 million on Iraq (news - web sites) contracts, U.S. defense officials said on Thursday.

Why am I not surprised? This entire Iraq war fiasco will be the subject of much consternation and entertainment to future generations.


December 12, 2003
@ 12:19 PM

Today is the last day of the XML 2003 conference. So far it's been a pleasant experience.


Attendance at the conference was much lower than last year. Considering that last year Microsoft announced Office 2003 at the conference while this year there was no such major event, this is no surprise. I suspect another reason is that XML is no longer new and is now so low down in the stack that a conference dedicated to just XML is no longer that interesting. Of course, this is only my second conference so this level of attendance may be typical from previous years and I may have just witnessed an abnormality last year.

Like last year, the conference seemed targetted mostly at the ex-SGML crowd (or document-centric XML users) although this time there wasn't the significant focus on Semantic Web technologies such as topic maps that I saw last year. I did learn a new buzzword around Semantic Web technologies, Semantic Integration and found out that there are companies selling products that claim to do what until this point I'd assumed was mostly theoretical. I tried to ask one such vendor how they deal with some of the issues with non-trivial transformation such as the pubDate vs. dc:date example from a previous post but he glossed over details but implied that besides using ontologies to map between vocabularies they allowed people to inject code where it was needed. This seems to confirm my suspicions that in the real world you end up either using XSLT or reinventing XSLT to perform transformations between XML vocabularies. 

From looking at the conference schedule, it is interesting to note that some XML technologies got a lot less coverage in the conference  relative to how much discussion they cause in the news or blogosphere. For example, I didn't see any sessions on RSS although there is one by Sam Ruby on Atom scheduled for later this morning. Also there didn't seem to be much about XML Web Service technologies being produced by the major vendors such as IBM, BEA or Microsoft. I can't tell if this is because there was no interest in submitting such sessions or whether the folks who picked the sessions didn't find these technologies interesting. Based on the fact that a number of the folks who had "Reviewer" on their conference badge were from the old school SGML crowd I suspect the latter. There definitely seemed to be disconnect between the technologies covered during the conference and how XML is used in the real world in a number of cases.


I've gotten to chat with a number of people I've exchanged mail with but never met including Tim Bray, Jon Udell, Sean McGrath, Norm Walsh and Betty Harvey. I also got to talk to a couple of folks I met last year like Rick Jellife, Sam Ruby, Simon St. Laurent, Mike Champion  and James Clark. Most of the hanging out occurred at the soiree at Tim and Lauren's. As Tim mentions in his blog post there were a couple of "Wow, you're Dare?" or 'Wow, you're Sean Mcgrath?" through out the evening. The coolest part of that evening was that I got to meet Eve Maler who I was all star struck about meeting since I'd been seeing her name crop up as being one of the Über-XML geeks at Sun Microsystems since I was a programming welp back in college and I'm there gushing "Wow, you're Eve Maler" and she was like "Oh you're Dare? I read your articles, they're pretty good". Sweet. Since Eve worked at Sun I intended to give her some light-hearted flack over a presentation entitled UBL and Object-Oriented XML: Making Type-Aware Systems Work which was spreading the notion that the relying on the "object oriented" features of W3C XML Schema was a good idea then it turned out that she agreed with me. Methinks another W3C XML Schema article on could be spawned from this. Hmmmm.


Categories: XML

December 11, 2003
@ 03:42 PM

The new home of the RSS Bandit project is on SourceForge. Various things precipitated this move with the most recent being the fact that a Passport account was needed to download RSS Bandit from GotDotNet. I'd like to thank Andy Oakley for all his help with  GotDotNet Workspaces while RSS Bandit was hosted on there.

The most current release of RSS Bandit is still v1.2.0.61, you can now download it from sourceforge here. The source code is still available, and you can now browse the RSS Bandit CVS repository if interested in such things.


Categories: RSS Bandit

Jeremy Zawodney writes

The News RSS Feeds are great if you want to follow a particular category of news. For example, you might want to read the latest Sports (RSS) or Entertainment (RSS) news in your aggregator. But what if you'd like an RSS News feed generated just for you? One based on a word or phrase that you could supply?
 For example, if you'd like to follow all the news that mentions Microsoft, you can do that. Just subscribe to this url. And if you want to find news that mentions Microsoft in a financial context, use Microsoft's stock ticker (MSFT) as the search parameter like this.

Compare this to how you'd programmatically do the same thing with Google using the Google Web API which utilizes SOAP & WSDL. Depending on whether you have the right toolkit or not, the Google Web API ease either much simpler or much harder to program against than the Yahoo RSS based search. With the Yahoo RSS based search, a programmer has to directly deal with HTTP and XML when programming against it while with the Google API and the appropriate XML Web Service tools this is all hidden behind the scenes and for the most part the developer programs directly against objects that represent the Google API without dealing directly with XML or HTTP. For example, see this example of talking to the Google API from PHP. Without using appropriate XML Web Service tools, the Google API is more complex to program against than the Yahoo RSS search because one now has to deal with sending and receiving SOAP requests not just regular HTTP GETs. However there are a number of freely available XML Web Service toolsets available so there should be no reason to program against the Google API directly.

This being said there are a number of benefits to the URI-based (i.e RESTful) search that Yahoo provides which comes from being a part of the Web architecture.

  1. I can bookmark a Yahoo RSS search or send a link to it in an email. I can't do the same with an RPC-style SOAP API.
  2. Intermediaries between my machine and Google are unlikely to cache the results of a search made via the Google API since it uses HTTP POST but could cache requests that use the Yahoo RSS-based  search since it uses HTTP GET.  This improves the scalability of the Yahoo RSS-based search without any explicit work from myself or Yahoo, this is just from utilizing the benefits of the Web architecture.

The above contrast of the differing techniques for returning search results as XML used by Yahoo and Google is a good way to compare and contrast RESTful XML Web Services to RPC-based XML Web Services and understand why some people believe strongly [perhaps too strongly] that XML Web Services should be RESTful not RPC-based.

By the way, I noticed that Adam Bosworth is trying to come to grips with REST which should lead to some interesting discussion for those who are interested in the RESTful XML Web Services vs. RPC-based XML Web Services debate.




Categories: XML

In the most recent release of RSS Bandit we started down the path of making it look like Outlook 2003 by using Tim Divil's Winforms controls. The primary change we made was change the toolbar. This change wasn't good enough for Thomas Feudenberg who made a couple of other changes to RSS Bandit that make it look more like Outlook 2003. He wrote

Anyway, contrary to SharpReader, you can get the bandit's source code. Because I really like the UI of Outlook 2003 (and FeedDemon), I patched RSS Bandit:

It took about 15 minutes. Mainly I docked the feed item list and the splitter to the left. Additionally, I've created a custom feed item formatter, which bases on Julien Cheyssial's MyBlogroll template. You can download my XSLT file here.

You can find a screenshot on his website. Torsten's already made similar changes to the official RSS Bandit source code after seeing Thomas's feedback.


Categories: RSS Bandit

I accidentally caught Al Sharpton on Saturday Night Live last night and it was a horrible experience. Not only was the show as unfunny as getting needles shoved in your eyeballs  (why the fuck do good shows like Futurama and Family Guy get cancelled but this turd continues to stink up the airwaves?) but our pal Al keep fumbling his lines like he'd forgotten them and kept having to surreptituously read them from the teleprompter. What a joke.

Definitely a horrible way to end a Saturday night.


Categories: Ramblings

Shelley Powers writes

For instance, The W3C TAG team -- that's the team that's defining the architecture of the web, not a new wrestling group -- has been talking about defining a new URI scheme just for RSS, as brought up today by Tim Bray. With a new scheme, instead of accessing a feed with:

You would access the feed as:


I've been trying to avoid blogging about this discussion since I'll be leaving for Philly to attend the XML 2003 conference in a few hours and won't be able to participate in any debate. However since it seems some folks have started blogging about this topic and there  some misconceptions in their posts I've thrown my hat in the ring.

The first thing I want to point is that although Shelley is correct that some discussion about this has happened on the W3C Technical Architecture Group's mailing list they are not proposing a new URI scheme. Tim Bray was simply reporting on current practices in the RSS world that I mentioned in passing on the atom-syntax list.

The problem statement is "How does a user go to a website such as or, who'd like to subscribe to information from these sites in a client-side news aggregator do so in a quick and painless manner?". The current process is to click on an icon (most likely an orange button with the white letters 'XML' on it) that represents an RSS feed, copy the URL from the browser address bar, fire up your RSS client and click on the subscribe dialog (if it has one).

This is lot of steps and many attempts have been made to collapse this into one step (click link and the right dialog pops up). 

The oldest one I am aware of was pioneered by Dave Winer and involved client side aggregators listening on a port on the local machine and a hyperlink on the website linking to a URL of the form This technique is used by every Radio Userland weblog and is even used by dasBlog which is my blogging tool of choice as is evidenced by clicking on the icon with a picture of a coffee mug and the letters "XML" on it at bottom of my weblog.

There are two problems with this approach. The first is the security issue brought on by the fact that you have a news aggregator listening on a socket on your local machine which could lead to hack attempts if a security exploit is found on in your news aggregator of choice, however this can be mitigated by firewalls and so far thus hasn't been a problem. The second is that if one has multiple aggregators installed there is contention for which one should listen on that port. For this reason different aggregators listen on different local ports; Radio listens on port 5335, AmphetaDesk listens on port 8888, Awasu listens on port 2604, nntp//rss listens on port 7810 and so on.

An alternative solution was chosen by various other aggregator authors in which hyperlinks pointed to the URLs of RSS feeds with the crucial distinction that the http:// part of the URL was substituted with a custom URI scheme. Since most modern browser have a mechanism for handing off unknown URI schemes to other client applications this also allows "one-click feed subscription".  Here also there is variance amongst news aggregators;  Vox Lite, RSS Bandit & SharpReader support the feed:// URI scheme, WinRSS supports the rss:// URI scheme while NewsMonster supports the news-monster:// scheme.

With all this varying approaches, it means that any website that wants to provide a link that allows one click subscription to an RSS feed needs to support almost a dozen different techniques and thus create a dozen different hyperlinks on their site. This isn't an exaggeration, this is exactly what Feedster when one wants to subscribe to the results of a search. If memory serves correcly, Feedster uses the QuickSub javascript module to present these dozen links in a drop down list.

The recent debate on both the atom-syntax and the www-tag mailing lists focuses on the feed:// URI proposal and it's lack of adherence to guidelines set in the current draft of Architecture of the World Wide Web document being produced by the W3C Technical Architecure Group. This document is an attempt to document the architecture of the World Wide Web ex post facto.

Specifically the debate hinges on the guideline that states

Authors of specifications SHOULD NOT introduce a new URI scheme when an existing scheme provides the desired properties of identifiers and their relation to resources.
If the motivation behind registering a new scheme is to allow an agent to launch a particular application when retrieving a representation, such dispatching can be accomplished at lower expense by registering a new Internet Media Type instead. Deployed software is more likely to handle the introduction of a new media type than the introduction of a new URI scheme.

The use of unregistered URI schemes is discouraged for a number of reasons:

  • There is no generally accepted way to locate the scheme specification.
  • Someone else may be using the scheme for other purposes.
  • One should not expect that general-purpose software will do anything useful with URIs of this scheme; the network effect is lost.

The above excerpt assumes that web browsers on the World Wide Web are more likely to know how to deal with unknown Internet Media Types than unknown URI schemes which is in fact the case. For example, Internet Explorer  will fallback to using the file extension of the file if it doesn't know how to deal with the provided MIME type (see  MIME Type Detection in Internet Explorer for more details). However there are several problems with using MIME types for one click feed subscription that do not exist in the previously highlighted approaches.

Greg Reinacker detailed them in hist post RSS and MIME types a few months ago.

Problem 1: [severity: deal-breaker] In order to serve up a file with a specific MIME type, you need to make some changes in your web server configuration. There are a LOT of people out there (shared hosting, anyone?) who don't have this capability. We have to cater to the masses, people - we're trying to drive adoption of this technology.

Problem 1a: [severity: annoyance] There are even more people who wouldn't know a MIME type from a hole in the head. If Joe user figures out that he can build a XML file with notepad that contains his RSS data (and it's being done more often than you think), and upload it to his web site, you'd think that'd be enough. Sorry, Joe, you need to change the MIME type too. The what?

Problem 2: [severity: deal-breaker] If you register a handler for a MIME type, the handler gets the contents of the file, rather than the URL. This is great if you're a media player or whatever. However, with RSS, the client tool needs the URL of the RSS file, not the actual contents of the RSS file. Well, it needs the contents too, but it needs the URL so it can poll the file for changes later. This means that the file that's actually registered with a new MIME type would have to be some kind of intermediate file, a "discovery" file if you will. So now, not only would Joe user have to learn about MIME types, but he'd have to create another discovery file as well.

Many people in the MIME type camp have pointed out that problem two can be circumvented by having the feed file contain it's location. Although this seems a tad redudundant and may be prone to breakage if the website is reorganized it probably should work for the most part. However there is at least one other problem with using MIME types which people have glossed over. 

Problem 3:  If clicking on a link to an RSS feed in your browser always invokes your aggregator's feed subscription dialog then this means you can't view an RSS feed in your browser if you have a client aggregator installed and may not be able to view even if you don't because your browser of choice may not know how to handle the MIME type if it isn't something like text/xml.

At least one person, Tim Bray, doesn't see this as a big deal and in fact stated, "why not? Every time I click on a link to a PDF doc I get a PDF reader. Every time I click on a .mov file I get Quicktime. Seems like a sensible default behavior".

Using MIME types to solve the one click subscription problem is more diffficult for weblog tools to implement than the other two approaches favored by news aggregators and requires changing web server configurations as well while the other approaches do not. Although the architecure astronauts will rail against the URI scheme based approach it is unlikely that anyone who looks dispassionately at all three approaches will choose to use MIME types to solve this problem. 

Of course, since one of the main forces behind the ATOM movement has stated that MIME types will be the mechanism used for performing one click subscription to ATOM feeds this just seems like one more reason for me to be skeptical about the benefits of adopting the ATOM syndication format.


Categories: XML

The BBC has a contest to vote for the Ten most embarrassing political moments. Funny enough they don't have the my favorite, the projectile vomitting incident involving George Bush senior and Kiichi Miyazawa of Japan.

When Mr Bush's father attended a state visit in Japan in January 1992, he responded to the arrival of Japanese beef steak (French-style) with a projectile vomit into the lap of Prime Minister Kiichi Miyazawa.

Suffering from flu at the time, George Bush Senior then slumped under the table before getting up a few minutes later and announcing he felt great. 

Too bad there isn't an option for a write-in vote.


Categories: Ramblings

I've gotten a complaint that in some cases it seems you need a Microsoft Passport account to download the latest version of RSS Bandit. For this reason I've setup an alternate download location, for anyone who's having this problem and doesn't have a Microsoft Passport account.



Categories: RSS Bandit

December 5, 2003
@ 07:38 PM

Download it here. New features and bug fixes described below.

Differences between v1.2.0.43 and v1.1.0.61 below.

  • FEATURE: One can now search for feeds by keyword as well as by URL. This means one can search the for 'CNN', 'RSS Bandit' or 'XML' and get back up to a hundred feeds back from the Syndic8 database that contain that requested text in the description. Very useful for browsing for new feeds to subscribe to.
  • FEATURE: Added an [Apply] button to the Options dialog so one can test features such as the various XSLT themes for displaying feeds (the DOS theme is still my favorite) without having to close the dialog.
  • FEATURE: Now provides visual indication when downloading the comments for a particular feed that supports wfw:commentRss.
  • FEATURE: Added a tab in the options dialog for configuring the web search engines available from the Search toolbar (still have MSN Search, Feedster and Google by default)
  • FEATURE: Option added to network settings dialog to take over proxy settings from Internet Explorer.
  • FEATURE: Support for the feed:// URI scheme proposed by Greg Reinacker
  • FEATURE: New user interface with an Office 2003 look & feel courtesy of Tim Dawson's SandBar controls
  • FEATURE: Ability to subscribed feeds for items that contain a particular keyword (via View->Rss Search).For example,  searching for all posts with "RSS Bandit" in their content.
  • FEATURE: now you can restrict the security settings of the embedded web browser (via Tools->Options->Web Browser). By default now only allows download of images, but no Javascript, ActiveX or Java applets.
  • FIXED: Problem that occured infrequently where at certain times moving, renaming or deleting category nodes led to corruption of feedlist.xml file.
  • FIXED: Clicking on a category node in the tree view no locks up the application thus making it unusable.
  • FIXED: Certain websites which use "deflate" compression without headers caused exceptions on attempts to decompress the feeds such as MSDN or dasBlog feeds.
  • FIXED: Maximum age to keep feed items sometimes is inconsistent between the value set in the options dialog and the value used for particular feeds.
  • FIXED: Too many spurious XML-related errors showing up in 'Feed Errors' special folder.


Categories: RSS Bandit

Check out the screenshots of the two newest features added to RSS Bandit; filtered search and locating RSS feeds by keyword.

Categories: RSS Bandit

My latest column is up on MSDN, Extreme XML: EXSLT Meets XPath.


Categories: XML

Robert Scoble wrote

I see over on Evan Williams site that it looks like Google (er, Blogger, which is owned by Google) is going to support Atom. So far Microsoft has been supporting RSS 2.0 (we've spit out RSS 2.0 on MSDN, on the PDC app, on MyWallop, and in a few other places). Atom is a syndication format that's similar, but slightly different from RSS. I wonder how the market will shake out now.

Evan: can you explain, in layman's terms, why you support Atom and not RSS?

This question is misleading. There are two parts to ATOM that are being discussed by Google, the ATOM API and the ATOM syndication format. The ATOM API is competitive with technologies like the Blogger API, MetaWeblog API and the LiveJournal API while the ATOM syndication format competes with technologies like RSS 1.0 and RSS 2.0.

There has been enough written about the history of feed syndication formats named "RSS" so I'll skip that discussion and move directly to discussing the history of weblog posting APIs.

The Blogger API was originally developed by Blogger (now owned by Google) as a way of allowing client applications to talk to blogger weblogs (using client applications such as w.bloggar). This API was later adopted by other blogging tools such as Radio Userland. However Dave Winer decided he didn't like some of the perceived deficiencies in the Blogger API and forked it thus creating the MetaWeblog API. Later on the Blogger folks came out with version 2.0 of the Blogger API which led to online war of words with Dave Winer because he felt they should use his forked version instead even though his version removed functionality that was crucial to Blogger. Eventually Blogger backed off from implementing v2.0 of thier API and has been waiting for an alternative which presented itself in the ATOM API. Most of this history is available from  Evan Williams's blog post entitled the Tragedy of the API.

<update source="Dave Winer" >

  1. ManilaRPC came first, way before all the others you mention. It was an XML-RPC then SOAP-based API for driving Manila, and is still in use today, and is much deeper than any of the other APIs.
  2. The MetaWeblog API addressed a very well-known deficiency in the Blogger API, no support for titles. You neglected to mention that it was broadly supported by tools and blogging systems, by everyone except Blogger.

The ATOM effort is aimed at replacing both the popular syndication formats and the popular weblog publishing APIs. Both of which have been burdened with histores full of turbulent turf battles and personal recriminations.  

Based on my experiences working with syndication software as a hobbyist developer for the past year is that the ATOM syndication format does not offer much (if anything) over RSS 2.0 but that the ATOM API looks to be a significant step forward compared to previous attempts at weblog editting/management APIs especially with regard to extensibility, support for modern practices around service oriented architecture, and security. The problem is that if one implements the ATOM API it makes sense that since this API uses the feed syndication format as the payload of the messages sent between the client and the server then one should also implement the ATOM syndication format. This is probably why Blogger/Google will support both the ATOM API and the ATOM syndication format.

I personally tend to agree with Don Park's proposal

IMHO, the most practical path out of this mess is for the Atom initiative to hi-jack RSS 2.0 and build on it without breaking backward compatibility.  A new spec will obviously have to be written to avoid copyright problems with Dave's version of the RSS 2.0 spec, but people were complaining about the old spec anyway.

As to the Atom API, I won't bitch about it any more if RSS 2.0 is adopted as the core Atom feed format because the feed format is far more important than the API.  This should satisfy Evan Williams since his real beef is with the API.  Yes, there are some issues people have with RSS 2.0 but they can be ignored or worked-around with extensions until later, hopefully much later.

This compromise will give the best of all world's to users. There is no discontinuity in syndication formats yet blog editting APIs are improved and brought in line with 21st century practices. I've mentioned this on the atom-syntax mailing list in the past but the  idea seemed to receive a cold reception.

Regardless of what ends up happening, the ATOM API is best poised to be the future of weblog editting APIs. The ATOM syndication format on the other hand...


    Categories: XML

    I've begun to dread every time I see a blog entry in my aggregator with "XAML" in the title. It usually means I am either going to read a lot of inane fanboy gushing about the latest and greatest from Microsoft or some FUD from some contingent that either misunderstands the technology or has an axe to grind with Microsoft. So much so, I've been contemplating adding a "hide entry if contains keyword" feature to RSS Bandit so I never have to read another post about XAML. Anyway, back to the point of my post.

    Diego Doval has an entry entitled XAML and... Swing which contained a number of opinions that completely perplexed me. I'll go over each one in turn.

    XAML will be Windows-only, so in that sense the comparison is stretched. But this is a matter of practice, in theory an XML-based language could be made portable (when there's a will there's a way). XAML was compared a lot to Mozilla's XUL, and rightly so, but I think there are some parallels between it and Swing as well.

    In theory, every language targetted at a computer is portable to other platforms. However if I wrap XML tags around  C++ code that uses Win32 API calls, how portable is that in practice? As for parallels between XAML and Swing, I thought this was extremely obvious. XAML is the XML-ized way to write what one could consider to be the next generation WinForms (managed APIs for interacting with Windows GUI components)  applications. In fact, someone has already implemented XAML for WinForms, called Xamlon. Considering that Swing (Java APIs for interacting with operating system  GUI components) is analogous to Winforms it isn't a leap to see a parallel to XAML and Swing.

    One big difference that XAML will have, for sure, is that it will have a nice UI designer, something that Swing still lacks. On the other hand, I think that whatever code an automated designer generates will be horribly bloated. And who will be able to write XAML by hand?

    One of the chief points of XAML being an XML-based markup language is so that peple can actually author it. My personal opinion is that this is more of a bad thing than a good thing, I prefer using GUI tools to design a user interface than dicking around with markup files. I've always disliked technologies like CSS and ASP.NET, moving GUI programming in that direction seems to me like a step backwards but based on the enthusiasm about XAML showed by various people in the developer community it seems I am a Luddite.

    The main thing I want to point out about the Diego's statements so far are that they are FUD, no designer has been demoed for XAML let alone one that generates bloated code. This is all just negative speculation but let's go on...

    And: the problem of "bytecode protection" in Java comes back with XAML, but with a vengeance. How will the code be protected? Obfuscation of XML code? Really? How would it be validated then? And why hasn't anyone talked about this.

    XAML is compiled to IL. XAML obsfucation questions are IL obfuscation questions. If you're gung ho about protecting your "valuable IP" with IL obsfucation technologies then grab a copy of Dotfuscator or Spices.NET.

    On a related note, Robert says this regarding XAML:

    [...] you will see some business build two sites: one in HTML and one in XAML. Why? Because they'll be able to offer their customers experiences that are impossible to deliver in HTML.

    Come on, Robert, these days, when everyone's resources are stretched to the limit, when CIOs want to squeeze every possible drop of code from their people, when everyone works 60-hour weeks as a matter of common practice, are you seriously saying that companies will have two teams to develop a single website? Is this Microsoft's selling point? "Here, just retrain all of your people, and double the size and expense of your development team, and you'll be fine."

    I tend to agree with Diego here. Having a XAML-based website on the Internet will most likely be cost ineffective for quite a while. On the other hand, it is quite likely that using XAML on the intranet will not be. Corproate intranets are all about  homogenous environements which is why you tend to see more Java applets, IE-specific pages and ActiveX controls used within intranets than on the global World Wide Web. If I was a member of the Longhorn evangelization team or any other of the public faces of Longhorn or Avalon I wouldn't focus much on XAML on the World Wide Web but that's just my opinion.

    That leaves two possibilities: 1) XAML will be niche and never really used a lot (think ActiveX, or, hey, even Java Applets!) or 2) XAML will kill HTML. 

    Talk about completely missing the point. XAML is primarily for writing Windows client applications, y'know like RSS Bandit or SharpReader, not for delivering stuff on the Web. I don't think anyone at Microsoft is silly enough to think that XAML will replace HTML. The idea is completely laughable.

    It is always amazing seeing how stupid and arrogant people tend to think Microsoft is.




    Categories: Life in the B0rg Cube

    Halley Suitt writes

    Employers are about to lose a lot of "loyal" employees who have been sticking around through the bad economy, but are more than ready to jump ship as the job market snaps back.

    Business Week wrote about this in October, but I think it's coming on even stronger now. BW suggests employers are in for a rude awakening:

    I get the same feeling while walking the hallways of the B0rg cube. I suspect that if this becomes a problem in the near future the B0rg will try to adapt as it always has.


    Categories: Ramblings

    "Clairol haircolor transformed me from a college graduate to a successful financial advisor. Clairol gives my hair the shine I need to brighten my face and my spirit, which pumps up my confidence. I'm energized! Today I manage millions — next year I'll manage tens of millions! People trust me. That's inspiring!"

    "My blonde hair had gone gray — I felt depressed. I went blonde one day for a big party and it was quite a hit! I felt GREAT! I started a diet, lost 72 lbs., began belly dance lessons to keep the weight off and still color my beautiful long blonde hair!"

    It's quite impressive what a catalyst for self improvement a simple change like dying your hair can be. There were a number of similar testimonials submitted by the New Year New You! Contest Winners. Read them, be inspired, dye your hair.


    December 1, 2003
    @ 12:38 PM

    Doc Searles wrote

    Britt Blaser is a techblogger who will never be a warblogger because he's been there, done that, and collected a lot more than a t-shirt: namely, three Distinguished Flying Crosses, including one for the legendary Fire Flight at Katum.
      His latest post is Voice of Experience:
      This post will make the most sense for those who have witnessed war and are not freaked out by the cold calculus of accepting death as a constant and the loss of buddies as gut-stirring but as inevitable as taxes. Most of the rest of the world has been forced to experience war first hand. Perhaps that's why the rest of the world is unimpressed with this administration's gung-ho attitude, so typical of raw recruits and so uncharacteristic of adults who've peered into the abyss and lived to describe it..
      I hate to diss fellow bloggers, but the warbloggers seem to have a paucity of combat experience. We would never entertain the views of programmers who've never hacked code, or historians who've never read history. Why would we listen carefully to warbloggers who've never watched tracers arcing toward their position?
      Every warrior knows that perfect safety is a fool's paradise. The premise of the current war on terror is that we can entertain our way out of the terrorist threat. It's entertainment to feel an illusory omnipotence that will hunt down every evil-doer and infidel­a kind of adolescent road rage, really. The old heads in your squadron know to protect such greenhorns from their enthusiasms, at least until they learn or die. "There are old pilots and bold pilots. There are no old, bold pilots."

    The more I think about it the more I tend to feel that GW Bush's reelection is in the bag. Posts like the one linked above from Britt Blaser cement this feeling. I deeply suspect that, from the perspective of the average "man on the street" in the US who felt rage at the events of September 11th 2003, the US government has delivered in spades; retribution has been wreaked across two continents with minimal losses to US forces, the message has clearly been sent that if you screw with the US you get burned, and there have been no significant terror attacks on US soil despite several threats from terrorist organizations. This opinion is based on the general sentiments I get from reading open forums were people from diverse backgrounds discuss current affairs such as the Yahoo! Message Boards.

    The position of this mythical "man on the street" is very difficut to assail even with well written posts such as that by Britt Blaser. No matter how much one disagrees with the decisions the current US adminsitration has made as part of its "War on Terror" it is hard to argue with the fact that so far it has seemed relatively successful in the ways that are immediately noticeable. The various counter arguments to this position I have seen online usually sound like Britt Blaser's, they tend to argue that the current course of action is wrong but do not provide alternatives or they claim that there will be negative consequences for the current course of actions but none of the consequences are immediate.  These arguments don't hold up well compared to the aforementioned successes of the "War on Terror". If people feel safer, regardless of whether they are actually safer or not then it is hard to convince them otherwise especially when there isn't any concrete way to justify that position one way or the other.  


    Categories: Ramblings

    December 1, 2003
    @ 11:31 AM

    From the BBC

    Red-faced officials at General Motors in Canada have been forced to think of a new name for their latest model after discovering it was a slang word for masturbation.

    GM officials said they had been unaware that LaCrosse was a term for self-gratification among teenagers in French-speaking Quebec.

    The article describes a copuple of similar issues with product names as they cross the language barrier. The most amusing story was the poor reception of the Ford Pinto in Brazil which was attributed to the fact that in Brazilian Portuguese slang, pinto means "small penis".