May 22, 2004
@ 06:02 PM

Joshua Allen has a post entitled RSS Politics which does a good job of properly framing the growing Microsoft and RSS vs. Google and Atom silliness spurred by Joi Ito that I've been seeing in the comments on Robert Scoble's weblog. Joshua writes

First, be very clear.  The “debate“ over Atom vs. RSS is a complete non-issue for Microsoft.  We use RSS to serve thousands of customers right now, and most of the people setting up RSS feeds have never heard of the political “debates“.  RSS works for them, and that's all they care about.  On the other hand, if Atom ever reaches v1.0 and we had a business incentive to use it, we would use it.  No need for debate.

Now, of the three or four people at Microsoft who know enough about Atom to have said anything about it, I wouldn't say that anyone has trashed the format.  I and others have pointed out that it's just fine for what it does; just like RSS.  If anything, I have asked hard questions about why I or any business decision maker should be spending resources on the whole debate right now.  If a business has deployed using RSS, what financial motive would they have to switch to a new, nearly identical, format once it ships?  I've got nothing against the Atom people inventing new syndication formats, but I just don't see why *I* should be involved right now.  There's no good reason.

The other comment I've made before is that the Atom community is not being served by the polarizing attitudes of some participants.  The “us vs. them“ comments are not helpful, especially when untrue, and the constant personalization (”Support Atom because I hate Dave Winer!”) just damages the credibility of the whole group (many of whom might have good motives for being involved).

I totally echo his sentiments. In the past couple of months more and more folks at Microsoft have pinged me about syndication and blogging technologies once they learn I wrote RSS Bandit. Every single time I've given them the same advice I gave in my post, Mr. Safe's Guide to the RSS vs. ATOM debate. If you are a feed consumer you'll need to support the various flavors of RSS and the various flavors of ATOM (of which there'll at least be two, ATOM 0.3 and whatever is produced from the IETF/W3C process). If you are a feed producer, you should stick with RSS 0.91/2.0 since this is the widest supported format and the most straightforward.

Although no one has asked yet, I'm also going to give my advice on Mr. Safe at Microsoft should consider adopting the ATOM API. In my personal opinion, the current draft of the ATOM API seems better designed and falls more inline with Microsoft's technologies than the existing alternatives (Blogger API/MetaWeblog API/LiveJournal API), etc. However the API lacks lots of functionality and in fact already there are extensions to the ATOM API showing up in the wild. Currently, these "innovations" are being lauded but given the personalities behind ATOM it is likely that if Microsoft products supported the API and extended it there could be a negative backlash. In which case perhaps going with a product specific API may be the best option if there is sensitivity to such feedback or the ATOM API has to be significantly extended to fit the product's needs.


 

Categories: Life in the B0rg Cube | XML

I've posted a few entries in the past questioning the value of the Semantic Web as currently envisioned by the W3C along with its associated technologies like RDF and OWL. My most recent post about this was On Semantic Integration and XML. It seems I'm not the only XML geek who's been asking the same questions after taking a look at the Semantic Web landscape. Elliotte Rusty Harrold is at WWW2004 and wrote the following opinions of the Semantic Web on Day 4 of WWW2004

This conference is making me think a lot about the semantic web. I'm certainly learning more about the details (RDF, OWL etc.). However, I still don't see the point. For instance what does RDF bring to the party? The basic idea of RDF is that a collection of URIs forms a vocabulary. Different organizations and people define different vocabularies, and the URIs sort out whose name, date, title, etc. property you're using at any given time. Remind you of anything? It reminds me a lot of XML + namespaces. What exactly does RDF bring to the party? OWL (if I understand it) lets you connect different vocabularies. But so does XSLT. I guess the RDF model is a little simpler. It's all just triples, that can be automatically combined with other triples, and thereby inferences can be drawn. Does this actually produce anything useful, though? I don't see the killer app. Theoretically a lot of people are talking about combining RDF and ontologies from mulktiple sources too find knowledge that isn't obvious from any one source. However, no one's actually publishing their RDF. They're all transforming to HTML and publishing that.

I've written variations of the same theme over the past couple of months. It's just hard to point at any practical value that RDF/OWL/etc provide over XML/XSLT/etc for semantic integration.


 

Categories: XML

Every couple of months someone asks me why I haven't written up my thoughts about the current and future trends in social software, blogging and syndication as part of a Bill Gates "Think Week" paper. I recently was asked this again and I'm now considering whether to spend some time doing so or not. If you are unfamiliar with a "Think Week", below is a description of one taken from an interview with Bill Gates

I actually do this thing where I take a week and I call it "Think Week" where I just get to go off and read the latest Ph.D. theses, try out new technologies, and try and write down my thoughts about where the market is going. Things are going fast enough that instead of doing one think a year, last year I started doing two a year. And that’s one of the most fun parts of my job. So, you know, not only trying things out, but seeing how the pieces fit together and thinking ahead what kind of software will that require, that’s a big part of my job. And I get lots of great ideas coming from the people inside Microsoft, whether it’s sending e-mail, or meeting with me, and it’s important for me to synthesize that and so there’s a lot of thinking that I’ve got to do. And, you know, that’s fun.

I have been balking at writing one for a few reasons. The first was that it seems like a bunch of effort for relatively small return [the people I know who've written one first hand got the equivalent of a "virtual pat in the back"], the second was that I didn't think this topic would be interesting enough to get past the layer of VPs and technical assistants that probably screen these papers before Bill Gates reads them.

After thinking about this some more it seems that I was wrong about whether BillG would be interested in this topic given his recent endorsement of blogging and syndication. I still don't think much would come out of it but I've now see myself bursting with a lot of ideas about the current and future landscape of blogging and syndication technologies that I definitely want to write something down anyway regardless of who reads it. If I write this paper I plan to make it available online along with my other writings. The question is whether there are any folks out there interested in reading such a paper? If not, it is easier for me to just keep notes on the various ideas and blog bits & pieces of the ideas as I have been doing thus far.

So what do you guys think?


 

Categories: Ramblings | RSS Bandit

Scoble has a misleading post entitled Microsoft attending Atom meeting

Microsoft attending Atom meeting

Some people have already tried to paint me into a corner when it comes to RSS vs. Atom. Just to be clear. Microsoft's Chris Sells and George Bullock, of Microsoft, are attending the June 2 Atom group meeting.

These post reads like official representatives of Microsoft are attending the Atom conference. Considering that Chris Sells and George Bullock are MSDN folks it is highly unlikely that they are going to be representative of all of Microsoft or of the major parts of Microsoft that would be interested in Atom. I work with standards and product groups every day and I always try to make the distinction between official Microsoft position and personal positions. Even then official Microsoft opinion may vary from product group to product group (it is really a bunch of small companies in here).


 

Categories: Ramblings

I've been reading the various pieces of feedback on my recent blog post on Why You Won't See XSLT 2.0 or XPath 2.0 in the Next Version of the .NET Framework including the 40 comments in response to the post and the "Microsoft is killing XSLT" thread on xsl-list. Most of it has been flames witrh little useful feedback but there was an interesting response by Norm Walsh entitled XQuery 1.0 or XSLT 2.0? which I've been drawn to respond to. Norm writes

Dare Obasanjo argues that “XQuery is strongly and statically typed while XPath 2.0 is weakly and dynamically typed.” What’s not clear from his post is that he is comparing XQuery 1.0 to XPath 2.0 in backwards compatibility mode (Michael Rys did provide a clarification). That’s an odd comparison to make. XPath 2.0 needs a backwards compatibility mode so that it stands some chance of doing the right thing when used in the context of an XSLT 1.0 stylesheet, but that’s not the expected mode for long-term use.

I thought my point was self evident here but if Norm missed it then it means most of the people who read my original blog post did as well. XPath 2.0 is a subset of XQuery 1.0, the parts of XQuery missing are XML construction, the query prolog, the let-where-orderby parts of the FLWOR expression, typeswitch and a few other things.  XPath 2.0 has a backwards compatibility mode which has different semantics from regular XPath 2.0 and XQuery. When I talked about Microsoft not implementing XPath 2.0 I meant XPath 2.0 in backwards compatibility mode since implementing XQuery means you already have regular XPath 2.0. After all, everything you can do in XPath 2.0 you can do in XQuery. 

Norm also writes

The funniest arguments are the ones that imply that XQuery is a competitor in the same problem space as XSLT, that users will use XQuery instead of XSLT. I say that’s funny because there are so many problems that you simply cannot solve with XQuery. If your data is regular and especially if it’s all stored in a database already so that your XQuery implementation can run really fast, then XQuery absolutely makes sense, but didn’t the database folks already have a query language? Nevermind. If your customers don’t need to solve the kinds of problems for which XSLT was designed, or if you want to sell them some sort of proprietary system to solve them, then implementing XSLT 2.0 probably doesn’t make sense.

I've seen variations of the above theme (XSLT is for transformation, XQuery is for query) in various responses to my original post. Taking away the words query and transformation out of the picture both XQuery and XSLT are designed to reshape XML data. SQL is primarily a query language but you can use it to reshape relational data, this is exactly how SQL views work. For most people, the transformations they want to perform using XSLT also be expressed using XQuery. Per Bothner wrote an article over a year ago on XML.com about Generating XML and HTML using XQuery showing how you could use XQuery to transform an XML document to another XML format or HTML. There are a few niceties in XSLT 2.0 that don't exist in XQuery such as the ability to write to multiple output streams but in general most of the things you can do in XSLT 2.0 can also be done in XQuery. In fact this leads me to something else Norm wrote

If you want to transform documents that aren’t regular, especially documents that have a lot of mixed content, XSLT is clearly the right answer. I’ll wager dinner at your favorite restaurant that XQuery cannot be used to implement the functionality of the DocBook XSLT Stylesheets. (You produce the XQuery that does the job, I buy you dinner.)

First of all XSLT is actually very bad at dealing with XML that isn't regular and has lots of mixed content. This is why a number of XSLT gurus got together to created EXSLT and why I started the EXSLT.NET project (grab the latest version from the Microsoft.com download servers here). As for transforming DocBook with XQuery, as I mentioned before Per Bothner wrote an article about using XQuery for transformations. In fact, he specifically writes about Transforming DocBook to HTML using XQuery.

The bottom line is that XQuery is as much a "transformation language" as XSLT. XSLT may have some functionality that XQuery does not have but there isn't much I've seen that couldn't be implemented using extension functions. Perhaps I should start an EXQuery.NET project? :)

 


 

Categories: XML

A few months ago Mark Fussell wrote an article entitled What's New in System.Xml for Visual Studio 2005 and the .NET Framework 2.0 Release. Mark Ihimoyan has a followup series of blog posts which mentions which of the new features of System.Xml mentioned in Mark's article will actually be in the .NET Compact Framework. The blog posts are listed below

  1. System.Xml in NETCF v2.0 part I
  2. System.Xml in NETCF v2.0 part II
  3. System.Xml in NETCF v2.0 part III

 


 

Categories: XML

I recently stumbled on Mike Padula's website. Mike is a Cornell student who's analyzing blogging at Microsoft as part of his course work. We've exchanged some email, and I've offered to answer some questions he raised and clarify some points in my blog. Below are links to Mike's writings thus far

  • Expanding on Interesting Developments
  • Interesting Developments
  • Typical Features of a Blog and their Use
  • Blog What? That's What.
  • Blog What? Part 2
  • Blog What? Part 1
  • Let's start things off: A little intro
  • Proposal
  • One of Mike's main questions is whether blogging at Microsoft is a concerted effort or not. To answer this I'll give a brief history lesson. A couple of years ago, there was one visible Microsoft blogger, Joshua Allen. Back in 2001, Joshua was blogging about life at Microsoft, XML and current affairs way before blogging was a hip buzzword that every new age marketing department is trying to adopt. Joshua took a lot of the early heat for being a Microsoft blogger from arguments with Open Source advocates such as Eric Raymond to having coworkers try to get him fired for revealing too much about working at Microsoft in his blog to having to deal with PR and HR folks concerned about blogging. Through all this Joshua stuck it through and was an example to other Microsoft folks who later showed up with an interest in blogging who wondered if it was kosher to do so. I was one of them. The catalyst for much of the growth of blogging at Microsoft today are twofold, the first being Chris Anderson starting to blog. Chris was gung ho about blogging and not only wrote his own software but created an internal server for hosting blogs which became moderately popular. Many Microsoft folks who blog today started with internal blogs hosted on Chris's machine. The other catalyst is the changing climate internally towards interacting with customers and being 'community' focused. I believe this was directly pushed by execs like Eric Rudder and Steve Balmer. Once Microsoft people saw highly internally and externally visible people like Don Box and Robert Scoble blogging, the rest was inevitable and now we have the current situation today where there are almost six hundred Microsoft folks blogging at http://blogs.msdn.com.

    Mike assumes that there is a concerted effort at Microsoft to blog. This isn't the case as far as I've seen. I know that many product teams now require that people engage in some form of customer interaction and blogs are one way of doing so but there's never been a formal edict. In many cases, some coworkers see a colleague's blog think blogging is a neat idea then start blogging themselves.

    Since we're going down memory lane I should point out that http://blogs.msdn.com kinda happened by accident. About a year ago most Microsoft bloggers were hosted on disparate locations until blogs.gotdotnet.com was launched with Betsy Aoki being in charge. Around the same time Scott Watermasysk had launched .NET Weblogs which was aimed at being a community for developers interested in the .NET Framework. After a while, the bandwidth costs for .NET Weblogs got too high and Scott needed help. The Microsoft ASP.NET came to the rescue and Weblogs@ASP.NET was born. Eventually, Blogs@GotDotNet.com also couldn't take the traffic and Betsy was getting overloaded with feature requests and bug reports since Blogs@GotDotNet was running Chris Anderson's BlogX software which he'd stopped maintaining and I'd offered to take over but never actually did. Again the ASP.NET team came to the rescue and the plan was to migrate the Blogs@GotDotNet to Weblogs@ASP.NET. The original plan was simply to have all Microsoft blogs just merge with those on the Weblogs@ASP.NET site without demarcating who worked for Microsoft and who didn't. This was met with some resistance by the existing users of Weblogs@ASP.NET as well as criticism from folks like myself and Josh Ledgard on internal mailing lists. Eventually the folks at MSDN recanted and decided to go with a plan where the Microsoft bloggers were merged with Weblogs@ASP.NET but one could filter for Microsoft employees by going to http://blogs.msdn.com. Folks like Sara Williams, Betsy, Scott and the ASP.NET folks deserve the praise for getting this done. This solution seemed to satisfy everyone involved. Now http://blogs.msdn.com is basically where most Microsoft people (not just employees of MSDN) who want to blog in the context of their jobs have their blogs hosted.

    This should answer a bunch of Mike's open questions about blogging at Microsoft. There are two questions specific to my blog I should answer as well. Mike wonders what kind of traffic I get. Based on tracking unique IPs, I'd say I have about a thousand or so regular readers of my personal weblog. Since my work weblog is syndicated on the MSDN XML Developer Center its readership is in the tens of thousands. Mike also wondered how I noticed his project. Three words, Technorati Link Cosmos.


     

    Categories: Life in the B0rg Cube

    In his blog posting entitled On probation Eric Gunnerson writes

    I got an email today from the owner of all the MSDN columns, telling me that if I wasn't able to produce a column every other month, my column would be put on probation, and then cancelled.

    I'm frankly surprised it took this long - the whole essence of a column or any other periodical is that it is just that - periodical. The combination of me writing a blog and spending a lot of time doing PM stuff has meant that my column has been neglected. June, September, and February does not a periodical column make.

    I got the same email that Eric got since I'm the author of the Extreme XML column on MSDN. From looking at the history of the column since I took it over about two years ago I have averaged an article every two months so the new schedule actually accurately reflects my rate of output and I no longer have to make excuses about publishing missing an article every other month or so.

    Eric asks whether he should keep blogging or keep the column. I believe this is a false dichotomy and shouldn't be an EITHER-OR choice. Comparing an MSDN column to the typical Microsoft blog at blogs.msdn.com I see a number of key differences. A column on MSDN hits different needs and different audiences from a blog. A column is read by tens of thousands of developers online and several thousands more who get it as part of the MSDN library CDs/DVDs. A blog is read by a couple of hundred people or few thousand people. A blog posting typically contains quick tips, interesting tidbits about future technological directions or answers to an FAQ question. A column provides insight into a particular technology or feature and usually provides significant code samples that developers can build upon which helps them significantly in their daily lives. Up until I started RSS Bandit I had never written a GUI application but learned most of what I know about building multithreaded GUI apps in Winforms from Chris Sells' Wonders of Windows Forms column. Then there is the information on how system tray applications work that I learned from Eric Gunnerson's article Creating a System Tray Application. Oleg Tkachenko was inspired to build XInclude.NET by Chris Lovett's article XInclude, Anyone?. The list goes on...

    I think Microsoft's developer customers should get both and don't think one necessarily replaces the other. If given the choice between less postings from Eric about things he wishes were different in C# and why C# doesn't have const methods or articles about how to get Exchange to talk to Excel using C# or building a system tray application complete with code samples, I definitely could do with less blog postings and more articles.

    As blogging became the vogue at Microsoft I've worried that people will think blogging should replace traditional, more useful means of providing information to our customers. It's one thing that blogging lets out the voices that couldn't surmount the high barriers to producing official content (MSDN articles, product documentation, KB articles, etc) but it's another when it is expected to replace official documentation. By the way, don't get me started on the email discussion I saw where some product team wanted to use a link to some blog postings in lieu of product documentation. *sigh*


     

    Categories: Life in the B0rg Cube

    The Microsoft Pattern and Practices folks have produced an excellent guide to Improving .NET Application Performance and Scalability with a chapter on Improving XML Performance. If you build .NET Framework applications that utilize XML then you owe it to yourself to take a look at the guidelines in that document. There is also a handy, easily printable XML Performance checklist which can be used as a quick way to check that your application is doing the right thing with regards to getting the best performance for XML applications.

    On a similar note, Mark Fussell has posted XmlNameTable: The Shiftstick of System.Xml and XmlNameTable Revisited which provide some tips about how to use the XmlNameTable class to improve processing speed by up to 10% when processing XML documents.


     

    Categories: XML

    I collect about half a dozen comic book titles and I've noticed a growing trend in blurring the line between the secret identity of a super hero and their super hero identity. In the old days, a super hero had a regular dayjob with coworkers, girlfriends and bills to pay and put on his or her tights at night to fight crime. Nowadays I read comics about the same characters I used to read about as a child who now either no longer hide their non-super hero identity as a secret or have had their secret identity revealed to so many people that it might as well not be a secret.

    This trend is particularly true in Marvel's Ultimate universe. If you are unfamiliar with the Marvel Ultimate universe here is a brief description of it from the Wikipedia entry for Marvel Universe

    A greater attempt has been made with the Ultimate titles; this series of titles is in a universe unrelated to the main Marvel continuity, and essentially is starting the entire Marvel Universe over again, from scratch. Ultimate comics now exist for the X-Men, the Avengers, Spider-Man, and the Fantastic Four. Sales of these titles are strong, and indications are that Marvel will continue to expand the line, effectively creating two Marvel Universes existing concurrently. (Some rumors exist that if sales continue to increase and more titles are added, Marvel may consider making the Ultimate universe its main universe.)

    In the Marvel Ultimate universe the Avengers (now known as the Ultimates) are government agents who treated as celebrities by the tabloids and whose non-super hero identities are known to the public. The Ultimate X-Men appear on the cover of Time magazine and have met with the president several times. The anti-mutant hysteria that is a mainstay of the regular Marvel universe is much more muted in the Ultimate Marvel universe (Thank God for that, they had gone overboard with it although classics like God Loves, Man Kills will always have a special place in my heart). The identity of Ultimate spider-man isn't known to the general public but it is known by his girlfriend (Mary Jane), an orphan adopted by his aunt (Gwen Stacy), the Ultimates, all of major villains spidey has met (Doc Ock, Green Goblin, Kraven the Hunter, Sandman & Electro) as well as most of the staff of S.H.I.E.L.D. 

    This has also spread to the regular Marvel universe, most noticeably with Daredevil. His secret identity was known by the Kingpin for a long time and eventually was an open secret to most of the Kingpin's criminal organization. In recent issues, Daredevil has been outed as Matt Murdock in the tabloids and has to deal with assassination attempts on him in his regular life as well as when he is Daredevil.

    DC Comics is also playing along somewhat with Batman. Although it isn't common knowledge that Batman is Bruce Wayne there are now so many heroes (the entire Justice League, Robin, Nightwing, Spoiler, Batgirl, Huntress, Oracle) and villains (the Riddler, Hush, Bane, Ra's al Ghul) that it might as well be public.

    I suspect that one of the reasons for this trend is a point that the character Bill makes in Kill Bill vol.2 towards the end of the movie. He points out that most super heroes are regular people with regular lives that have a secret identity as a super hero while Superman was actually a super hero who had a secret identity as a regular person. Getting rid of the artificial division between super hero and alter ego makes sense because we tend to look at them as different people (Bruce Wayne is nothing like Batman) when in truth they are different facets of the same character. The increased connectedness of society as a whole has also made it easier to blur the lines between various aspects of one's character that used to be kept separate. I think comic book authors are just reflecting this trend.

    Speaking of reflecting current trends in comics I was recently disappointed and then impressed by statements made the Ultimate version of Captain America. In Ultimates #12, Cap is fighting the apparently indestructible leader of the alien invasion army who's just survived getting half his head blown off by an assault rifle when this exchange takes place

    Alien Leader: Now let's get back to business, eh, Captain? The world was about to go up and you were about to surrender in these few brief moments we've got left. Let me hear you say it. “I surrender Herr Kleiser! Make it quick!”.

    Captain America: *head butts and then starts to beat up the alien leader while saying* - Surrender? Surrender? You think the letter A on my head stands for France?

    This issue came out when the "freedom fries" nonsense was still somewhat fresh in people's minds and I was very disappointed to read this in a comic book coming from a character I liked. However he recently redeemed himself with his line from a conversation with Nick Fury in Ultimate Six #7

    Captain America: You know, being a veteran of war it occured to me, that really it's men of influence and power that decide what these wars will be about. They decide who we are going to fight and how we will fight them. And then they go about planning the fight. In a sense, really, these people will the war into existence.

    I remember thinking the same thoughts as a preteen in military school trying to decide whether to follow in my dad's footsteps and join the military or not. I fucking love comics.


     

    Categories: Ramblings