Query and Transformation: 2 Sides of the Same Coin?

May 21, 2004

@ 06:56 PM

I've been reading the various pieces of feedback on my recent blog post on Why You Won't See XSLT 2.0 or XPath 2.0 in the Next Version of the .NET Framework including the 40 comments in response to the post and the "Microsoft is killing XSLT" thread on xsl-list. Most of it has been flames witrh little useful feedback but there was an interesting response by Norm Walsh entitled XQuery 1.0 or XSLT 2.0? which I've been drawn to respond to. Norm writes

Dare Obasanjo argues that “XQuery is strongly and statically typed while XPath 2.0 is weakly and dynamically typed.” What’s not clear from his post is that he is comparing XQuery 1.0 to XPath 2.0 in backwards compatibility mode (Michael Rys did provide a clarification). That’s an odd comparison to make. XPath 2.0 needs a backwards compatibility mode so that it stands some chance of doing the right thing when used in the context of an XSLT 1.0 stylesheet, but that’s not the expected mode for long-term use.

I thought my point was self evident here but if Norm missed it then it means most of the people who read my original blog post did as well. XPath 2.0 is a subset of XQuery 1.0, the parts of XQuery missing are XML construction, the query prolog, the let-where-orderby parts of the FLWOR expression, typeswitch and a few other things. XPath 2.0 has a backwards compatibility mode which has different semantics from regular XPath 2.0 and XQuery. When I talked about Microsoft not implementing XPath 2.0 I meant XPath 2.0 in backwards compatibility mode since implementing XQuery means you already have regular XPath 2.0. After all, everything you can do in XPath 2.0 you can do in XQuery.

Norm also writes

The funniest arguments are the ones that imply that XQuery is a competitor in the same problem space as XSLT, that users will use XQuery instead of XSLT. I say that’s funny because there are so many problems that you simply cannot solve with XQuery. If your data is regular and especially if it’s all stored in a database already so that your XQuery implementation can run really fast, then XQuery absolutely makes sense, but didn’t the database folks already have a query language? Nevermind. If your customers don’t need to solve the kinds of problems for which XSLT was designed, or if you want to sell them some sort of proprietary system to solve them, then implementing XSLT 2.0 probably doesn’t make sense.

I've seen variations of the above theme (XSLT is for transformation, XQuery is for query) in various responses to my original post. Taking away the words query and transformation out of the picture both XQuery and XSLT are designed to reshape XML data. SQL is primarily a query language but you can use it to reshape relational data, this is exactly how SQL views work. For most people, the transformations they want to perform using XSLT also be expressed using XQuery. Per Bothner wrote an article over a year ago on XML.com about Generating XML and HTML using XQuery showing how you could use XQuery to transform an XML document to another XML format or HTML. There are a few niceties in XSLT 2.0 that don't exist in XQuery such as the ability to write to multiple output streams but in general most of the things you can do in XSLT 2.0 can also be done in XQuery. In fact this leads me to something else Norm wrote

If you want to transform documents that aren’t regular, especially documents that have a lot of mixed content, XSLT is clearly the right answer. I’ll wager dinner at your favorite restaurant that XQuery cannot be used to implement the functionality of the DocBook XSLT Stylesheets. (You produce the XQuery that does the job, I buy you dinner.)

First of all XSLT is actually very bad at dealing with XML that isn't regular and has lots of mixed content. This is why a number of XSLT gurus got together to created EXSLT and why I started the EXSLT.NET project (grab the latest version from the Microsoft.com download servers here). As for transforming DocBook with XQuery, as I mentioned before Per Bothner wrote an article about using XQuery for transformations. In fact, he specifically writes about Transforming DocBook to HTML using XQuery.

The bottom line is that XQuery is as much a "transformation language" as XSLT. XSLT may have some functionality that XQuery does not have but there isn't much I've seen that couldn't be implemented using extension functions. Perhaps I should start an EXQuery.NET project? :)

Categories: XML

« What's New in System.Xml for .NET Compac... | Home | Scoble Gets On My Nerves Sometimes »

Friday, 21 May 2004 20:57:58 (GMT Daylight Time, UTC+01:00)

I'm a little baffled by this response, but I'll give it some more thought and craft some sort of reply over the next few days.

Norman Walsh

Friday, 21 May 2004 21:12:54 (GMT Daylight Time, UTC+01:00)

Hi Dare,

I was following you fine until I came across this statement:

"First of all XSLT is actually very bad at dealing with XML that isn't regular and has lots of mixed content."

I assumed that you had an excellent understanding of both XSLT and XQuery and that you were simply clarifying your position and/or expressing your preferences. But this statement makes me question whether you in fact do understand XSLT (because mixed content and irregular structures are where XSLT shines). Am I misunderstanding you?

Evan Lenz

Saturday, 22 May 2004 02:02:58 (GMT Daylight Time, UTC+01:00)

Evan,
XSLT shines compared to what? When I think of mixed content processing I think of text processing and XSLT 1.0 sucked at this and XSLT 2.0 has the same text processing functions as XQuery. As for irregular XML, I should take that back given that XSLT is just as good or better than the alternatives.

Dare Obasanjo

Sunday, 23 May 2004 04:53:53 (GMT Daylight Time, UTC+01:00)

Okay, I agree that XSLT 1.0 sucks for text processing, but I view that as a different category than mixed content processing. I think mixed content processing is more like what the Docbook stylesheets do--translate document-oriented markup into other document-oriented markup, especially recursive structures, e.g.:

xsl:template match="emphasis"
em
xsl:apply-templates/
/em
/xsl:template

The text itself doesn't have to be processed (parsed, regexp-matched, etc.); instead, it's just copied through. Comparing the equivalent set of XSLT template rules to the "rules" in Per Bothner's XQuery-for-Docbook example drives this point home. If you look at the query itself and see how quickly it gets unwieldy (not to mention not expressive enough), you might start to realize the necessity of template rules for processing document-oriented markup. It seems like a perfect citation to support exactly the opposite of what you're arguing. In other words, Per is nowhere close to getting a free dinner (not that it's his fault) ;-)

Evan Lenz

Tuesday, 25 May 2004 20:13:33 (GMT Daylight Time, UTC+01:00)

Evan,
Agreed. For XQuery to be a proper replacement for XSLT for processing XML documents it should have something equivalent to or as powerful as XSLT's template matching functionality.

Dare Obasanjo

Friday, 28 May 2004 13:07:02 (GMT Daylight Time, UTC+01:00)

Dare,

XSLT is significantly more powerful than Xquery.

Xquery is missing the xsl:apply-templates feature.

For example, FXSL couldn't be developed with Xquery.

As for "XSLT 1.0 text processing sucks", do have a look at the text-processing templates of FXSL.

Cheers,

Dimitre Novatchev [XML MVP],
FXSL developer, XML Insider,

http://fxsl.sourceforge.net/ -- the home of FXSL

Dimitre Novatchev

Tuesday, 03 August 2004 00:22:25 (GMT Daylight Time, UTC+01:00)

Apart from template rules, here are just a few of the things that are in XSLT 2.0 that have no equivalent, or only very cumbersome equivalents, in XQuery 1.0:

for-each-group (grouping by common values, by position in a sequence, or both)

analyze-string (ability to create elements to replace text matched using a regex)

format-number

format-date

unparsed-text (ability to read and process text files)

generate-id (ability to generate unique hyperlink anchors)

xsl:number (generating sequence numbers)

xsl:import (ability for one module to override definitions in another)

ability for parameters to be optional or required (aids stylesheet evolution)

tunnel parameters (ability to set "almost global" variables that affect a phase of processing, greatly increasing the ability to re-use existing code with local modifications)

xsl:key - provides user control over indexing and performance

xsl:output - provides serialization control from within the stylesheet

xsl:result-document - multiple documents written by one stylesheet, each with a URI so they can be hyperlinked together.

XQuery makes it easier to write many 10-20 line queries, but for the typical production XSLT stylesheet of 500 - 10000 lines, it's not a serious alternative to XSLT 2.0.

Michael Kay

Michael Kay

Comments are closed.

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Query and Transformation: 2 Sides of the Same Coin? - Dare Obasanjo's weblog