Tim Ewald on Versioning XML Web Services with XSD

May 15, 2006

@ 01:48 AM

Tim Ewald has been blogging about ways to add versioning to web services which work around the various limitations of the W3C XML Schema Definition Language (XSD).One bit of insight I always like to share when talking about XSD is that there are two primary usage scenarios that have developed around XML document validation and XML schemas. My article XML Schema Design Patterns: Is Complex Type Derivation Unnecessary? describes them as

Describing and enforcing the contract between producers and consumers of XML documents: An XML schema ordinarily serves as a means for consumers and producers of XML to understand the structure of the document being consumed or produced. Schemas are a fairly terse and machine readable way to describe what constitutes a valid XML document according to a particular XML vocabulary. Thus a schema can be thought of as contract between the producer and consumer of an XML document. Typically the consumer ensures that the XML document being received from the producer conforms to the contract by validating the received document against the schema.
Creating the basis for processing and storing typed data represented as XML documents: XSD describes the creation of a type annotated infoset as a consequence of document validation against a schema. During validation against an XSD, an input XML infoset is converted into a post schema validation infoset (PSVI), which among other things contains type annotations. However practical experience has shown that one does not need to perform full document validation to create type annotated infosets; in general many applications that use XML schemas to create strongly typed XML such as XML<->object mapping technologies do not perform full document validation, since a number of XSD features do not map to concepts in the target domain.

If you are building a SOAP-based XML Web service using the toolkits provided by the major vendors like IBM, Microsoft or BEA then it is most likely that your usage pattern aligns with scenario #2 above. This means that your Web service toolkit isn't completely enforcing that documents being consumed or generated by the service actually are a 100% valid against the schema. This seems bad until you realize that XSD is so limited in the constraints that it can describe that any XSD validation done would still need to be backed by a further business logic validation phase in your code. In his post Making everything optional Tim Ewald writes

DJ commented on my post addressing the problem Raimond raised with my versioning strategy. He wondered if he'd missed an earlier post where I argued that you not use XSD to validate your data because if you make content optional, you can't use it to check what has to be there. Since I haven't written about that yet, I figured I'd start to address it now.

When people build a schema for a single service, they tend to make it reflect the precise requirements of that system at that moment in time. Then, when those requirements change, they revise the schema. The result is a system that tends to be very brittle. If you take the same approach when you design a schema for use by multiple systems, describing a corporate level model for customer data for instance, things are even worse. Some systems won't have all the required data. They have to decide whether to (a) collect the data, (b) make up bogus data, or (c) not adopt the common model. None of these are good approaches.

To solve both these problems, I've started thinking about my schema not as the definition of what this system needs right now but as the definition of what the data should look like if it's present instead. I move the actual checking for what has to be present inside the system (either client or service) and implement it using either code or a narrowed schema that is duplicate of the contract schema with more constraints in place.

There are important lessons in Tim's posts which are unfortunately often learned the hard way. A document or message can have different required/optional fields depending on what part of the process your are in or even whether it is being used as input vs. output. It's hard to come up with on single schema definition for a common type across a system without resorting to "everything is optional" and then relying on code to do the specific business logic validation for which phase in the process your are in.

There's another great comment in Tim's follow up post More on making everything optional

I think it's important not to confuse your schema with your contract. A client and a service have to agree on all sorts of things, only some of which are captured in your WSDL/XSD(/Policy). My goal in proposing that almost everything in your XSD be optional is to find the sweet-spot between easy coding and flexibility for evolution.

Amen! Preach on brother.

Categories: XML Web Services

« Misinterpreting Alexa Traffic Data for L... | Home | Startups as Research for Big Companies »

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Tim Ewald on Versioning XML Web Services with XSD - Dare Obasanjo's weblog