Disclaimer: This may sound like a rant but it isn't meant to be. In the wise words of Raymond Chen this is meant to highlight problems that are harming the productivity of developers and knowledge workers in today's world. No companies or programs will be named because the intent is not to mock or ridicule. 

This morning I had to rush into work early instead of going to the gym because of two limitations in the software around us.

Problem #1: Collaborative Document Editing

So a bunch are working on a document that is due today. Yesterday I wanted to edit the document but found out I could not because the software claimed someone else was currently editing the document. So I opened it in read-only mode, copied out some data, edited it and then sent my changes in an email to the person who was in charge of the document. As if that wasn’t bad enough…

This morning, as I'm driving to the gym for my morning work out, I glance at my phone to see that I've received mail from several co-workers because it I've "locked" the document and no one can make their changes. When I get to work, I find out that I didn’t close the document within the application and this was the reason none of my co-workers could edit it. Wow.

The notion that only one person at a time can edit a document or that if one is viewing a document, it cannot be edited seems archaic in today’s globally networked world. Why is software taking so long to catch up?

Problem #2: Loosely Coupled XML Web Services

While I was driving to the office I noticed another email from one of the services that integrates with ours via a SOAP-based XML Web Service. As part of the design to handle a news scenario we added a new type that was going to be returned by one of our methods (e.g. imagine that there was a GetFruit() method which used to return apples and oranges which now returns apples, oranges and bananas) . This change was crashing the applications that were invoking our service because they weren’t expecting us to return bananas.

However, the insidious thing is that the failure wasn’t because their application was improperly coded to fail if it saw a fruit it didn’t know, it was because the platform they built on was statically typed. Specifically, the Web Services platform automatically converted the XML to objects by looking at our WSDL file (i.e. the interface definition language which stated up front which types are returned by our service) . So this meant that any time new types were added to our service, our WSDL file would be updated and any application invoking our service which was built on a Web services platform that performed such XML<->object mapping and was statically typed would need to be recompiled. Yes, recompiled.

Now, consider how many potentially different applications that could be accessing our service. What are our choices? Come up with GetFruitEx() or GetFruit2() methods so we don’t break old clients? Go over our web server logs and try to track down every application that has accessed our service? Never introduce new types? 

It’s sad that as an industry we built a technology on an eXtensible Markup Language (XML) and our first instinct was to make it as inflexible as technology that is two decades old which was never meant to scale to a global network like the World Wide Web. 

Software should solve problems, not create new ones which require more technology to fix.

Now playing: Young Jeezy - Bang (feat. T.I. & Lil Scrappy)


 

Tuesday, 17 July 2007 20:18:39 (GMT Daylight Time, UTC+01:00)
I also wrote a feed consumer about a year ago that mapped the incoming xml to an object (and therefore would break in problem #2). Actually I was inspired doing this after reading your article "XML Serialization in the .NET Framework". Any comments regarding that article? I remember thinking: this feels wrong, but apparently Dare is doing it and I for sure like typed datasets... Did I misunderstand any important concepts in that article?
Peter Sunna
Tuesday, 17 July 2007 22:23:37 (GMT Daylight Time, UTC+01:00)
Using XML Serialization in the .NET Framework works if you have (i) an XML schema and (ii) knowledge that the XML format will not change in a way that violates the schema.

This generally means that this technology is primarily beneficial in tightly coupled scenarios where you control both ends of the wire (i.e. the consumer and producer of the XML).
Tuesday, 17 July 2007 23:16:00 (GMT Daylight Time, UTC+01:00)
Re #1: Onenote (from that plucky indie developer, Microsoft) handles this amazingly well. Letting multiple people make unlocked changes to a document on a shared drive is just brilliant.
Tuesday, 17 July 2007 23:31:42 (GMT Daylight Time, UTC+01:00)
insidious might exactly be the right word, but it is just as insidious to imagine even having a 'b' of booting any of those laptops, dekstops, PDAs, routers that get you to the 'end result' and plenty more without a static and typed interface.

Lets say of an interface to a simple neccessity: power supply.

It seems to me what you have hit on there is 'tangling', and that is a design issue (easy trap thanks to tools on top of 'live' code).

I am not stating we should map between objects and xml, or relational data and objects, or even use XML at runtime. No way, and never blindly at least. I am more interested in what constitutes an environment that handles some forms of static expression while retaining all the goodness of a mighty compilation process, because without it none (and I mean even this web-page) renders.

So I agree XML can stink :) Best use I see to it is the actual text 'phase' and we all have one, no matter what language you work in unless you input your operands in hexadecimal (still text:-). So even DCE had *it*, and btw people still do a form of DCE in all those Web APIs (Google, Amazon, MSFT and more) just on a 'text level'.

Hence Dare, in your mind and with your background from the Data team, how do we fix the 'Web as platform' and/or 'software build' problem?

I doubt that Google has no veresioning or typing issues either, and it seems 'fixing' can only come from the source. I never came across self-healing software too, so my guess is you need to fix the client not the typed server end.

Analogy: attempting to charge a 9V battery from a *direct* (adj:indirect), main supply.
Tuesday, 17 July 2007 23:48:07 (GMT Daylight Time, UTC+01:00)
And talking of typing and compilation, its results, even when the server is right:

The DNS state of hack, makes it pretty obvious that every time it has a glitch and I type in:

www.live.com

DNS awakes, IE rises and voila:

I get to www.live.com with a text entry www.live.com, and service stating it cannot find www.live.com.

Great, both the client and server interface are working, oh mighty typeless programming out of the blues TM. Fixing the client or something above server end is the only way I see it helping. As you cannot stop the server receiving broken semantics at runtime, or taking away its simple but cool interface, you have to compile somewhere (even if that somewhere is your router or OS piece or on the opposite side of spectrum/extreme the client that filters).
Wednesday, 18 July 2007 00:00:14 (GMT Daylight Time, UTC+01:00)
Being aware you are more fluent on that topic and plenty more than me, heck Bray's XML never had it, compiler people figured out that seperate namespaces are a good solution.

Still I am against XML at runtime, for good (prefer to be isolated from that subject for now, what was the word, 'keep that sht off the Web':-)
Wednesday, 18 July 2007 02:23:12 (GMT Daylight Time, UTC+01:00)
You know the only way to fix it, Dare - find the internal bug database and FILE A BUG. Or get someone else to file a bug. No bug == no change. So, go get someone started on fixing the problem.
SomeGuy
Wednesday, 18 July 2007 09:58:43 (GMT Daylight Time, UTC+01:00)
Welcome to the wonderful world of contract versioning. Unfortunately, this has nothing to do with the fact that you're using xml. Just because the format is loosely coupled enough to allow me to change the contract easily doesn't mean I can do it with impunity.

I could write a new contract that explicitly supports extensibility and versioning. I can now add new data to the message without changing its syntactical contract. But I cannot add new data to a service without changing its semantic contract (you've just returned me some order information about bananas. I neither ship nor process bananas. How do I handle this order? And now you're talking about pistachios?)

(JSON might very well have worked syntactically in this situation, but you've no idea how the client will semantically handle the new, unexpected data).

Cheers
Matt
Thursday, 19 July 2007 16:59:20 (GMT Daylight Time, UTC+01:00)
So isn't it a limitation of the XML-to-Object deserialization layer ? As far as I understand the code gets regenerated on seeing new types in the WSDL which expects new fruit be available.
Can this code-generate XML-to-Object layer can get a bit smarter and don't do the branch exoecting a new fruit if it's actually not there in the incoming message ?
Thursday, 19 July 2007 18:49:10 (GMT Daylight Time, UTC+01:00)
Dare,

For prob #1 we use an application called Telelogic DOORs (http://www.telelogic.com/Products/doors/doors/index.cfm) at work. Once the document is ready it can eventually be exported to MS Word etc., so it really isn't the main document itself that is being worked on.
Ahmad Mageed
Saturday, 21 July 2007 04:00:27 (GMT Daylight Time, UTC+01:00)
Dare

I have a hard time seeing how clients can be resilient to changes in the data that the service sends back. If the client does not understand a field in the data sent back by the service, how can it blissfully process the part it understands - it may completely misinterpret the semantics of the data.

Are you suggesting that in the real world, its common for the service to send data in which some elements are "must understand" and other elements are hints/optimizations that the client does not need to understand (and hence be resilient to changes)?

I would love to get a better understanding of this aspect of the REST-SOAP debate.

Thanks
Vish
Vish
Thursday, 26 July 2007 15:05:11 (GMT Daylight Time, UTC+01:00)
There are a lot of version-control systems out there that solve Problem #1 correctly: Subversion (subversion.tigris.com) being perhaps the best known and most widely used; it has a spiffy Windows interface too.

Beyond simple checkout - work independently - merge changes systems without global locks, you enter the world of true distributed change control, where the One True Version is a matter of convention, and anyone can maintain their own repository. Darcs is probably the leading contender here, though that claim will get plenty of opposition.
Comments are closed.