Thursday, 10 February 2005 - Dare Obasanjo's weblog

February 10, 2005

@ 02:59 PM

On Interoperability and Tim Ewald's 3 Web Services Stacks

Tim Ewald has a post entitled My 3 Web Services Stacks where he writes

The point Chris was making was that you created a COM class, your choice of data type determined who could use your component. In other words, your choice affected how far your code could reach. There will be a similar split in the Web services world, hence the title of this post. This time though the spin will be a little different. It won't focus on data format so much (it's all just XML), but on behavior. To wit, “My 3 Web Service Stacks“:

First, there is basic SOAP over HTTP described with WSDL and XSD. It hits a sweet spot for developers who want tools to hide XML details but is still simple enough to implement everywhere. Then there is WS-*, which offers richer functionality but is harder to implement and will not be supported by all products, especially legacy systems. Finally, there are toolkit specific features like BEA's idiom for async messaging over HTTP using a header to convey a callback URL, Iona's Artix support for a CORBA binding, or Microsoft's Indigo support for binary messaging.

In this world, your choice of feature will decide how far you can reach: to everyone, to everyone using WS-* enabled tools and infrastructure, or only to others using the same toolkit you are.

I completely agree with this and in fact back when I was on the XML team I used to be frustrated with the fact that the given all the reams of text coming out of the XML Web Services Developer Center on MSDN Microsoft never did a good job of explaining this to developers. Of course, this would require admitting that many developers don't require the functionality of the WS-* specs and due to their complexity in certain cases this would actually hamper interoperability compared to plain old SOAP (aka POS). To some this may contradict Bill Gates's statements on interoperability in his recent executive memo Building Software That Is Interoperable By Design where he wrote

The XML-based architecture for Web services, known as WS-* ("WS-Star"), is being developed in close collaboration with dozens of other companies in the industry including IBM, Sun, Oracle and BEA. This standard set of protocols significantly reduces the cost and complexity of connecting disparate systems, and it enables interoperability not just within the four walls of an organization, but also across the globe. In mid-2003, Forrester Research said that up to a "ten-fold improvement in integration costs will come from service-oriented architectures that use standard software plumbing."

However context is everything. Replacing a world of distributed applications written with DCOM, CORBA, Java RMI, etc with one where they are written using the WS-* protocols is a step in the right direction. I completely agree that it leads to better interoperability across technologies produced by the big vendors who are using incompatible technologies today.

But when your plan is to reach as many parties as possible one should favor simpler Web services technologies like plain old SOAP or just plain old XML (aka POX). Plain old XML doesn't necessarily mean following the principles of REST either. After all a number of successful plain old XML services on the Web don't follow these principles to the letter. For example, Joe Gregorio pointed out in his article Amazon's Simple Queue Service that Amazon's Queueing Service violated these principles. In Sam Ruby's post entitled Vacant Space he points out that plain old XML web services exposed by Bloglines and del.icio.us aren't RESTful either.

When building distributed applications using XML, one should always keep in mind the 3 web service stacks. My day job now involves designing XML web services for both internal and external consumption so I'll probably start writing more about web services in the coming months.

Categories: XML Web Services

February 10, 2005

@ 02:01 PM

Comments [2]

It's On Now

Just saw Iran Promises 'Burning Hell' for Any Aggressor and N.Korea Says It Has Nuclear Arms, Spurns Talks. Looks like the world woke up on the wrong side of bed.

Categories: Mindless Link Propagation

February 9, 2005

@ 03:05 PM

Comments [0]

On the Complexity of XML APIs

David Megginson (the creator of SAX) has a post entitled The complexity of XML parsing APIs where he writes

Dare Obasanjo recently posted a message to the xml-dev mailing list as part of the ancient and venerable binary XML permathread (just a bit down the list from attributes vs. elements, DOM vs. SAX, and why use CDATA?). His message including the following:

I don’t understand this obsession with SAX and DOM. As APIs go they both suck[0,1]. Why would anyone come up with a simplified binary format then decide to cruft it up by layering a crufty XML API on it is beyond me.

[0] http://www.megginson.com/blogs/quoderat/archives/2005/01/31/sax-the-bad-the-good-and-the-controversial/

[1] http://www.artima.com/intv/dom.html

I supposed that I should rush to SAX’s defense. I can at least point to my related posting about SAX’s good points, but to be fair, I have to admit that Dare is absolutely right – building complex applications that use SAX and DOM is very difficult and usually results in messy, hard-to-maintain code.

I think this is a pivotal part of the binary XML debate. The primary argument for binary serializations of XML is that certain parties want to get the benefit of the wide array of technologies for processing XML yet retain the benefits of a binary format such as reduced size on the wire and processing time. Basically having one's cake and eating it too.

For me, the problem is that XML is already being pulled from too many directions as it is. In retrospect I realize it was foolish for me to think that the XML team could come up with a single API that would satisfy people processing business documents written in wordprocessingML, people building distributed computing applications using SOAP or developers reading & writing to application configuration files. All of these scenarios use intersecting subsets of the full functionality of the XML specification. The SOAP specs go as far as banning some features of XML while others are simply frowned upon based on the fact that the average SOAP toolkit simply doesn't know what to do with them. One man's meat (e.g. mixed content) is another man's poison.

What has ended up happening is that we have all these XML APIs that expose a lot of cruft of XML that most developers don't need or even worse make things difficult in the common scenarios because they want to support all the functionality of XML. This is the major failing of APIs such as the .NET Framework's pull model parser class, System.Xml.XmlReader, DOM and SAX. The DOM also has issues with the fact that it tries to support conflicting data models (DOM vs. XPath) and serialization formats (XML 1.0 & XML 1.0 + XML namespaces). At the other extreme we have APIs that try to simplify XML by only supporting specific subsets of its expressivity such as the System.Data.DataSet and the System.Xml.XmlSerializer classes in the .NET Framework. The problem with these APIs is that the developer is dropped of a cliff once they reach the limits of the XML support of the API and have to either use a different API or resort to gross hacks to get what they need done.

Unfortunately one of the problems we had to deal with when I was on the XML team was that we already had too many XML APIs as it was. Introducing more would create developer confusion but trying to change the existing ones would break backwards compatibility. Personally I'd rather see efforts being to create better toolkits and APIs for the various factions that use XML to make it easier for them to get work done than constantly churning the underlying format thus fragmenting it.

Categories: XML

February 9, 2005

@ 02:13 PM

Comments [2]

Microsoft Research Social Computing Symposium 2005, April 25-26

I got an email from Shelly Farnham announcing a Social Computing Symposium, sponsored by Microsoft Research which will be held at Redmond Town Center on April 25-26. Below is an excerpt from the announcement

In the spring of 2004 a small two-day Social Computing Symposium, sponsored by Microsoft Research, brought together researchers, builders of social software systems, and influential commentators on the social technology scene...A second symposium at Microsoft Research is planned for April 25th-26th, 2005...
If you are interested in attending the symposium, please send a brief, 300-500 word position paper. The symposium is limited to 75 people, and participants will be selected on the basis of submitted position papers.
Position papers should not be narrowly focused on either academic study or industry practice. Rather, submissions should do one or more of the following: address theoretical and methodological issues in the design and development of social computing technologies; reflect on concrete experiences with mobile and online settings; offer glimpses of novel systems; discuss current and evolving practices; offer views as to where research is needed.
We are particularly interested in position papers that explore any of the following areas. However, given the symposium’s focus on new innovation in social technologies, we are open to other topics as well.
a) The digitization of identity and social networks.
b) Proliferation and use of social metadata.
c) Mobile, ubiquitous social technologies changing the way we socialize.
d) Micropublishing of personal content (e.g. blogs), and the democratization of information exchange and knowledge development.
e) Social software on the global scale: the impact of cross-cultural differences in experiences of identity and community.

Please send your symposium applications to scspaper@microsoft.com by February 28th.

I would like to attend which means I have to cough up a position paper. I have 3 ideas for position papers; (i) Harnessing Latent Social Networks: Building a Better Blogroll with XFN and FOAF, (ii) Blurring the Edges of Online Communication Forms by Integrating Email, Blogging and Instant Messaging or (iii) Can Folksonomies Scale to Meet the Challenges of the Global Web?

So far I've shared this ideas with one person and he thought the first idea was the best. I assume some of the readers of my blog will be at this symposium. What would you guys like to get a presentation or panel discussion on?

Categories: Life in the B0rg Cube | Technology

February 9, 2005

@ 01:43 PM

Comments [0]

We're Hiring

The team I work for at MSN has open developer and test positions. Our team is responsible for the underlying infrastructure of services like MSN Messenger, MSN Spaces, MSN Groups and some of Hotmail. We frequentlu collaborate with other properties across MSN such as MyMSN and MSN Search as well as with other product units across Microsoft such as in the case of Outlook Live. If you are interested in building world class software that is used by hundreds of millions of people and the following job descriptions interest you then send me your resume

Software Design Engineer (Developer)

The team builds key platforms that facilitate storing and sharing photo albums, files, blogs, calendars, buddy lists, favorites, etc. We are looking for a strong software developer with good problem solving skills and at least 2-3 years of experience with C++ or other programming languages. Experience in relational databases especially data modeling, schema design, developing stored procedures and programming against the databases is required. Candidates with good SQL knowledge and performance tuning databases will be preferred.

Software Design Engineer/Test (Tester)

The MSN Communication and Platform Services team is looking for a passionate and experienced SDET candidate with a strong programming, design, and server background to help us test next generation internet sharing and communication scenarios at MSN. You will be responsible for designing and creating the test automation and tools for the MSN Contacts service, which stores over 9 billion live MSN Hotmail and MSN Messenger contacts for over 400 million users. You should be able to demonstrate thorough understanding of testing methodologies and processes. Requirements for this position include strong skills in C++ or C#, design, problem solving, troubleshooting, proven experience with test case generation techniques or model based testing methodologies. communication (written & verbal). and a strong passion for quality. Your position will include key tasks such as writing test plans and test cases, working with PM and Development to provide them with integration and architecture feedback, working across teams with major partners such as Hotmail, Office, and Windows, and driving quality and performance related issues to resolution.

Email your resume to dareo@msft.com (replace msft with microsoft) if the above job descriptions sound like they are a good fit for you. If you have any questions about what working here is like, you can send me an email and I'll either follow up via email or my blog to answer any questions of general interest [within reason].

Categories: Life in the B0rg Cube | MSN

February 9, 2005

@ 12:27 PM

Comments [0]

MSN Spaces Tips and Tricks

Abbie, who is one of the coolest MSN Spaces users around, has a posted a collection of links to various posts showing how to get extra mileage out of MSN Spaces. Check out her post MSN Spaces Tips, Tricks, Gods and More . Some of my favorite links from her page include

Alerts For Your Space - want to set up alerts are learn how they work? Read here!

Edit It! Button - Scott's trick for obtaining additional blog editing features.

Guide to Trackbacks - What are trackbacks and how do you use them?

Edit! Help for FireFox Users - Some editing perks for your FireFox users.

Understanding Layout Customization - Learn your way around customizing your Space.

Minimizing Content Spam - Great post by Mike regarding spam in your Space.

Podcasting Your Space - Great information on how to set up your own podcast of your Space

Deleting Your Space - What really happens when you delete your Space?

Take Too Long and Lose It - Did you know you could lose your post if you take too long to type it out? Read and learn how to prevent it.

Add A GuestBook - a unique way to add a guestbook to your Space

Give Your FeedBack and Ideas for Spaces - Have an idea for Spaces? Get it heard here!

Who Owns Your Spaces Content - a small but great FYI post regarding your content

There is at least one neat trick that Abbie missed from her list. Jim Horne shows how to embed audio and video into a blog post on a Space. Excellent.

Categories: Mindless Link Propagation | MSN

February 7, 2005

@ 02:24 PM

Comments [2]

On the Elections in Iraq

If you're ~~black~~ Iraqi, you gotta look at America a little bit different. You gotta look at America like the uncle who paid for you to go to college ... but molested you."

The above sentence, slightly modified from a quote in Chris Rock's Never Scared, sums up my thoughts on the elections in Iraq.

Categories: Ramblings

February 5, 2005

@ 03:09 PM

Comments [12]

RSS Bandit v1.3.x (Wolverine) Beta Installer Available

We are now feature complete for the next release of RSS Bandit. Interested users can download it at RssBandit.1.3.0.18.Wolverine.Beta.zip. All bugs specific to the previous alpha release have been fixed. The next steps for us are finish working on documentation, translations and fixing any bugs that are reported during the beta with the intention of shipping a final release in the next two weeks.

Check out a screenshot of the Wolverine Beta which shows some of the new features such as newspaper styles and the additional columns in the listview. The major new features are listed below but there are couple of minor ones such as proxy exclusion lists and better handling of SSL certificate issues which aren't listed. There will be a comprehensive list of bug fixes and new features in the announcement for the final release.

NEW FEATURES (compared to v1.2.0.117)

Newspaper styles: Ability to view all unread posts in feeds and categories or all posts in a search folder in a Newspaper view. This view uses templates that are in the same format as those used by FeedDemon so one can use RSS Bandit newspaper styles in FeedDemon and vice versa. One can choose to specify a particular stylesheet for a given feed or category. For example, one could use the slashdot.fdxsl stylesheet for viewing Slashdot, headlines.fdxsl for viewing news sites in the News category and outlook-express.fdxsl for viewing blog posts.
Column Chooser: Ability to specify which columns are shown in the list view from a choice of Headline, Topic, Date, Author, Comment Count, Enclosures, Feed Title and Flag. One can choose to specify a column layout for a particular feed or category. For example, one could use the layout {Headline, Author, Topic, Date, Comment Count} for a site like Slashdot but use one like {Headline, Topic, Date} for a traditional blog that doesn't have comments.
Category Properties: It is now possible to specify certain properties for all feeds within a category such as how often they should be updated or how long posts should be kept before being removed.
Identities: One can create multiple personas with associated information (homepage, name, email address, signature, etc) for use when posting comments to weblogs that support the CommentAPI.
del.icio.us Integration: Users who have accounts on the del.icio.us service can upload links to items of interest directly from RSS Bandit.
Skim Mode: Added option to 'Mark All Items As Read on Exiting a Feed'
Search Folder Improvements: Made the following additions to the context menu for search folders; 'New Search Folder', 'Refresh Search', 'Mark All Items As Read' and 'Rename Search Folder'. Also deletion of a search folder now prompts the user to prevent accidental deletion
Item Deletion: News items can be deleted by either using the [Del] key or using the context menu. Deleted items go to the "Deleted Items" special folder and can be deleted permanently by emptying the folder or restored to the original feed at a later date.
UI Improvements: Tabbed browsers now use a multicolored border reminiscent of Microsoft OneNote.

POSTPONED FEATURES (will be in the NightCrawler release)

NNTP Support
Synchronizing state should happen automatically on startup/shutdown
Applying search filters to the list view
Provide a way to export feeds from a category

Categories: RSS Bandit

February 5, 2005

@ 02:01 PM

Comments [0]

Michael Brundage on What it's Like to Work at Microsoft

When I used to work on the XML team at Microsoft, there were a number of people who I interacted with who were so smart I used to feel like I learned something new everytime I walked into their office. These folks include

Michael Brundage - social software geek before it was a hip buzzword, XML query engine expert and now working on the next generation of XBox
Joshua Allen - semantic web and RDF enthusiast, if not for him I'd dismiss the semantic web as a pipe dream evangelized by a bunch of theoreticians who wouldn't know real code if it jumped them in the parking lot and defecated on their shoes, now works at MSN but not on anything related to what I work on
Erik Meijer - programing language god and leading mind behind X# ~~Xen~~ Cω , he is a co-inventor on all my patent applications most of which started off with me coming into his office to geek out about something I was going to blog about
Derek Denny-Brown - XML expert from back when Tim Bray and co. were still trying to figure out what to name it, one heckuva cool cat

Anyway that was a bit of digression before posting the link mentioned in the title of the post. Michael Brundage has an essay entitled Working at Microsoft where he provides some of his opinions on the good, the bad, and the in-between of working at Microsoft. One key insight is that Microsoft tends to have good upper management and poor middle management. This insight strikes very close to home but I know better than to give examples of the latter in a public blog post. Rest assured it is very true and the effects on the company have cost it millions, if not billions of dollars.

Categories: Life in the B0rg Cube

February 3, 2005

@ 05:44 PM

Comments [0]

Square Pegs in Round Holes: W3C XML Schema and XInclude

One of the biggest assumptions I had about software development was shattered when I started working on the XML team at Microsoft. This assumption was that standards bodies know what they are doing and produce specifications that are indisputable. However I've come to realize that the problems of design by committee affects illustrious names such as the W3C and IETF just like everyone else. These problems become even more pernicious when trying to combine technologies defined in multiple specifications to produce a coherent end to end application.

An example of the problem caused by contradictions in core specifications of the World Wide Web is summarized in Mark Pilgrim's article, XML on the Web Has Failed. The issue raised in his article is that determining the encoding to use when processing an XML document retrieved off the Web via HTTP, such as an RSS feed, is defined in at least three specifications which contradict each other somewhat; XML 1.0, HTTP 1.0/1.1 and RFC 3023. The bottom line being that most XML processors including those produced by Microsoft ignore one or more of these specifications. In fact, if applications suddenly started following all these specifications to the letter a large number of the XML documents on the Web would be considered invalid. In Mark Pilgrim's article, 40% of 5,000 RSS feeds chosen at random would be considered invalid even though they'd work in almost all RSS aggregators and be considered well-formed by most XML parsers including the System.Xml.XmlTextReader class in the .NET Framework and MSXML.

The newest example, of XML specifications that should work together but instead become a case of putting square pegs in round holes is Daniel Cazzulino's article, W3C XML Schema and XInclude: impossible to use together??? which points out

The problem stems from the fact that XInclude (as per the spec) adds the xml:base attribute to included elements to signal their origin, and the same can potentially happen with xml:lang. Now, the W3C XML Schema spec says:

3.4.4 Complex Type Definition Validation Rules

Validation Rule: Element Locally Valid (Complex Type)
...

3 For each attribute information item in the element information item's [attributes] excepting those whose [namespace name] is identical to http://www.w3.org/2001/XMLSchema-instance and whose [local name] is one of type, nil, schemaLocation or noNamespaceSchemaLocation, the appropriate case among the following must be true:

And then goes on to detailing that everything else needs to be declared explicitly in your schema, including xml:lang and xml:base, therefore :S:S:S.

So, either you modify all your schemas to that each and every element includes those attributes (either by inheriting from a base type or using an attribute group reference), or you validation is bound to fail if someone decides to include something. Note that even if you could modify all your schemas, sometimes it means you will also have to modify the semantics of it, as a simple-typed element which you may have (with the type inheriting from xs:string for example), now has to become a complex type with simple content model only to accomodate the attributes. Ouch!!! And what's worse, if you're generating your API from the schema using tools such as xsd.exe or the much better XsdCodeGen custom tool, the new API will look very different, and you may have to make substancial changes to your application code.

This is an important issue that should be solved in .NET v2, or XInclude will be condemned to poor adoption in .NET. I don't know how other platforms will solve the W3C inconsistency, but I've logged this as a bug and I'm proposing that a property is added to the XmlReaderSettings class to specify that XML Core attributes should be ignored for validation, such as XmlReaderSettings.IgnoreXmlCoreAttributes = true. Note that there are a lot of Ignore* properties already so it would be quite natural.

I believe this is a significant bug in W3C XML Schema that it requires schema authors to declare up front in their schemas where xml:lang, xml:base or xml:base may occur in their documents. Since I used to be the program manager for XML Schema technologies in the .NET Framework this issue would have fallen on my plate. I spoke to Dave Remy who toke over my old responsibilities and he's posted his opinion about the issue in his post XML Attributes and XML Schema . Based on the discussion in the comments to his post it seems the members of my old team are torn on whether to go with a flag or try to push an errata through the W3C. My opinion is that they should do both. Pushing an errata through the W3C is a time consuming process and in the meantime using XInclude in combination with XML Schema is signficantly crippled on the .NET Framework (or on any other platform that supports both technologies). Sometimes you have to do the right thing for customers instead of being ruled by the letter of standards organizations especially when it is clear they have made a mistake.

Please vote for this bug on the MSDN Product Feedback Center.

Categories: XML

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Thursday, 10 February 2005 - Dare Obasanjo's weblog

3.4.4 Complex Type Definition Validation Rules