Monday, 26 April 2004 - Dare Obasanjo's weblog

April 26, 2004

@ 05:46 PM

Joel Spolsky on Scoble as Microsoft's Fifth Column

In response to a recent post by Joel Spolsky, Robert Scoble asks if Does Joel believe that blogging is a waste of time?. I found Joel's response in the comments to Robert's post interesting. Joel writes

For every brilliant, useful post by Raymond Chen, there are ten "So, I think I'll try this blogging thing" posts.

What's unusual is for a small company to manage to force a large company to react, and that's what Dave effectively did. It took a lot of work to get the train rolling but now Microsoft is devoting enormous resources to blogging. Even if there are only 400 bloggers, and lets say that each blogger only spends 10% of their time blogging, that's the equivalent of 40 full time employees soaked up... about 10 times the staff of UserLand. If Microsoft gets to the point of 100% blogging which I'm sure is a reasonable goal, we're talking about the equivalent of 5000 employees getting paid to post about what they had for breakfast. That's how many employees Microsoft had TOTAL when I worked there. Dave Winer's idea could conceivably do more to soak up employee resources at Microsoft than Linus Torvald's idea, and that's why it's brilliant. In fact it could even surpass Wes Cherry's idea in sheer time-soaking-up ability. That would be something.

I find Joel's perspective interesting. First of all, I seriously doubt that Microsoft could ever get to the point where 100% of its employees were blogging. Secondly, he makes it seem that blogging doesn't provide value to Microsoft and it is a mere waste of time. This is very, very far from the truth. Besides the abstract benefits such as the fact that it “humanizes” Microsoft to developers there are many practical benefits which we provide to our customers.

Up until blogging, the only channels for getting technical information to our customers were press releases, articles on MSDN or Knowledge Base articles. That is a fairly high barrier to getting information that people working with Microsoft software need to get their jobs done. It isn't like this information doesn't exist. Daily there are hundreds of emails full of information about the inner workings of some component, the roadmap for some technology or the work around flying around the internal aliases at Microsoft that our customers never get to see but would be amazingly useful to them. Take Raymond Chen's blog as an example of this. About 3 years ago when I interned at Microsoft I stumbled on his internal web page that contained all sorts of interesting, useful and technical anecdotes about the history of Windows programming. Up until blogging, useful information such as What order do programs in the startup group execute?, Why are HANDLE return values so inconsistent? , or Why can't the system hibernate just one process? simply languished as information that was only privy to the folks who happened to be on the right email distribution lists at Microsoft or stumbled on the right internal website. Raymond's blog isn't the only one like this, just today I've seen Don Box post about the roadmap for .NET Remoting, Omar Shahine has a post on the issues with building .NET Framework components as addins to Outlook and Philo Janus on implementing context sensitive Help in InfoPath.

Would our customers have access to this wealth of information if we restricted ourselves to traditional channels of communication (press releases, white papers, KB articles, etc)? I don't think so. I do agree that like most things there are high quality blogs from Microsoft employees and others that aren't as useful. But that's life, 99% of everything is crap.

Categories: Life in the B0rg Cube

April 26, 2004

@ 04:31 PM

Comments [3]

XInclude for .NET, Sequential XPath and XPathNavigatorReader

It seems April is the month of custom implementations of the XmlReader. The first entry was Daniel Cazzulino's XPathNavigatorReader. As Daniel writes

There are many reasons why developers don't use the XPathDocument and XPathNavigator APIs and resort to XmlDocument instead... XPathNavigator is a far superior way of accessing and querying data because it offers built-in support for XPath querying independently of the store, which automatically gain the feature and more importantly, because it abstracts the underlying store

There are some problems with the XPathNavigator as implemented in the .NET Framework in v1.0 and v1.1. For the most part the APIs in the .NET Framework that work with XML mostly operate on instances of XmlReader or to a lesser extent XmlNode not XPathNavigator. Also there are some basic features one would expect from an XML API such as the ability to get the XML as a string which don't exist on the class. Daniel solves a number of these problems by implementing the XPathNavigatorReader which is a subclass of XmlTextReader implemented of over an XPathNavigator. This way you can pass an XPathNavigator to APIs expecting an XmlReader or XmlTextReader and get some user friendly functions like ReadInnerXml().

The second custom XmlReader I've seen this month is Oleg Tkachenko's XIncludingReader which is featured as part of his article on the MSDN entitled Combining XML documents with XInclude which provides a brief overview of XInclude and shows how to use the XIncludingReader which implements the XInclude 1.0 Last Call Working Draft from November 10, 2003. From the article

The key class within XInclude.NET is the XIncludingReader, found in the GotDotNet.XInclude namespace. The primary design goal was to build pluggable, streaming pipeline for XML processing. To meet that goal, XIncludingReader is implemented as an XmlReader, which can be wrapped around another XmlReader. This architecture allows easy plugging of XInclude processing layer into a variety of applications without any major modifications.

XML Inclusion process is orthogonal to XML parsing, validation, or transformation. That effectively means it's up to you when to allow XML Inclusion happen: after parsing, but before validation; or after validation, but before transformation, or even after transformation

The design of the XIncludingReader highlights the composability that was our goal when we originally shipped the XmlReader. One can layer readers on top of each other augmenting their capabilities as needed. We will definitely be emphasizing this more in Whidbey.

The third custom reader I've seen this month is the XPathReader. Nothing has been published about this class so far but I'm in the process of putting together an article about it which should show up on the MSDN XML Developer Center at the end of this week or early next week. The whet your appetite imagine an XmlReader that allows you to read XML in a forward-only, streaming manner but allows you to match XPath expressions based on the subset put forward by Arpan Desai in his paper Introduction to Sequential XPath. The following is a sample of how the XPathReader can be used

XPathCollection xc = new XPathCollection(); int query1 = xc.Add('//book/title'); XmlTextReader reader = new XmlTextReader("books.xml"); XPathReader xpathReader = new XPathReader(reader, xc); while (xpathReader.ReadUntilMatch()){ Console.WriteLine("Title={0}", xpathReader.ReadString()); }

I should be done with getting the article reviewed and the like in the next few days. April's definitely been the month of the XmlReader.

Categories: XML

April 24, 2004

@ 04:52 AM

Comments [2]

On Reinventing Terminology

In response to a post about tunneling variables in XSLT 2.0 on the Lambda the Ultimate weblog, Frank Atanassow writes

The markup language community is notorious for reinventing and duplicating concepts and terminology, sometimes even their own. Thus they have "minimum literals" rather than "string literals", "parameter entities" rather than "macros", "templates" rather than "procedures" or "functions", "validate" rather "type-check", "data binding" rather than "translation", "unmarshal" rather than "parse" et cetera.

I suspect that a lot of this is due to sins of omission instead of sins of commision. People in the markup world just aren't that aware of what's going on in the world of programming languages [or databases] and vice versa. I have to deal with this a lot at work.

Thanks to Joshua Allen for pointing out this comment to me.

Categories: XML

April 22, 2004

@ 05:00 PM

Comments [4]

Serializing an Object's State != Serializing an Object

In his post Why not standardize an Object Schema? Jason Mauss writes

I was listening to the latest .NET Rocks! episode; the part where they were discussing Service-Oriented systems. I don't remember exactly who-said-what but I do remember what was said. There was mention of something like, “You only want to pass XML messages back and forth, not objects.” The reasoning behind this (IIRC) had to do with interoperability. Let's say you have a .NET caller and a J2EE caller. Since they both define objects differently (and perhaps create and expect different serialized representations of objects) it's not gonna work. This got me thinking, why not have someone (like say, the W3C w/ the help of people at Sun, IBM, MS, etc.) develop a standard “object” schema for Web Services (and SO systems) to pass back and forth?

For example (this is just off the top of my head and not thought through well):
<object type=““ basetype=““>
   <property name=““ value=““ />
   <method name=““ accesstype=”” address="">
     <parameters>
        <parameter name="" type="" required="" />
     </parameters>
   </method>
</object>
I realize this is a huge simplification of what the schema might actually look like, but perhaps someone could provide me with some insight as to why this would or wouldn't be a good idea.

There are a number of points to tackle in this one post. The first is the misconception that XML and service orientation are somehow linked. Service orientation is simply a state of mind, go back and read Don's Don's four fundamentals of service orientation;

Boundaries are explicit
Services are autonomous
Services share schema and contract, not class
Service compatibility is determined based on policy

None of these explicitly rely on XML, except for the part about services sharing schemas and contracts not classes but XML isn't the only data format with a schema language. Some people such as Sun Microsoystems like to point out that ASN.1 schemas and binary encodings fit this bill as well. The key point is that you should be passing around messages with state not executable code. The fundamental genius of the SOAP 1.1 specification is that it brought this idea into the mainstream and built this concept into its very core. The original spec has this written into its design goals

A major design goal for SOAP is simplicity and extensibility. This means that there are several features from traditional messaging systems and distributed object systems that are not part of the core SOAP specification. Such features include

Distributed garbage collection
Boxcarring or batching of messages
Objects-by-reference (which requires distributed garbage collection)
Activation (which requires objects-by-reference)

Once you start talking about passing around objects and executable code the system becomes much more complex and much more tightly coupled. However experience from enterprise messaging systems and global distributed systems such as the World Wide Web show that you can build scalable, loosely coupled yet powerful applications in an architecture based on passing around messages and defining a couple of operations that can be performed on these messages. Would the Web be as successful if to make web requests you had to fire up Java RMI, DCOM, CORBA or some equivalent instead of making HTTP GET & HTTP POST requests over network sockets with text payloads?

Now as for Jason's schema, besides the fact that doing what he requests defeats the entire purpose of claiming to have built a service oriented application (even though the term is mostly meaningless anyway) the schema is missing the most important part. An object has state (fields & properties) as well as behavior (methods). Service oriented architectures dictate that you pass around state while the methods exist at the service end point, (e.g. an HTTP GET or HTTP POST request sends some state to the server either in the form of a payload or as HTTP headers which are then operated upon by the server which sends a result after said processing is done). Once you start wanting to send behavior over the wire you are basically asking to send executable code. The question then becomes what do you send; MSIL, Java byte codes, x86 instructions or some new fangled binary format? When you finally decide that all you would have done is reinvent Java RMI, CORBA, DCOM and every other distributed object system but this time it uses the XML magic pixie dust.

Categories: XML

April 21, 2004

@ 06:42 PM

Comments [4]

The W3C Trudges Along Ever So Slowly...

Am I the only one saddened by the fact that it's been over four years since Microsoft and IBM co-submitted the XInclude NOTE and the spec is still just a Candidate Recommendation? How about the fact that the W3C Query Languages workshop which led to the creation of the XQuery working group was held almost six years ago and the XQuery specification is still a Working Draft which means it is still a year or two from being done.

This lateness in delivering specs in combination with the unnecessary complexity yet lack of features of other W3C technologies such as XML Schema makes me feel more and more that the W3C is more of a hinderance to the world of XML development than a boon at this point.

Many feel that there isn't any alternative but grinning and bearing it. I wonder if that is truly the case and whether individual or community based innovation such as has happened with technologies like RSS or EXSLT isn't the way forward.

Categories: XML

April 21, 2004

@ 05:11 PM

Comments [0]

Why You Might Want to Declare a Class sealed or final.

Ted Neward, James Robertson and a few others have been arguing back and forth about the wisdom of preventing derivation of one's classes by marking them as sealed in C# or final in Java. Ted covers various reasons why he thinks preventing derivation is valid in posts such as Uh, oh; Smalltalk culture clashes with Java/.NET culture... , Unlearn Bo... and Cedric prefers unsealed classes. On the other side is James Robertson who thinks doing so amounts to excessively trying to protection developers in his post More thoughts on sealed/final.

As a library designer I definitely see why one would want to prevent subclassing of a particular type. My top 3 reasons are

Security Considerations: If I have a class that is assumed to perform certain security checks or conform to certain integrity constraints which is depended upon by other classes then it may make sense to prevent inheritance of that class.
Fundamental Classes: Certain types are so fundamental to the software system that replacing them is too risky to allow. Types such as strings or numeric types fall into this category. An API that takes a string or int parameter shouldn't have to worry whether someone has implemented their own “optimized string class“ that looks like System.String/java.lang.String but doesn't support Unicode or stores information as UTF-8 instead of UTF-16.
Focused Points of Extensibility: Sometimes a object model is designed with a certain extensibility points in which case other potential extensibility points should be blocked. For example, in System.Xml in Whidbey we will pointing people to subclass XPathNavigator as their way of exposing data as XML instead of subclassing XmlDocument or XPathDocument. In such cases it would be valid to have made the latter classes final instead of proliferating bad ideas such as XmlDataDocument.

There is a fourth reason which isn't as strong but one I think is still valid

Inheritance Testing Cost Prohibitive: A guiding principle for designing subclasses is the Liskov Substitution Principle which states

If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T.
although straightforward on the surface it is hard work testing that this is indeed the case when designing your base classes. To test that one's classes truly exhibit this principle subclasses should be created and run through a gamut of tests that the base class passes to see if the results are indistinguishable. Often it is actually the case that there is dependency on the internal workings of the class by related components. There are two examples of this in v1.0 of the APIs in System.Xml in the .NET Framework, XmlValidatingReader's constructor accepts an XmlReader as input but really only supports an XmlTextReader and XslTransform should accept an XPathNodeIterator as output from an extension function but really only supports the internal ResetableIterator. Neither of these cases is an example of where the classes should be sealed or final (in fact we explicitly designed the classes to be extensible) but they do show that it is very possible to ship a class where related classes actually have dependencies on its internals thus making subclassing it inappropriate. Testing this can be expensive and making classes final is one way to eliminate an entire category of tests that should be written as part of the QA process of the API. I know more than one team at work who've taken this attitude when designing a class or set of classes.

That's my $0.02 on the topic. I do agree that most developers don't think about the ramifications of making their classes inheritable when designing them but to me that is an argument that final/sealed should not be the default modifier on a class. Most developers won't remember to change the default modifier regardless of what it is, the question is whether it is more beneficial for there to be lots of classes that one can't derive from or that one can even if the class wasn't designed with derivation in mind. I prefer the latter.

Categories: Technology

April 21, 2004

@ 04:22 PM

Comments [2]

Adding Subscription Harmonization to RSS Bandit

Now that I've gotten Visual Studio.NET 2003 reinstalled on my machine I've been working towards fixing the few bugs I own that have prevented us from shipping a release. The most important bug left to fix is [ 930282 ] Remote storage doesn't actually synchronize state which exists because of laziness on my part.

RSS Bandit gives you the option to download and upload your feed list from a file share, an FTP server or a dasBlog weblog. However this doesn't actually do much synchronization during the import phase, basically it just adds the feeds that don't currently exist in your aggregator. It doesn't synchronize the read/unread messages, remove deleted feeds or remember which items you've flagged for follow up. I am in the process of fixing this for the next release.

I'm currently thinking that we'll break backwards compatibility with this feature. Synchronization will only work between current versions of RSS Bandit. You'll have a choice of two transfer formats ZIP and SIAM. If you select ZIP then we'll synchronize your search folders, flagged items, replied items, subscribed feeds and read/unread message state. All of these will be transferred as a ZIP file. The SIAM option will synchronize subscribed feeds and read/unread message state and will be transferred as a SIAM document. The supported data sources will be WebDAV folder, network share and FTP. I'm interested in any other options people think is interesting to support.

There are a couple of interesting problems to solve before I'm done mostly to do with how to perform the synchronization as quickly as possible. They revolve around scenarios like “What if my work machine has 3 months of posts while my home machine only has 2 weeks of posts in its cache and I synchronize between them?“ or “What happens if I've read different posts from the same feed on my work machine and on my home machine?“. The issue revolve around the fact that replacing the existing information with the incoming information while simple leads to information loss.

This should be fun, I've wanted this functionality for a while.

Categories: RSS Bandit

April 20, 2004

@ 04:37 PM

Comments [0]

XPath for the HTML DOM in Internet Explorer

Dimitri Glazkov has produced a JavaScript implementation of DOM Level 3 XPath for Microsoft Internet Explorer. Below are some examples of what using XPath from Javascript looks like with his implementation

Now counting all links on your document is just one XPath query:
var linkCount = document.evaluate( “count(//a[@href])“, document, null, XPathResult.NUMBER_TYPE, null).getNumberValue();
So is getting a list of all images without an alt tag:
var imgIterator = document.evaluate( “//img[not(@alt)]“, document, null, XPathResult.ANY_TYPE, null);
So is finding a first LI element of al UL tags:
var firstLiIterator = document.evaluate( “//ul/li[1]“, document, null, XPathResult.ANY_TYPE, null);

Excellent work. XPath is one of the most powerful XML technologies and getting support for it in clientside HTML scripting is should be a great boon to developers who do a lot of HTML processing using Javascript.

Categories: Mindless Link Propagation

April 20, 2004

@ 04:24 PM

Comments [2]

RSS Bandit 1.2.0.108 (beta) available

The march towards getting the next official release of RSS Bandit continues, until then you can download the latest version.

The biggest changes since the last beta are that two more language translations have been added, Brazilian Portuguese and Polish. This brings our number of supported languages to six (English, German, Russian, Simplified Chinese, Brazilian Portuguese and Polish). A couple of bugs with Search Folders were also fixed.

I'm currently the one holding up the release. I've finally gotten Visual Studio.NET 2002 installed on my machine and it seems that I should be able to get Visual Studio.NET 2003 installed today. Once done I'll work on the new installer and implement actual synchronization of feed state using a central server of your choice. Lack of this feature is beginning to get on my nerves.

Categories: RSS Bandit

April 20, 2004

@ 08:55 AM

Comments [0]

Living Up To Stereotypes

From the Duh! department are the following excerpts from the an interview with the author of Sister's Keeper

In her book, Robbins goes undercover at a college she calls “State U.” during the 2002-2003 school year to find out whether the stereotypes—binge drinking, drug use, eating disorders and promiscuity—are true.

NEWSWEEK: What kinds of things did you witness?
Alexandra Robbins: I really hadn’t expected to find the level of "Animal House" campiness that I did in some groups. They had a tradition called boob ranking where pledges had just a limited amount of time to strip off their shirt and bras to examine each other topless so that by the time the clock was up, they were basically lined up in order of chest size in order of the sisters to inspect. Some sororities hold what they call “naked parties,” during which after a few drinks sisters and pledges strip off their clothes and basically run around the house naked, some of them hooking up with each other before they let the boys in.

NEWSWEEK: Isn’t there a constant emphasis on boys?
Alexandra Robbins: From the mixers to the formals to the homecomings to fraternity parties—there’s frequently a race to get dates from a limited pool of acceptable fraternity guys. And white sororities are so centered on relationships with their ceremonies and rituals and songs to celebrate specific relationship milestones. By comparison, in at least one white sorority, the award for getting the highest GPA was a bag of potato chips. And you have to wonder what’s the point of a girls-only organization if it revolves around men.

NEWSWEEK: How prevalent are eating disorders?
Alexandra Robbins: I had heard urban legends about plumbers having to come clean out the pipes ever month or so in sororities because they get clogged with vomit. A lot of girls told me that was true. Eating disorders are so popular that some houses have puking contests after dinner. At State U., every single one of the 18 sororities had eating-disorder problems.

The entire premise of the book reminds me of an episode of the Maury Povich show, an excercise in voyeurism.

Categories: Mindless Link Propagation

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Monday, 26 April 2004 - Dare Obasanjo's weblog