May 2, 2004
@ 10:07 PM

I just finished fixing some bugs in the synchronization code in RSS Bandit and now the feature should work as expected. All I can say is, "Wow". I've just taken it for granted that I can open an instance of my mail client on different machines and get the same state but the same isn't the case for my news aggregator. Being able to click 'Download Feeds' on startup and have everything I read/flagged/replied to at home or at work synced up is totally sweet.

The only thing preventing a release now is that I'd like Torsten and I to come up with a way to improve the responsiveness of the GUI when loading feeds with thousands of items. On my machine where I have 3 months of posts from blogs like Weblogs @ ASP.NET it definitely seems like the user interface is creaking and groaning. This shouldn't take more than a few days so we should be on track for a release this week.


 

Categories: RSS Bandit

Ted Neward writes

So if I need a custom build task to do ASP.NET deployment, do I build a NAnt task knowing its lifetime is probably scoped to this Whidbey beta cycle? Or do I build a MSBuild task knowing that it'll probably be two years before it's really useful, and that it's entirely likely that I'll have to rewrite it at least once or twice in the meantime? Recommendation: Don't try to "build a better tool" originating out of the open-source community; choose instead to support the community by adopting it into the IDE or toolchain and commit resources to develop on it if it doesn't do what you want.

I tend to disagree with Ted. The big problem that Microsoft faces in this space is that developers tend to want all their tools to come from a single vendor. Ted shows this mentality himself by implying that even though an existing 3rd party tool to perform the tasks he wants exists (Nant) he'd rather wait for vaporware from Microsoft (MSBuild). Here I use the term vaporware to describe any product that has been announced and demoed but not released.

This problem means that Microsoft is constantly being hit with a barrage of requests by developers to implement technologies when existing 3rd party products can satisfy the needs of our developers. For example, our team constantly gets requests from developers who'd like the equivalent of a full fledged XML development environments like XML Spy in Visual Studio.

My personal opinion is that Microsoft should build what it thinks the essentials are for developers into their development tools and the .NET Framework then provide extensibility hooks for others to plugin and extend the existing functionality as needed. Contrary to the tone of Ted's post the Visual Studio team isn't averse to shipping third party plugins in the box as witnessed by its inclusion of Dotfuscator Community Edition with Visual Studio.NET 2003. Ironically, the first review of Visual Studio.NET I could find  that mentions Dotfuscator contains this text in the conclusion

 I was kind of hoping that Microsoft was developing an obfuscator in-house that would be completely integrated into Visual Studio .NET

Yet another request for Microsoft to produce a tool when third party tools exist and one is even included in the box. Even if the Visual Studio team considered Nant another third party tool they'd want to ship in this manner I'm sure the fact that it is licensed under the GNU Public Licence (GPL) would give them cause for pause.

The other reason I disagree with Ted is that I believe in competition. I don't think the existence of WinAmp and Quicktime should mean Windows Media player shouldn't exist, the existence of Borland's IDEs mean Visual Studio shouldn't exist or the existence of Yahoo! Mail means Hotmail shouldn't exist. I definitely don't think that the existence of a developer tool or API being produced by Microsoft or a third party precludes anyone else from developing one. The mentality that only one tool or one technology should exist in a given space is what is stifling to competition and community building not the fact that Microsoft develops some technology that overlaps with something similar some other entity is building.


 

Categories: Life in the B0rg Cube

The final beta version of RSS Bandit before the next release is available, you can download the latest version.

The major change in this version is that you can now synchronize RSS Bandit using WebDAV, FTP or a file share. We transfer the data in ZIP file. If you select WebDAV, FTP or file share we'll synchronize your search folders, flagged items, replied items, subscribed feeds and read/unread message state. All of these will be transferred as a ZIP file. You can also select the dasBlog option which will upload/download your feedlist as an OPML file from your weblog. The latter is functionality that has always existed which hasn't being modified to perform synchronization. 

I'll spend the weekend ironing out bugs in this feature and trying to fix some areas that cause a lack of responsiveness in the GUI. We should have a release on Monday or Tuesday at the latest.


 

Categories: RSS Bandit

I've been reading quite a bit about various opinions on standards in the software industry. The first piece I read about this was the C|Net article  You call that a standard? where Robert Glushko said

Q: Why have so many standards emerged for electronic commerce?
A: One of the issues here is what a standard is. That is one of the most abused words in the language and people like you (in the media) do not help by calling things standard that are not standards. Very few things are really standard. Standards come out of standards organizations, and there are very few of those in the world.

There is ANSI (American National Standards Institute), there is ISO (International Organization for Standardization), the United Nations. Things like OASIS and the W3C (World Wide Web Consortium) and WS-I (Web Services Interoperability Organization) are not standards organizations. They create specifications that occasionally have some amount of consensus. But it is the marketing term to call things standard these days.

I tend to agree that a lot of things the media and software industry pundits call “standards” are really specifications not standards. However I'd even claim that simply calling something a standard because some particular organization produced it doesn't really jibe with reality. There has been lengthy discussion about this C|Net article on XML-DEV and in one of the posts I wrote

The word "standard' when it comes to software and computer technology is usually meaningless. Is something standard if it produced by a standards body but has no conformance tests (e.g. SQL)? What if it has conformance testing requirements but is owned by a single entity (e.g. Java)? What if it is just widely supported with no formal body behind it (e.g. RSS)?
 
Whenever I hear someone say standard it's as meaningless to me as when I hear the acronym 'SOA', it means whatever the speaker wants it to mean.

Every one of the technologies mentioned in my post(s) on XML-DEV (SQL, Java, RSS, Flash) can be considered standards by developers and their customers for some definition of the word 'standard'. In particular, I want to sieze on Glushko's claim that standards are things produced by standards bodies with the example of ANSI (American National Standards Institute) and the “SQL standard”. Coincidentally I recently read an article entitled Is SQL a Real Standard Anymore? written by Michael Gorman who has been the Secretary of the ANSI (American National Standards Institute) NCITS (National Committee on Information Technology Standards) H2 Technical Committee on Database for over 23 years. In the article he begins

What Makes a Standard a Standard?

Simple. Not implementation, but conformance. And, conformance is “known” only after conformance testing. Then and only then can users know with any degree of certainty that a vendor’s product conforms to a standard.
...
But, from the late 1980s through 1996 there was conformance testing. This was accomplished by the United States Government Department of Commerce’s National Institute of Standards and Technology (NIST). NIST conducted the tests in support of public law that was originally known as the"Brooks Act," and later under other laws that were passed in the 1990s. The force behind the testing was that no Federal agency was able to buy a DBMS unless it passed conformance tests. Conformance meant the possibility of sales.
...
The benefits derived from the NIST conformance tests were well documented. A NIST commissioned study showed that there were about $35 million in savings from a program that only cost about $600 thousand. But, in 1996, NIST started to dismantle its data management standards program. The publically stated reason was"costs." Obviously, that wasn’t true.
...
In May of 1996, I wrote an article for the Outlook section of the Washington Post. It was unpublished as it was considered too technical. The key parts of the article were:

"Because of NIST’s FY-97 and beyond plans, SQL’s conformance tests and certifications, that is, those beyond the SQL shell will be left to the ANSI/SQL vendors. They however have no motivation whatsoever to perform full and complete testing nor self policing. Only the largest buyer has that motivation, and in the case of ANSI/SQL the largest buyer is the United States Government.
...
"Simply put, without robust test development and conformance testing by NIST, DBMS will return to the days of vendor specific, conflicting features and facilities that will lock Federal agencies into one vendor, or make DBMS frightfully expensive acquire, use, and dislodge.”

This definitely hits the nail on the head. Standards are a means to an end and in this case the goal of standards is to prevent vendor lock-in. That's it, plain and simple. The rest of Michael  Gorman's article goes on to elaborate how the things he predicted in his 1996 article have come pass and why SQL isn't much of a standard anymore since vendors basically pay lip service to it and have no motivation to take it with seriousness. Going back to Glushko's article on C|Net, SQL is a standard since it is produced by a standards body yet here we have the secretary of the committee saying that it isn't. Who are we to believe?

From my point of view, almost everything that is called a 'standard' by the technology press and pundits is really just a specification. The fact that W3C/ISO/ANSI/OASIS/WS-I/IETF/etc produced a specification doesn't make it a 'standard' by any real definition of the word except for one that exists in the minds of technology pundits. Every once in a while someone at Microsoft asks me “Is RSS a standard?“ and I always ask “What Does That Mean?“ because as shown by the articles linked above it is an ambiguous question. People ask the question for various reasons; they want to know about the quality of the specification, the openness of the process for modifying or extending the specification, where to seek clarifications, or whether the technology is controlled by a single vendor. All of these are valid questions but few [if any] of them are answered by the question “Is <technology foo> a standard“.


 

Categories: Technology

April 30, 2004
@ 04:33 AM

Breaking up with someone is another way of saying "I'd rather be alone than be with you".

:(


 

Categories: Ramblings

April 27, 2004
@ 03:54 PM

A few weeks ago Robert Scoble and Charles Torre came by my office and interviewed me for about an hour. The interview was impromptu and I rambled about all things XML at Microsoft. below are clips of the interview that have made it to Channel 9

  1. What is the biggest misperception of XML?

  2. What is the best place to learn about XML?

  3. Is Microsoft supporting XML standards?

  4. Where is Microsoft going with RSS/syndication?

  5. What would you pitch Bill Gates on?

I was unprepared for the questions and tended to ramble when answering some of the questions but the answers were more honest and off the cuff.


 

Categories: Life in the B0rg Cube

In response to a recent post by Joel Spolsky, Robert Scoble asks if Does Joel believe that blogging is a waste of time?. I found Joel's response in the comments to Robert's post interesting. Joel writes

For every brilliant, useful post by Raymond Chen, there are ten "So, I think I'll try this blogging thing" posts.

What's unusual is for a small company to manage to force a large company to react, and that's what Dave effectively did. It took a lot of work to get the train rolling but now Microsoft is devoting enormous resources to blogging. Even if there are only 400 bloggers, and lets say that each blogger only spends 10% of their time blogging, that's the equivalent of 40 full time employees soaked up... about 10 times the staff of UserLand. If Microsoft gets to the point of 100% blogging which I'm sure is a reasonable goal, we're talking about the equivalent of 5000 employees getting paid to post about what they had for breakfast. That's how many employees Microsoft had TOTAL when I worked there. Dave Winer's idea could conceivably do more to soak up employee resources at Microsoft than Linus Torvald's idea, and that's why it's brilliant. In fact it could even surpass Wes Cherry's idea in sheer time-soaking-up ability. That would be something.

I find Joel's perspective interesting. First of all, I seriously doubt that Microsoft could ever get to the point where 100% of its employees were blogging. Secondly, he makes it seem that blogging doesn't provide value to Microsoft and it is a mere waste of time. This is very, very far from the truth. Besides the abstract benefits such as the fact that it “humanizes” Microsoft to developers there are many practical benefits which we provide to our customers.

Up until blogging, the only channels for getting technical information to our customers were press releases, articles on MSDN or Knowledge Base articles. That is a fairly high barrier to getting information that people working with Microsoft software need to get their jobs done. It isn't like this information doesn't exist. Daily there are hundreds of emails full of information about the inner workings of some component, the roadmap for some technology or the work around flying around the internal aliases at Microsoft that our customers never get to see but would be amazingly useful to them. Take Raymond Chen's blog as an example of this. About 3 years ago when I interned at Microsoft I stumbled on his internal web page that contained all sorts of interesting, useful and technical anecdotes about the history of Windows programming. Up until blogging, useful information such as What order do programs in the startup group execute?, Why are HANDLE return values so inconsistent? , or Why can't the system hibernate just one process? simply languished as information that was only privy to the folks who happened to be on the right email distribution lists at Microsoft or stumbled on the right internal website. Raymond's blog isn't the only one like this, just today I've seen Don Box post about the roadmap for .NET Remoting, Omar Shahine has a post on the issues with building .NET Framework components as addins to Outlook and Philo Janus on implementing context sensitive Help in InfoPath.

Would our customers have access to this wealth of information if we restricted ourselves to traditional channels of communication (press releases, white papers, KB articles, etc)? I don't think so. I do agree that like most things there are high quality blogs from Microsoft employees and others that aren't as useful. But that's life, 99% of everything is crap.


 

Categories: Life in the B0rg Cube

It seems April is the month of custom implementations of the XmlReader. The first entry was Daniel Cazzulino's XPathNavigatorReader. As Daniel writes

There are many reasons why developers don't use the XPathDocument and XPathNavigator APIs and resort to XmlDocument instead... XPathNavigator is a far superior way of accessing and querying data because it offers built-in support for XPath querying independently of the store, which automatically gain the feature and more importantly, because it abstracts the underlying store

There are some problems with the XPathNavigator as implemented in the .NET Framework in v1.0 and v1.1. For the most part the APIs in the .NET Framework that work with XML mostly operate on instances of XmlReader or to a lesser extent XmlNode not  XPathNavigator. Also there are some basic features one would expect from an XML API such as the ability to get the XML as a string which don't exist on the class. Daniel solves a number of these problems by implementing the XPathNavigatorReader which is a subclass of XmlTextReader implemented of over an XPathNavigator. This way you can pass an XPathNavigator to APIs expecting an XmlReader or XmlTextReader and get some user friendly functions like ReadInnerXml().

The second custom XmlReader I've seen this month is Oleg Tkachenko's XIncludingReader which is featured as part of his article on the MSDN entitled Combining XML documents with XInclude which provides a brief overview of XInclude and shows how to use the XIncludingReader which implements the XInclude 1.0 Last Call Working Draft from November 10, 2003. From the article

The key class within XInclude.NET is the XIncludingReader, found in the GotDotNet.XInclude namespace. The primary design goal was to build pluggable, streaming pipeline for XML processing. To meet that goal, XIncludingReader is implemented as an XmlReader, which can be wrapped around another XmlReader. This architecture allows easy plugging of XInclude processing layer into a variety of applications without any major modifications.

XML Inclusion process is orthogonal to XML parsing, validation, or transformation. That effectively means it's up to you when to allow XML Inclusion happen: after parsing, but before validation; or after validation, but before transformation, or even after transformation

The design of the XIncludingReader highlights the composability that was our goal when we originally shipped the XmlReader. One can layer readers on top of each other augmenting their capabilities as needed. We will definitely be emphasizing this more in Whidbey.

The third custom reader I've seen this month is the XPathReader. Nothing has been published about this class so far but I'm in the process of putting together an article about it which should show up on the MSDN XML Developer Center at the end of this week or early next week. The whet your appetite imagine an XmlReader that allows you to read XML in a forward-only, streaming manner but allows you to match XPath expressions based on the subset put forward by Arpan Desai in his paper Introduction to Sequential XPath. The following is a sample of how the XPathReader can be used

XPathCollection xc = new XPathCollection();
int query1 = xc.Add('//book/title');

XmlTextReader reader = new XmlTextReader("books.xml");
XPathReader xpathReader = new XPathReader(reader, xc);

while (xpathReader.ReadUntilMatch()){

   Console.WriteLine("Title={0}", xpathReader.ReadString());
}

I should be done with getting the article reviewed and the like in the next few days. April's definitely been the month of the XmlReader.


 

Categories: XML

April 24, 2004
@ 04:52 AM

In response to a post about tunneling variables in XSLT 2.0 on the Lambda the Ultimate weblog, Frank Atanassow writes

The markup language community is notorious for reinventing and duplicating concepts and terminology, sometimes even their own. Thus they have "minimum literals" rather than "string literals", "parameter entities" rather than "macros", "templates" rather than "procedures" or "functions", "validate" rather "type-check", "data binding" rather than "translation", "unmarshal" rather than "parse" et cetera.

I suspect that a lot of this is due to sins of omission instead of sins of commision. People in the markup world just aren't that aware of what's going on in the world of programming languages [or databases] and vice versa. I have to deal with this a lot at work.

Thanks to Joshua Allen for pointing out this comment to me.

 


 

Categories: XML

In his post Why not standardize an Object Schema? Jason Mauss writes

I was listening to the latest .NET Rocks! episode; the part where they were discussing Service-Oriented systems. I don't remember exactly who-said-what but I do remember what was said. There was mention of something like, “You only want to pass XML messages back and forth, not objects.” The reasoning behind this (IIRC) had to do with interoperability. Let's say you have a .NET caller and a J2EE caller. Since they both define objects differently (and perhaps create and expect different serialized representations of objects) it's not gonna work. This got me thinking, why not have someone (like say, the W3C w/ the help of people at Sun, IBM, MS, etc.) develop a standard “object” schema for Web Services (and SO systems) to pass back and forth?

For example (this is just off the top of my head and not thought through well):

<object type=““ basetype=““>
   <property name=““ value=““ />
   <method name=““ accesstype=”” address="">
     <parameters>
        <parameter name="" type="" required="" />
     </parameters>
   </method>
</object>

I realize this is a huge simplification of what the schema might actually look like, but perhaps someone could provide me with some insight as to why this would or wouldn't be a good idea.

There are a number of points to tackle in this one post. The first is the misconception that XML and service orientation are somehow linked. Service orientation is simply a state of mind, go back and read Don's Don's four fundamentals of service orientation;

  • Boundaries are explicit
  • Services are autonomous
  • Services share schema and contract, not class
  • Service compatibility is determined based on policy

None of these explicitly rely on XML, except for the part about services sharing schemas and contracts not classes but XML isn't the only data format with a schema language. Some people such as Sun Microsoystems like to point out that ASN.1 schemas and binary encodings fit this bill as well. The key point is that you should be passing around messages with state not executable code. The fundamental genius of the SOAP 1.1 specification is that it brought this idea into the mainstream and built this concept into its very core. The original spec has this written into its design goals

 A major design goal for SOAP is simplicity and extensibility. This means that there are several features from traditional messaging systems and distributed object systems that are not part of the core SOAP specification. Such features include

  • Distributed garbage collection
  • Boxcarring or batching of messages
  • Objects-by-reference (which requires distributed garbage collection)
  • Activation (which requires objects-by-reference)

Once you start talking about passing around objects and executable code the system becomes much more complex and much more tightly coupled. However experience from enterprise messaging systems and global distributed systems such as the World Wide Web show that you can build scalable, loosely coupled yet powerful applications in an architecture based on passing around messages and defining a couple of operations that can be performed on these messages. Would the Web be as successful if to make web requests you had to fire up Java RMI, DCOM, CORBA or some equivalent instead of making HTTP GET & HTTP POST requests over network sockets with text payloads?

Now as for Jason's schema, besides the fact that doing what he requests defeats the entire purpose of claiming to have built a service oriented application (even though the term is mostly meaningless anyway) the schema is missing the most important part. An object has state (fields & properties) as well as behavior (methods). Service oriented architectures dictate that you pass around state while the methods exist at the service end point, (e.g. an HTTP GET or HTTP POST request sends some state to the server either in the form of a payload or as HTTP headers which are then operated upon by the server which sends a result after said processing is done). Once you start wanting to send behavior over the wire you are basically asking to send executable code. The question then becomes what do you send; MSIL, Java byte codes, x86 instructions or some new fangled binary format? When you finally decide that all you would have done is reinvent Java RMI, CORBA, DCOM and every other distributed object system but this time it uses the XML magic pixie dust.


 

Categories: XML