Randy Holloway writes

This session was set up as an open conversation, with only one concrete agenda item.  That being RSS versus Atom.

Interesting, a bunch of developers get together to discuss weblogging technologies and they discuss the most irrelevant piece of the puzzle. For those not keeping track, there are two primary weblog syndication formats in popular usage; the RDF-based RSS 1.0 and Dave Winer's RSS 0.91/RSS 2.0. Developers tend to prefer Dave Winer's specs to the RSS 1.0 branch but due to various interpersonal issues with Dave Winer (unsurprising since he can be quite trying) a bunch of people decided to create a third syndication format (Atom) which duplicates the functionality of the other two primarily to get around the fact that Dave Winer controlled the spec for the most popular feed syndication format. This third format adds little to the table besides fragmenting the feed syndication world which already has to deal with RSS 1.0 vs. RSS 0.91/RSS 2.0 issues. In fact, this redundancy is currently being debated on the atom-syntax list.

There are three major technologies (i.e. XML formats) in the blogging world; feed syndication (RSS), blog editing (Blogger API & MetaWeblog API) and feed list information (OPML). Dave Winer's specs are dominant in all three areas given that he is the author of both the MetaWeblog API and OPML spec. Of the three of the them, RSS is probably the best of the specs and meets the needs of most users except for the Semantic Web folks who want an RDF-based format (i.e. RSS 1.0). On the other hand, there are significant deficiencies in both the MetaWeblog API and OPML. I have blogged about What is wrong with the MetaWeblog API as well as mentioned some of the problems with OPML as a format for storing information about subscribed feeds.

Given that RSS is the best of the weblog related technologies while the blog editing and feed list formats are actually the technologies with problems one might wonder why there is so much energy invested in fixing what isn't that broken instead of trying to tackle actual problems that affect developers and users of blogging tools? One of the answers to this question comes from a comment that Randy Holloway says someone made during the Weblogging BOF

"We don't solve problems, we just talk about them." 

Most of the people engaged in the discussions don't actually write any code or at least not any weblogging related code, so they are unaware of the real problems but instead focus on simple yet irrelevant issues that are easy to grok. This is definitely a case of bike shedding. [Hmmm, I love the term "bike shedding" so much I dug up  the original source of the phrase]

Speaking of Atom, I'm curious as to how all the XML Web Services folks at the Weblogging BOF felt about the fact that the current drafts of the ATOM API uses just HTTP and XML instead of the XML Web Services buzzword soup (SOAP, WSDL, etc) meaning they won't be using Indigo to code against it just plain old System.Xml and System.Net. If ""XML-RPC is a fantastic solution... from a while ago"  I wonder what they think of using just HTTP with no fancy object<->XML mappings, positively prehistoric :)


 

Categories: Ramblings

October 28, 2003
@ 06:58 AM

I finally got around to providing a list of the major features of RSS Bandit on the RSS Bandit wiki this evening. I finally was motivated to do this after reading a blog post by Luke Huttemann, the author of SharpReader and the ensuing comments.Of the ten comments in response to his post about an upcoming release of SharpReader,  four of them were requests for features already in RSS Bandit (actually two of them actually mentioned RSS Bandit by name). I find it amusing that the requestors would rather wait for Luke to add the features to SharpReader than use RSS Bandit. That's definitely some brand loyalty at work. :)

Anyway, in the spirit of democracy there's another vote going on in the RSS Bandit workspace. This time it's to decide what features should be in the next release of RSS Bandit. It seems like structured search (mentioned in a previous blog entry) is high on everyone's priority list and will definitely make it into the next release. Actually the release after the next one since I plan to ship a bugfix release in the next couple of days to clean up some of the sloppiness in the last release since it was rushed to be in time for PDC.

One of the things that warms the cockles of my heart is seeing RSS Bandit mentioned in posts such as  The Magic of Blogging. Having some of the stuff you work on mentioned in the same breath as the word "magic" is pretty fucking cool. It's great to see regular people actually getting some use out software I helped build instead of just building plumbing for developers which is my day job.  

 


 

Categories: RSS Bandit

October 28, 2003
@ 05:36 AM

Clemens Vasters writes

Indigo is the successor technology and the consolidation of DCOM, COM+, Enterprise Services, Remoting, ASP.NET Web Services (ASMX), WSE, and the Microsoft Message Queue. It provides services for building distributed systems all the way from simplistic cross-appdomain message passing and ORPC to cross-platform, cross-organization, vastly distributed, service-oriented architectures providing reliable, secure, transactional, scalable and fast, online or offline, synchronous and asynchronous XML messaging.

I think is truly awesome, they (folks like Don Box, Doug Purdy, Steve Swartz, Scott Gellock, Omri Gazitt , Mike Vernal , John Lambert et al) have not just cooked up a brand new distributed computing platform but have built it on open standards and open technologies meaning that probably for the first time in decades there won't be artifical, politics induced divisions limiting a distributed computing technology to particular platforms or operating systems (i.e. like CORBA, DCOM & Java RMI). The extra goodness is that these open standards are all XML based so crazy XML geeks like me can do stuff like this or people like Sam Ruby can do stuff like that.

The next generation of DCOM, just that this time it interoperates with everyone regardless of what programming language or operating system they are running.

Fucking sweet.


 

Categories: Life in the B0rg Cube | XML

Since I'm not at the Microsoft Professional Developer's conference, I decided to answer questions by attendees about stuff that I am directly or indirectly responsible for right here. So let's roll the questions out

Alan Dean writes

    • You can leverage the XML support in .NET against a data source of your choice (for example, the Registry) by implementing a new XmlReader. We were pointed to work by Mark Fussell on MSDN to do this (Writing XML Providers for Microsoft .NET).

      In Whidbey we will be encouraging people to implement custom XPathNavigator instances instead of custom XmlReaders unless their situation specifically calls for forward-only processing. The ObjectXPathNavigator is an example of a custom navigator

    • Tim indicated that the whitespace handling had been particularly useful in the field. He mentioned a gotcha with empty elements; namely that this
      <elementName></elementName>
      is actually the same as this
         <elementName>
         </elementName>

      except that the second has been pretty-printed with a CRLF and some whitespace indentation. The only way to handle this correctly in your XmlReader, however, is to use _reader.WhitespaceHandling = WhitespaceHandling.None

      This is a legacy from what I like to call our "unconformant by default" era which is how we shipped the XML parser in v1.0 and v1.1. There were a couple of nice features like not erroring on invalid characters and the above feature that people needed in some cases but shouldn't have been the default behavior since it was extremely difficult to figure out how to turn of all the features and get a conformant XML 1.0 parser. In v2.0 we're going to a "conformant by default" mode where people have to go out of their way to read in unconformant XML not the other way around.

    • The current pain suffered by not being able to CreateElement independent of an XmlDocument which led to the ImportNode hack to allow movement between documents. They were sufficiently noisy on this to lead me to think that this is resolved in Whidbey.

      To my knowledge this behavior will remain the same in Whidbey.

Kirk Allen Evans writes

There will be another, more difficult, XML parser, and you will hear Mark Fussell talk about that later this week.“ - Don Box

I would hedge bets that this revolves around XQuery or something, I am looking forward to hearing what this is.

I am extremely curious about what this means myself but don't think Don was talking about XQuery. If anything he probably was talking about APIs but even then we aren't changing much from what we provided in v1 in the area of the XML parser (i.e. the XmlReader class) although the implementation has been rewritten to be faster and more conformant (much props to Helena). I am puzzled about what Don meant by that statement since I've seen Mark's slides and there really isn't anything about another XML parser that users have to learn. Perhaps he meant another XML API which is valid given that we did a ton of work on the XPathDocument for v2.0 which means there may be a lot of people moving from using XmlDocument to using XPathDocument for a number of reasons.

 


 

Categories: Life in the B0rg Cube

October 27, 2003
@ 03:39 PM

So it looks like my boss, his boss, his boss's boss, and his boss's boss's boss are all out at the Microsoft Professional Developer's Conference 2003 (aka PDC) where folks will get a sneak peak at the next versions of Windows, SQL Server and Visual Studio. Thus it looks like won't be much whip cracking going on this week so I can spend time working on my pet projects for work.

  1. XML Developer Center on MSDN: Mark Fussel recently posted complaints about the quality of some articles on XML he'd recently read. I generally feel the same way about websites dedicated to articles about XML. Of all the developer sites devoted to XML there are only two I've seen that aren't utter crap; XML.com and IBM's XML developerWorks site. Even these are kind of hit or miss, XML.com usually publishes about 3 articles a week of which one is excellent, one is good and one is crap. Which is fine except that the excellent article is typically about something that isn't directly applicable to what I work on. The problem with IBM's DeveloperWorks is that all the code is Java-centric which doesn't help me since I work with the .NET Framework.

    After seeing some of what Tim Ewald did with producing content around Microsoft technologies and XML Web Services via the Web Services Developer Center on MSDN I talked to some of the folks at MSDN about creating something similar for XML content. This was green lighted a while ago but preparations for PDC has stopped this from taking off until next month. In the meantime, I'll be creating my content plan and coming up with a list of authors (both Microsoft employees and non-Microsoft folks) for new dev center.

    So far I've gotten a couple of folks lined up internally as well as some excellent non-Microsoft folks like Daniel Cazzulino, Christoph Schittko and Oleg Tkachenko. Definitely expect some pages to the XML Home Page on MSDN in the next few months.

  2. Sequential XPath and Pull Based XML Parsing: In 2001, Arpan Desai presented on Sequential XPath at XML 2001. Relevant bits from the paper

    This paper will provide an explanation of and the subset of XPath which we will tentatively dub: Sequential XPath, or SXPath for ease of use. SXPath allows a event-based XML parser, such as a typical SAX-compliant XML parser, to execute XPath-like expressions without the need of more memory consumption than is normally used within a sequential pull-based parser.
    ...
    By creating a streaming XML parser which utilizes Sequential XPath, one is able to reap the inherent benefits of a streaming parser with the querying power of XPath. By defining this proper subset of XPath, we enable developers and users to utilize XML in a wide array of applications thought to be too performance sensitive for traditional XML processing.
    The code for the technology outlined above has actually been gathering dust on some hard drives at work for a while. I'm currently in the process of liberating this code so that everyone can get access to the combined benefits of pull-based parsing and XPath based matching of nodes. Hopefully folks should be able to download classes similar to the ones outlined in Arpan's presentation in the next few weeks. Hopefully by Christmas, everyone will be able to write code similar to the following snippet taken from Tim Bray's XML is too Hard for Programmers
while (<STDIN>) {
  next if (X<meta>X);
  if    (X<h1>|<h2>|<h3>|<h4>X)
  { $divert = 'head'; }
  elsif (X<img src="/^(.*\.jpg)$/i>X)
  { &proc_jpeg($1); }
  # and so on...
}
Of course you'll have to substitute the Perl code above for C#, VB.NET or any one the various languages targetted at the .NET Framework.

 

Categories: XML

October 25, 2003
@ 05:47 PM

Get it here

Differences between v1.1.0.36 and v1.2.0.42 below.

  • Support for password protected feeds using either HTTPS/SSL or HTTP Authentication. This feature can be tested using Steven Garrity's test feeds.
  • The ability to store and retrieve feed list from remote locations such as a dasBlog blog, an FTP server or a network file share. This enables users utilizing RSS aggregators on multiple machines to synchronize their feed list from a single point. This feature has been called a subscription harmonizer by some.
  • Multiple feeds downloaded simultaneously instead of one at a time thus reducing download time.
  • When saving as OPML, the hierarchy of the feed list is preserved instead of writing out a flat structure.
  • Default theme for viewing items changed to resemble that of a mail reader like Outlook Express. 
  • Added support for <dc:author> and <author> elements to a number of templates including the default theme.
  • FIXED: Feed list corruption when importing an OPML file where xmlUrl="" for some feeds
  • FIXED: NullReferenceException involving streams when accessing feeds after RSS Bandit has been running for a long time.


 

Categories: RSS Bandit

"This paper proposes extending popular object-oriented programming languages such as C#, VB or Java with native support for XML. In our approach XML documents or document fragments become first class citizens. This means that XML values can be constructed, loaded, passed, transformed and updated in a type-safe manner. The type system extensions, however, are not based on XML Schemas. We show that XSDs and the XML data model do not fit well with the class-based nominal type system and object graph representation of our target languages. Instead we propose to extend the C# type system with new structural types that model XSD sequences, choices, and all-groups. We also propose a number of extensions to the language itself that incorporate a simple but expressive query language that is influenced by XPath and SQL. We demonstrate our language and type system by translating a selection of the XQuery use cases."

From Programming with Rectangles, Triangles, and Circles by Erik Meijer and Wolfram Schulte

I talk to Erik about this stuff all the time, so it's great to finally see some of the thoughts and discussions around this topic actually written down in a research paper. According to Erik's blog post from a few weeks ago he'll actually be presenting about this at XML 2003


 

Categories: XML

According to C|Net News

Amazon.com on Thursday unveiled a new service that lets bookworms search through pages of thousands of books available on its online store.

The service, dubbed "Search Inside the Book," lets people type in any keyword and receive results for all the pages and titles of various books that contain that term. In the past, Amazon customers could search only by author name, title or keyword.

I am impressed by how in one move Amazon made their search feature utterly useless. I just tried to search for "open source xml" and "java xml" books on Amazon it and it was a fucking disaster. Even the top 10 hits that were returned were polluted with books that simply had the words "Java" or "XML" somewhere in the book. In fact almost every search I tried returned Oracle9i JDeveloper Handbook  in the top 10. If ever a feature needed to be turned off by default it is this one.


 

Categories: Ramblings

October 23, 2003
@ 06:42 PM

Every once in a while I notice links from educational institutions that use my writings for their classes in my referrer logs. It gives me the warm fuzzies to know that I'm actually [indirectly] teaching a generation of CS geeks. In the past month I've seen the following referrers/references to my writings

My corrupting influence spreads...


 

Categories: Ramblings

October 23, 2003
@ 05:56 PM

I picked up a Belkin Mobile iPod FM Transmitter. on a whim last night. At first, I had issues with the amount of static and feedback that were being emitted from the speakers but once I figured out that I was supposed to turn down the volume on my iPod and turn it up on the car stereo it was heaven. Since this was an impulse buy I didn't shop around but if I had I may have decided on an iTrip instead since there are no dangling wires and batteries are not required. I'll see how I feel about the Belkin device in a week or so.

According to Slashdot, B0rg Central didn't have anything nice to say about the launch of iTunes on Windows. Looking beyond what seem like obvious sour grapes it is a bummer that iPods don't support the WMA format.

My favorite B0rg hater, Russell Beattie, has this to say about the iPod

So here's my thoughts: 1) The current iPod needs a successor and soon because consumers will start to balk at the B&W interface. 2) With the color screen and all that storage, it'd be dumb not to show multimedia like Photos and Video. 3) If Apple's going to show multimedia, they'll probably want to use Quicktime to do it... 4) If they're going that route, they'd need a Mobile OS to run it on. (Not to mention for other needs like supporting Wireless access to the iPod via WiFi or Bluetooth).

I guess I'm about the reveal myself as being a Luddite but I have no problem with the B & W iPod interface nor am I interested in taking pictures or playing videos on my music player. This annoying convergence of features has not interested me in my cell phone (which happen to have lost useful features over time like password protected address books for frivolous shit like games, web browsing and taking pictures) and I definitely don't want it in my music player especially if it keeps the price high instead of allowing it to drop to a more reasonable amount so I can pick up a few as Xmas gifts.


 

Categories: Ramblings