Monday, 05 January 2004 - Dare Obasanjo's weblog

January 5, 2004

@ 04:26 PM

Last night I posted about bugs in RSS Bandit others have that aren't reproducible on my machine. Ted Leung was nice enough to post some details about the problems he had. So it turns out that RSS Bandit is at fault. The two primary ways of adding feeds to your subscription list have ~~usability issues~~ bugs.

If you click the "New Feed" button and specify a feed URL without providing the URI scheme (e.g. enter www.example.org instead of http://www.example.org) then RSS Bandit assumes the URI is a local URI (i.e. file://www.example.org ) . Actually, it's worse sometimes it just throws an exception.
The "Locate Feed" button that uses Mark Pilgrim's Ultra-liberal RSS autodiscovery algorithm to locate a feed for a site shows a weird error message if it couldn't make sense of the HTML because it was too malformed (i.e tag soup) tag soup. There are a bunch of things I could change here from using a better error message to falling back to using Syndic8 to find the feed.

I'll fix both of these bugs before heading out to work today. Hopefully this should take care of the problems various people have had and probably never mentioned with adding feeds to RSS Bandit.

Categories: RSS Bandit

January 5, 2004

@ 08:37 AM

Comments [1]

More Stuff That Doesn't Repro

A few hours after my recent post about Roy's problems subscribing to feeds not being reproduible on my machine I stumbled on the following excerpt from a post by Ted Leung

I've played with RSS Bandit and there were some recent laudatory posts about the latest versions, so this morning I downloaded a copy (after doing Windows update for Win2K on the Thinkpad, rebooting, installing a newer version of the .NET framework, and rebooting...) and installed it. Things seemed fine, at least until I started adding feeds. The first two feeds I added were hers and mine. RSS Bandit choked on both. Now we have a internal/external network setup, complete with split DNS and a whole bunch of other stuff. I figured that might be a problem, and started tweaking. The deeper I got, the more I realized it wasn't going to work. I foresaw many pleas for technical support followed by frustration -- I mean, *I* was frustrated. So I dumped that and went for Plan B, as it were.

What's truly weird about his post is that I was reading it in RSS Bandit which means reading his feed works fine for me on my machine but for some reason didn't work with his. In fact, I just checked his wife's blog and again no problems reading it in RSS Bandit. *sigh*

I suspect gremlins are to blame for this...

Categories: RSS Bandit

January 5, 2004

@ 08:22 AM

Comments [8]

Porous Thinking

Nick Bradbury recently posted an entry entitled On Piracy which read

Many people who use pirated products justify it by claiming they're only stealing from rich mega-corporations that screw their customers, but this conveniently overlooks the fact that the people who are hurt the most by piracy are people like me.

Shareware developers are losing enormous amounts of money to piracy, and we're mostly helpless to do anything about it. We can't afford to sue everyone who steals from us, let alone track down people in countries such as Russia who host web sites offering pirated versions of our work...Some would argue that we should just accept piracy as part of the job, but chances are the people who say this aren't aware of how widespread piracy really is. A quick look at my web server logs would be enough to startle most people, since the top referrers are invariably warez sites that link to my site (yes, not only do they steal my software, but they also suck my bandwidth).

A couple of years ago I wanted to get an idea of how many people were using pirated versions of TopStyle, so I signed up for an anonymous email account (using a "kewl" nickname, of course) and started hanging out in cracker forums. After proving my cracker creds, I created a supposedly cracked version of TopStyle and arranged to have it listed on a popular warez site....This cracked version pinged home the first time it was run, providing a way for me to find out how many people were using it. To my dismay, in just a few weeks more people had used this cracked version than had ever purchased it. I knew piracy was rampant, but I didn't realize how widespread it was until this test.

The proliferation of software piracy isn't anything new. The primary reason I'm bothering to post about it is that Aaron Swartz posted an obnoxious response to Nick's post entitled On Piracy, or, Nick Bradbury is an Amazing Idiot which besides containing a "parody" which is part Slippery Slope and part False Analogy ends with the following gems

Nick has no innate right to have people pay for his software, just as I have no right to ask people to pay for use of my name.

Even if he did, most people who pirate his software probably would never use it anyway, so they aren't costing him any money and they're providing him with free advertising.

And of course it makes sense that lots of people who see some interesting new program available for free from a site they're already at will download it and try it out once, just as more people will read an article I wrote in the New York Times than on my weblog.

...

Yes, piracy probably does take some sales away from Nick, but I doubt it's very many. If Nick wants to sell more software, maybe he should start by not screaming at his potential customers. What's next? Yelling at people who use his software on friends computers? Or at the library?

Aaron's arguments are so silly they boggle the mind but let's take them one at a time. Human beings have no innate rights. Concepts such as "unalienable rights" and documents such as the Bill of Rights have been agreed upon by some societies as the law but this doesn't mean they are universal or would mean anything if not backed up by the law and its enforcers. Using Aaron's argument, Aaron has no innate right to live in a house he paid for, eat food he bought or use his computer if some physically superior person or armed thug decides he covets his possessions. The primary thing preventing this from being the normal state of affairs is the law, the same law that states that software piracy is illegal. Western society has decided that Capitalism is the way to go (i.e. a party provides goods or services for sale and consumers of said goods and services pay for them). So for whatever definition of "rights" Aaron is using Nick has a right to not have his software pirated.

Secondly, Aaron claims that if people illegally utilizing your software can't afford it then it's OK for them to do so. This argument is basically, "It's OK to steal if what you want is beyond your purchasing power". Truly, why work hard and save for what you want when you can just steal it. Note that this handy rule of Aaron's also applies to all sorts of real life situations. Why not shoplift, after all big department store chains can afford it anyway and in fact they factor that into their prices? Why not steal cars or rob jewellery stores if you can't afford them after all, it's all insured anyway right? The instant gratification generation truly is running amok.

The best part of Aaron's post is that even though Nick states that there are more people using pirated versions of his software than those that paid for it Aaron dismisses this by saying that his personal opinion is that there wouldn't have been many lost sales by piracy then it devolves into some slippery slope argument about whether people should pay for using Nick's software on a friend's computer or at the library. Of course, the simple answer to this question is that by purchasing the software the friend or the library can let anyone use it, the same way that I can carry anyone in my car after purchasing it.

My personal opinion is that if you think software is too expensive then (a) use cheaper alternatives (b) write your own or (c) do without it after all no one needs software. Don't steal it then try and justify your position with inane arguments that sound like the childish "information wants to be free" rants that used to litter Slashdot during the dotbomb era.

Categories: Ramblings

January 5, 2004

@ 07:39 AM

Comments [17]

Request For Comments: Synchronization of Information Aggregators using Markup (SIAM)

I've just finished the first draft of a specification for Synchronization of Information Aggregators using Markup (SIAM) which is the result of a couple of weeks of discussion between myself and a number of others authors of news aggregators. From the introduction

A common problem for users of desktop information aggregators is that there is currently no way to synchronize the state of information aggregators used on different machines in the same way that can be done with email clients today. The most common occurence of this is a user that uses a information aggregator at home and at work or at school who'd like to keep the state of each aggregator synchronized independent of whether the same aggregator is used on both machines.

The purpose of this specification is to define an XML format that can be used to describe the state of a information aggregator which can then be used to synchronize another information aggregator instance to the same state. The "state" of information aggregator includes information such as which feeds are currently subscribed to by the user and which news items have been read by the user.

This specification assumes that a information aggregator is software that consumes an XML syndication feed in one of the following formats; ATOM, [RSS0.91], [RSS1.0] or [RSS2.0]. If more syndication formats gain prominence then this specification will be updated to take them into account.

This final draft owes a lot of its polish to comments from Luke Hutteman (author of SharpReader), Brent Simmons (author of NetNewsWire) and Kevin Hemenway aka Morbus Iff (author of AmphetaDesk ). There are no implementations out there yet although once enough feedback has been gathered about the current spec I'll definitely add this to RSS Bandit and deprecate the existing mechanisms for subscription harmonization.

Brent Simmons has a post which highlights some of the the various issues that came up in our discussions entitled The challenges of synching.

Categories: Technology | XML

January 5, 2004

@ 05:53 AM

Comments [1]

Request For Comments: The "feed" URI Scheme [final draft]

I've written what should be the final draft of the specification for the "feed" URI scheme. From the abstract

This document specifies the "feed" URI (Uniform Resource Identifier) scheme for identifying data feeds used for syndicating news or other content from an information source such as a weblog or news website. In practice, such data feeds will most likely be XML documents containing a series of news items representing updated information from a particular news source.

The primary change from the previous version was to incorporate feedback from Graham Parks about compliance with RFC 2396. The current grammar for the "feed" URI scheme is

feedURI = 'feed:' absoluteURI | 'feed://' hier_part

where absoluteURI and hier_part are defined in section 3 of RFC 2396. Support for one click subscription to syndication feeds via this URI scheme is supported in the following news aggregators; SharpReader, RSS Bandit, NewsGator (in next release), NetNewsWire, Shrook, WinRSS and Vox Lite.

The next step will be to find somewhere more permanent to host the spec.

Categories: Technology

January 5, 2004

@ 05:42 AM

Comments [3]

Not Repro

One of the trying parts of being a tester for me was I'd file a bug and the bug with resolve it as "Not Repro" which loosely translates to "it worked on my machine". Half the time, the devs would be right and half the times I'd be. It was always a pain trying to figure out whose machine was at fault and which was the most recent build, dependencies and the whole shebang.

Earlier today Roy Osherove wrote

I didn't expect for this to happen. RSS Bandit just gave me a huge disappointment. I really thought this was it, that it works OK now, but I was wrong. Out of about 150 feeds, RSS Bandit can't parse 22 of them. And not esoteric ones either:

Jermemy,HackNot,Feedster,Dino,Brain.Save() and a good few more. These are all feeds that work very well in sharpreader. Now, the only thing I can think of is that these feeds do not conform to all the rules an RSS needs to conform to. That means that RSS bandit is somehow too '"strict" enforcing those rules (just a guess). If I can still get these feeds some other place you can be sure that's the path I'll take.

All of the aforementioned feeds work on my machine. Screenshot below. Granted I'm using the current snapshot from SourceForge but the RSS handling code hasn't changed much besides a change I just made to fix the fact that in some cases the cache loses items if you hit [Refresh Feeds] on startup.

Categories: RSS Bandit

January 4, 2004

@ 03:38 AM

Comments [2]

RSS Bandit Feature Requests Answered

Roy Osherove has a recent blog entry entitled Moving to RSS Bandit: A simple review where he talks about why he's switched his primary news aggregator from SharpReader to RSS Bandit. In his post he asks a couple of questions most of which are feature requests. My questions and my answers are below.

The Feed tree can only be widened to a certain extent. Why is that?

I'm not sure about the answer to this one. Torsten writes the UI code and he's currently on vacation. I assume this is done so that you can't completely cover one panel with another.

Posting to my blog from it

You can post to your blog from RSS Bandit using the w.bloggar plugin developed by Luke Hutteman. I've assigned a feature request bug to myself to ensure that this plugin should be installed along with RSS Bandit.

a "blog about this item" feature which automatically asks you what parts of the item you'd like to be inserted into the new blog post (title,author name, quoted text...)

Once the ATOM effort produces decent specs around a SOAP API for posting to blogs and the various blogging tools start to support it then this will be native functionality of RSS Bandit. No ETA for this feature since it is dependent on a lot of external issues.

I can't wait for the search folders!

Neither can I. This feature will definitely be in the next release.

Pressing space while reading a long blog post does not scroll the explorer pane of the post(unless it is focused), but automatically takes you to the next unread post. I wish that would behave like SR where it would scroll the post until it ends and only then take you to the next one

I'll mention this to Torsten when he gets back although I'm sure he'll read this entry before I get to.

I wish there was an ability to choose whether you can order the feed tree alphabetically or by a distinct order the user wants (like SR)

I've always thought this was a weird feature request. I remember Torsten didn't like having to implement it and the main justification for having the feature I've heard from a user is satisfied with Search Folders.

For some reason, some of the posts are blue and some not. What does that mean?

Blue means they contain a link to your webpage [as specified by you in the preferences dialog]. It's a handy visual way to determine when posts link to your blog. Again, this functionality is probably superceded by Search Folders.

I'd like to know how far down the feed list is the updating process when I press the "update all feeds" (a simple XX feeds left to update should do)

Another feature request for Torsten. I do like the fact that we now (in current builds not yet publicly released) provide visual indication of when items are being downloaded from a feed and when an error occurs during the downloading process.

Why is there a whole panel just for search when all there is is just a small text box? Why not simply put that box on the main tool bar?

The UI work for the Search feature isn't done yet. We will use up all that space once everything is done.

While we're at it, entering text in the search box and pressing enter should automatically run search( i.e the Search Button should be the default button when the text box is active)

Agreed. This will be fixed.

I'd like to be able to set the default update rate for a category(which will impact all feeds in it) and not just for the whole feeds globally using the main options dialog

This makes sense. However there is some complexity in that categories can nest and so on. I'll think about it.

NO RSS aggregator I've seen yet has been able to do this simple task: in the main .Net weblogs feed, show the name of the post author\Blog name next to the post title. Is this information simply missing from the feed? If not, how hard would it be to implement this?

This information is shown in the Reading Pane. Would you like to see this in the list view? For most blogs this would be empty (since the dc:author & author elements are rarely used) or redundant since most feeds are produced by a single blog.

I'd like to be able to setup the viewer pane to the right, and the posts pane to the bottom left (like in outlook's 2003 default view or like FeedDemon)

This is in the current builds although the feature is hidden. You have to right-click on the 'Feed Details' tab. I plan to talk to Torsten about making this a toolbar button like in Outlook/Outlook Express.

Categories: RSS Bandit

January 2, 2004

@ 09:55 PM

Comments [4]

Why I Hate Physics

It snowed yesterday in the Seattle area. It was nice watching the snowflakes fall and afterwards I had the first snowball fight of my life. Then I got in my car, turned on the heat and drove a few blocks to the video store. By the time I got out of the store there was a crack almost a foot long on the driver's side of the windshield.

Crap. Crap. Crap.

Categories: Ramblings

January 1, 2004

@ 10:51 AM

Comments [2]

The Unified Theory of Everything

Sean Campbell or Scott Swigart writes

I want this also. I want a theory that unifies objects and data. We're not there yet.

With a relational database, you have data and relationships, but no objects. If you want objects, that's your problem, and the problem isn't insignificant. There’s been a parade of tools and technologies, and all of them have fallen short on the promise of bridging the gap. There's the DataSet, which seeks to be one bucket for all data. It's an object, but it doesn't give you an object view of the actual data. It leaves you doing things like ds.Tables["Customer"].Rows[0]["FirstName"].ToString(). Yuck. Then there are Typed DataSets. These give you a pseudo-object view of the data, letting you do: ds.Customer[0].FirstName. Better, but still not what I really want. And it's just code-gen on top of the DataSet. There's no real "Customer" object here.

Then, there are ObjectSpaces that let you do the XSD three-step to map classes to relational data in the database. With ObjectSpaces you get real, bona fide objects. However, this is just a bunch of goo piled on top of ADO.NET, and I question the scalability of this approach.

Then there are UDTs. In this case, you've got objects all the way into the database itself, with the object serialized as one big blob into a single column. To find specific objects, you have to index the properties that you care about, otherwise you're looking at not only a table scan, but rehydrating every row into an object to see if it's the object you're looking for.

There's always straight XML, but at this point you're essentially saying, "There are no objects". You have data, and you have schema. If you're seeing objects, it's just an optical illusion on top of the angle brackets. In fact, with Web services, it's emphatically stated that you're not transporting objects, you're transporting data. If that data happens to be the serialization of some object, that's nice, but don't assume for one second that that object will exists on the other end of the wire.

And speaking of XML, Yukon can store XML as XML. Which is to say you have semi-structured data, as XML, stored relationally, which you could probably map to an XML property of an object with ObjectSpaces.

What happens when worlds collide? Will ObjectSpaces work with Yukon UDTs and XML?

Oh, and don't forget XML Views, which let you view your relational data as XML on the client, even though it's really relational.

<snip />

So for a given scenario, do all of you know which technology to pick? I'm not too proud to admit that honestly I don't. In fact, I honestly don't know if I'll have time to stress test every one of these against a number of real problem domains and real data. And something tells me that if you pick the wrong tool for the job, and it doesn't pan out, you could be pretty hosed.

Today we have a different theory for everything. I want the Theory of Everything.

I've written about this problem in the past although at the time I didn't have a name for the Theory of Everything, now I do. From my previous post entitled Dealing with the Data Access Impedance Mismatch I wrote

The team I work for deals with data access technologies (relational, object, XML aka ROX) so this impedance mismatch is something that we have to rationalize all the time.

Up until quite recently the primary impedance mismatch application developers had to deal with was the Object<->Relational impedance mismatch. Usually data was stored in a relational database but primarily accessed, manipulated and transmitted over the network as objects via some object oriented programming language. Many felt (and still feel) that this impedance mismatch is a significant problem. Attempts to reduce this impedance mismatch has lead to technologies such as object oriented databases and various object relational mapping tools. These solutions take the point of view that the problem of having developers deal with two domains or having two sets of developers (DB developers and application coders) are solved by making everything look like a single domain, objects. One could also argue that the flip side of this is to push as much data manipulation as you can to the database via technologies like stored procedures while mainly manipulating and transmitting the data on the wire in objects that closely model the relational database such as the .NET Framework's DataSet class.

Recently a third player has appeared on the scene, XML. It is becoming more common for data to be stored in a relational database, mainly manipulated as objects but transmitted on the wire as XML. One would then think that given the previously stated impedance mismatch and the fact that XML is mainly just a syntactic device that XML representations of the data being transmitted is sent as serialized versions of objects, relational data or some subset of both. However, what seems to be happening is slightly more complicated. The software world seems to moving more towards using XML Web Services built on standard technologies such as HTTP, XML, SOAP and WSDL to transmit data between applications. And taken from the WSDL 1.1 W3C Note

WSDL recognizes the need for rich type systems for describing message formats, and supports the XML Schemas specification (XSD) [11] as its canonical type system

So this introduces a third type system into the mix, W3C XML Schema structures and datatypes. W3C XML Schema has a number of concepts that do not map to concepts in either the object oriented or relational models. To properly access and manipulate XML typed using W3C XML Schema you need new data access mechanisms such as XQuery. Now application developers have to deal with 3 domains or we need 3 sets of developers. The first instinct is to continue with the meme where you make everything look like objects which is what a number of XML Web Services toolkits do today including Microsoft's .NET Framework via the XML Serialization technology. This tends to be particularly lossy because traditionally object oriented systems do not have the richness to describe the constraints that are possible to create with a typical relational database let alone the even richer constraints that are possible with W3C XML Schema. Thus such object oriented systems must evolve to not only capture the semantics of the relational model but those of the W3C XML Schema model as well. Another approach could be to make everything look like XML and use that as the primary data access mechanism. Technologies already exist to make relational databases look like XML and make objects look like XML. Unsurprisingly to those who know me, this is the approach I favor. The relational model can also be viewed as a universal data access mechanism if one figured out how to map the constraints of the W3C XML Schema model. The .NET Framework's DataSet already does some translation of an XML structure defined in a W3C XML Schema to a relational structure.

The problem with all three approaches I just described is that they are somewhat lossy or involve hacking one model into becoming the uber-model. XML trees don't handle the graph structures of objects well, objects can't handle concepts like W3C XML Schema's derivation by restriction and so on. There is also a fourth approach which is endorsed by Erik Meijer in his paper Unifying Tables, Objects, and Documents where one creates a new unified model which is a superset of the pertinent features of the 3 existing models. Of course, this involves introducing a fourth model.

The fourth model mentioned above is the unified theory of everything that Scott or Sean is asking for. Since the last time I made this post, my friend Erik Meijer has been busy and produced another paper that shows what such a unification of the ROX triangle would look like if practically implemented as a programming language in his paper Programming with Circles, Triangles and Rectangles. In this paper Erik describes the research language Xen which seems to be the nirvana Scott or Sean is looking for. However this is a research project and not something Sean or Scott will be likely to use in production in the next year.

The main problem is that Microsoft has provided .NET developers with too much choice when it comes to building apps that retrieve data from a relational store, manipulate the data in memory then either push the updated information back to the store or send it over the wire. The one thing I have learned working as a PM on core platform technologies is that our customers HATE choice. It means having to learn multiple technologies and make decisions on which is the best, sometimes risking making the wrong choice. This is exactly the problem Scott or Sean is having with the technologies we announced at the recent Microsoft Professional Developer Conference (PDC) which will should be shiping this year. What technology should I use and when I should I use it?

This is something the folks on my team (WebData - the data access technology team) know we have to deal with when all this stuff ships later this year which we will deal with to the best of our ability. Our users want architectural guidance and best practices which we'll endeavor to make available as soon as possible.

The first step in providing this information to our users are the presentations and whitepaper we made available after PDC, Data Access Design Patterns: Navigating the Data Access Maze (Powerpoint slides) and Data Access Support in Visual Studio.NET code named “Whidbey”. Hopefully this will provide Sean, Scott and the rest of our data access customers with some of the guidance needed to make the right choice. Any feedback on the slides or document would be appreciated. Follow up documents should show up on MSDN in the next few months.

Categories: Technology | XML

December 31, 2003

@ 04:46 PM

Comments [0]

I Want This T-Shirt

From ThinkGeek

Skillset Exportable
Insufficient ROI
Office of Employee Termination and Overseas Outsourcing

Definitely wouldn't mind rocking this around the B0rg cube.

Categories: Mindless Link Propagation

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Monday, 05 January 2004 - Dare Obasanjo's weblog