Andy Edmonds has a post over on the MSN Search blog entitled Tagging Feedback at MSN Search where he talks about the internal app he built that is used to track feature requests and bug reports about MSN Search.

When the MSN Search team gets feedback or bug reports, each one is "tagged" with multiple keywords/categories which can then be analyzed later by frequency. The example, Andy shows in his post is the tag "ypResults" which is used to categorize featue requests for yellow page hits as part of web search results. With this system the search team can have a simple yet effective way to keep track of their most hot button issues

Andy showed this to me a few months ago and I thought it was really cool. I'd have loved to have a system like this when I used to work on the XML team to figure out what features/bugs were most often requested by users in a quantitative way.

Below is a screenshot of the feature (names changed to protect the innocent)


 

Categories: MSN

I was at the Anger Management 3 concert last night and it was quite the show. Lil' Jon & The Eastside Boyz were a welcome surprise as the opening act. They cycled through the BME clique hits from "Get Low" to "Salt Shaker" for the 30 minutes they were on stage. The problem with Lil' Jon is that most of the hits you associate him with are collaborations so at concerts you end up getting half the performance not being live for songs such as "Yeah!" or "Lovers & Friends".

The next set had the entire G-Unit record label including newly signed acts like Mobb Deep & M.O.P. performing for just over an hour. The first part of the G-Unit set sucked because we had to sit through the crap singles from Tony Yayo, Young Buck and Lloyd Banks solo efforts as well as some of the crud from The Massacre. Halfway through it picked up with the better songs from The Massacre (Disco Inferno, Candy Shop), old hits from Get Rich or Die Tryin' (P.I.M.P., In Da Club, Wanksta) and G-Unit's Beg for Mercy (I Wanna Get To Know Ya). M.O.P. did their hit from a few years ago, "Ante Up", and Mobb Deep hit the crowd with "Quiet Storm" without Lil' Kim. There was a momentary infusion of crap when a lot of time was devoted to a new 50 Cent & Mobb Deep song but the show got back on track after that. The G-Unit set was OK but I'd have loved t hear some of their mix tape cuts instead of just mainstream tracks.

Eminem killed. He made the concert go from OK to fantastic with almost an hour and a half of performances from himself and D12. Even 50 Cent got in on the act when they performed "Patiently Waiting" and "Gatman & Robin". The parts of the show where Eminem riffed with the audience about tabloids, Mariah Carey and Michael Jackson were also golden.

If this show is going to hit your town you should definitely check it out.


 

Categories: Music

My buddy Erik Meijer and Peter Drayton have written a paper on programming languages entitled Static Typing Where Possible, Dynamic Typing When Needed: The The End of the Cold War Between Programming Languages. The paper is meant to seek a middle ground between the constant flame wars over dynamically typed vs. statically typed programming language. The paper is pretty rough and definitely needs a bunch of work. Take the following excerpt from the first part of the paper

Static typing fanatics try to make us believe that “well-typed programs cannot go wrong”. While this certainly sounds impressive, it is a rather vacuous statement. Static type checking is a compile-time abstraction of the runtime behavior of your program, and hence it is necessarily only partially sound and incomplete. This means that programs can still go wrong because of properties that are not tracked by the type-checker, and that there are programs that while they cannot go wrong cannot be type-checked. The impulse for making static typing less partial and more complete causes type systems to become overly complicated and exotic as witnessed by concepts such as "phantom types" and "wobbly types"
...
In the
mother of all papers on scripting, John Ousterhout argues that statically typed systems programming languages make code less reusable, more verbose, not more safe, and less expressive than dynamically typed scripting languages. This argument is parroted literally by many proponents of dynamically typed scripting languages. We argue that this is a fallacy and falls into the same category as arguing that the essence of declarative programming is eliminating assignment. Or as John Hughes says, it is a logical impossibility to make a language more powerful by omitting features. Defending the fact that delaying all type-checking to runtime is a good thing, is playing ostrich tactics with the fact that errors should be caught as early in the development process as possible.

We are interesting in building data-intensive three-tiered enterprise applications. Perhaps surprisingly, dynamism is probably more important for data intensive programming than for any other area where people traditionally position dynamic languages and scripting. Currently, the vast majority of digital data is not fully structured, a common rule of thumb is less then 5 percent. In many cases, the structure of data is only statically known up to some point, for example, a comma separated file, a spreadsheet, an XML document, but lacks a schema that completely describes the instances that a program is working on. Even when the structure of data is statically known, people often generate queries dynamically based on runtime information, and thus the structure of the query results is statically unknown.

The comment about making programming languages more powerful by removing features being a logical impossibility seems rather bogus and seems out of place in an academic paper. Especially when one can consider the 'removed features' to be restrictions which limit the capabilities of the programming language.

I do like the fact that the paper tries to dissect the features of statically and dynamically typed languages that developers like instead of simply arguing dynamic vs. static as most discussions of this form take. I assume the purpose of this dissection is to see if one could build a programming language with the best of both worlds. From personal experience, I know Erik has been interested in this topic from his days.

Their list of features runs the gamut from type inference and coercive subtyping to lazy evaluation and prototype inheritence. Although the list is interesting I can't help but think that it seems to me that Erik and Peter already came to a conclusion and tried to fit the list of features included in the paper to that conclusion. This is mainly taken from the fact that a lot of the examples and features are taken from  instead of popular scripting languages.

This is definitely an interesting paper but I'd like to see more inclusion of dynamic languages like Ruby, Python and Smalltalk instead of a focus on C# variants like . The paper currently looks like it is making an argument for Cω 2.0 as opposed to real research on what the bridge between dynamic and static programming languages should be.


 

Categories: Technology

Robert Scoble has posted aseries ofentries comparing the Bloglines Citations feature with Technorati.com for finding out how many sites links to a particular URL. His conclusion seems to be that Technorati sucks compared to Bloglines which has led to an interesting back & forth discussion between him and David Berlind.

I've been frustrated by Technorati.com for quite a while and have been quietly using Bloglines Citations as an alternative when I want to get results from a web search and PubSub for results I want to subscribe to in my favorite RSS reader. Technorati seems to lack the breadth of either service when it comes to finding actual blog posts that link to a site and neither site brings up unrelated crap such as blogrolls in their results.

The only problem with Bloglines is that their server can't handle the load and the citations feature is typically down several times during the day. Technorati has also had similar problems recently.

At this point all that Technorati seems to have going for it is first mover advantage. Or is there some other reason to use Technorati over competitors like Bloglines or PubSub that I've missed?


 

From  Omar's post in Sender ID I see that Forbes has an article entitled Microsoft, Yahoo! Fight Spam--Sort Of. The article gives a pretty even handed description of the various approaches both Yahoo! and MSN are taking in dealing with phishing and spam.

In the article we learn

While some e-mail services have adopted SenderID, there are still many that have not. According to Cox, the other reason for the false positives is that not all users remain on a single server. “SPF says, ‘All of my mail should come from these servers,’” says Cox. For many of EarthLink’s customers, they can be legitimately on a variety of servers, such as a corporate server, and still send and receive mail using their EarthLink address. For those users, SPF fails.

EarthLink started testing DomainKeys in the first quarter of 2005 and now signs over 70% of all outgoing mail. Other companies are also testing DomainKeys. Yahoo! Mail claims to be receiving approximately 350 million inbound DomainKeys signed messages per day.

Critics have accused Microsoft forcing SenderID on the industry without addressing questions about perceived shortcomings. The company drew fresh criticism recently when reports claimed that its Hotmail service would delete all messages without a valid SenderID record beginning in November. While AOL uses SPF, many e-mail systemsdo not. If Microsoft went through with this, for example, a significant portion of valid e-mails would never reach intended Hotmail recipients.

Microsoft says that Hotmail will not junk legitimate e-mail solely because the sending domain lacks an SPF record. The company says SenderID will be weighed more heavily in filtering e-mails, but will remain one of the many factors used when evaluating incoming e-mail. The company did say that with increased adoption of Sender ID and SPF, it will eventually become a more reliable indicator.

Both SenderID and DomainKeys filter messages with spoofed e-mail addresses in which the sender has changed the "From:"field to make it look like someone else has sent the e-mail. For example, many phishing scams come from individuals posing as banks. Under the SenderID framework, if the bank has published an SPF record, the receiving server can compare the originating server against the SPF record. If they don’t match, the receiving server flags it as spam. DomainKeys perform a similar comparison but use an encrypted key in each message and the public key unique to each domain to check where the message originated.

The amount of phony email I get per week claiming to be from Paypal & eBay and requesting that I 'confirm my account info or my account will be cancelled' is getting ridiculous. I welcome any technology that can be used to fight this flood of crap.


 

Categories: MSN

July 17, 2005
@ 05:54 AM

From Tim Bray's post entitled Atom 1.0 we learn

There are a couple of IETF process things to do, but this draft (HTML version) is essentially Atom 1.0. Now would be a good time for implementors to roll up their sleeves and go to work.

I'll add this to the list of things I need to support in the next version of RSS Bandit. The Longhorn RSS team will also need to update their implementation as well. :)

I couldn't help but notice that Tim Bray has posted an entry entitled RSS 2.0 and Atom 1.0, Compared which is somewhat misleading and inaccurate. I find it disappointing that Tim Bray couldn't simply announce the upcoming release of Atom 1.0 without posting a FUD style anti-RSS post as well.

I'm not going to comment on Tim Bray's comparison post beyond linking to other opinions such as those from Alex Bosworth on Atom Failings and Don Park on Atom Pendantics.


 

The list of PDC 2005 sessions is out. The website is rather craptacular since I can't seem to link directly to search results or directly to sessions. However thanks to some inside information from my man Doug I found that if you search for "POX" in the the session track list, you'll find the following session abstract

Indigo: Web Services for XML Programmers
If you love XML, you will love this session. Learn how to write services that range from Plain Old XML (POX) to WS-I Basic Profile and WS-* using Indigo. Learn best practices for transforming and manipulating XML data as well as how and when to expose strong-typed views. If you use XML, XSLT, XSD, and serialization directly in your Web services today, this session offers the information you need to successfully migrate your services to Indigo.
Session Level(s): 300
Track(s): Communications

Microsoft's next generation development platforms are looking good for web developers. AJAX support? check. RSS support? check. And now it looks like the Indigo folks will be enabling developers to build distributed applications on the Web using plain old XML (POX) over HTTP as well as SOAP. 

A number of popular services on the Web expose APIs on the Web using POX (although they mistakenly call them REST APIs). In my post Misunderstanding REST: A look at the Bloglines, del.icio.us and Flickr APIs I pointed out that the Flickr API, del.icio.us API and the Bloglines sync API are actually examples of POX web services not REST web services. This approach to building services on the Web has grown increasingly popular over the past year and it's great that Microsoft's next generation distributed computing platform will support this approach.

I spent a bunch of time convincing the Indigo folks to consider widening their view of Web services and thanks to open minded folks like Doug, Don & Omri it looks like I was successful.

Of course, it isn't over yet. The icing on the cake would be the ability to get full support for using REpresentational State Transfer (REST) in Indigo. Wish me luck. :)

Update: I was going to let the Indigo guys break this themselves but I've been told that it is OK to mention that there will be first class support for building REpresentational State Transfer (REST) web services using Indigo.


 

Categories: XML Web Services

July 13, 2005
@ 01:36 PM

I stumbled on Bus Monster last week and even though I don't take the bus I thought it was a pretty cool application. There's a mapping application that I've been wanting for a few years and I instantly realized that given the Google Maps API I could just write it myself.

Before starting I shot a mail off to Chandu and Steve on the MSN Virtual Earth team and asked if their API would be able to support building the application I wanted. They were like "Hell Yeah" and instead of working on my review I started hacking on Virtual Earth. In an afternoon hacking session, I discovered that I could build the app I wanted and learned new words like geocoding.

My hack should be running internally on my web server at Microsoft before the end of the week. Whenever Virtual Earth goes live I'll move the app to my personal web site. I definitely learned something new with this application and will consider Hacking MSN Virtual Earth as a possible topic for a future Extreme XML column on MSDN. Would anyone be interested in that?


 

Categories: MSN | Web Development | XML

Sometime during the past week, the number of downloads of RSS Bandit from SourceForge crossed 100,000 for the most recent release and 300,000 total downloads since the project moved to SourceForge a year and a half ago. This isn't bad for a project that started of as a code sample in an MSDN article a few years ago.

However even though Torsten and I have been improving the original code for about two years now there is still a bunch of work to do. Some of these areas for improvement were recently pointed out by Jack Vinson in his posts RSSBandit Thoughts and More RSSBandit Experience. Below are his comments and responses from me.

"Next unread item" means the oldest unread item, rather than the youngest.  This seems to run counter to most of the aggregators, which present the newest unread item.  Interestingly, the "newspaper" view shows items in reverse chronological order.

I like to read posts in the order they were written especially when a newer post might be a follow up to an older post [as is the case with the latter post by Jack]. In general, I don't think anyone has really complained about this before.

Space bar goes to "next unread," rather than doing a "scroll" in the current reading pane window when viewing in newspaper mode.  If the reading pane has focus, it will scroll there.  When reading a single post, it does scroll as expected.

The behavior of going to the next unread item on hitting space bar predates the newspaper view. The problem we had when coming up with newspaper views was how to integrate both features in a way that was intuitive. The main issue here being that if the space bar scrolls you through the newspaper and you scroll half way down then click somewhere else, do you expect that half the posts from the newspaper view should be marked as read or stay unread? We didn't have a good idea of what the right choice would be so we punted on the problem by not scrolling in the newspaper view when you hit space but instead keeping the old "Next Unread" behavior.

RSS Bandit is much more sensitive to errors in the feeds - more accurately, it tells me that there are errors in some feeds.  They provide a "feed error" folder that lists problems as they arise.  But I see that the feeds it has trouble with work fine elsewhere.  Not good.

Some of the errors we report really aren't worth showing to end users. Things like HTTP timeouts and the like are really transient issues that are more likely due to the user's network than a problem with the feed. We need to do some filtering of these errors in future releases.

I can't get the full text on excerpt-only feeds.  This is probably the biggest loss of moving from the old reader.

If the feed only has excerpts, how do we get the full text of the entry?

I like the newspaper view, when I select a folder (they call them "categories").  Articles are listed in descending order, but are grouped by feed.  I don't quite understand how the feeds are sorted (it's not by the feed with the most recent article is at the top.)  This is a handy mode for reading unread stuff once or twice a day.

I like this feature as well. In the next version we'll be adding the ability to flag or mark items as read/unread from the newspaper view. The feeds should be sorted by the order the appear in the tree view.

RSS Bandit is a stand-alone application, but it uses the Internet Explorer engine to render HTML and XSLT.  By default, it opens links in tabs within the app.  You can also have it open links in the default browser.  I like the tabs in the application.  Now I need to find out if there are keyboard shortcuts for navigating the tabs.

Tabbed browsing is definitely cool. You can navigate between tabs by pressing the Ctrl+Tab or Shift+Ctrl+Tab keys. It's pretty sweet.

The BlogJet This plug-in works in the reading windows.  But the BlogJet This plug-in for IE does not work in the tabs that open within RSS Bandit

Weird. I'm not sure why this is the case but can look into it.

Email this only emails the URL of the post.  I'd rather it give the entire text (HTML) of the item (along with the URL). 

I've kind of wondered about this myself but since no one has ever really complained I never changed it. Are there other RSS Bandit users out there that would prefer that "Email This" sent the body of the post and not just the URL as it does today?

I'm not quite clear on how the user interface is responding. Sometimes I will select a folder/category that has updated feeds, and I will get a view that lists just the new entries. Other times the newspaper will show both new and old entries. The topic list always shows both the new and old.

For search folders the newspaper view shows all the items in the folder while for regular feeds or folders/categories it shows the unread items.

One can create search folders to display ONLY unread messages, for example.
It seems slow, but this is my complaint with many of these apps. Maybe I just read too many feeds. Marking about 80 unread items read (when in the "unread view") took quite a while. Even 28 unread items took 10-15 seconds to "process." This seems to be a memory issue, because the next time I hit "mark all read" in the same usage session, it is much faster.

I agree that it does seem to take far too long for an operation like "Mark All Read" to be performed in a search folder. I'll work on improving the performance of this for the next version.

There seems to be no easy way to tell the software that I'm offline and to not bother downloading.

Go to the File menu and select "Work Offline". We also detect if you select this option directly from Internet Explorer as well.

When it's checking feeds, it eats a lot of resources. So much so, that I can't even scroll the current window, much less select a new feed to read. (Outlook has been doing the same thing to me lately.)

Downloading feeds is pretty CPU intensive for us. Not because of the actual downloading of the files but because we run the algorithm that infers relationships across different posts so we can show them as threaded conversations. I hacked on this code during the last release but only made it slightly less CPU intensive. I've considered just having an option to turn off this feature for the folks who'd rather have a more responsive UI than the threaded conversation feature.


 

Categories: RSS Bandit