Jeff Atwood has a blog post entitled Sorting for Humans : Natural Sort Order where he writes

The default sort functions in almost every programming language are poorly suited for human consumption. What do I mean by that? Well, consider the difference between sorting filenames in Windows explorer, and sorting those very same filenames via Array.Sort() code:

Implementing a natural sort is more complex than it seems, and not just for the gnarly i20n issues I've hinted at, above. But the Python implementations are impressively succinct

I tried to come up with a clever, similarly succinct C# 3.0 natural sort implementation, but I failed. I'm not interested in a one-liner contest, necessarily, but it does seem to me that a basic natural sort shouldn't require the 40+ lines of code it takes in most languages.

Since I’m still in my “learning Python by mapping it to C#” phase I thought this should be a straightforward task. Below is the equivalent IronPython code for natural sort which is slightly modified from the code posted in Jeff’s post along with what I hoped to be a succint version in C# 2.0. It would definitely be shorter in C# 3.0 [which I don’t plan to start using for another year or so]. The Python snippet below takes advantage of some interesting rules around comparing lists of objects which don’t exist in C#. I’m sure I could reduce the size of the C# code while maintaining readability but my procrastination time is over and I need to get to work. Wink

Natural Sort in IronPython

import re

def sort_nicely( l ):
  """ Sort the given list in the way that humans expect. """
   
  convert = lambda x: x.isdigit() and int(x) or x
  alphanum = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ]
  l.sort( key=alphanum ) #serious magic happens here
  return l

print sort_nicely(["z22.txt", "z5.txt" , "z.txt", "z10.txt", "z300.txt", "z2.txt", "z11.txt", "y.txt", "z", "z4.txt", "za.txt" ])

Natural Sort in C# 2.0


using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;


public class Test {


   ///<summary>Compare two lists of strings using Python rules and natural order semantics</summary>
  public static int NaturalCompare(IList<string> a, IList<string> b) {
    int y, z, len = (a.Count < b.Count ? a.Count : b.Count);

    for (int i = 0; i < len; i++) {
      if (a[i].Equals(b[i])) continue;

      bool w = Int32.TryParse(a[i], out y), x = Int32.TryParse(b[i], out z);
      bool bothNumbers = w && x, bothNotNumbers = !w && !x;

      if (bothNumbers) return y.CompareTo(z);
      else if (bothNotNumbers) return a[i].CompareTo(b[i]);
      else if (w) return -1;
      else return 1; //numbers always less than words or letters
    }
    return (a.Count.CompareTo(b.Count)); //subset list is considered smaller 
  }

  public static List<string> SortNicely(List<string> list) {
    Regex re
= new Regex("([0-9]+)");
    list
.Sort(delegate(string x, string y) { return NaturalCompare(re.Split(x), re.Split(y)); });
    return list;
  }


  public static void Main(string[] args) {
    List<string> l = new List<string>(new string[] { "z.txt", "y.txt", "z22.txt", "z5.txt", "z10.txt", "z3.txt", "z2.txt", "za.txt", "z11.txt", "z400.txt" });
    foreach (string s in SortNicely(l)) Console.WriteLine(s);
  }
}

Now playing: Notorious B.I.G. - Real N*ggas Do Real Things


 

Categories: Programming

Yesterday I read about the Opening up Facebook Platform Architecture. My initial thoughts are that Facebook has done what Google claimed to have done but didn't with Open Social. Facebook seems to have provided detailed specs on how to build an interoperable widget platform unlike Google who unleashed a bunch of half baked REST API specs with no details about the "widget" aspect of the platform unless you are building an Orkut application.

As I've thought about this over the past few weeks, building a widget platform that is competitive with Facebook's is hard work. Remember all those stories about OpenSocial apps being hacked in 45 minutes or less? The problem was that sites like Plaxo Pulse and Ning simply didn't think through all the ramifications of building a widget platform and bumped up against the kind of "security 101" issues that widget platforms like Netvibes, iGoogle and Live.com gadgets solved years ago.  I started to wonder exactly how many of these social networking sites will be able to keep up with the capabilities and features of platforms like Facebook's and Orkut's when such development is outside their core competency.

In fact let's take a quote from the TechCrunch story First OpenSocial Application Hacked Within 45 Minutes 

theharmonyguy says he’s successfully hacked Facebook applications too, including the Superpoke app, but that it is more difficult:

Facebook apps are not quite this easy. The main issue I’ve found with Facebook apps is being able to access people’s app-related history; for instance, until recently, I could access the SuperPoke action feed for any user. (I could also SuperPoke any user; not sure if they’ve fixed that one. Finally, I can access all the SuperPoke actions - they haven’t fixed that one, but it’s more just for fun.) There are other apps where, last I checked, that was still an issue ( e.g. viewing anyone’s Graffiti posts).

But the way Facebook setup their platform, it’s tons harder to actually imitate a user and change profile info like this. I’m sure this kind of issue could be easily solved by some verification code on RockYou’s part, but it’s not inherent in the platform - unlike Facebook. I could do a lot more like this on FB if Facebook hadn’t set things up the way they did.

At that point I ask myself, how useful is it to have the specs for the platform if you aren't l337 enough to implement it yourself? [Update: It looks like Google is well aware of this problem and has launched an Apache project called Shindig which is meant to be an Open Source widget platform that implements the Open Social APIs. This obviously indicates that Google realizes the specs are worthless and instead shipping a reusable widget platform is the way to go. It’s interesting to note that with this move Google is attempting to be a software vendor, advertising partner and competitor to the Web’s social networking sites. That must lead to some confusing internal meetings. Smile ]

For now, Facebook has definitely outplayed Google here. The most interesting part of the Facebook announcement to me is

Now we also want to share the benefits of our work by enabling other social sites to use our platform architecture as a model. In fact, we’ll even license the Facebook Platform methods and tags to other platforms. Of course, Facebook Platform will continue to evolve, but by enabling other social sites to use what we’ve learned, everyone wins -- users get a better experience around the web, developers get access to new audiences, and social sites get more applications.

it looks like Facebook plans to assert their Intellectual Property rights on anyone who clones their platform. This is one of the reasons I've found Open Social to be worrisome abuse of the term "open". Like Facebook, Google shipped specs for a proprietary platform whose copyrights, patents, etc belong to them. Any company that implements Open Social or even GData which it is built upon is using Google's intellectual property.

What's to stop Google from asserting these intellectual property rights the way Facebook is doing today? What exactly is "open" about it that makes it any less proprietary than what Facebook just announced?


 

danah boyd writes eloquently about the slippery slope we are now headed down thanks to way Facebook is influencing the design of social software applications when it comes to privacy. She writes in here post entitled Facebook's "opt-out" precedent

I've been watching the public outcry over Facebook's Beacon (social ads) program with great interest…For all of the repentance by Facebook, what really bugs me is that this is the third time that Facebook has violated people's sense of privacy in a problematic way.

In each incident, Facebook pushed the boundaries of privacy a bit further and, when public outcry took place, retreated just a wee bit to make people feel more comfortable. In other words, this is "slippery slope" software development.

I kinda suspect that Facebook loses very little when there is public outrage. They gain a lot of free press and by taking a step back after taking 10 steps forward, they end up looking like the good guy, even when nine steps forward is still a dreadful end result. This is how "slippery slopes" work and why they are so effective in political circles. Most people will never realize how much of their data has been exposed to so many different companies and people. They will still believe that Facebook is far more private than other social network sites (even though this is patently untrue). And, unless there is a large lawsuit or new legislation introduced, I suspect that Facebook will continue to push the edges when it comes to user privacy.

Lots of companies are looking at Facebook's success and trying to figure out how to duplicate it. Bigger companies are watching to see what they can get away with so that they too can take that path.

I’ve stated before that one of my concerns about Beacon is that it legitimizes what is truly worrying behavior when it comes to companies respecting people’s privacy on the Web. As it stands now we have companies thinking it is OK to send out information about money you are loaning to your friend and that it is OK to violate federal legislation and share information about movies you have rented to watch in the privacy of your home without user consent.

This is an unprecedented degree of violation of the sanctity of the customer’s private Web experience. What I find sad is that not only are the technology unsavvy giving up their privacy on the Web in a way that they would never accept in meat space, but that even the technological savvy who know what is going on just assume it is par for the course. For example, see comments by John Dowdell of Adobe who implies that we were already led down this slippery slope by DoubleClick in the 90s and this is just the natural progression.

I actually worry less about Facebook and more about what happens when the Googles, DoubleClicks, Microsofts, and Yahoos of the world decide that “If Facebook can get away with it, we should do it too especially if we want to stay competitive”. In that world, your privacy and mine becomes collateral damage in the chase after the almighty dollar euro.

Now playing: Ashanti - Unfoolish (feat. Notorious B.I.G.)


 

Categories: Social Software

Things have been progressing well with RSS Bandit development recently. Some of our recent changes seem so valuable to me that I’ve been flirting with throwing out all our plans for Phoenix and shipping a new release right away. That’s how much I like the personalized meme tracking feature. I also fixed a bug where we continually fetch favicons for all your feeds if one of the sites in your subscriptions gives us an invalid image when we fetch its favicon. This bug affects me personally since a lot of RSS Bandit users are subscribed to my site and are polling my favicons several times an hour instead of once every startup or less. 

Despite those thoughts, we will continue with our plan to add the top 5 features for the next version of RSS Bandit which I blogged about last month. I need to do some fit & finish work this weekend on the meme tracking feature and then it is on to the next set of tasks. Torsten will be looking at ways to add UI for managing your pending and downloaded podcasts. I will be working on adding support for treating the Windows RSS platform and Newsgator Online as “Feed Sources”. This will mean that you can use RSS Bandit in standalone mode and as a desktop client for feeds that you are either sharing with Newsgator applications  (FeedDemon, NewsGator Online, Net News Wire, etc) or Windows RSS platform applications (Internet Explorer 7, Outlook 2007, etc).

For a long time, people have been asking for me to treat services like Newsgator Online in the same way an email client like Outlook treats mail servers like Exchange instead of the arms length degree of integration I’ve done in the past. It’s taken a while but I’m now going to go ahead and do just that.  

With that done, we’d probably have enough new features to ship an alpha and start getting initial feedback. I estimate that this will happen sometime in the first quarter of 2008. I also plan to go over our backlog of bugs during the holiday season and will knock out as many as I can before the alpha.

If you have any questions or comments, fire away. I’m all ears.

Now playing: Scarface - Diary of a Madman


 

Categories: RSS Bandit

The top story in my favorite aggregator today is the announcement on Scott Guthrie’s blog of the ASP.NET 3.5 Extensions CTP Preview. Normally, announcements related to ASP.NET would not interest me except this time is an interesting item in the list of technologies being released

ADO.NET Data Services: In parallel with the ASP.NET Extensions release we will also be releasing the ADO.NET Entity Framework.  This provides a modeling framework that enables developers to define a conceptual model of a database schema that closely aligns to a real world view of the information.  We will also be shipping a new set of data services (codename "Astoria") that make it easy to expose REST based API endpoints from within your ASP.NET applications.

Wow. It looks like Astoria has quickly moved from being an experimental project to see what it would like to place RESTful interfaces on top of SQL Server database to being very close to shipping a production version.  I dug around for more posts about Astoria ADO.NET Data Services so I could find out what was in the CTP and came across two posts from Mike Flasko and Andy Conrad respectively.

In his post entitled ADO.NET Data Services ("Project Astoria") CTP is Released on the ADO.NET Data Services team blog Mike Flasko writes

The following features are in this CTP:

  • Support to create ADO.NET Data Services backed by:
    • A relational database by leveraging the Entity Framework
    • Any data source (file, web service, custom store, application logic layer, etc)
  • Serialization Formats:
    • Industry standard AtomPub serialization
    • JSON serialization
  • Simple HTTP interface
    • Any platform with an HTTP stack can easily consume a data service
    • Designed to leverage HTTP semantics and infrastructure already deployed at large
  • Client libraries:
    • .NET Framework
    • ASP.NET AJAX
    • Silverlight (coming soon)

This is sick. With Astoria I can expose my relational database or even a local just an XML file using a RESTful interface that utilizes the Atom Publishing Protocol or JSON. I am somewhat amused that one of the options is placing a RESTful interface over a SOAP Web Service. My, how times have changed…

It is pretty cool that Microsoft is the first major database vendor to bring the dream of the Atom store to fruition. I also like that one of the side effects of this is that there is now an AtomPub client library for .NET Framework. Smile

Andy Conrad has a blog post entitled Linq to REST which gives an idea of what happens when you combine the Astoria client library with the Language Integrated Query (LINQ) features of C# 3.0

    [OpenObject("PropBag")]
    public class Product{
        private Dictionary<string, object> propBag = new Dictionary<string, object>();

        [Key]
        public int ProductID { get; set; }        
        public string ProductName { get; set; }        
        public int UnitsInStock { get; set; }
        public IDictionary<string, object> PropBag { get { return propBag; } }
    }

        static void Main(string[] args){
            WebDataContext context = new WebDataContext("http://localhost:18752/Northwind.svc");
            var query = from p in context.CreateQuery<Product>("Products")
                        where p.UnitsInStock > 100
                        select p;

            foreach (Product p in query){
                Console.WriteLine(p.ProductName + " , UnitsInStock= " + p.UnitsInStock);
            }

        } 

If you hover over the query variable, you will actually see the Astoria URI which the Linq query is translated into by the Astoria client library:

http://localhost:18752/Northwind.svc/Products?$filter=(UnitsInStock)%20gt%20(100)

So, there you go.  Linq to Astoria's RESTFUL API.  In other words, Linq to REST. 

Like I said earlier, this is sick. I need to holla at Andy and see if there is a dependency on the Atom feed containing Microsoft specific extensions or whether this Linq to REST capability can be utilized over any arbitrary Atom feed.

Now playing: Jay-Z - Success (feat. Nas)


 

December 9, 2007
@ 10:30 PM

A few months ago, Jenna and I found out about the Trash the Dress blog which features photo shots from wedding pictures taken in non-traditional locations. The term "trash the dress" is supposed to refer to the fact that the wedding dress is usually trashed at the end of the shoot.

Yesterday we met up with Cheryl Jones from In A Frame Photograpy and proceeded to destroy the Jenna's wedding dress while getting some good pictures out of the process. Below are a couple of pics from the shoot. Click on them to see more pics from Cheryl's blog post.

Now playing: Wyclef Jean - Sweetest Girl (feat. Akon, Lil Wayne & Niia)


 

Categories: Personal

This time last year, Erik Meijer sent me a paper about a new programming language project he was working on. I was high on the social graph at that time and didn't get around to responding to Erik's paper until this fall. The premise seemed fundamentally interesting; create an MSIL to Javascript compiler which is conceptually similar to Google's GWT and Nikhil Kothari's Script# then flip the traditional Web development script by allowing developers to choose whether code runs on the server or on the client by simply decorating methods with attributes. The last bit is the interesting innovation in Erik's project although it is obscured by the C#/VB/MSIL to Javascript compiler aspects.

As an example, let's say you have a function like ValidateAddress(). Whether this logic lives on the client (i.e. Javascript in the browser) or runs on the server is really a function of how complicated that function actually ends up being. Now imagine if when the time comes to refactor the function and move the validation logic from the Web client to the server or vice versa, instead of rewriting Javascript code in C#/IronPython/VB.NET/IronRuby/etc or vice versa you just add or remove a [RunAtOrigin] attribute on the function.

This project shipped last week as Microsoft Volta. You can learn a little more about it in Erik Meijer's post on Lambda the Ultimate entitled Democratizing the Cloud using Microsoft Live Labs Volta. Try it out, it's an interesting project that has legs. 

Now playing: Jay-Z - Pray


 

Categories: Programming

Om Malik has a blog post entitled Zuckerberg’s Mea Culpa, Not Enough where he writes

Frankly, I am myself getting sick and tired of repeating myself about the all-important “information transmission from partner sites” aspect of Beacon. That question remains unanswered in Zuckerberg’s blog post, which upon second read is rather scant on actual privacy information. Here is what he writes:

If you select that you don’t want to share some Beacon actions or if you turn off Beacon, then Facebook won’t store those actions even when partners send them to Facebook.”

So essentially he’s saying the information transmitted won’t be stored but will perhaps be interpreted. Will this happen in real time? If that is the case, then the advertising “optimization” that results from “transmissions” is going to continue. Right!

If they were making massive changes, one would have seen options like “Don’t allow any web sites to send stories to Facebook” or “Don’t track my actions outside of Facebook” in this image below.

This is the part of Facebook's Beacon service that I consider to be unfixable which probably needs to be stated more explicitly given comments like those by Sam Ruby in his post Little Details.

The fundamental design of Facebook Beacon is that a Web site publishes information about my transactions to Facebook without my permission and then Facebook tells me what happened after the fact. This is fundamentally Broken As Designed (B.A.D.).

I read Mark Zuckerburg's Thoughts on Beacon last week and looked at the new privacy controls. Nowhere is the fundamental problem addressed.

Nothing Mark Zuckerburg wrote changes the fact that when I rent a movie from Blockbuster Online, information about the transaction is published to Facebook regardless of whether I am a Facebook user or not.  The only change Zuckerburg has announced is that I can opt out of getting nagged to have the information spammed to my friends via the News Feed. One could argue that this isn't Facebook's problem. After all, when SixApart implemented support for Facebook Beacon they didn't decide that they'd blindly publish all activities from users of TypePad to Facebook. Instead they have an opt-in model on their site which preserves their users' privacy by not revealing information to Mark Zuckerburg's company without their permission. On the flip side the Blockbuster decided to publish information about all of their customers' video rental transaction history  to Mark Zuckerburg and company, without their explicit permission, even though this violates federal law. As a Blockbuster customer, the only way around this is to stop using Blockbuster's service.

So who is to blame here? Facebook for designing a system that assumes that 3rd parties publishing private user data to them without the user's consent is OK as the default or Facebook affiliates who care so little of their customer's privacy that they give it away to Facebook in return for "viral" references to their services (aka spam)?

Now playing: Akon - Ghetto (Green Lantern remix) (feat. Notorious B.I.G. & 2Pac)


 

I often tell people at work that turning an application into a platform is a balancing act, not only do you have to please the developers on your platform BUT you also have to please the users of your application as well.

I recently joined the This has got to stop group on Facebook. If you don't use Facebook, the front page of the group is shown in the screenshot below.

POINTLESS FACEBOOK APPLICATIONS ARE RUINING FACEBOOK (167,186 Members)

I've seen a bunch of tech folks blog about being overwhelmed by Facebook app spam like Tim Bray in his post Facebook Rules and Doc Searls in Too much face(book) time. However I assumed that the average college or high school student who used the site didn't feel that way. Looks like I was wrong.

The folks at Facebook could fix this problem easily but it would eliminate a lot of the "viralness" that has been hyped about the platform. Personally, I think applications on the site have gotten to the point where the costs have begun to outweigh the benefits. The only way to tip the balance back is to rein them in otherwise it won't be long until the clean and minimal vs. cluttered and messy aesthetics stop working in their favor in comparisons with MySpace. When that happens there will be an opportunity for someone else to do the same thing to them.

On an unrelated note,  the MoveOn.org sponsored group about Facebook Beacon has 74,000 members which is less than half of the size of the This has got to stop group.  This is despite the fact that MoveOn.org has had national media attention focused on that topic. I guess it goes to show that just because a story gets a lot of hype in blogs and the press doesn't mean that it is the most important problem facing the people it actually affects.

Now playing: Jay-Z - Ignorant Shit