Two seemingly unrelated posts flew by my aggregator this morning. The first was Robert Scoble’s post The shy Mark Zuckerberg, founder of Facebook where he talks about meeting the Facebook founder. During their conversation, Zuckerburg admits they made mistakes with their implementation of Facebook Beacon and will be coming out with an improved version soon.

The second post is from the Facebook developer blog and it is the announcement of the JavaScript Client Library for Facebook API which states

This JavaScript client library allows you to make Facebook API calls from any web site and makes it easy to create Ajax Facebook applications. Since the library does not require any server-side code on your server, you can now create a Facebook application that can be hosted on any web site that serves static HTML.

Although the pundits have been going ape shit over this on Techmeme this is an unsurprising announcement given the precedent of Facebook Beacon. With that announcement they provided a mechanism for a limited set of partners to integrate with their feed.publishTemplatizedAction API using a Javascript client library. Exposing the rest of their API using similar techniques was just a matter of time.

What was surprising to me when reading the developer documentation for the Facebook Javascript client library is the following notice to developers

Before calling any other Facebook API method, you must first create an FB.ApiClient object and invoke its requireLogin() method. Once requireLogin() completes, it invokes the callback function. At that time, you can start calling other API methods. There are two way to call them: normal mode and batched mode.

So unlike the original implementation of Beacon, the Facebook developers aren’t automatically associating your Facebook account with the 3rd party site then letting them party on your data. Instead, it looks like the user will be prompted to login before the Website can start interacting with their data on Facebook or giving Facebook data about the user.

This is a much better approach than Beacon and remedies the primary complaint from my Facebook Beacon is Unfixable post from last month.

Of course, I haven’t tested it yet to validate whether this works as advertised. If you get around to testing it before I do, let me know if it works the way the documentation implies in the comments.


 

I’m not a regular user of Digg so I tend to ignore the usual controversy of the month style storms that end up on Techmeme whenever they tweak their algorithm for promoting items from the upcoming queue to the front page. Today I decided to take a look at the current controversy and clicked on the story So Called Top Digg Users Cry About Digg Changes which led me to the following comment by ethicalh

the algorithm uses the "social networking" part of digg now. if you are a mutual friend of the submitter your vote is cancelled and wont go towards promotion to the frontpage, and if you're a mutual friend of someone who has already dugg the story, your vote is cancalled and won't go towards promotion to the frontpage. so if you don't have a friends list and don't use the social networking part of digg then you can still be a top digger. you just need to create a new account and don't be tempted to add anyone as a friend, because the new algorithm is linked upto everyones friends list now. thats the real reason digg added social networking, is so they could eventually hook it upto the algorithm, thats the secret reason digg introduced social networking and friends lists onto the digg site.

A number of people confirmed the behavior in the comments. It seems that all votes from mutual friends (i.e. people on each other’s friends list) are treated as a single vote. So if we are three friends that all use Digg and have each other on our friends’ lists then if all three of us vote for a story, it is counted as a single vote. This is intended to subvert cliques of “friends” who vote up articles and encourage diversity or so the Digg folks claim in a quote from a New York Times article on the topic.

My first impression was that this change seems pretty schizophrenic on the part of the Digg developers. What is the point of adding all sorts of features that enable me to keep tabs on what other Digg users are voting up if you don’t want me to vote up the ones I like as well? Are people really supposed to trawl through http://www.digg.com/tech_news/upcoming to find stories to vote up instead of just voting up the ones their friends found that they thought were cool as well? Yeah…right.

But thinking deeper, I realize that this move is even more screwed up. The Digg developers are pretty much penalizing you for using a feature of the site. You actually less value out of the site by using the friends feature. Because once you start using the friends feature, it is less likely that stories you like will be voted to the front page due to the fact that your vote doesn’t count if someone on your friends list has already voted for the story.

So why would anyone want to use the friends feature once they realize this penalty exists?

Now playing: Dogg Pound - Ridin', Slipin' & Slidin'


 

Categories: Social Software

According to the blog post entitled on Microsoft Joins DataPortability.org on dev.live.com we learn

Today Microsoft is announcing that it has joined DataPortability.org, a group committed to advancing the conversation about the portability, security and privacy of individuals’ information online.  There are important security and privacy issues to solve as the internet evolves, and we are committed to being an integral part of the industry conversation on behalf of our users.

The decision to join DataPortability.org is an outgrowth of a deeper theme that technology and the internet should be deployed to help people be at the center of their online worlds, a theme that has begun to permeate our products and services over the past few years. We believe the logical evolution of the internet is to enable the removal of barriers to provide integrated, seamless experiences, but to do so in a manner that ensures that users retain full control over the security and privacy of their information.

Windows Live is focused on providing tools and a platform to enable these types of seamless experiences.  Windows Live has more than 420 million active Live IDs that work across our services and across partner sites. 

I’m sure some folks are wondering exactly what this means. Even though I was close to the decision making around this, I believe it is still too early to tell. Personally, I share Marc Canter’s skepticism about Dataportability.org given that so far there’s been a lot of hype but no real meat.

However we have real problems to solve as an industry. The lack of interoperability between various social software applications is troubling given that the Internet (especially the Web) got to be a success today by embracing interoperability instead of being about walled gardens fighting over who can build the prettiest gilded cage for their prisoners customers. The fact that when interoperability happens, it is in back room deals (e.g. Google OpenSocial, Microsoft’s conversations with startups, etc) instead of being open to all using standard and unencumbered protocols is similarly troubling. Even worse, insecure practices that expose social software users to privacy violations have become commonplace due to the lack of a common framework for interoperability.

As far as I can tell, Dataportability.org seems like a good forum for various social software vendors to start talking about how we can get to a world where there is actual interoperability between social software applications. I’d like to see real meat fall out of this effort not fluff. One of the representatives Microsoft has chosen is the dev lead from the product team I am on (Inder Sethi) which implies we want technical discussion of protocols and technologies not just feel good jive. We’ll also be sending a product planning/marketing type as well (John Richards) to make sure the end user perspective is also being covered. You can assume that even though I am not on the working group in person, I will be there in spirit since I communicate with both John and Inder on a regular basis. Smile 

I’ll also be at the O’Reilly offices during Super Bowl weekend attending the O’Reilly Social Graph FOO Camp which I hope will be another avenue to sit together with technical decision makers from the various major social software vendors and talk about how we can move this issue forward as an industry.

Now playing: Bone Thugs 'N Harmony - If I Could Teach The World


 

Categories: Windows Live

The top story in my favorite RSS reader is the article MapReduce: A major step backwards by David J. DeWitt and Michael Stonebraker. This is one of those articles that is so bad you feel dumber after having read it. The primary thesis of the article is

 MapReduce may be a good idea for writing certain types of general-purpose computations, but to the database community, it is:

  1. A giant step backward in the programming paradigm for large-scale data intensive applications
  2. A sub-optimal implementation, in that it uses brute force instead of indexing
  3. Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago
  4. Missing most of the features that are routinely included in current DBMS
  5. Incompatible with all of the tools DBMS users have come to depend on

One of the worst things about articles like this is that it gets usually reasonable and intelligent sounding people spouting of bogus responses as knee jerk reactions due to the articles stupidity. The average bogus reaction was the kind by Rich Skrenta in his post Database gods bitch about mapreduce which talks of "disruption" as if Google MapReduce is actually comparable to a relation database management system.

On the other hand, the good thing about articles like this is that you get often get great responses from smart folks that further your understanding of the subject matter even though the original article was crap. For example, take the post from Google employee Mark Chu-Carroll entitled Databases are Hammers; MapReduce is a ScrewDriver where he writes eloquently that

The beauty of MapReduce is that it's easy to write. M/R programs are really as easy as parallel programming ever gets. So, getting back to the article. They criticize MapReduce for, basically, not being based on the idea of a relational database.

That's exactly what's going on here. They've got their relational databases. RDBs are absolutely brilliant things. They're amazing tools, which can be used to build amazing software. I've done a lot of work using RDBs, and without them, I wouldn't have been able to do some of the work that I'm proudest of. I don't want to cut down RDBs at all: they're truly great. But not everything is a relational database, and not everything is naturally suited towards being treated as if it were relational. The criticisms of MapReduce all come down to: "But it's not the way relational databases would do it!" - without every realizing that that's the point. RDBs don't parallelize very well: how many RDBs do you know that can efficiently split a task among 1,000 cheap computers? RDBs don't handle non-tabular data well: RDBs are notorious for doing a poor job on recursive data structures. MapReduce isn't intended to replace relational databases: it's intended to provide a lightweight way of programming things so that they can run fast by running in parallel on a lot of machines. That's all it was intended to do.

Mark’s entire post is a great read.

Greg Jorgensen also has a good rebutal in his post Relational Database Experts Jump The MapReduce Shark which points out that if the original article had been a critique of a Web-based structured data storage systems such as Amazon’s SimpleDB  or Google Base then the comparison may have been almost logical as opposed to being completely ridiculous. Wink

Now playing: Marvin Gaye - I Heard It Through the Grapevine


 

Categories: Web Development

Ari Steinberg who works on the Facebook developer platform has a blog post entitled New Rules for News Feed which states

As part of the user experience improvements we announced yesterday, we're changing the rules for how Feed stories can be published with the feed.publishTemplatizedAction API method. The new policy moving forward will be that this function should only be used to publish actions actively taken by the "actor" mentioned in the story. As an example, feed stories that say things like "John has received a present" are no longer acceptable. The product motivation behind this change is that Feed is a place to publish highly relevant stories about user activity, rather than passive stories promoting an application.

To foster this intended behavior, we are changing the way the function works: the "actor_id" parameter will be ignored. Instead the session_key used to generate the feed story will be used as the actor.


In order to ensure a high quality experience for users, starting 9am Pacific time Tuesday 22 January we may contact you, or in severe cases initiate an enforcement action, if your stories are not complying with the new policy, especially if the volume of non-complying stories is high.

If you are not a developer using the Facebook platform, it may be unclear what exactly this announcement means to end users or applications that utilize Facebook’s APIs.

To understand the impact of the Facebook announcement, it would be useful to first talk about the malicious behavior that Facebook is trying to curb. Today, an application can call feed.publishTemplatizedAction and publish a story to the user’s Mini-feed (list of all the user’s actions) which will also show up in the News Feed of the users friends. Unfortunately some Facebook applications have been publishing stories that don’t really correspond to a user taking an action. For example, when a user installs the Flixster application, Flixster not only publishes a story to all of the user’s friends saying the user has installed the application but also publishes a story to the friends of each of the user’s friends that also have Flixster installed. This means my friends get updates such as

being sent to my friends when I wasn’t actually doing anything with the Flixster application. I don’t know about you but this seems like a rather insiduous way for an application to spread “virally”.

Facebook’s attempt to curb such application spam is to require that an application have a session key that identifies the logged in user when publishing the story which implies that the user is actually using the application from within Facebook when the story is published. The problem with this remedy is that it totally breaks applications that publish stories to the Facebook News Feed when the user isn’t on the site. For example, since I have the Twitter application installed on Facebook, my Facebook friends get an update sent to their News Feeds whenever I post something new on Twitter.

The problem for Facebook is that by limiting a valid usage of the API, they may have closed off a spam vector but have also closed off a valuable integration point for third party developers and for their users.

PS: There might be an infinite session key loophole to the above restriction which I’m sure Facebook will close off if apps start abusing it as well.

Now playing: Silkk the Shocker - It Ain't My Fault


 

I’ve finally broken down and used the Extension methods feature in C# 3.0. The feature is similar to the concept of “open classes” in Ruby described in Neal Ford’s post Are Open Classes Evil? which contains the following excerpt

Open classes in dynamic languages allow you to crack open a class and add your own methods to it. In Groovy, it's done with either a Category or the Expando Meta-class (which I think is a great name). JRuby allows you to do this to Java classes as well. For example, you can add methods to the ArrayList class thusly:

require "java"
include_class "java.util.ArrayList"
list = ArrayList.new
%w(Red Green Blue).each { |color| list.add(color) }

# Add "first" method to proxy of Java ArrayList class.
class ArrayList
def first
size == 0 ? nil : get(0)
end
end
puts "first item is #{list.first}"


Here, I just crack open the ArrayList class and add a first method (which probably should have been there anyway, no?).

In my case, I added a CanonicalizedUri() method the System.Uri class because I was tired of how we had to have all sorts of special case code to deal with canonicalizing URIs because Uri.AbsoluteUri property would not represent http://www.example.com and http://www.example.com/ as the same URI and a couple of other special cases.  My extension method is shown below

/// <summary>

/// Helper class used to add extension method to the System.Uri class.

/// </summary>

public static class UriHelper

{

/// <summary>

/// Returns a the URI canonicalized in the following way. (1) if the file is a UNC or file URI then it only returns the local part.

/// (2) for Web URIs it removes trailing slashes and preceding "www."

/// </summary>

/// <param name="uri">The URI to canonicalize</param>

/// <returns>The canonicalized URI as a string</returns>

public static string CanonicalizedUri(this Uri uri)

{

if (uri.IsFile || uri.IsUnc)

return uri.LocalPath;

UriBuilder builder = new UriBuilder(uri);

builder.Host = (builder.Host.ToLower().StartsWith("www.") ? builder.Host.Substring(4) : builder.Host);

builder.Path = (builder.Path.EndsWith("/") ? builder.Path.Substring(0, builder.Path.Length - 1) : builder.Path);

return builder.ToString().Replace(":" + builder.Port + "/", "/");

}

}

Now everywhere we used to have special case code for canonicalizing a URI, we just replace that with a call to uri.CanonicalizedUri(). It’s intoxicating how liberating it feels to be able to “fix” classes in this way.  

I have seen some complain that coupling this feature with Intellisense (i.e. method name autocomplete) leads to an overwhelming experience. Compare the following screenshots from Jessica Folser’s post using System.Linq, sometimes more isn't better which shows the difference between hitting ‘.’ on an Array with the System.Linq namespace included versus not. Note that the System.Linq namespace defines a number of extension methods for the Array class and its base classes.

WITHOUT SYSTEM.LINQ (20 methods)
Without System.Linq, Array has 20 methods

USING SYSTEM.LINQ (68 methods)

Using System.Linq, Array has 68 methods

I did find this disconcerting at first but I got used to it.

It should be noted that “open classes” in Ruby come with a bunch more features than extension methods in C# 3.0. For example, Ruby has remove_method and undef_method which allows developers to remove methods from a class. This is particularly useful if there is a particularly buggy or insecure method you’d rather was not used by developers in your project. Much better than simply relying on the Obsolete attribute. Smile

One problem I had with C# is that I can’t create an extension property only methods (so my CanonicalizedUri() had to be a method not a property like Uri.AbsoluteUri).  I assume this is due to difficulty in coming up with a backwards compatible extension to the syntax for properties. Either way, it sucks. You can count me as another developer who is voting for extension properties in C# 4.0 (or whatever comes next).

Now playing: Oomp Camp - Intoxicated


 

Categories: Programming

I was reading the blog post entitled The hard side of Mister Softie from Josh Quittner of Fortune magazine which ends with the following excerpt

Hall said that Microsoft’s main concern, and the reason it sent out Big Foot letters in the first place, was security. “If you look at what a number of sites are doing, they’re asking for your Hotmail login info, They’re storing your identity, which is not a best practices [approach] for anyone’s data from a security standpoint. We want to make sure our data is kept between our users and our servers.”

The thrust of the term sheets, he said, was to create a process whereby Hotmail and other Windows Live data could be shared securely with third parties. Added Hall: “There are models for federation where you can trust other services—and that’s what we’re trying to do with our partners.”

Thats what doesn’t make sense to me. If this is such a security problem, why do Google and Yahoo let their users take their contacts with them?

Besides the obvious observation that folks at Google & Yahoo! probably don’t think it’s a good idea for random fly-by-night social networking services to be collecting  usernames and password from users of their services (see posts like Spock sign-up flow demonstrates how to scare users away... from Jeremy Zawodny of Yahoo!), I am amused by the “if the geniuses at Google and Yahoo! think it’s OK, who are the Microsoft morons to think different” sentiment exposed by that statement.

Maybe I’m getting snarky in my old age. Wink

Now playing: Red Hot Chili Peppers - Torture Me


 

Categories: Social Software | Windows Live

From danah boyd: confused by Facebook

I'm also befuddled by the slippery slope of Facebook. Today, they announced public search listings on Facebook. I'm utterly fascinated by how people talk about Facebook as being more private, more secure than MySpace. By default, people's FB profiles are only available to their network. Join a City network and your profile is far more open than you realize. Accept the default search listings and you're findable on Google. The default is far beyond friends-only and locking a FB profile down to friends-only takes dozens of clicks in numerous different locations. Plus, you never can really tell because if you join a new network, everything is by-default open to that network (including your IM and phone number).

From Caroline McCarthy: Report: Facebook threatens to ban Gawker's Denton

Facebook isn't too happy with Gawker Media founder Nick Denton over some screenshots of a member's profile that he posted on Gawker.com on Tuesday, Portfolio.com reports. The social-networking site reportedly plans to send a warning letter to the New York-based digital-media entrepreneur citing several terms-of-service violations--one more, and he's out.

Facebook representatives were not immediately available for comment.

On Tuesday, Denton--who took over as managing editor of Gawker.com this month after several staff departures--posted a bit of an expose on 25-year-old Emily Brill, daughter of New York publishing figure Steve Brill. Screenshots of the younger Brill's Facebook profile, featuring glamorous photos of a yachting trip to the British Virgin Islands, as well as excited "status" messages about an impending trip to the Caribbean luxury getaway of St. Barth's, were juxtaposed with an older photograph of the Brown graduate when she was significantly heavier.

It's not clear whether Denton and Brill are "friends" on the site, or if it was even Denton (rather than a source or another Gawker Media employee) who pulled the screenshots from Facebook. But both Denton and Brill are members of the New York regional network, so there is a chance that Denton would have been able to view Brill's profile even without being connected as friends.

It boggles my mind that someone sat down and coded “Anyone who lives in the same city as me” as a privacy control and didn’t immediately smack themselves on the head for writing something so ridiculously useless and that is guaranteed to cause privacy issues. 

It would have been easier to have a notion of public profiles and appropriate scary warnings or defaults that protected people’s privacy than the farce that is “regional networks”.

Now playing: Chris Brown - Say Goodbye


 

Categories: Social Software

I’ve noticed a meme that seems to have been going around in various blogs of folks who continue to indulge in the long since dead REST vs. SOAP discussion. This meme is that (i) you don’t want or need an interface definition language if you are building a RESTful Web Service and (ii) an interface definition language for RESTful Web Services has to look something like CORBA IDL or WSDL.

You can find examples of this kind of thinking in blog posts like Steve Vinoski’s IDLs vs. Human Documentation post excerpted below

Note that Patrick mostly talks about data schemas, whereas my posting talks only of interface definition languages. These are two very different things, which I’ve noted in comments on his blog. In a reply comment he said they’re both metadata, which is true, but still, they’re very separable. REST depends heavily on data definitions, but it doesn’t require specialized interface definitions because it promotes a uniform interface. For data definition REST relies on and promotes media/MIME types, and the standardization of such data definitions is critical to allowing independently-developed consumers and providers to interact correctly. I doubt Patrick and I really disagree on this last point.

and Ryan Tomayko’s Speaking of, "lying through their teeth..." also excerpted below

The WS-* folks have historically been obsessed with making things easy, usually for an imaginary business analyst who is nowhere near as technically adept as they. The REST folks, on the other hand, seem much more interested in keeping the entire stack simple, and for everyone involved.

This difference in priorities (easy vs. simple) often manifests itself in arguments about technological issues on the surface. Take the never ending debate about whether REST needs a description language like WSDL; which, incidentally, Sanjiva is largely responsible for. If building systems in your world can be made easier with the addition of a description language, then WSDL probably makes a lot of sense. If, however, building distributed systems in your world is a tediously hard pain in the ass whether you have these cockamamie description files or not, well, then you fight to keep the system as simple as possible by reducing the number of actors, dependencies, and concepts to an absolute minimum.

Let’s start with Steve Vinoski’s post. Steve is right to point out that there is a difference between data schemas and interface definitional languages. When building services with WS-*, you have a WSDL to describe your methods & expected inputs/outputs and XSD schema(s) to describe the schemas for said inputs/outputs. When building a RESTful Web Service, the need for both of these documents does not go away regardless of how often you repeat the phrase “uniform interface”.

Steve argues that instead of using an XML schema to describe your document formats, you should rely on registered MIME types. The benefit of this is that you’ve broadened your horizon from thinking that the only payload for your Web service can be XML documents. The WS-* folks had to jump through lots of mental hoops to try and get non-XML data to fit in their model with wacky schemes like SOAP with Attachments (SwA), Direct Internet Message Encapsulation (DIME), WS-Attachments, Message Transmission Optimization Mechanism (MTOM) and XML-binary Optimized Packaging (XOP). All of this complexity existed because of the fundamental design of WS-* is that all data going in and out of a SOAP Web service must either be an XML document or modelled as an XML document. 

However this doesn’t mean everything is plain sailing if you stick to only using registered IANA MIME types as the payloads of your Web services. What happens when you have a document format that doesn’t have a registered MIME type? You have two choices, you can either co-opt an existing MIME type and use it as an envelope format as Google has done with GData or you can use your own custom XML format as Facebook has done with the Facebook REST API. In both cases, it would be useful for developers if your data schema is documented either in prose or via some XML schema language. This doesn’t require an interface definition language like WSDL nor does it require a schema definition language like XSD.  

On the other hand, how does a client application discover your application’s service end points? Today, when you point your browser to my blog at http://www.25hoursaday.com/weblog, your browser automatically detects that I have an RSS feed. When you  point Windows Live Writer to a weblog, it automatically detects how to post and edit blog posts programmatically if your weblog software supports the Atom Publishing Protocol.

In the case of the RSS feed, your browser knows I have one by looking at the link element pointing to my RSS feed. The browser knows what to do with the file via the MIME type and there is no interface to be defined because the only contract of an RSS feed is that it should support HTTP GET. On the flip side an Atom service document, which is what Windows Live Writer reads to learn about your blog describes, describes the various service end points (i.e. collections) as well as the accepted inputs/outputs (either as MIME types or the hardcoded string ‘entry’ for Atom entries since they don’t have a MIME type).

The examples of Atom service documents and link elements in HTML, highlight that there is real world value in describing the interfaces to your RESTful Web Service. In addition, Atom service documents show that you can define an interface definition language for Web services without resorting to reinventing CORBA IDL (i.e. WSDL). So I respectfully disagree with Ryan Tomayko…just because my life is made easier with a service description language doesn’t make WSDL a good idea.

Now playing: Dead Prez - Hell Yeah (Pimp the System)


 

Categories: XML Web Services

In his blog post entitled Joining Microsoft Live Labs Greg Linden writes

I am starting at Microsoft Live Labs next week.

Live Labs is an applied research group affiliated with Microsoft Research and MSN. The group has the enjoyable goal of not only trying to solve hard problems with broad impact, but also getting useful research work out the door and into products so it can help as many people as possible as quickly as possible.

Live Labs is lead by
Gary Flake, the former head of Yahoo Research. It is a fairly new group, formed only two years ago. Gary wrote a manifesto that has more information about Live Labs.

when I found out Greg was shutting down Findory I thought myself that he’d be a great hire for Microsoft especially since he already lived in the area. It seems someone else though the same thing and now Greg has been assimilated. Congratulations, Greg.

I seem to be bumping into more and more people who are either working for or with Live Labs. Besides Justin Rudd who I just referred to the team, there’s Mike Deem and Erik Meijer, two people I know from my days on the XML team. I wonder what Gary Flake is cooking up in those swanky offices in Bellevue that has so many smart folks gravitating to his group?

Now playing: Kool & The Gang - Celebration