On Friday of last week,  brad Fitzpatrick posted an entry on the Google code blog entitled URLs are People, Too where he wrote

So you've just built a totally sweet new social app and you can't wait for people to start using it, but there's a problem: when people join they don't have any friends on your site. They're lonely, and the experience isn't good because they can't use the app with people they know. You could ask them to search for and add all their friends, but you know that every other app is asking them to do the same thing and they're getting sick of it. Or they tried address book import, but that didn't totally work, because they don't even have all their friends' email addresses (especially if they only know them from another social networking site!). What's a developer to do?

One option is the new Social Graph API, which makes information about the public connections between people on the Web easily available and useful
...
Here's how it works: we crawl the Web to find publicly declared relationships between people's accounts, just like Google crawls the Web for links between pages. But instead of returning links to HTML documents, the API returns JSON data structures representing the social relationships we discovered from all the XFN and FOAF. When a user signs up for your app, you can use the API to remind them who they've said they're friends with on other sites and ask them if they want to be friends on your new site.

I talked to Dewitt Clinton, Kevin Marks and Brad Fitzpatrick about this API at the O'Reilly Social Graph FOO Camp and I think it is very interesting. Before talking about the API, I did want to comment on the fact that this is the second time I've seen a Google employee ship something that implies that any developer can just write custom code to do data analysis on top of their search index (i.e. Google's copy of the World Wide Web) and then share that information with the world. The first time was Ian Hickson's work with Web authoring statistics. That is cool.

Now back to the Google Social Graph API. An illuminating aspect of my conversations at the Social Graph FOO Camp is that the scenario described by Brad where social applications would like to bootstrap the user's experience by showing them their friends who use the service is more important than the "invite my friends to join this new social networking site" for established social apps. This is interesting primarily because both goals are currently achieved by the current anti-pattern of requesting a user's username and password to their email service provider and screen scraping their address book. The social graph API attempts to eliminate the need for this ugly practice by providing a public API which will crawl a user's publicly articulated relationships and then providing an API that social apps can use to find the user's identities on other services as well as their relationships with other users on those services.

The API uses URIs as the primary identifier for users instead of email addresses. Of course, since there is often an intuitive way to convert a username to a URI (e.g. 'carnage4life on Twitter' => http://www.twitter.com/carnage4life), users simply need to provide a username instead of a URI.

So how would this work in the real world? So let's say I signed up for Facebook for the first time today. At this point my experience on the site would be pretty lame because I've made no friends so my news feed would be empty and I'm not connected to anyone I know on the site yet. Now instead of Facebook collecting the username and password for my email address provider to screen scrape my addres book (boo hiss) it shows a list of social networking sites and asks for just my username on those sites. On obtaining my username on Twitter, it maps that to a URI and passes that to the Social Graph API. This returns a list of people I'm following on Twitter with various identifiers for them, which Facebook in turn looks up in their user database then prompts me to add them as my friends on the site if any of them are Facebook users.

This is a good idea that gets around the proliferation of applications that collect usernames and passwords from users to try to access their social graph on other sites. However there are lots of practical problems with relying on this as an alternative to screen scraping and other approaches intended to discover a user's social graph including

  • many social networking sites don't expose their friend lists as FOAF or XFN
  • many friend lists on social networking sites are actually hidden from the public Web (e.g. most friend lists on Facebook) which is by design
  • many friend lists in social apps aren't even on the Web (e.g. buddy lists from IM clients, address books in desktop clients)

That said this is a good contribution to this space. Ideally, the major social networking sites and address book providers would also expose APIs that social applications can use to obtain a user's social graph without resorting to screen scraping. We are definitely working on that at Windows Live with the Windows Live Contacts API. I'd love to see other social software vendors step up and provide similar APIs in the coming months. That way everybody wins; our users, our applications and the entire ecosystem.

Now Playing: Playaz Circle - Duffle Bag Boy (feat. Lil Wayne)


 

Categories: Platforms | Social Software

A few days ago I got a Facebook message from David Recordon about Six Apart's release of the ActionStreams plugin. The meat of the announcement is excerpted below

Today, we're shipping the next step in our vision of openness -- the Action Streams plugin -- an amazing new plugin for Movable Type 4.1 that lets you aggregate, control, and share your actions around the web. Now of course, there are some social networking services that have similar features, but if you're using one of today's hosted services to share your actions it's quite possible that you're giving up either control over your privacy, management of your identity or profile, or support for open standards. With the Action Streams plugin you keep control over the record of your actions on the web. And of course, you also have full control over showing and hiding each of your actions, which is the kind of privacy control that we demonstrated when we were the only partners to launch a strictly opt-in version of Facebook Beacon. Right now, no one has shipped a robust and decentralized complement to services like Facebook's News Feed, FriendFeed, or Plaxo Pulse. The Action Streams plugin, by default, also publishes your stream using Atom and the Microformat hAtom so that your actions aren't trapped in any one service. Open and decentralized implementations of these technologies are important to their evolution and adoption, based on our experiences being involved in creating TrackBack, Atom, OpenID, and OAuth. And we hope others join us as partners in making this a reality.

This is a clever idea although I wouldn't compare it to the Facebook News Feed (what my social network is doing) it is instead a self hosted version of the Facebook Mini-Feed (what I've been doing). Although people have been doing this for a while by aggregating their various feeds and republishing to their blog (life streams?), I think this is the first time that a full fledged framework for doing this has been shipped as an out of the box solution. 

Mark Paschal has a blog post entitled Building Action Streams which gives an overview of how the framework works. You define templates which contains patterns that should be matched in a feed (RSS/Atom) or in an HTML document and how to convert these matched elements into a blog post. Below is the template for extracting and republishing del.icio.us links extracted from the site's RSS feeds.

delicious:
    links:
        name: Links
        description: Your public links
        html_form: '[_1] saved the link <a href="[_2]">[_3]</a>'
        html_params:
            - url
            - title
        url: 'http://del.icio.us/rss/{{ident}}'
        identifier: url
        xpath:
            foreach: //item
            get:
                created_on: dc:date/child::text()
                title: title/child::text()
                url: link/child::text()

It reminds me a little of XSLT. I almost wondered why they just didn't use that until I saw that it also supports pattern matching HTML docs using Web::Scraper [and that XSLT is overly verbose and difficult to grok at first glance].

Although this is a pretty cool tool I don't find it interesting as a publishing tool. On the other hand, it's potential as a new kind of aggregator is very interesting. I'd love to see someone slap more UI on it and make it a decentralized version of the Facebook News feed. Specifically, if I could feed it a blogroll, have it use the Google Social Graph API to figure out the additional services that the people in my subscriptions have and then build a feed reader + news feed experience on top of it. That would be cool. 

Come to think of it, this would be something interesting to experiment with in future versions of RSS Bandit.

Now Playing: Birdman - Pop Bottles (remix) (feat. Jim Jones & Fabolous)


 

Categories: Platforms | Social Software

Given that I work in Microsoft's online services group and have friends at Yahoo!, I obviously won't be writing down my thoughts on Microsoft's $44.6 billion bid for Yahoo. However I have been somewhat amused by the kind of ranting I've seen in the comments at Mini-Microsoft. Although the majority of the comments on Mini-Microsoft are critical of the bid, it is clear that the majority of the posters aren't very knowledgeable about Microsoft, it's competitors or the online business in general.

There were comments from people who are so out of it they think Paul Allen is a majority share holder of Microsoft. Or even better that Internet advertising will never impact newspaper, magazine or television advertising. I was also amused by the person that asked if anyone could name 2 or 3 successful acquisitions or technology purchases by Microsoft. I wonder if anyone would say the  Bungie or Visio acquisitions didn't work out for the company. Or that the products that started off as NCSA Mosaic or Sybase SQL have been unsuccessful as Microsoft products.

My question for the armchair quarterbacks that have criticized this move in places like Mini-Microsoft is "If you ran the world's most successful software company, what would you do instead?"

PS: The ostrich strategy of "ignoring the Internet" and milking the Office + Windows cash cows doesn't count as an acceptable answer. Try harder than that.

Now Playing: Birdman - Hundred Million Dollars (feat. Rick Ross, Lil' Wayne & Young Jeezy)


 

Categories: Life in the B0rg Cube

As I'm getting ready to miss the first Super Bowl weekend of my married life to attend the the O'Reilly Social Graph FOO Camp, I'm reminded that I should be careful about using wireless at the conference by this informative yet clueless post by Larry Dignan on ZDNet entitled Even SSL Gmail can get sidejacked which states

Sidejacking is a term Graham uses to describe his session hijacking hack that can compromise nearly all Web 2.0 applications that rely on saved cookie information to seamlessly log people back in to an account without the need to reenter the password.  By listening to and storing radio signals from the airwaves with any laptop, an attacker can harvest cookies from multiple users and go in to their Web 2.0 application.  Even though the password wasn’t actually cracked or stolen, possession of the cookies acts as a temporary key to gain access to Web 2.0 applications such as Gmail, Hotmail, and Yahoo.  The attacker can even find out what books you ordered on Amazon, where you live from Google maps, acquire digital certificates with your email account in the subject line, and much more.

Gmail in SSL https mode was thought to be safe because it encrypted everything, but it turns out that Gmail’s JavaScript code will fall back to non-encrypted http mode if https isn’t available.  This is actually a very common scenario anytime a laptop connects to a hotspot before the user signs in where the laptop will attempt to connect to Gmail if the application is opened but it won’t be able to connect to anything.  At that point in time Gmail’s JavaScripts will attempt to communicate via unencrypted http mode and it’s game over if someone is capturing the data.

What’s really sad is the fact that Google Gmail is one of the “better” Web 2.0 applications out there and it still can’t get security right even when a user actually chooses to use SSL mode. 

Although the blog post is about a valid concern,  the increased likelihood of man-in-the-middle attacks when using unsecured or shared wireless networks, it presents it in the most ridiculous way possible. Man-in-the-middle attacks are a problem related to using computer networks, not something that is limited to the Web let alone Web 2.0 (whatever that means).

Now Playing: 50 Cent - Touch The Sky (Feat. Tony Yayo) (Prod by K Lassik)


 

Obviously, this is the top story on all the tech news sites this morning. My favorite take so far has been from a post on Slashdot entitled Implications for open source  which is excerpted below

A consolidation of the Microsoft and Yahoo networks could shift a massive amount of infrastructure from open source technologies to Microsoft platforms.Microsoft said that "eliminating redundant infrastructure and duplicative operating costs will improve the financial performance of the combined entity." Yahoo has been a major player in several open soruce projects. Most of Yahoo's infrastructure runs on FreeBSD, and the lead developer of PHP, Rasmus Lerdorf, works as an engineer at Yahoo. Yahoo has also been a major contributor to Hadoop, an open source technology for distributed computing. Data Center Knowledge [datacenterknowledge.com] has more on the infrastructure implications.

I listened in on the conference call and although the highlighted quote is paraphrased it is similar to what I remember raising my eyebrows at when I heard it over the phone, given my day job.

What a day to not be going into work...


 

Categories:

From the press release entitled Microsoft Proposes Acquisition of Yahoo! for $31 per Share we learn

REDMOND, Wash. — Feb. 1, 2008 — Microsoft Corp. (NASDAQ:MSFT) today announced that it has made a proposal to the Yahoo! Inc. (NASDAQ:YHOO) Board of Directors to acquire all the outstanding shares of Yahoo! common stock for per share consideration of $31 representing a total equity value of approximately $44.6 billion. Microsoft’s proposal would allow the Yahoo! shareholders to elect to receive cash or a fixed number of shares of Microsoft common stock, with the total consideration payable to Yahoo! shareholders consisting of one-half cash and one-half Microsoft common stock. The offer represents a 62 percent premium above the closing price of Yahoo! common stock on Jan. 31, 2008.

“We have great respect for Yahoo!, and together we can offer an increasingly exciting set of solutions for consumers, publishers and advertisers while becoming better positioned to compete in the online services market,” said Steve Ballmer, chief executive officer of Microsoft. “We believe our combination will deliver superior value to our respective shareholders and better choice and innovation to our customers and industry partners.”

“Our lives, our businesses, and even our society have been progressively transformed by the Web, and Yahoo! has played a pioneering role by building compelling, high-scale services and infrastructure,” said Ray Ozzie, chief software architect at Microsoft. “The combination of these two great teams would enable us to jointly deliver a broad range of new experiences to our customers that neither of us would have achieved on our own.”

WOW. Just...wow.

There's a conference call with Ray Ozzie, Steve Ballmer, Chris Liddell and Kevin Johnson in about half an hour to discuss this. This is the first time I've considered listening in on one of those.


 

Two seemingly unrelated posts flew by my aggregator this morning. The first was Robert Scoble’s post The shy Mark Zuckerberg, founder of Facebook where he talks about meeting the Facebook founder. During their conversation, Zuckerburg admits they made mistakes with their implementation of Facebook Beacon and will be coming out with an improved version soon.

The second post is from the Facebook developer blog and it is the announcement of the JavaScript Client Library for Facebook API which states

This JavaScript client library allows you to make Facebook API calls from any web site and makes it easy to create Ajax Facebook applications. Since the library does not require any server-side code on your server, you can now create a Facebook application that can be hosted on any web site that serves static HTML.

Although the pundits have been going ape shit over this on Techmeme this is an unsurprising announcement given the precedent of Facebook Beacon. With that announcement they provided a mechanism for a limited set of partners to integrate with their feed.publishTemplatizedAction API using a Javascript client library. Exposing the rest of their API using similar techniques was just a matter of time.

What was surprising to me when reading the developer documentation for the Facebook Javascript client library is the following notice to developers

Before calling any other Facebook API method, you must first create an FB.ApiClient object and invoke its requireLogin() method. Once requireLogin() completes, it invokes the callback function. At that time, you can start calling other API methods. There are two way to call them: normal mode and batched mode.

So unlike the original implementation of Beacon, the Facebook developers aren’t automatically associating your Facebook account with the 3rd party site then letting them party on your data. Instead, it looks like the user will be prompted to login before the Website can start interacting with their data on Facebook or giving Facebook data about the user.

This is a much better approach than Beacon and remedies the primary complaint from my Facebook Beacon is Unfixable post from last month.

Of course, I haven’t tested it yet to validate whether this works as advertised. If you get around to testing it before I do, let me know if it works the way the documentation implies in the comments.


 

I’m not a regular user of Digg so I tend to ignore the usual controversy of the month style storms that end up on Techmeme whenever they tweak their algorithm for promoting items from the upcoming queue to the front page. Today I decided to take a look at the current controversy and clicked on the story So Called Top Digg Users Cry About Digg Changes which led me to the following comment by ethicalh

the algorithm uses the "social networking" part of digg now. if you are a mutual friend of the submitter your vote is cancelled and wont go towards promotion to the frontpage, and if you're a mutual friend of someone who has already dugg the story, your vote is cancalled and won't go towards promotion to the frontpage. so if you don't have a friends list and don't use the social networking part of digg then you can still be a top digger. you just need to create a new account and don't be tempted to add anyone as a friend, because the new algorithm is linked upto everyones friends list now. thats the real reason digg added social networking, is so they could eventually hook it upto the algorithm, thats the secret reason digg introduced social networking and friends lists onto the digg site.

A number of people confirmed the behavior in the comments. It seems that all votes from mutual friends (i.e. people on each other’s friends list) are treated as a single vote. So if we are three friends that all use Digg and have each other on our friends’ lists then if all three of us vote for a story, it is counted as a single vote. This is intended to subvert cliques of “friends” who vote up articles and encourage diversity or so the Digg folks claim in a quote from a New York Times article on the topic.

My first impression was that this change seems pretty schizophrenic on the part of the Digg developers. What is the point of adding all sorts of features that enable me to keep tabs on what other Digg users are voting up if you don’t want me to vote up the ones I like as well? Are people really supposed to trawl through http://www.digg.com/tech_news/upcoming to find stories to vote up instead of just voting up the ones their friends found that they thought were cool as well? Yeah…right.

But thinking deeper, I realize that this move is even more screwed up. The Digg developers are pretty much penalizing you for using a feature of the site. You actually less value out of the site by using the friends feature. Because once you start using the friends feature, it is less likely that stories you like will be voted to the front page due to the fact that your vote doesn’t count if someone on your friends list has already voted for the story.

So why would anyone want to use the friends feature once they realize this penalty exists?

Now playing: Dogg Pound - Ridin', Slipin' & Slidin'


 

Categories: Social Software

According to the blog post entitled on Microsoft Joins DataPortability.org on dev.live.com we learn

Today Microsoft is announcing that it has joined DataPortability.org, a group committed to advancing the conversation about the portability, security and privacy of individuals’ information online.  There are important security and privacy issues to solve as the internet evolves, and we are committed to being an integral part of the industry conversation on behalf of our users.

The decision to join DataPortability.org is an outgrowth of a deeper theme that technology and the internet should be deployed to help people be at the center of their online worlds, a theme that has begun to permeate our products and services over the past few years. We believe the logical evolution of the internet is to enable the removal of barriers to provide integrated, seamless experiences, but to do so in a manner that ensures that users retain full control over the security and privacy of their information.

Windows Live is focused on providing tools and a platform to enable these types of seamless experiences.  Windows Live has more than 420 million active Live IDs that work across our services and across partner sites. 

I’m sure some folks are wondering exactly what this means. Even though I was close to the decision making around this, I believe it is still too early to tell. Personally, I share Marc Canter’s skepticism about Dataportability.org given that so far there’s been a lot of hype but no real meat.

However we have real problems to solve as an industry. The lack of interoperability between various social software applications is troubling given that the Internet (especially the Web) got to be a success today by embracing interoperability instead of being about walled gardens fighting over who can build the prettiest gilded cage for their prisoners customers. The fact that when interoperability happens, it is in back room deals (e.g. Google OpenSocial, Microsoft’s conversations with startups, etc) instead of being open to all using standard and unencumbered protocols is similarly troubling. Even worse, insecure practices that expose social software users to privacy violations have become commonplace due to the lack of a common framework for interoperability.

As far as I can tell, Dataportability.org seems like a good forum for various social software vendors to start talking about how we can get to a world where there is actual interoperability between social software applications. I’d like to see real meat fall out of this effort not fluff. One of the representatives Microsoft has chosen is the dev lead from the product team I am on (Inder Sethi) which implies we want technical discussion of protocols and technologies not just feel good jive. We’ll also be sending a product planning/marketing type as well (John Richards) to make sure the end user perspective is also being covered. You can assume that even though I am not on the working group in person, I will be there in spirit since I communicate with both John and Inder on a regular basis. Smile 

I’ll also be at the O’Reilly offices during Super Bowl weekend attending the O’Reilly Social Graph FOO Camp which I hope will be another avenue to sit together with technical decision makers from the various major social software vendors and talk about how we can move this issue forward as an industry.

Now playing: Bone Thugs 'N Harmony - If I Could Teach The World


 

Categories: Windows Live

The top story in my favorite RSS reader is the article MapReduce: A major step backwards by David J. DeWitt and Michael Stonebraker. This is one of those articles that is so bad you feel dumber after having read it. The primary thesis of the article is

 MapReduce may be a good idea for writing certain types of general-purpose computations, but to the database community, it is:

  1. A giant step backward in the programming paradigm for large-scale data intensive applications
  2. A sub-optimal implementation, in that it uses brute force instead of indexing
  3. Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago
  4. Missing most of the features that are routinely included in current DBMS
  5. Incompatible with all of the tools DBMS users have come to depend on

One of the worst things about articles like this is that it gets usually reasonable and intelligent sounding people spouting of bogus responses as knee jerk reactions due to the articles stupidity. The average bogus reaction was the kind by Rich Skrenta in his post Database gods bitch about mapreduce which talks of "disruption" as if Google MapReduce is actually comparable to a relation database management system.

On the other hand, the good thing about articles like this is that you get often get great responses from smart folks that further your understanding of the subject matter even though the original article was crap. For example, take the post from Google employee Mark Chu-Carroll entitled Databases are Hammers; MapReduce is a ScrewDriver where he writes eloquently that

The beauty of MapReduce is that it's easy to write. M/R programs are really as easy as parallel programming ever gets. So, getting back to the article. They criticize MapReduce for, basically, not being based on the idea of a relational database.

That's exactly what's going on here. They've got their relational databases. RDBs are absolutely brilliant things. They're amazing tools, which can be used to build amazing software. I've done a lot of work using RDBs, and without them, I wouldn't have been able to do some of the work that I'm proudest of. I don't want to cut down RDBs at all: they're truly great. But not everything is a relational database, and not everything is naturally suited towards being treated as if it were relational. The criticisms of MapReduce all come down to: "But it's not the way relational databases would do it!" - without every realizing that that's the point. RDBs don't parallelize very well: how many RDBs do you know that can efficiently split a task among 1,000 cheap computers? RDBs don't handle non-tabular data well: RDBs are notorious for doing a poor job on recursive data structures. MapReduce isn't intended to replace relational databases: it's intended to provide a lightweight way of programming things so that they can run fast by running in parallel on a lot of machines. That's all it was intended to do.

Mark’s entire post is a great read.

Greg Jorgensen also has a good rebutal in his post Relational Database Experts Jump The MapReduce Shark which points out that if the original article had been a critique of a Web-based structured data storage systems such as Amazon’s SimpleDB  or Google Base then the comparison may have been almost logical as opposed to being completely ridiculous. Wink

Now playing: Marvin Gaye - I Heard It Through the Grapevine


 

Categories: Web Development