One of the core tennets we’ve had when designing social graph applications within Windows Live is that we always put users in control. Which means privacy features and opt out galore. Manifestations of this include

  • You can’t IM a Windows Live Messenger user unless they’ve given you permission to do so. So IM spam is pretty much nonexistent on our network. At worst, there is the potential of gettig lots of IM buddy requests from spammers if you have a guessable email address but even that problem has seemed more theoretical than real in our experience. 

  • Don't like getting friend invites from Windows Live Spaces? You can opt out of getting them completely or restrict who can send them to you to your first degree social networks (e.g. IM buddies) or your second degree networks (e.g. friends of friends).

  • If a non-Microsoft application wants to access your social graph (e.g. IM buddy list or Hotmail address book) using our contact APIs, not only does it need access to your log-in credentials but it also needs explicit permission from you which can be revoked if the application becomes untrustworthy.

The last item is what I want to talk about today. Pete Cashmore over at Mashable has a blog post entitled Are You Getting Quechup Spammed? where he writes

One controversial issue among social networks is how hard they should push for user acquisition. Most social networks these days let you to import your email address book in some way (Twitter is the latest), but most make it clear if they’re about to mail your contacts.

One site that’s catching people off guard is Quechup: we’ve got a volley of complaints about them in the mailbox this weekend, and a quick Google reveals that others were caught out too.

The issue lies with their “check for friends” form: during signup you’re asked to enter your email address and password to see whether any of your friends are already on the service. Enter the password, however, and it will proceed to mail all your contacts without asking permission. This has led to many users issuing apologies to their friends for “spamming” them inadvertently. Hopefully the bad PR on this one will force them to change the system.

In related news, ZDnet investigates social services Rapleaf and UpScoop, pointing out that they’re run by TrustFuse, a company that sells data to marketers. UpScoop lets you enter your email address and password and find all your friends on social networks. The company is not selling the email addresses you input, but those clients who already have lists of email addresses can bring those to TrustFuse and receive additional information about those people mined from public social networking profiles. The aggregation of all that data is perfectly legal and perhaps even ethically sound, but it’s a little unnerving for some.

I won’t comment on the legality of these services except to point out that a number of practices used to obtain a user’s contact list violate the Terms of Service of the sites they are obtained from especially when these sites have APIs. Of course, I am not a lawyer and don’t play one on TV.

I will point out that 9 times out of 10 when you hear geeks talking about social network portability or similar buzzwords they are really talking about sending people spam because someone they know joined some social networking site. I also wonder how many people realize that these fly-by-night social networking sites that they happily hand over their log-in credentials to so they can spam their friends also share the list of email addresses thus obtained with services that resell to spammers?

This brings me to Brad FitzPatrick’s essay Thoughts on the Social Graph which lists the following as one of the goals of a project he is working on while at Google

For end-users:

  1. A user should then be able to log into a social application (e.g. dopplr.com) for the first time, ideally but not necessarily with OpenID, and be presented with a dialog like,
    "Hey, we see from public information elsewhere that you already have 28 friends already using dopplr, shown below with rationale about why we're recommending them (what usernames they are on other sites). Which do you want to be friends with here? Or click 'select-all'."
    Also every so often while you're using the site dopplr lets you know if friends that you're friends with elsewhere start using the site and prompts you to be friends with them. All without either of you re-inviting/re-adding each other on dopplr... just because you two already declared your relationship publicly somewhere else. Note: some sites have started to do things like this, in ad-hoc hacky ways (entering your LJ username to get your other LJ friends from FOAF, or entering your email username/password to get your address book), but none in a beautiful, comprehensive way.

The question that runs through my mind is if you are going to build a system like this, how do you prevent badly behaved applications like Quechup from taking control away from your users? At the end of the day your users might end up thinking you sold their email addresses to spammers when in truth it was the insecure practices of the people who they’d shared their email addresses with that got them in that mess. This is one of the few reasons I can understand why Facebook takes such a hypocritical approach. :) 

At least Brad's design seems to assume that the only identifiers for users within his system will be the equivalent of foaf:mbox_sha1sum. However I suspect that many of the startups expressing interest in this space are interested in sharing rich profile data and legitimate contact information not just hashes of interesting data.

I’ll find out if my suspicions are worth anything later this week when I’m at the Data Sharing Summit.

PS: If you really want to put your tin foil hat on, read this on post on the Google Group on social network portability on Evil Third Party Graph Analysis which speculates on all the bad things one could do if you had a publicly accessible social graph (e.g. find which people in your service have lots of “friends” with bad credit, low income, criminal history, history of political dissension, poor health, etc so you can discriminate or target them accordingly) especially if you can tie some of the hashed information back to real data which should be quite possible for some subset of the people in the graph.  

Now playing: Fergie - Big Girls Don't Cry


 

Monday, 03 September 2007 00:45:26 (GMT Daylight Time, UTC+01:00)
In Brad's post he also says "The focus is only on public data for now, as that's all you can spray around the net freely to other parties. While focusing on public data doesn't solve 100% of the problem, it does solve, say, 90% of the problem at 10% of the complexity. Private data can be added later, perhaps at a higher layer. For now, only public data."

At first I thought it might be possible to do some kind of private data sharing using hashed identifiers or Bloom filters (http://nicklothian.com/blog/2007/08/24/preserving-privacy-while-promoting-social-network-portability/)

I thought that Bloom filters might provide some protection against "evil graph analysis" but I have since come to the conclusion that - while that might make it more computationally expensive - companies are going to do it anyway (eg Jeff Jonas: http://jeffjonas.typepad.com/jeff_jonas/2007/07/how-to-use-a-gl.html)

Now I think I just agree with Brad - public data only is the way to go. Of course, I think selling that more widely might be interesting...

(Not sure if this posted or not...)
Monday, 03 September 2007 00:45:39 (GMT Daylight Time, UTC+01:00)
For fuck's sake Dare. Fergie? Are you a twelve year old girl?
Paul Orndorff
Monday, 03 September 2007 08:10:52 (GMT Daylight Time, UTC+01:00)
Not sure giving login credentials to a third party site is a good idea - Are they throwaway credentials? (e.g. API key)

(Pardon my ignorance if they are, I don't have a Windows Live account).
nexusprime
Monday, 03 September 2007 08:21:47 (GMT Daylight Time, UTC+01:00)
Seems the MSDN article covers it quite thoroughly - "Login credentials" initially gave me visions of cross site authentication attacks given the prevalence of re-using passwords.
nexusprime
Comments are closed.