One of the core tennets we’ve had when designing social graph applications within Windows Live is that we always put users in control. Which means privacy features and opt out galore. Manifestations of this include

  • You can’t IM a Windows Live Messenger user unless they’ve given you permission to do so. So IM spam is pretty much nonexistent on our network. At worst, there is the potential of gettig lots of IM buddy requests from spammers if you have a guessable email address but even that problem has seemed more theoretical than real in our experience. 

  • Don't like getting friend invites from Windows Live Spaces? You can opt out of getting them completely or restrict who can send them to you to your first degree social networks (e.g. IM buddies) or your second degree networks (e.g. friends of friends).

  • If a non-Microsoft application wants to access your social graph (e.g. IM buddy list or Hotmail address book) using our contact APIs, not only does it need access to your log-in credentials but it also needs explicit permission from you which can be revoked if the application becomes untrustworthy.

The last item is what I want to talk about today. Pete Cashmore over at Mashable has a blog post entitled Are You Getting Quechup Spammed? where he writes

One controversial issue among social networks is how hard they should push for user acquisition. Most social networks these days let you to import your email address book in some way (Twitter is the latest), but most make it clear if they’re about to mail your contacts.

One site that’s catching people off guard is Quechup: we’ve got a volley of complaints about them in the mailbox this weekend, and a quick Google reveals that others were caught out too.

The issue lies with their “check for friends” form: during signup you’re asked to enter your email address and password to see whether any of your friends are already on the service. Enter the password, however, and it will proceed to mail all your contacts without asking permission. This has led to many users issuing apologies to their friends for “spamming” them inadvertently. Hopefully the bad PR on this one will force them to change the system.

In related news, ZDnet investigates social services Rapleaf and UpScoop, pointing out that they’re run by TrustFuse, a company that sells data to marketers. UpScoop lets you enter your email address and password and find all your friends on social networks. The company is not selling the email addresses you input, but those clients who already have lists of email addresses can bring those to TrustFuse and receive additional information about those people mined from public social networking profiles. The aggregation of all that data is perfectly legal and perhaps even ethically sound, but it’s a little unnerving for some.

I won’t comment on the legality of these services except to point out that a number of practices used to obtain a user’s contact list violate the Terms of Service of the sites they are obtained from especially when these sites have APIs. Of course, I am not a lawyer and don’t play one on TV.

I will point out that 9 times out of 10 when you hear geeks talking about social network portability or similar buzzwords they are really talking about sending people spam because someone they know joined some social networking site. I also wonder how many people realize that these fly-by-night social networking sites that they happily hand over their log-in credentials to so they can spam their friends also share the list of email addresses thus obtained with services that resell to spammers?

This brings me to Brad FitzPatrick’s essay Thoughts on the Social Graph which lists the following as one of the goals of a project he is working on while at Google

For end-users:

  1. A user should then be able to log into a social application (e.g. dopplr.com) for the first time, ideally but not necessarily with OpenID, and be presented with a dialog like,
    "Hey, we see from public information elsewhere that you already have 28 friends already using dopplr, shown below with rationale about why we're recommending them (what usernames they are on other sites). Which do you want to be friends with here? Or click 'select-all'."
    Also every so often while you're using the site dopplr lets you know if friends that you're friends with elsewhere start using the site and prompts you to be friends with them. All without either of you re-inviting/re-adding each other on dopplr... just because you two already declared your relationship publicly somewhere else. Note: some sites have started to do things like this, in ad-hoc hacky ways (entering your LJ username to get your other LJ friends from FOAF, or entering your email username/password to get your address book), but none in a beautiful, comprehensive way.

The question that runs through my mind is if you are going to build a system like this, how do you prevent badly behaved applications like Quechup from taking control away from your users? At the end of the day your users might end up thinking you sold their email addresses to spammers when in truth it was the insecure practices of the people who they’d shared their email addresses with that got them in that mess. This is one of the few reasons I can understand why Facebook takes such a hypocritical approach. :) 

At least Brad's design seems to assume that the only identifiers for users within his system will be the equivalent of foaf:mbox_sha1sum. However I suspect that many of the startups expressing interest in this space are interested in sharing rich profile data and legitimate contact information not just hashes of interesting data.

I’ll find out if my suspicions are worth anything later this week when I’m at the Data Sharing Summit.

PS: If you really want to put your tin foil hat on, read this on post on the Google Group on social network portability on Evil Third Party Graph Analysis which speculates on all the bad things one could do if you had a publicly accessible social graph (e.g. find which people in your service have lots of “friends” with bad credit, low income, criminal history, history of political dissension, poor health, etc so you can discriminate or target them accordingly) especially if you can tie some of the hashed information back to real data which should be quite possible for some subset of the people in the graph.  

Now playing: Fergie - Big Girls Don't Cry


 

Categories: Social Software

Recently someone at work asked me if I thought social networking on the Web was a fad. My response was that it depends on what you mean by social networking since lots of Web applications are lumped into that category but either way I think all of these different categories are here to stay.

I thought it would be useful to throw out a couple of definitions so that we all had a shared vocabulary when talking about the different types of Web applications that incorporate some form of social networking. A number of these terms were popularized by Mark Zuckerberg which is a testament to the way Facebook has cornered the thought leadership in this space.

  1. Social Graph: If one were to render the various ways different people in a particular community were connected into a data structure, it would be a graph.  In a social graph, each person is a vertex and each relationship connecting two people is an edge. There can be multiple edges connecting people (e.g. Mike and I work at Microsoft, Mike and I are IM buddies, Mike and I live in Washington state, etc). Edges in the social graph have a label which describes the relationship. Fun examples of social graphs are slander & libel -- the official computer scene sexchart and the Mark Foley Blame chart.

  2. Social Graph Application: An application that requires or is improved by the creation of a social graph describing the context specific relationships between it’s users is a social graph application. Examples of applications that require a social graph to actually be usable are instant messaging applications like Skype and Windows Live Messenger.  Examples of applications that are significantly improved by the existence of context specific social graphs within them are Digg, Flickr, Del.icio.us and Twitter which all don’t require a user to add themselves to the site’s social graph to utilize it’s services but get more valuable once users do. One problem with the latter category of sites is that they may require a critical mass of users to populate their social graph before they become compelling.

    Where Facebook has hit the jack pot is that they have built a platform where applications that are compelling once they have a critical mass of users can feed off of Facebook’s social graph instead of trying to build a user base from scratch. Contrast the struggling iLike website with the hugely successful iLike Facebook application.

  3. Social Networking Site: These are a subset of social graph applications. danah boyd has a great definition of social networking sites on her blog which I’ll crib in it’s entirety; A "social network site" is a category of websites with profiles, semi-persistent public commentary on the profile, and a traversable publicly articulated social network displayed in relation to the profile. Popular examples of such websites are MySpace and Bebo. You can consider these sites to be the next step in the evolution of a personal homepage which now incorporate richer media, more avenues for self expression and more interactivity than our GeoCities and Tripod pages of old.

  4. Social Operating System: These are a subset of social networking sites. In fact, the only application in this category today is Facebook.  Before you use your computer, you have to boot your operating system and every interaction with your PC goes through the OS. However instead of interacting directly with the OS, most of the time you interact with applications written on top of the OS. Similarly a Social OS is the primary application you use for interacting with your social circles on the Web. All your social interactions whether they be hanging out, chatting, playing games, watching movies, listening to music, engaging in private gossip or public conversations occurs within this context. This flexibilty is enabled by the fact that the Social OS is a platform that enables one to build various social graph applications on top of it.

By the way, on revisiting my schedule I do believe I should be able to attend the Data Sharing Summit on Friday next week. I'll only be in the area for that day but it should be fun to chat with folks from various companies working in this space and get to share ideas about how we can all work together to make the Web a better place for our users.

Now playing: Snoop Doggy Dogg - That's That Shit (feat. R. Kelly)


 

Categories: Social Software

There's an article on InfoQ entitled "Code First" Web Services Reconsidered which begins

Are you getting started on developing SOAP web services? If you are, you have two development styles you can chose between. This first is called “start-from-WSDL”, or “contract first”, and involves building a WSDL service description and associated XML schema for data exchange directly. The second is called “start-from-code”, or “code first”, and involves plugging sample service code into your framework of choice and generating the WSDL+schema from that code.

With either development style, the end goal is the same – you want a stable WSDL+schema definition of your service. This goal is especially important when you’re working in a SOA environment. SOA demands loose coupling of services, where the interface is fixed and separate from the implementation.

There are lots of problems with this article, the main one being that choosing between “contract first” and “code first” SOAP Web Services development styles is like choosing between putting a square peg in a round hole and putting a round peg in a square hole. Either way, you will have to deal with the impedance mismatch between W3C XML Schema (XSD) and objects from your typical OO system. This is because practically every SOAP Web service toolkit does some level of XML<->object mapping which ends up being lossy because W3C XML Schema contains several constructs that don’t really map well to objects and objects may contain constructs that don’t really map well to W3C XML Schema depending on your platform choice. 

The only real consideration when deciding between “code first” and “contract first” approaches is whether your service is concerned primarily with objects or whether it is concerned primarily with XML documents [preferrably with a predefined schema]. If you are just moving around data objects (i.e. your data model isn’t much more complex than JSON) then you are better off using a “code first” approach especially since most SOAP toolkits can handle the basic constructs that would result from generating WSDLs from such types. On the other hand, if you are transmitting documents that have a predefined schema (e.g. XBRL or OOXML documents) then you are better off authoring the WSDL by hand than trying to jump through hoops to get the XML<->object mapping technology in your off-the-shelf SOAP toolkit to do a great job with these schemas. Unfortunately, anyone consuming your SOAP Web service who is using an off-the-shelf SOAP toolkit will likely have interoperability issues when their toolkit tries to process schemas that fully utilize a significant number of the features W3C XML Schema. If your situation falls somewhere in the middle, then you’re probably screwed regardless of which approach you choose.

One of the interesting consequences of the adoption of RESTful Web services is that these interoperability issues have been avoided because most people who provide these services do not provide schemas in the W3C XML Schema. Thus discouraging the foolishness of XML<->object mapping based on XSD that causes interoperability problems in the SOAP world. Instead, most vendors who expose RESTful APIs that want to provide an object-centric programming model for developers who don’t want to deal with XML either create or encourage the third party creation of client libraries on target platforms which wrap their simple, RESTful Web services with a more object oriented facade. Examples of vendors who have gone with this approach for their RESTful APIs include

That way XML geeks like me get to party on the raw XML which is in a simple and straightforward format while the folks who are scared of XML can party on objects using their favorite programming language and platform. The best of both worlds.

“Contract first” vs. “Code first”? More like do you want to slam your head into a brick wall or a concrete wall?

Now playing: Big Boi - Kryptonite (I'm On It) feat. Killer Mike


 

Categories: XML Web Services

The Facebook developer blog has a post entitled Change is Coming which details some of the changes they've made to the platform to handle malicious applications including

Requests

We will be deprecating the notifications.sendRequest API method. In its place, we will provide a standard invitation tool that allows users to select which friends they would like to send a request to. We are working hard on multiple versions of this tool to fit into different contexts. The tool will not have a "select all" button, but we hope it enables us to increase the maximum number of requests that can be sent out by a user. The standardized UI will hopefully make it easier for users to understand exactly what they are doing, and will save you the trouble of building it yourself.

Notifications

Soon we will be removing email functionality from notifications.send, though the API function itself will remain active. In the future, we may provide another way to contact users who have added your app, as we know that is important. Deceptive and misleading notifications will continue to be a focus for us, and we will continue to block applications which behave badly and we will continue to iterate on our automated spam detection tools. You will also see us working on ways to automatically block deceptive notifications.

It looks like some but not all of the most egregious behavior is being targetted which is good. Specifically, I  wonder what is meant by deprecating the notifications.sendRequest API. When I think of API deprecation, I think of @deprecated in Java and Obsolete in C#, neither of which prevent the API from being used.

One of my biggest gripes with the site is the number of “friend requests” I get from applications with no way to opt out of getting these requests. However it doesn’t seem that this has been eliminated. Instead an API is being replaced with a UI component but the API isn’t even going away. I hope there is a follow up post where they describe the opt-out options they’ve added to the site so users can opt-out of getting so many unsolicited requests.

Now playing: Big Pun - Punish Me


 

August 28, 2007
@ 04:24 PM

People Who Need to Get the Fuck out of the White House
Donald Rumsfeld
Karl Rove
Alberto Gonzales
Dick Cheney
G.W. Bush

Now playing: Al Green - Tired Of Being Alone


 

Categories: Current Affairs

Robert Scoble has a blog post up entitled Why Mahalo, TechMeme, and Facebook are going to kick Google’s butt in four years where he argues that search based on social graphs (e.g. your Facebook relationships) or generated by humans (e.g. Mahalo) will eventually trump Google's algorithms. I'm not sure I'd predict the demise of Google but I do agree that the social graph can be used to improve search and other aspects of the Internet experience, in fact I agree so much that was the topic of my second ThinkWeek paper which I submitted earlier this year (Microsoft folks can find it here).

However I don’t think Google’s main threat from sites like Facebook is that they may one day build social graph powered search that beats Google’s algorithms. Instead it is that these sites are in direct conflict with Google’s mission to

organize the world's information and make it universally accessible and useful.

because they create lots of valuable content that Google can not access. Google has branched out of Web search into desktop search, enterprise search, Web-based email and enterprise application hosting all to fulfill this mission.

The problem that Google faces with Facebook is pointed out quite well in Jason Kottke’s post Facebook vs. AOL, redux where he writes

Think of it this way. Facebook is an intranet for you and your friends that just happens to be accessible without a VPN. If you're not a Facebook user, you can't do anything with the site...nearly everything published by their users is private. Google doesn't index any user-created information on Facebook.2

and in Jeff Atwood's post Avoiding Walled Gardens on the Internet which contains the following excerpt

I occasionally get requests to join private social networking sites, like LinkedIn or Facebook. I always politely decline…public services on the web, such as blogs, twitter, flickr, and so forth, are what we should invest our time in. And because it's public, we can leverage the immense power of internet search to tie it all-- and each other-- together.

What Jason and Jeff are inadvertantly pointing out is that once you join Facebook, you immediately start getting less value out of Google’s search engine. This is a problem that Google cannot let continue indefinitely if they plan to stay relevant as the Web’s #1 search engine.

What is also interesting is that thanks to efforts of Google employees like Mark Lucovsky, I can use Google search from within Facebook but without divine intervention I can’t get Facebook content from Google’s search engine. If I was an exec at Google, I’d worry a lot more about the growing trend of users creating Web content where it cannot be accessed by Google than all the “me too” efforts coming out of competitors like Microsoft and Yahoo!.

The way you get disrupted is by focusing on competitors who are just like you instead of actually watching the marketplace. I wonder how Google will react when they eventually realize how deep this problem runs?

Now playing: Metallica - Welcome Home (Sanitarium)


 

I try to avoid posting about TechMeme pile ups but this one was just too irritating to let pass. Mark Cuban has a blog post entitled The Internet is Dead and Boring which contains the following excerpts 

Some of you may not want to admit it, but that's exactly what the net has become. A utility. It has stopped evolving. Your Internet experience today is not much different than it was 5 years ago.
...
Some people have tried to make the point that Web 2.0 is proof that the Internet is evolving. Actually it is the exact opposite. Web 2.0 is proof that the Internet has stopped evolving and stabilized as a platform. Its very very difficult to develop applications on a platform that is ever changing. Things stop working in that environment. Internet 1.0 wasn't the most stable development environment. To days Internet is stable specifically because its now boring.(easy to avoid browser and script differences excluded)

Applications like Myspace, Facebook, Youtube, etc were able to explode in popularity because they worked. No one had to worry about their ISP making a change and things not working. The days of walled gardens like AOL, Prodigy and others were gone.
...
The days of the Internet creating explosively exciting ideas are dead. They are dead until bandwidth throughput to the home reaches far higher numbers than the vast majority of broadband users get today.
...
So, let me repeat, The days of the Internet creating explosively exciting ideas are dead for the foreseeable future..

I agree with Mark Cuban that the fundamental technologies that underly the Internet (DNS and TCP/IP) and the Web in particular (HTTP and HTML) are quite stable and are unlikely to undergo any radical changes anytime soon. If you are a fan of Internet infrastructure then the current world is quite boring because we aren't likely to ever see an Internet based on IPv8 or a Web based on HTTP 3.0. In addition, it is clear that the relative stability in the Web development environment and increase in the number of people with high bandwidth connections is what has led a number of the trends that are collectively grouped as "Web 2.0".

However Mark Cuban goes off the rails when he confuses his vision of the future of media as the only explosive exciting ideas that can be enabled by a global network like the Internet. Mark Cuban is an investor in HDNet which is a company that creates and distributes professionally produced content in high definition video formats. Mark would love nothing more than to distribute his content over the Internet especially since lack of interest in HDNet in the cable TV universe (I couldn't find any cable company on the Where to Watch HDNet page that actually carried the channel).

Unfortunately, Mark Cuban's vision of distributing high definition video over the Internet has two problems. The first is the fact that distributing high quality video of the Web is too expensive and the bandwidth of the average Web user is insufficient to make the user experience pleasant. The second is that people on the Web have already spoken and content trumps media quality any day of the week. Remember when pundits used to claim that consumers wouldn't choose lossy, compressed audio on the Web over lossless music formats? I guess no one brings that up anymore given the success of the MP3 format and the iPod. Mark Cuban is repeating the same mistake with his HDNet misadventure.  User generated, poor quality video on sites like YouTube and larger library of content on sites like Netflix: Instant Viewing is going to trump the limited line up on services like HDNet regardless of how much higher definition the video quality gets.

Mark Cuban has bet on a losing horse and he doesn't realize it yet. The world has changed on him and he's still trying to work within an expired paradigm. It's like a newspaper magnate blaming printer manufacturers for not making it easy to print a newspaper off of the Web instead of coming to grips with the fact that the Internet with its blogging/social media/user generated content/craig's list and all that other malarky has turned his industry on its head.

This is what it looks like when a billionairre has made a bad investment and doesn't know how to recognize the smell of failure blasting his nostrils with its pungent aroma.


 

Categories: Current Affairs | Technology

August 26, 2007
@ 11:10 PM

Over the last few months, there have been numerous articles by various bloggers and mainstream press speculating on whether use of Facebook will supplant email. I have also noticed that I use the private messaging feature in Facebook to keep in touch with more people, more often than I do with non-work related email.

I used to think that this was a welcome development from a user's point of view because spam is pretty much eliminated due to in-built white lists based on social networks. Or at least that's what I thought. However over the past few weeks I've been getting more and more unsolicited private messages on the site which aren't p3n1s enlargement or 419 scams but are still unsolicited. On taking a look at the privacy options, it turns out that there doesn't seem to be a way to opt out of being contacted by random people on the site. 

In fact, I'm surprised that regular spammers haven't yet flooded Facebook given how they seem to end up everywhere else you let people contact each other directly over the Web. One thing I find confusing is that I can swear that there was an option to opt out of messages from people who aren't on your friend's list or in one of your Networks. Or was this just my imagination?

The other kind of unsolicited mail that is totally wrecking my Facebook experience is unsolicited friend requests from Facebook applications. These aren't just regular friend requests. It seems that every application a user adds can make friend requests specific to the application. I'm getting friend requests from My Questions and Likeness on an almost daily basis with no way to permanently ignore friend requests from these applications.

I guess Clay Shirky's old saying is true, the definition of social software is stuff that gets spammed.


 

Categories: Social Software

August 24, 2007
@ 06:44 PM

Recently, my status message on Facebook was I'm now convinced microformats are a stupid idea. Shortly afterwards I got a private message from Scott Beaudreau asking me to clarify my statement. On reflection, I realize that what I find stupid is when people suggest using microformats and screen scraping techiques instead of utilizing an API when the situation calls for one. For example, the social network portability proposal on the microformats wiki states

The "How To" for social network profile sites that want to solve the above problems and achieve the above goals.

  1. Publish microformats in your user profiles:
    1. implement hCard on user profile pages. See hcard-supporting-profiles for sites that have already done this.
    2. implement hCard+XFN on the list of friends on your user profile pages. See hcard-xfn-supporting-friends-lists for sites that already do this. (e.g. [Twitter (http://twitter.com/)]).
  2. Subscribe to microformats for your user profiles:
    1. when signing up a new user:
      1. let a user fill out and "auto-sync" from one of their existing hcard-supporting-profiles, their name, their icon etc. Satisfaction Inc already supports this. (http://microformats.org/blog/2007/06/21/microformatsorg-turns-2/)
      2. let a user fill out and "auto-sync" their list of friends from one of their existing hCard+XFN supporting friends lists. Dopplr.com already supports this.

It boggles my mind to see the suggestion that applications should poll HTML pages to do data synchronization instead of utilizing an API. Instead of calling friends.get, why don't we just grab the entire friends page then parse out the handful of useful data that we actually need.

There are a number of places microformats are a good fit, especially in situations where the client application has to parse the entire HTML document anyway. Examples include using hCard to enable features like Live Clipboard or using hReview and hCalendar to help microformat search engines. However using them as a replacement for an API or an RSS feed is like using boxing gloves instead of oven mitts when baking a pie.

If all you have is a hammer, everything looks like a nail.

Now playing: D12 - American Psycho II (feat. B-Real)