Today on the Facebook blog I spotted a post entitled FQL which contains the following excerpt

Two and a half months ago, a few of us were hanging out in the Facebook TV room, laying on the Fatboys and geeking out about how to move forward with the API for the Facebook Platform. We had a beta version that was fully functional, but we kept wishing that the interface were cleaner, more concise, and more consistent. Suddenly it occurred to me – this problem had been solved over 30 years earlier by database developers who came up with SQL – the Structured Query Language. What if we could use the same time-tested interface as the way for developers to access Facebook's data?
...
This isn't a simple problem – with millions of users and billions of friend connections, photos, tags, etc., Facebook's data doesn't exactly fit into your average database. And, even if it did, we still have to carefully apply all of those complicated privacy rules. Facebook Query Language would have to take those SQL-style queries from developers, figure out what data they're actually looking for, figure out if they're allowed to actually see the data, figure out where the data is stored, and then finally go and get the data to return back to the developer. I knew building FQL would be hard, but that's why I couldn't wait to do it.

This is one of those things I used to think was a great idea when I was on the XML team at Microsoft. Instead of exposing your data using APIs, why not expose your data as XML then allow people to perform XQuery operations over the data. In reality, this often isn't really feasible because you don't want people performing arbitrary queries over your data store that may request data too much data (SELECT * FROM blog_posts) or are expensive computationally.

Looking at the FQL developers guide it states that a typical queries look like

SELECT name, pic FROM user WHERE uid=211031 OR uid=4801660

SELECT name, affiliations FROM user
WHERE uid IN (SELECT uid2 FROM friend WHERE uid1=211031)
AND "Facebook" IN affiliations.name AND uid < 10

SELECT src, caption, 1+2*3/4, caption, 10*(20 + 1) FROM photo
WHERE pid IN (SELECT pid FROM photo_tag WHERE subject=211031) AND
pid IN (SELECT pid FROM photo_tag WHERE subject=204686) AND
caption

and return results as XML. I've assumed that what is supported is a simple subset of SQL, perhaps written with Lex & Yacc or ANTLR but it still seems somewhat problematic to move away from the constrained interface of an API and provide access via a query language. It is definitely a lot cooler and more consistent to work with a query language than an API though. Later on when I have some free time, I'll see if I can deduce the grammer for FQL by trying out queries in the Facebook API test console. It looks like there goes one of my evenings this week.

Nice work.


 

February 27, 2007
@ 06:15 PM

With the hubbub now settling down down I decided to go back and try out Yahoo! Pipes. For a while, I've wanted a feed for articles by Chris Kelly over on Huffington Post so I decided to build that.  After a couple of false starts I created the feed which currently doesn't have any items because there aren't any posts by Chris Kelly in the Huffington Post feed.

Now that I've actually used the service I'm pretty surprised that anyone thinks that this is a service that non-geeks will use. Programming with flowcharts to process RSS feeds seems even geekier than having a Star Trek wedding which was my previous bar for geekiest thing ever.


 

From the Microsoft press release Microsoft Demonstrates Further Commitment to Healthcare Market With Planned Acquisition of Web Search Company we learn

NEW ORLEANS — Feb. 26, 2007 — Microsoft Corp. today announced that it has agreed to acquire Medstory Inc., a privately held company based in Foster City, Calif., that develops intelligent Web search technology specifically for health information. The acquisition represents a strategic move for Microsoft in the consumer health search arena and signals a long-term commitment toward the development of a broader consumer health strategy. Medstory employees will join the Health Solutions Group, a recently formed division at Microsoft that will manage product development and delivery. Financial terms were not disclosed, as part of the agreement between the organizations.

This reminds me of the post Thoughts on health care, continued from Google's Adam Bosworth which stated

As I indicated in my post last week, I've been interested in the issue of health care and health information for a while. I just spoke at a conference about some of the challenges in the health care system that we at Google want to tackle. The conference, called Connecting Americans to Their Health Care, is a gathering focused on how consumers are transforming health care through the use of personal health technologies.

This speech will give you some insight into the problems that we believe need our attention.

It is also interesting that Adam Bosworth had been billed with the title Architect, Google Health for a while. I'd once heard that the the market for medical related keywords is one of the most lucrative for search engines which may explain the interest. However if you look at the list of most expensive adwords it would seem that building a vertical search engine targetted at debt consolidation is the real goldmine. :)


 

I'm almost caught up on my blog reading since getting back from vacation and I've spotted a couple of items I'd have blogged responses to if I was around. Since I don't have the time to write full blog posts on each of these items, here are links to the posts and brief outlines on what I thought about them

  • Harish Mallipeddi has a blog post entitled Measuring efficiency of tagging with Entropy links to the paper Understanding Navigability of Social Tagging Systems by Ed Chi and Todd Mytkowicz of Xerox Parc which excerpts the key findings from the paper. One result of their research which seems obvious in hindsight and shows one of the issues that social software has to deal with as its community of users grows was

    The way he does that is to measure entropy (yup that same old same old Claude Shannon’s information theory which you learned in one of the CS courses) of entities like documents (D), users (U) and tags (T). His research group crawled the entire del.icio.us archive and then calculated the entropies. Here’s what they found:

    • H(D|T) specifies the social navigation efficiency. How efficient is it for us to specify a set of tags to find a set of specific documents? We found that in del.icio.us that it is getting less and less efficient.

    This makes sense when you think about it. Let's say the first set of users of del.icio.us came from a homogenous software development background and started applying the tag "xml" to mean items about the eXtensible Markup Language. Later on as the community grew, a number of gamers joined the site and they now use the tag "xml" to refer to items about the game X-Men Legends. Now if you are one of the original geek users of the site, the URL http://del.icio.us/tag/xml no longer is just about markup languages but also about video games. To actually find items strictly about the eXtensible Markup Language you may have to add other tags as refinements such as http://del.icio.us/tag/xml+programming.

    What this means is that to the oldest users of the site, the quality of the tagging system will seem to degrade over time even though this is a natural consequence of growth and diversifying its user base. Of course, this is only a problem if a lot of people use del.icio.us to find all items about a topic (i.e. browsing by tags) as opposed to just storing their individual bookmarks or subscribing to the bookmarks of people they know and trust.

  • It seems Google announced some sort of Microsoft Office killer last week. You can read Don Dodge's Why Microsoft will not fall into the Innovators Dilemma and Robert Scoble's Microsoft has no innovator’s dillema? for two conflicting opinions on how this affects Microsoft. Personally, I think I've overdosed on the amount of times I've read the words innovator's dilemma in association with this announcement while catching up on email and blogs. What is funny about this situation is that almost everyone I've seen who throws the term around doesn't seem to have read the book. It is quite interesting to see Don Dodge write sentences like

    Microsoft will do everything possible to preserve these businesses while transitioning to the new Live strategy.
    and then follow that up with "No Innovators Dilemma here" without seeing the obvious contradiction in his words. Lots of  doublethink at work it seems.

    A side effect of reading this set of blog posts is that I found Don Dodge's Innovate or Imitate...Fame or Fortune? which praises being a fast follower as being more valuable than being an innovator. I've found that a lot of people at Microsoft point to past and recent successes such as XBox, Microsoft Office and Internet Explorer as proof that being a "fast follower" is the best strategy for Microsoft. There are three key problems with this kind of thinking

    1. It assumes your competitors are incompetent. This may have worked in the old days but with competitors like Google and Apple Inc, it isn't the case anymore.
    2. It requires that you have an ace up your sleeve that significantly one ups the competitors when you ship your knock off (e.g. integrating disparate applications into an Office Suite and pricing it lower than competitors, integrating product into the operating system, integrating a rich and social online experience into what was previously a solitary experience etc).
    3. It ignores the fact that "first mover advantage" is actually true for applications that have network effects which is definitely the case for social software which a lot of software has become today.

  • The "diversity in conferences" recurring debate was kicked off again by a blog post by Jason Kottke entitled Gender Diversity at Web Conferences which encouraged the interesting responses from folks like Eric Meyer, Anil Dash and Shelley Powers. They are all good posts with stuff I agree and disagree with in them but I wasn't moved to write until I read the post Why are smart people still stuck on gender and skin-color blinders? by Tantek Çelik where he wrote

    Why is it that gender (and less often race, nay, skin-color, see below) are the only physical characteristics that lots of otherwise smart people appear to chime in support for diversity of?

    E.g. as long as we are trying for greater diversity in superficial physical characteristics (superficial because what do such characteristics have to do with the stated directly relevant criteria of "technical expertise, speaking skills, professional stature, brand appropriateness, and marketability" - though perhaps I can see a tenuous link with "rainbow" marketing), why not ask about other such characteristics?

    Where are all the green-eyed folks?

    Where are all the folks with facial tattoos?

    Where are all the redheads?

    Where are the speakers with non-ear facial piercings?

    Surely such speakers would help with "hipness" marketing.

    I found this post to be disingenious and wondered how anybody could downplay the gender and racial bias in the "Web 2.0" technology conference scene by equating it to a preference for green eyed speakers. So I decided to throw in my $0.02 on this topic...again.

    After the last ETech, I realized I was seeing the same faces and hearing the same things over and over again. More importantly, I noticed that the demographics of the speaker lists for these conferences don't match the software industry as a whole let alone the users who we are supposed to be building the software for.

    There were lots of little bits of ignorance by the speakers and audience which added up in a way that rubbed me wrong. For example, at the 2005  Web 2.0 conference a lot of people were ignorant of Skype except as 'that startup that got a bunch of money from eBay'. Given that there are a significant amount of foreigners in the U.S. software industry who use Skype to keep in touch with folks back home, it was surprising to see so much ignorance about it at a supposedly leading edge technology conference. The same thing goes for how suprised people were by how teenagers used the Web and computers. Additionally, there are just as many women using social software such as photo sharing, instant messaging, social networking, etc as men yet you rarely see their perspectives presented at any of these conferences. 

    When I think of diversity, I expect diversity of perspectives. People's perspectives are often shaped by their background and experiences. When you have a conference about an industry which is filled with people of diverse backgrounds building software for people of diverse backgrounds, it is a disservice to have the conversation and perspectives be homogenous. The software industry isn't just young white males in their mid-20s to mid-30s nor is that the primary demographic of Web users.

    Personally, I've gotten tired of attending conferences where we heard more about technologies and sites that the homogenous demographic of young to middle aged, white, male computer geeks find interesting (e.g. del.icio.us and tagging) and less about what Web users actually use regularly or find interesting (hint: it isn't del.icio.us and it sure as fuck isn't tagging).


 

Every once in a while someone asks me about software companies to work for in the Seattle area that aren't Microsoft, Amazon or Google. This is the first in a series of weekly posts about startups in the Seattle area that I often mention to people when they ask me this question.

Jott is described as voice-powered, hands-free messaging and to do lists on the front page of the site. Jott is primarily a voice to text service with two main features

  1. You can call 1 877 568 8486 and leave a voice memo that is converted to an email and sent to your email address.
  2. You can call 1 877 568 8486 and leave a voice memo that is converted to an email or SMS text message and sent to one of your contacts

I don't have much use for the first feature but the second is quite useful for sending text messages while in traffic instead of trying to futz around typing with T9.

The founders of Jott are John Pollard and Shreedhar Madhavapeddi who are both ex-Microsoft folks. I worked with Shree briefly as part of the MSN Windows Live Messenger server team before he left Microsoft. He was a smart guy and someone I regretted not working with more before he left the company. 

Press: Seattle Times on Jott Networks

Number of Employees: 5

Location: Fremont, WA

Jobs: jobs@jott.com, current open positions are for a VP of Marketing and a Software Development Engineer


 

February 25, 2007
@ 04:37 PM

Twice this week, I've been impressed by how some rant I made in my blog turned into an implemented feature in Web software that hundreds of thousands of people use. The first incident comes from my blog post Why Feedburner Doesn't Count Outlook 2007 Subscribers where I wrote about the fact that FeedBurner doesn't track subscribers to my feed who're using Outlook 2007 because it uses the same user agent string as Internet Explorer 7. Thus I was pleasantly surprised when I logged into FeedBurner and saw the following

It seems that while I was on vacation the folks at FeedBurner decided to implement a solution to the problem I pointed out even though it isn't their fault. Nice.

The second incident comes in response to my post MSN SoapBox in Public Beta where I mentioned that neither Google Reader nor Bloglines would display videos from MSN Soapbox embedded in a blog post. Yesterday Mihai Parparita who works on Google Reader let me know that they added support for that while I was on vacation. That means if you are reading this in Google Reader you should see a video on the next line

Thanks to the Google Reader team for implementing this so quickly.


 

February 23, 2007
@ 04:36 PM

I'm back from vacation at Disneyland. My pictures are at http://www.flickr.com/photos/carnage4life

What did I miss?


 

Categories: Personal

February 19, 2007
@ 07:36 PM

I should be on my way to the airport but this was just too good to share. Below are the opening paragraphs of a LiveJournal post by chalain entitled So Beautiful, So Disturbing

I wake. For a moment, I stare at the ceiling trying to remember something. Something important. Something important happened last night, but the details escape me. Something fascinating yet sinister, like touring the CIA offices. Something exotic yet somehow familiar, like putting hot sauce on meatloaf. I wonder if I have a hangover. I wonder why I am thinking about the CIA and meatloaf. I roll onto my side.

There is a strange woman in bed with me.

A lot of things happen at once. First, I realize that this is the most beautiful woman I have ever seen, and I am a lucky, lucky man. Second, I realize that this is not my wife, and I panic. Third, I realize that she's awake, has been watching me sleep. Fourth, before I can really react to thoughts 1 and 2, she smiles at me and speaks with a lovely accent I can't quite place: "So. You like new wife, yes? Yes. Up now, I make breakfast."

She gets out of bed and stretches, perfect curves sliding under silky lingerie and momentarily making me forget about breakfast, meatloaf, and whoever it was I was married to before last night. She seems to know this, and smiles at me again, but apparently she's serious about making breakfast. She turns and strides confidently from the room. As she does, I see for the first time the large Microsoft logo splayed across her back. My stomach lurches as I suddenly remember everything.

Windows Vista. I bought a new computer yesterday... and it came with Windows Vista.

Read the entire thing here. It's pretty good stuff and is kinda cool that software can invoke such positive and negative emotions from its users.


 

February 19, 2007
@ 04:28 AM

Mary Jo Foley has a blog post entitled Ballmer’s list: Microsoft’s CEO shares his top nine Microsoft growth picks where she writes

Ballmer's guaranteed nine growth spots:

1. Windows client revenues from OEMs (PC makers and system builders)

2. "Desktop value" revenues derived from corporations (big enough to have an IT department). This sounded like Office revenues

3. Server revenues — Windows Server, database, security products. Ballmer said he sees this as an arena where Microsoft has a good opportunity to grow its business vis-a-vis Linux

4. "Mature desktops" — i.e., add-on revenues in corporations where there's already some penetration of Windows and Office. Client-access licenses are a key growth driver here.

5. Emerging market savings — especially due to Genuine Advantage Initiative anti-piracy crackdown campaigns/mechanisms

6. Advertising — especially via adCenter, Microsoft's online ad system — and the properties fueled by it

7. Xbox, particularly in dollars derived from Xbox Live, attached hardware and attached software

8. Sales of Office to small businesses and consumers

9. Windows Mobile operating system sales to cell-phone and PDA makers
...
I was surprised that Windows Live — supposedly one of Microsoft's most important strategic efforts — didn't make either of Ballmer's lists. Ballmer did mention services, but talked about it more from a platform perspective, than as a bunch of individual point products.

Am I the only one who's wondering why Mary Jo Foley didn't realize that #6 refers to Windows Live?


 

Categories: Windows Live

While everyone else was raving about the fact that Feedburner can now count RSS subscribers coming from Google reader I've been noticing that there was another discrepancy in the Feedburner data that didn't seem to be accounted for. Below is a screenshot of number of hits from Web browsers on my RSS feed

It seems pretty unlikely that people have clicked on my RSS feed over 5000 times today. At first I thought Feedburner was miscounting feeds that had been subscribed from IE 7 but a quick look in Fiddler shows that IE 7 requests feeds using Windows-RSS-Platform as the User-Agent and is correctly counted by Feedburner.

So I sent some mail to Eric Lunt who's a co-founder and the  CTO of Feedburner to see if he knew what was wrong. He let me know that the problem is that Outlook 2007 doesn't identify itself in the User-Agent string and instead pretends to be Internet Explorer 7. This means there is no way to separate out accesses of your feed from Outlook 2007 from people clicking on your feed in IE 7.

This seems like a fairly rookie mistake to ship in a bigtime product like Outlook. I don't have the latest version installed so I can't confirm that this is truly the case but if it is I hope they plan to fix this soon. It's really lame to not identify your product correctly in the User-Agent string.
...
Oops. I should have done a search before sending out mail. It looks like this was already covered in a blog post entitled Outlook, RSS, & the user-agent string by Michael Affronti who was the PM for RSS in Outlook 2007. He wrote

For Outlook 2007 we will unfortunately not be able to report any custom user agent string for our RSS aggregation.  Due to the way we integrate with IE across many parts of the application (the WININET stack is the underlying infrastructure for all of Outlook’s internet communication), we cannot easily and safely change the way we broadcast ourselves when connecting to external servers.  To do so would require a fundamental change in the way the WININET stack is called from Outlook and could affect all of the Office applications.  The scope of this fix is unfortunately outside of what we can provide this release.

I guess this won't be fixed anytime soon, if ever. Anyway, I hope this post helps out other users of Feedburner who've also been curious about their weird number of hits supposedly from IE 7.