I've been spending some time over the past couple of months thinking about Web services and Web APIs. Questions like when a web site should decide to expose an API, what form the API should take and what technologies/protocols should be used are topics I've rehashed quite a lot in my head. Recently I came to the conclusion that if one was going to provide a Web service that is intended to be consumed by as many applications as possible, then one should consider exposing the API using multiple protocols. I felt that at least two protocols should be chosen SOAP over HTTP (for the J2EE/.NET crowd) and Plain Old XML (POX) over HTTP (for the Web developer crowd).

However, I've recently started spending a bunch of time writing Javascript code for various Windows Live gadgets and I've begun to appreciate the simplicity of using JSON over parsing XML by hand in my gadgets. I've heard similar comments echoed by co-workers such as Matt who's been spending a bunch of time writing Javascript code for Live Clipboard and Yaron Goland who's one of the minds working on the Windows Live developer platform. JSON has similar goals to XML-RPC and W3C XML schema in that it provides a platform agnostic way to transfer data which is encoded as structured types consisting of name<->value pairs and collections of name<->value pairs. It differs from XML-RPC by not getting involved with defining a mechanism for remote procedure calls and from W3C XML schema by being small, simple and focused.

Once you start using JSON in your AJAX apps, it gets pretty addictive and it begins to seem like a hassle to parse XML even when it's just plain old XML such as RSS feeds not complex crud like SOAP packets. However being an XML geek, there are a couple of things I miss from XML that I'd like to see in JSON especially if it's usage will grow to become as widespread as XML is on the Web today. Yaron Goland feels the same way and has started a series of blog posts on the topic.

In his blog post entitled Adding Namespaces to JSON Yaron Goland writes

The Problem

If two groups both create a name "firstName" and each gives it a different syntax and semantics how is someone handed a JSON document supposed to know which group's syntax/semantics to apply? In some cases there might be enough context (e.g. the data was retrieved from one of the group's servers) to disambiguate the situation but it is increasingly common for distributed services to be created where the original source of some piece of information can trivially be lost somewhere down the processing chain. It therefore would be extremely useful for JSON documents to be 'self describing' in the sense that one can look at any name in a JSON document in isolation and have some reasonable hope of determining if that particular name represents the syntax and semantics one is expecting.

The Proposed Solution

It is proposed that JSON names be defined as having two parts, a namespace name and a local name. The two are combined as namespace name + "." + local name to form a fully qualified JSON name. Namespace names MAY contain the "." character. Local names MUST NOT contain the "." character. Namespace names MUST consist of the reverse listing of subdomains in a fully qualified DNS name. E.g. org.goland or com.example.bigfatorg.definition.

To enable space savings and to increase both the readability and write-ability of JSON a JSON name MAY omit its namespace name along with the "." character that concatenated it to its local name. In this case the namespace of the name is logically set to the namespace of the name's parent object. E.g.

{ "org.goland.schemas.projectFoo.specProposal" :
"title": "JSON Extensions",
"author": { "firstName": "Yaron",
"com.example.schemas.middleName":"Y",
"org.goland.schemas.projectFoo.lastName": "Goland",
}
}

In the previous example the name firstName, because it lacks a namespace takes on its parent object's namespace. That parent is author which also lacks a namespace so recursively author looks to its parent specProposal which does have a namespace, org.goland.schemas.projectFoo. middleName introduces a new namespace "com.example.schemas", if the value was an object then the names in that object would inherit the com.example.schemas namespace. Because the use of the compression mechanism is optional the lastName value can be fully qualified even though it shares the same namespace as its parent. com.example.taxonomy

My main problem with the above approach is echoed by the first comment in response to Yaron's blog post; the above defined namespace scheme isn't completely compatible with XML namespaces. This means that if I have a Web service that emits both XML and JSON, I'll have to use different namespace names for the same elements even though all that differs is the serialization format. Besides the disagreement on the syntax of the namespace names, I think this would be a worthwhile addition to JSON.

In another blog post entitled Adding Extensibility to JSON Data Formats Yaron Goland writes

The Problem

How does one process JSON messages so that they will support both backwards and forwards compatibility? That is, how does one add new content into an existing JSON message format such that those who do not understand the extended content will be able to safely ignore it?

The Proposed Solution

In the absence of additional information providing guidance on how to handle unrecognized members a JSON processor compliant with this proposal MUST ignore any members whose names are not recognized by the processor.

For example, if a processor was expecting to receive an object that contained a single member with the name "movieTitle" and instead it receives an object with multiple members including "movieTitle", "producer" and "director" then the JSON processor would, by default, act as if the "producer" and "director" members were not present.

An exception to this situation would be a member named "movie" whose value is an object where the semantics of the members of that object is "the local name of the members of this object are suitable for presenting as titles and their values as text under those titles". In that case regardless of the processor's direct knowledge of the semantics of the members of the object (e.g. the processor may actually know about movieTitle but not "producer" or "directory") the processor can still process the unrecognized members because it has additional information about how to process them.

This requirement does not apply to incorrect usage of recognized names. For example, if the definition of an object only allowed a single "movieTitle" member then having two "movieTitle" members is simply an error and the ignore rule does not apply.

This specification does not require that ignored members be removed from the JSON structure. It is quite possible that other processors who will deal with the message may recognized members the current processor does not. Therefore it would make sense to let unrecognized members remain in the JSON structure so that others who process the structure may benefit from the extended information.

Definition: Simple value - A value of a type other than array or object.

If a JSON processor encounters an array where it had expected to encounter a simple value the processor MUST retrieve the first simple value in the array and treat that as the value it was expecting and ignore the other elements in the array.

Again, it looks like I'm going to go ahead and parrot the same feedback as a commenter to the original blog post. Defining an extensibility model where simple types can be converted to arrays in a future version seems like overkill and unnecessary complexity. It's not like it's that hard to another field to the type. The other thing I wondered about this blog post is that it seemed to define a problem set that doesn't really exist. It's not like there are specialized JSON parsers that barf if they see a field that they don't understand in widespread use. Requiring that the fields of various types be defined up front or else barfing when encountering undefined fields over the wire is primarily a limitation of statically typed languages and isn't really a problem for dynamic languages like JavaScript. Or am I missing something?


 

Categories: XML Web Services

August 14, 2006
@ 07:24 PM

I've been late to blog about this because I was out on vacation but better late than never. J.J. Allaire (yes, that one) has a blog post entitled Introducing Windows Live Writer which announces Microsoft's desktop blogging tool called Windows Live Writer. He writes

Introducing Windows Live Writer

Welcome to the Windows Live Writer team blog! We are excited to announce that the Beta version of Windows Live Writer is available for download today.

Windows Live Writer is a desktop application that makes it easier to compose compelling blog posts using Windows Live Spaces or your current blog service. Blogging has turned the web into a two-way communications medium. Our goal in creating Writer is to help make blogging more powerful, intuitive, and fun for everyone. Writer has lots of features which we hope make for a better blogging experience. Some of the ones we are most excited about include:

WYSIWYG Authoring

 The first thing to notice about Writer is that it enables true WYSIWYG blog authoring. You can now author your post and know exactly what it will look like before you publish it. Writer knows the styles of your blog such as headings, fonts, colors, background images, paragraph spacing, margins and block quotes and enables you to edit your post using these styles. ...

Photo Publishing

Writer makes inserting, customizing, and uploading photos to your blog a snap. You can insert a photo into your post by browsing image thumbnails through the “Insert Picture” dialog or by copying and pasting from a web page...Photos can be either uploaded directly to your weblog provider (if they support the newMediaObject API) or to an FTP server.

Writer SDK

 Already thinking of other cool stuff you want to insert into your blog? Good!

The Windows Live Writer SDK allows developers to extend the capabilities of Writer to publish additional content types. Examples of content types that can be added include:

  1. Images from online photo publishing sites
  2. Embedded video or audio players
  3. Product thumbnails and/or links from e-commerce sites
  4. Tags from tagging services

This is one project I've been dying to blog about for months. Since I was responsible for the blogging and the upcoming photo publishing APIs for Windows Live Spaces, I've spent the last couple of weeks working with the team to make sure that the user experience when using Windows Live Writer and Windows Live Spaces is great. I'd like to hear if you think we've done a good job.

PS: The application is not only chuck full of features but is also very extensible. Tim Heuer has already written plugins to enable integration with Flickr and added tagging support. If you are a developer, you should also download the SDK and see what extensions you can hack into the app.


 

Categories: Windows Live

In my previous post, I talked about some of the issues I saw with the idea of doing away with operations teams and merging their responsibilities into the development team's tasks [as practised at companies like Amazon]. Justin Rudd, who is a developer at Amazon, posts his first-hand perspective of this practice in his blog post entitled Expanding on the Pain where he writes

Since I am a current employee of Amazon in the software development area, I probably shouldn’t be saying this, but…
...

First a few clarifications - there is no dedicated operations team for Amazon as a whole that is correct.  But each team is allowed to staff as they see fit.  There are teams within Amazon that have support teams that do handle quite a bit of the day to day load.  And their systems tend to be more “smooth” because this is what that team does - keep the system up and running and automate keeping the system up and running so they can sleep at night.

There are also teams dedicated to networking, box failures, etc.  So don’t think that developers have to figure out networking issues all the time (although there are sometimes where networking doesn’t see a problem but it is affecting a service).

Now for those teams that do not have a support team (and I am on one of them), at 3 in the morning you tend to do the quickest thing possible to get the problem rectified.  Do you get creative?  After being in bed for 3 hours (if you’re lucky) and having a VP yell at you on the phone that this issue is THE most important issue there is or having someone yell at you that they are going to send staff home, how creative do you think you can be?  Let me tell you, not that creative.  You’re going to solve the problem, make the VP happy (or get the factory staff back to work), and go back to bed with a little post it note to look for root cause of the problem in the AM.

Now 1 of 2 things happens.  If you have a support team, you let them know about what happened, you explain the symptoms that you saw, how you fixed it, etc.  They take your seed of an idea, plant it, nurture it, and grow it.

If you don’t have a support team and you are lucky, in the morning there won’t be another THE most important thing to work on and you can look at the problem with some sleep and some creativity.  But the reality is - a lot of teams don’t have that luxury.  So what happens?  You end up cronning your solution which may be to bounce your applications every 6 hours or run a perl script that updates a field at just the right place in the database, etc.

We all have every intention of fixing it, but remember that VP that was screaming about how this issue had to be fixed?  Well now that it isn’t an issue anymore and it’s off his radar screen, he has new features that he wants pushed into your code.  And those new features are much more important than you fixing the issue from the night before because the VP really doesn’t care if you get sleep or not at night.

Justin's account jibes with other accounts I've heard [second hand] from other ex-Amazon developers about what it means to live without an operations team. Although it sounds good on paper to have the developers responsible for writing the code also responsible when there are issues with the code on the live site, it leads to burning the candle at both ends. Remember, division of labor exists for a reason.
 

Categories: Web Development

A few weeks ago, I bookmarked a post from Sam Ruby entitled Collapsing the Stack where he wrote

Werner Vogels: Yep, the best way to completely automate operations is to have to developers be responsible for running the software they develop. It is painful at times, but also means considerable creativity gets applied to a very important aspect of the software stack. It also brings developers into direct contact with customers and a very effective feedback loop starts. There is no separate operations department at Amazon: you build it; you run it.

Sounds like a very good idea.

I don't see how this sounds like a good idea. This reminds me of a conversation I once had with someone at Microsoft who thought it would be a good idea to get rid of their test team and replace them all with developers once they moved to Test Driven Development. I used to be a tester when I first joined Microsoft and this seemed to me to be the kind of statement made by someone who assumed that the only thing testers do is write unit tests. Good test teams don't just write unit tests. They develop and maintain test tools. They perform system integration testing. They manage the team's test beds and test labs. They are the first line of defence when attempting to reproduce customer bug reports face before pulling in developers who may be working on your next release. All of this can be done by the development team but it means that your developers spend less time developing and more time testing. This cost will show up either as an increment in the amount of time it takes to get to market or a reduction in quality if schedules are not adjusted to account for this randomization of the development team. Eventually you'll end up recreating your test team so there are specific people responsible for test-related activities [which is why software companies have test teams in the first place].

The same reasoning applies to the argument for folding the responsibilities of your operations team into the development team's tasks. A good operations team isn't just responsible deployment/setup of applications on your servers and monitoring the health of the Web servers or SQL databases inyour web farm. A good operations team is involved in designing your hardware SKUs and understanding your service's peak capacity so as to optimize purchase decisions. A good operations team makes the decisions around your data centers from picking locations with the best power prices and ensuring that you're making optimal use of all the physical space in your data center to making build . A good operations team is the first line of defence when your service is being hit by a Denial of Service attack. A good operations team insulates the team from worrying about operating system, web server or database patches being made to the live site. A good operations team is involved in the configuration, purchase, deployment and [sometimes] development of load balancing, database partitioning and database replication tools. Again, you can have your development team do this but eventually it would seem to make sense that these tasks be owned by specific individuals instead of splitting them across the developers building one's core product.

PS: I've talked to a bunch of folks who know ex-Amazon developers and they tend to agree with my analysis above. I'd be interested in getting the perspective of ex-Amazon developers like Greg Linden on replacing your operations team with your core development staff.

PPS: Obviously this doesn't apply if you are a small 2 to 5 person startup. Everyone ends up doing everything anyway. :)


 

Categories: Web Development

August 8, 2006
@ 08:30 PM

This morning I got an IM from Niall Kennedy letting me know that he was Leaving Microsoft. He begins his blog post about leaving by writing

I am leaving Microsoft to start my own company. My last day at Microsoft is next Friday, August 18. It's uncertain whether Microsoft will continue the feed platform work I started, but it's some good stuff so I hope they do.

As the person who referred Niall to the company and gave him some advice when he was weighing whether to join Windows Live, I am sad to see him leave so soon. I sympathize with his reasons for leaving although some of what he wrote is inaccurate and based on speculation rather than the actual facts of the matter. That said, I found Niall to be quite knowledgeable, smart and passionate, so I expect him to do well in his endeavors.

Good luck, Niall.


 

Don Demsak has a blog post entitled Open Source Projects As A Form Of Community Service which links to a number of blog posts about the death of the NDoc project. He writes

Open source projects have been the talk of the tech blogs recently with the announcement that NDoc 2 is Offcially Dead, along with the mention that the project's sole develop was a victim of an automated mail-bomb attack because the project wasn't getting a .Net 2.0 version out fast enough for their liking.  Kevin has decided to withdraw from the community, and fears for himself and his family.  The .Net blogging community has had a wide range of reactions:

  • Phil Haack talks about his ideas behind helping/saving the open source community and laid down a challenge. 
  • Eric Wise mentions that he will not work on another FOSS project. 
  • Scott Hanselman laments that Microsoft hasn't put together an Ineta like organization to handle giving grants to open source projects, and also shows how easy it is to submit a patch/fix to a project.
  • Peter Provost worries that bringing money into the equation may spoil the cool part of community developed software, and that leadership is the key to good open source projects.
  • Derek Denny-Brown says that "Microsoft needs to understand that Community is more than just lots of vendors creating commercial components, or MVPs answering questions on newsgroups".

I've been somewhat disappointed by the Microsoft developer division's relationship with Open Source projects based on the .NET Framework and it's attitude towards source code availability in general. Derek Denny-Brown's post entitled Less Rambling Venting about Developing for .Net hit the nail on the head for me. There are a number of issues with the developer community around Visual Studio and the .NET Framework that are raised in Derek's post and the others mentioned above. The first, is what seems like a classic case of Not Invented Here (NIH) in how Microsoft has not only failed to support Open Source projects that fill useful niches in the Visual Studio ecosysem but eventually competes with them (Nant vs. MSBuild, NUnit vs. Visual Studio Team System and now Sandcastle vs. NDoc). My opinion is that this is a consequence of Microsoft's strategy of integrated innovation which encourages Microsoft's product teams to pursue a seamless end-to-end experience where every software application in the business process is a Microsoft product. 

Another issue is Microsoft's seeming ambivalence and sometimes antipathy towards Open Source software. This is related to the fact that the ecosystem around Microsoft's software platforms (i.e. customers, ISVs, etc) is heavily tilted towards commercial software development. Or is that vice versa? Either way, commercial software developers tend to view Open Source as the bane of their existence. This is unfortunate given that practically every major software development platform that the .NET Framework and Visual Studio competes with is either Open Source (e.g. PHP, Perl, Python, Ruby) or at the very least encourages source code availability (e.g. Java). Quite frankly, I personally would love to the .NET Framework class libraries being Open Source or at the very least have the source code available in the same way Sun has done with the JDK. I know that there is the Shared Source Common Language Infrastructure (SSCLI) which I have used on occassion when having issues during RSS Bandit development but it isn't the same.

So we have a world where the developer community around Microsoft's products is primarily interested in building and using commercial software while the company pursues an integration strategy that guarantees that it will compete with projects that add value on its platform. The questions then are whether this is a bad thing and if so, how do we fix it?


 

August 8, 2006
@ 01:29 AM

Like every other company out there, Microsoft likes to encourage employees to refer people they know who might be a good fit to join the company. It seems the HR department sent out a mass mailing last week and guess whose ugly mug was used as part of the campaign?

Front

Back

I assume this was just reusing the stock photo from my page on the Microsoft Careers site as opposed to an exhortation to Microsoft employees to hire more people like me. We only need so many paper pushing PMs. ;)

It's still pretty sweet though.


 

Categories: Life in the B0rg Cube

August 6, 2006
@ 04:47 PM

I've seen Yochai Benkler mentioned twice in the past few weeks in blogs I read semi-regularly. It seems that he recently tangled with Jason Calacanis over his attempt to pay top contributors of social bookmarking sites like Digg to using his service. Jason Calacanis documents their encounter in his post entitled Calacanis vs. Benkler Round One. Yochai Benkler also posted a comment in Nick Carr's post which was then elevated to a post by Carr entitled . Below is an excerpt from Yochai Benkler's comment.

The reason is that the power of the major sites comes from combining large-scale contributions from heterogeneous participants, with heterogeneous motivations. Pointing to the 80/20 rule on contributions misses the dynamic that comes from being part of a large community and a recognized leader or major contributors in it, for those at the top, and misses the importance of framing this as a non-priced social process. Adding money alters the overall relationship. It makes some people "professionals," and renders other participants, "suckers." It is not impossible to mix paid and unpaid participants, as we see in free and open source software and even to a very limited extent in Wikipedia. It is just hard, and requires a cultural form that is definitely not "now at long last we can tell who's worth something and pay them, while everyone else is just worthelss." What Calacanis is doing now with his posts about the top contributors to Digg is trying to alter the cultural interpretation of what they are doing: from leaders in an engaged community, to suckers who are being taken for a ride by Ross.Maybe he will succeed in raining on Digg's parade, though I doubt it, but that does not mean that he will succeed in building an alternative sustained social process of peer production, or in replacing peer production with a purely paid service. Once you frame the people who aren't getting paid as poor sods being taken for a ride, for example, the best you can hope for is that some of the "leaders" elsewhere will come and become your low-paid employees (after all, what is $1,000 a month relative to the millions Calacanis would make if his plan in fact succeeds? At that point, the leaders are no longer leaders of a community, and they turn out to be suckers after all, working for pittance, comparatively speaking.)

I'm quite surprised to see Benkler mention and dismiss the example of Open Source software since what is happening between Calacanis and Digg seems to be history repeating itself. Back in the day, Open Source software like Linux was primarily built by hobbyists who worked on such projects in their free time without intention of getting financially rewarded. Later on, companies showed up that wanted to make money from Open Source software and there was a similar kind of angst to what we are seeing today about social bookmarking. If you want to take a trip down memory lane, go back and read all the comments on the various stories about the Redhat IPO on Slashdot to see the same kind of arguments and hurt feelings that you see in the arguments made by Benkler and from people such as the Backstabbed by Netscape blogger.

The fact is that since we are human, the 80/20 rule still applies when it comes to the value of the contributions by individuals. This means that it is beneficial to the 'community' if those that contribute the most value to the system are given as much incentive as possible to contribute. After all, I doubt that there is anyone who would argue that the fact that Linus Torvalds and Alan Cox are paid to work on Linux or that Miguel De Icaza is paid to work on Mono is harmful to the communities around these projects or that it makes the unpaid contributors to these projects "suckers".

I think where people are getting confused is that they are mixing up giving the most valuable contributors to the system more incentive with trying to incentivize the entire community with financial reward. They are not the same thing. Open Source projects wouldn't be successful if everyone contributing to a project did so with the expectation of being paid. On the flip side, Open Source projects benefit the most when the top contributors to the project can dedicate their 100% of their efforts on the project without having to worry about a day job. That's the difference.

Also it seems that Benkler seems to think that Whuffie (aka respect from the community) is a better incentive than money when it comes to influencing top contributors. I think that's a pipe dream that will only occur when we live in a world where money can't buy you anything such as the one in Cory Doctorow's Down and Out in the Magic Kingdom.


 

Categories: Social Software

August 4, 2006
@ 07:55 PM

Jason Fried over at the Signal vs. Noise blog has an entry entitled Don’t believe BusinessWeek’s bubble-math where he writes

This week’s BusinessWeek cover story features a beaming Kevin Rose from Digg. Across his chest it says “How this kid made $60 million in 18 months.” Wow, now that sounds like a great success story.

Too bad it’s a blatent lie. BusinessWeek knows it. They prove it themselves in the article:

So far, Digg is breaking even on an estimated $3 million annually in revenues. Nonetheless, people in the know say Digg is easily worth $200 million.

$3 million in revenues and they’re breaking even. That means no meaningful profits. That’s the first hint no one has made $60,000,000. Their gross revenues aren’t even anywhere close to that number. And let’s leave out the “people in the know say it’s easily worth” fantasy numbers. And certainly don’t use those numbers to do the math that makes the cover (we’ll get to that in a minute).

I can't believe BusinessWeek ran such a misleading cover story. I guess sensational, fact-less headlines aren't just for the supermarket tabloids these days.


 

Earlier this week I was showing off the new Windows Live Spaces to my girlfriend in an attempt to try and explain what I do at Microsoft all day. When I showed her the Friends list feature she was surprised that the site had morphed from a blogging service into a social networking service and wondered whether our users wouldn't react negatively to the change. Actually that's paraphrasing what she said. What she said was

If I was using your site to post my blog and my pictures for people I know, I'd be really annoyed if I started having to deal with strangers asking me to be their friend. If I wanted to deal with that shit crap I'd have just gone to MySpace.
That's valid criticism and is something the people who worked on the design of this feature (i.e. me, Mike, Neel, Matt, John and a bunch of others) took into account. One of the key themes of Windows Live is that it puts users in control of their Web experience. Getting repeated email requests to be some stranger's "friend" without a way to stop them doesn't meet that requirement. This is why there is a communications preferences feature in Windows Live Spaces which can be reached by clicking on http://[yourspacename].spaces.live.com/Settings/Communication/

Below is a screenshot of the page

Don't like getting friend requests as email? Disable this by unchecking 'Also send invitations and e-mails to my email address'. Don't want to deal with requests from total strangers wanting to be on your friend's list? Then move the setting on 'These people can request to be on your friends list' to something more restrictive like 'Messenger buddies' so that you only get friend requests from people on your IM buddy list who you already know.You hang out a virtual DoNotDisturb sign and we'll honor it.

Making sure that our users have total control over who they communicate and share information with is key to how we approach building social software for Windows Live. Thanks to all the users of Windows Live Spaces who have made it the most popular blogging service in the world. 


 

Categories: Windows Live