October 19, 2008
@ 08:47 AM

Tim Bray has a thought provoking post on embracing cloud computing entitled Get In the Cloud where he brings up the problem of vendor lock-in. He writes

Tech Issue · But there are two problems. The small problem is that we haven’t quite figured out the architectural sweet spot for cloud platforms. Is it Amazon’s EC2/S3 “Naked virtual whitebox” model? Is it a Platform-as-a-service flavor like Google App Engine? We just don’t know yet; stay tuned.

Big Issue · I mean a really big issue: if cloud computing is going to take off, it absolutely, totally, must be lockin-free. What that means if that I’m deploying my app on Vendor X’s platform, there have to be other vendors Y and Z such that I can pull my app and its data off X and it’ll all run with minimal tweaks on either Y or Z.

At the moment, I don’t think either the Amazon or Google offerings qualify.

Are we so deranged here in the twenty-first century that we’re going to re-enact, wide-eyed, the twin tragedies of the great desktop-suite lock-in and the great proprietary-SQL lock-in? You know, the ones where you give a platform vendor control over your IT budget? Gimme a break.

I’m simply not interested in any cloud offering at any level unless it offers zero barrier-to-exit.

Tim's post is about cloud platforms but I think it is useful to talk about avoiding lock-in when taking a bet on cloud based applications as well as when embracing cloud based platforms. This is especially true when you consider that moving from one application to another is a similar yet smaller scoped problem compared to moving from one Web development platform to another.

So let's say your organization wants to move from a cloud based office suite like Google Apps for Business to Zoho. The first question you have to ask yourself is whether it is possible to extract all of your organization's data from one service and import it without data loss into another. For business documents this should be straightforward thanks to standards like ODF and OOXML. However there are points to consider such as whether there is an automated way to perform such bulk imports and exports or whether individuals have to manually export and/or import their online documents to these standard formats. Thus the second question is how expensive it is for your organization to move the data. The cost includes everything from the potential organizational downtime to account for switching services to the actual IT department cost of moving all the data. At this point, you then have to weigh the impact of all the links and references to your organization's data that will be broken by your switch. I don't just mean links to documents returning 404 because you have switched from being hosted at google.com to zoho.com but more insidious problems like the broken experience of anyone who is using the calendar or document sharing feature of the service to give specific people access to their data. Also you have to ensure that email that is sent to your organization after the switch goes to the right place. Making this aspect of the transition smooth will likely be the most difficult part of the migration since it requires more control over application resources than application service providers typically give their customers. Finally, you will have to evaluate which features you will lose by switching applications and ensure that none of them is mission critical to your business.

Despite all of these concerns, switching hosted application providers is mostly a tractable problem. Standard data formats make data migration feasible although it might be unwieldy to extract the data from the service. In addition, Internet technologies like SMTP and HTTP all have built in ways to handle forwarding/redirecting references so that they aren't broken. However although the technology makes it possible, the majority of hosted application providers fall far short of making it easy to fully migrate to or away from their service without significant effort.

When it comes to cloud computing platforms, you have all of the same problems described above and a few extra ones. The key wrinkle with cloud computing platforms is that there is no standardization of the APIs and platform technologies that underlie these services. The APIs provided by Amazon's cloud computing platform (EC2/S3/EBS/etc) are radically different from those provided by Google App Engine (Datastore API/Python runtime/Images API/etc). For zero lock-in to occur in this space, there need to be multiple providers of the same underlying APIs. Otherwise, migrating between cloud computing platforms will be more like switching your application from Ruby on Rails and MySQL to Django and PostgreSQL (i.e. a complete rewrite).

In response to Tim Bray's post, Dewitt Clinton of Google left a comment which is excerpted below

That's why I asked -- you can already do that in both the case of Amazon's services and App Engine. Sure, in the case of EC2 and S3 you'll need to find a new place to host the image and a new backend for the data, but Amazon isn't trying to stop you from doing that. (Actually not sure about the AMI format licensing, but I assumed it was supposed to be open.)

In App Engine's case people can run the open source userland stack (which exposes the API you code to) on other providers any time the want, and there are plenty of open source bigtable implementations to chose from. Granted, bulk export of data is still a bit of a manual process, but it is doable even today and we're working to make it even easier.

Ae you saying that lock-in is avoided only once the alternative hosts exist?

But how does Amazon or Google facilitate that, beyond getting licensing correct and open sourcing as much code as they can? Obviously we can't be the ones setting up the alternative instances. (Though we can cheer for them, like we did when we saw the App Engine API implemented on top of EC2 and S3.)

To Doug Cutting's very good point, the way Amazon and Google (and everyone else in this space) seem to be trying to compete is by offering the best value, in terms of reliability (uptime, replication) and performance (data locality, latency, etc) and monitoring and management tools. Which is as it should be.

Although Dewitt is correct that Google and Amazon are not explicitly trying to lock-in customers to their platform, the fact is that today if a customer has heavily invested in either platform then there isn't a straightforward way for customers to extricate themselves from the platform and switch to another vendor. In addition there is not a competitive marketplace of vendors providing standard/interoperable platforms as there are with email hosting or Web hosting providers.

As long as these conditions remain the same, it may be that lock-in is too strong a word describe the situation but it is clear that the options facing adopters of cloud computing platforms aren't great when it comes to vendor choice.

Note Now Playing: Britney Spears - Womanizer Note


Sunday, October 19, 2008 6:03:53 PM (GMT Daylight Time, UTC+01:00)
Had you noticed that VMWare has created an ecosystem of vendors that can run virtual machines with complete portability? You can import a physical machine or a number of VM formats including VMWare and Microsoft. Numerous OSes are supported including Windows and Linux. This seems to me to be about as open and portable as possible.
Matthew Kane
Sunday, October 19, 2008 6:40:38 PM (GMT Daylight Time, UTC+01:00)
I agree that if you stick to only using VM hosting and avoid proprietary technologies (i.e. Google App Engine or Amazon's S3/SQS APIs) then migration becomes more straightforward. This is a similar argument made by Bob Warfield's post Amazon Cloud: Lock-In or No Lock-In?.

Definitely you can sidestep the issue of lock-in if you make the technology choice to only go with VM hosting for the cloud. But then this is in effect admitting that lock-in exists if you blindly adopt the entire kaboodle coming from cloud computing vendors like Google, Amazon and others.
Monday, October 20, 2008 7:30:17 AM (GMT Daylight Time, UTC+01:00)
>> I agree that if you stick to only using VM hosting

Why would that be the case? What's keeping me from rsync'ing / from a VM-based image such as EC2 and a dedicated hardware stack? You will want to avoid messing with the Xen kernel, obviously, but as long as your switching from a 2.6.x kernel to a 2.6.x kernel then there really isn't going to be a problem as long as your base distro is the same.

As far as Windows is concerned, vendor lock-in is a red herring. If your application is built on top of the Windows-based stack vendor lock-in is a concern in the same way vendor lock-in is a concern on top of a Unix-based stack. That's nothing unique to cloud computing.

In fact, you could easily argue that switching from one Windows cloud computing provider to another is a lot simpler than it would be switching from one Linux distribution to another based on the notion that you're dealing with 2 possible base Windows Server versions at this stage: 2k3 and 2k8, whereas there are countless Unix-vendors (Linux* inclusive), each of which have their own way of doing things.

That doesn't mean that switching from one Unix vendor to another is just as difficult as switching from Window to Unix or vice-versa. Just that it requires a lot more work than it would be switching from one Windows 2k3 or 2k8 provider to another 2k3 or 2k8 provider (as long as it was 2k3 to 2k3 or 2k8 to 2k8)
Monday, October 20, 2008 8:52:04 AM (GMT Daylight Time, UTC+01:00)
M. David Peterson,
Your entire comment confuses me. It seems you are agreeing that if you stick to VM-based cloud services then lock-in is virtually non-existent since your core dependency is LAMP/WISC not a proprietary cloud platform. Yet you phrase your comment as disagreement when your content indicates that we violently agree.
Monday, October 20, 2008 8:45:05 PM (GMT Daylight Time, UTC+01:00)
As was alluded to in other comments here, the "cloud vendor lock-in" issue arises mostly with vendors that offer non-standard APIs, which is where Amazon and Google are the most 'shining' examples. Every step away from those custom APIs makes it easier to migrate away to other providers.

For example, with Amazon's introduction of EBS there is now a little bit less of an issue: You don't need to use funky S3 storage APIs for everything, you can use a 'normal' file system. Thus, while you may still use some custom API here and there, you now have already lowered the hurdle to move to another platform.

And so it goes: Every platform has a few custom things and issues to keep in mind. But the less you rely on or need to use those custom extensions and APIs, the easier it gets. It really is a sliding scale, not just a "locked in" or "not locked in" question.

To close this with an example: I'm currently using several cloud vendors, one of them being Joyent, where everything is OpenSolaris. I also use Amazon, where I run a Ubuntu AMI. Admittedly, my application is not terribly complicated, but for the most part, I can use the same config files for my apps in either place. I just custom compile the sources of the apps that I am using (squid, nginx, vsftpd, etc.) on each platform, and for the most part, this works really well. My custom code, mostly Django, works on either platform without any changes.

So, even though I am moving between Solaris and GNU/Linux, I can still get away with it without too many problems. This is made possible because I designed my app in such a way that I don't rely on custom APIs anywhere. Not everyone has the option, but as I said: It's a sliding scale. With a bit of thought and maybe an abstraction layer or two, you can go quite a long way to limit your exposure to custom APIs.
Tuesday, October 21, 2008 1:18:56 AM (GMT Daylight Time, UTC+01:00)
I had this same conversation with my CTO at ISD just last week. We use a hybrid model: full local installations, some local clients of cloud apps and pure cloud apps. The sweet-spot will be when there's seamless integration between any local services and it’s could equivalents. Data portability will also be essential. In order words, the cloud is architected in such a way as to show up on your local systems as though it is part of it, not separate. I remarked to him that there presently is no company that I am aware of which offers this at an enterprise level, turnkey based. From my understanding of Microsoft's Live Mesh vision, this might eventually happen.
Tuesday, October 21, 2008 12:08:59 PM (GMT Daylight Time, UTC+01:00)

>> Your entire comment confuses me. It seems you are agreeing that if you stick to VM-based cloud services then lock-in is virtually non-existent since your core dependency is LAMP/WISC not a proprietary cloud platform. Yet you phrase your comment as disagreement when your content indicates that we violently agree.

Uh, hmmm.... You know, you're right. Not quite sure how that happened. I think I started writing my comment specific to using rsync to replicate images from one service provider to another, and then forgot to transition to the fact that this was really beside the point.

Sorry for the confusion! Yes, we most definitely are in violent agreement (I digg that phrase :-)
Wednesday, October 22, 2008 12:16:57 PM (GMT Daylight Time, UTC+01:00)
Wow, I just re-reread my first post and what it was in reference to and I have absolutely no clue why I initially came away thinking you were arguing the exact opposite of what you were actually arguing.

Sorry about that, Dare! I most definitely agree with you on /all/ of this.
Comments are closed.