From Mike Arrington's post Amazon: Grid Storage Web Service Launches on TechCruch we learn

Amazon Web Service is launching a new web service tonight called S3 - which stands for "Simple Storage Service". It is a storage service backend for developers that offers "a highly scalable, reliable, and low-latency data storage infrastructure at very low costs".
Here are the facts: This is a web service, and so Amazon is not releasing a customer facing service. They are offering standards-based REST and SOAP web services interfaces for developers. Entire classes of companies can be built on S3 that would not have been possible before due to infrastructure costs for the developer.
Virtually any file type is allowed, up to 5 GB. Files may be set as public, shared or private and will have a unique URL. Pricing is cheaper than anything else I’ve seen: $0.15 per GB of storage per month, and $0.20 for each GB of data transferred up or downstream. This translates to $15 per month for 100 GB of storage, net of any transfer fees (to move that much data on to S3 would be a one time cost of $20). These prices are going to be significantly below the development and ongoing costs for small or medium sized storage projects - meaning a lot of the front end services I’ve previously profiled will be much better off moving their entire back end to S3.

This is game changing.

A reader of my blog asked me to give my thoughts on Amazon's S3. The first question was what I thought about this in relation to Google's upcoming GDrive service. Both offerings aren't competing but they are related. GDrive will likely be an ad funded consumer service that offers functionality similar to that of sites like XDrive which enables users to store files in the "cloud" and interact with them via a website and/or a user interface integrated into the file explorer of their operating system. S3 is a service that can be used by applications to store files for a set cost. One could probably build a competitor to GDrive or XDrive using Amazon's S3.

Since my team owns Storage in the "cloud" for Windows Live, building something like Amazon's S3 is something I've thought quite a bit about but never actually pitched at work due to business ramifications. Giving programmatic access to cloud storage needs a revenue model but no one wants to charge for stuff like that since the assumption is that Google or Yahoo! will just give it away for free. What would work better is having something like GDrive which is ad-funded then giving users the ability to access their files using any application that supports the GDrive API. There is still the problem of how you prevent "abuse" (i.e. apps that only go through the GDrive API and thus the user never sees ads).

Being a curious developer type I read the online documentation for the Amazon S3 service to see how hard it would be to build something like GDrive on top of it. The tricky part seems to be that applications only get 100 buckets which are collections of data which can contain an unlimited amount of objects. So my GDrive app wouldn't be able to map each user to a bucket. Instead it would either have to map each user's data to an object (a tarball or ZIP file?) or instead come up with a custom way of partitioning buckets into subgroups each mapping to an individual user's data.

By the way did anyone else notice that bandwidth per GB costs more than storage per GB? The question for you viewers at home is whether this is surprising or expected. ;)


Wednesday, March 15, 2006 11:25:05 AM (GMT Standard Time, UTC+00:00)
yeah the cost for transfer versus storage is interesting.

Presumably unused storage is cheap; keep it on RAID5 HDD or even back it off to secondary storage. The comms cost is effectively the use tax, in which you pay not just for bandwidth but the CPU, buffer ram and power to drive it.

Which makes sense from a marketing perspective: a nice low cost of entry but as your stuff gets used, the cost goes up (predictably) and the profits roll in.

Its potentially an interesting option for any web "2.0" application that stores user data -you get to offload the problem of maintaining the HA storage facility for end user data, instead saying "amazon handles it", which could give end users more trust in the storage. For that to work you'd need a business model that could handle the charges, (but at least the entry cost is low), and the amazon service needs to be very-HA, have good bandwidth, and secure. I worry most about the latter.
Wednesday, March 15, 2006 3:03:24 PM (GMT Standard Time, UTC+00:00)
I don't think the 100 bucket limit is really a problem. In fact, you might use only 1 bucket per application. You would then use *meta-data* to associate each file with the appropriate user. Example:

userid: 2948
folder: documents and settings\admin\my documents\
filename: super-important-stuff.txt

Then you can query the API to get the appropriate files for a given user.
Wednesday, March 15, 2006 8:19:47 PM (GMT Standard Time, UTC+00:00)
I think a lot of the analysis is missing where Amazon might go with this. If you look at S3, simple queue and Mturk, you see Amazon creating some pretty interesting capabilities that I do not believe have direct comparisons to waht we have already seen. Simple queue, for example, looks totally pointless until you free your mind up a bit to imagine multi-party, distributed web services in the future. I look at S3 as a simple object store with access control and APIs. I don't think it will be or will enable anything like Gdrive or some place for consumers to back up.
Monday, September 18, 2006 8:05:11 PM (GMT Daylight Time, UTC+01:00)
Comments are closed.