A few weeks ago, I bookmarked a post from Sam Ruby entitled Collapsing the Stack where he wrote
Werner Vogels: Yep, the best way to completely automate operations is to have to developers be responsible for running the software they develop. It is painful at times, but also means considerable creativity gets applied to a very important aspect of the software stack. It also brings developers into direct contact with customers and a very effective feedback loop starts. There is no separate operations department at Amazon: you build it; you run it.
Sounds like a very good idea.
I don't see how this sounds like a good idea. This reminds me of a conversation I once had with someone at Microsoft who thought it would be a good idea to get rid of their test team and replace them all with developers once they moved to Test Driven Development. I used to be a tester when I first joined Microsoft and this seemed to me to be the kind of statement made by someone who assumed that the only thing testers do is write unit tests. Good test teams don't just write unit tests. They develop and maintain test tools. They perform system integration testing. They manage the team's test beds and test labs. They are the first line of defence when attempting to reproduce customer bug reports face before pulling in developers who may be working on your next release. All of this can be done by the development team but it means that your developers spend less time developing and more time testing. This cost will show up either as an increment in the amount of time it takes to get to market or a reduction in quality if schedules are not adjusted to account for this randomization of the development team. Eventually you'll end up recreating your test team so there are specific people responsible for test-related activities [which is why software companies have test teams in the first place].
The same reasoning applies to the argument for folding the responsibilities of your operations team into the development team's tasks. A good operations team isn't just responsible deployment/setup of applications on your servers and monitoring the health of the Web servers or SQL databases inyour web farm. A good operations team is involved in designing your hardware SKUs and understanding your service's peak capacity so as to optimize purchase decisions. A good operations team makes the decisions around your data centers from picking locations with the best power prices and ensuring that you're making optimal use of all the physical space in your data center to making build . A good operations team is the first line of defence when your service is being hit by a Denial of Service attack. A good operations team insulates the team from worrying about operating system, web server or database patches being made to the live site. A good operations team is involved in the configuration, purchase, deployment and [sometimes] development of load balancing, database partitioning and database replication tools. Again, you can have your development team do this but eventually it would seem to make sense that these tasks be owned by specific individuals instead of splitting them across the developers building one's core product.
PS: I've talked to a bunch of folks who know ex-Amazon developers and they tend to agree with my analysis above. I'd be interested in getting the perspective of ex-Amazon developers like Greg Linden on replacing your operations team with your core development staff.
PPS: Obviously this doesn't apply if you are a small 2 to 5 person startup. Everyone ends up doing everything anyway. :)