Lauren Carlson at Software Advice makes the point that the cloud reliability problems we have seen lately should not be considered in isolation. That is a very helpful point. She holds up the service level agreement as guarantor of cloud reliability, along with a quoted observation that cloud vendors have the latest technology while on-premises deployments do not. Does her comparison hold water?
In her analysis Lauren relies on a 2008 survey of email uptime comparing Gmail to Lotus and Exchange, where Gmail compares favorably. In this study Gmail users experience less unscheduled downtime and no scheduled downtime for about an hour more uptime per month. No explanation is provided, but assume the Gmail service makes better use of hot-site backup for scheduled/unscheduled maintenance than on-premises operations and the results back the idea of cloud vendors using better technology than inhouse data centers on average.
Uptime is not the full story of reliability. Lauren does note that the recent Google Blogger downtime event spanned 20 hours, but fails to note deeper problems beyond downtime in the Blogger event. Many Blogger users lost their blog content for several days, received poor support from Google during the outage, and permanently lost the blog comments — blogs are about conversations between the author and reader and the Blogger outage erased these. Data loss is a serious reliability problem, and one not captured by a focus on uptime/downtime.
More troubling was the cause of Google’s Blogger outage — it occurred because of a scheduled upgrade affecting all Blogger users. In effect Google’s use of the latest and greatest technology caused the downtime. Data Center Management 101 suggests testing all upgrades and holding a full backup in case of problems — Google failed to use good Data Center practices at the expense of every single Blogger user. The service level agreement was/is useless in such a circumstance.
Reliability has many factors. Uptime, data security / protection / restoration, customer service quality, new technology, and data center management practices are factors highlighted in one event. It may be that a service level agreement focuses attention on uptime over other factors, but that is no guarantee of reliability on the whole. How the user / customer and vendor work together to ensure reliability is what is important, not whether they choose to do so on-premises or in the cloud.