On the heals of TypePad’s 18 hour outage this week, there’s been (and will be) a lot of continued discussion about how to build scalable and reliable online / web-based applications. This is not a new problem (I not so fondly remember major and systemic outages in large services such as eBay and Amazon in the late 1990’s) but it’s gotten new attention as some of the emerging applications have scaled up the point as to have an interesting numbers of regular users (e.g. – it sucks if their service goes down for more than 15 minutes). For example, as far as I can tell, del.icio.us has been down for the last four hours (“del.icio.us is down for emergency maintenance. we’ll be back as soon possible.”) and on 12/15/05 Bloglines acknowledged that “Bloglines performance has sucked eggs lately.”
Tim Wolters – an extremely capable CTO – has an introduction to how he is approaching this at Collective Intellect. He’s taking a page from Google’s playbook and developing a web service based on a “shared nothing architecture.” On Friday, I had two different discussions about scalable architectures (e.g. “we’re going to scale up between 10x and 100x on a meaningful base in 2006 – here’s what we are planning”) and both included elements of what Tim is describing.
I expect we’ll hear a lot more about this in 2006 as a small percentage of the flood of web apps created in 2005 become popular enough to have real scale issues.