<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Acceptable Downtimie</title>
	<atom:link href="http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html</link>
	<description></description>
	<lastBuildDate>Mon, 13 Feb 2012 21:06:34 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Andrew Batchelor</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1338</link>
		<dc:creator>Andrew Batchelor</dc:creator>
		<pubDate>Mon, 01 Aug 2005 15:44:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1338</guid>
		<description>Two avenues for consideration.

First, couched in terms of the technology product adoption lifecycle the &#039;acceptable&#039; downtime depends upon where you are on the curve. &quot;Innovators&quot; will put up with a certain amount of crashes and failures.  Early adopters have less tolerance. The pain comes as you cross the chasm. The early majority simply wont &#039;buy&#039; a service/product unless they believe its stable and reliable. Downtime at this point could kill the market. So the further you are along the curve the more you ought to invest in resiliance of the systems.

The second stream of thought is a consideration of  how mission critical the &#039;service&#039; is to end users and how &#039;real time&#039; their requirements are. A system dealing with truck appointments and gate receipts at a container terminal (oneport.com) is going to need to be very resiliant as delays can cause major operational impact (and many dollars in consequential losses).

On the other hand a tshirt website has less need to be resiliant since failure is merely inconvenient and may cause the site loss of revenue, but not the customers. The tradeoff is loss of reputation, negative word of mouth.

Its good advice to develop practical and simple work around support if the system fails. This isnt feasible for all online services but if you can switch to a &#039;manual&#039; mode to maintain service then customers will prefer that to being totally down.

This might mean creating static mailto HTML forms and having to maintain manual data entry and manual email response on the backend (we spoofed an auction in Japan this way for a week until a critical bug was resolved) or it might mean fail over to a professional (outsourced) customer service hotline (costs being kept down by it only being used in emergencies).

In a bootstrapping start-up role Ive had to trade off the risk of system outage (or overload at peak times) against the desire (and need) to grow the customer base. This was a prepaid calling card business in Hong Kong. We managed some of the loading by influencing customer behaviour through our pricing strategies. We didnt have capital to build a fully resiliant platform.

Even where we invested in resiliance there were still problems. We hosted our services in a state of the art telco facility and bought capacity from world leading telco providers. The FM building lost all 3 redundant aircon systems in a typhoon and their key switching systems overheated and shut down the facility. A separate software upgrade on a providers network tripped a major outage on the routers which rippled around the network and caused sporadic failure for a week.  Even when you pay for the so called best you can face downtime.

Ive also been involved in projects where a Systems Architect has gone mad spending millions of dollars on redundant infrastructure way ahead of progress on customer adoption. When customer numbers / transaction volumes are lower the work arounds are obviously easier. As the numbers grow it becomes more necessary to build greater resiliance in the systems.

To assist in assessing how much to invest and when, Id recommend laying out a matrix of risk of failure, investment cost of redundancy or resiliance, risk/cost to customer/business, time to recovery.
</description>
		<content:encoded><![CDATA[<p>Two avenues for consideration.</p>
<p>First, couched in terms of the technology product adoption lifecycle the &#8216;acceptable&#8217; downtime depends upon where you are on the curve. &#8220;Innovators&#8221; will put up with a certain amount of crashes and failures.  Early adopters have less tolerance. The pain comes as you cross the chasm. The early majority simply wont &#8216;buy&#8217; a service/product unless they believe its stable and reliable. Downtime at this point could kill the market. So the further you are along the curve the more you ought to invest in resiliance of the systems.</p>
<p>The second stream of thought is a consideration of  how mission critical the &#8216;service&#8217; is to end users and how &#8216;real time&#8217; their requirements are. A system dealing with truck appointments and gate receipts at a container terminal (oneport.com) is going to need to be very resiliant as delays can cause major operational impact (and many dollars in consequential losses).</p>
<p>On the other hand a tshirt website has less need to be resiliant since failure is merely inconvenient and may cause the site loss of revenue, but not the customers. The tradeoff is loss of reputation, negative word of mouth.</p>
<p>Its good advice to develop practical and simple work around support if the system fails. This isnt feasible for all online services but if you can switch to a &#8216;manual&#8217; mode to maintain service then customers will prefer that to being totally down.</p>
<p>This might mean creating static mailto HTML forms and having to maintain manual data entry and manual email response on the backend (we spoofed an auction in Japan this way for a week until a critical bug was resolved) or it might mean fail over to a professional (outsourced) customer service hotline (costs being kept down by it only being used in emergencies).</p>
<p>In a bootstrapping start-up role Ive had to trade off the risk of system outage (or overload at peak times) against the desire (and need) to grow the customer base. This was a prepaid calling card business in Hong Kong. We managed some of the loading by influencing customer behaviour through our pricing strategies. We didnt have capital to build a fully resiliant platform.</p>
<p>Even where we invested in resiliance there were still problems. We hosted our services in a state of the art telco facility and bought capacity from world leading telco providers. The FM building lost all 3 redundant aircon systems in a typhoon and their key switching systems overheated and shut down the facility. A separate software upgrade on a providers network tripped a major outage on the routers which rippled around the network and caused sporadic failure for a week.  Even when you pay for the so called best you can face downtime.</p>
<p>Ive also been involved in projects where a Systems Architect has gone mad spending millions of dollars on redundant infrastructure way ahead of progress on customer adoption. When customer numbers / transaction volumes are lower the work arounds are obviously easier. As the numbers grow it becomes more necessary to build greater resiliance in the systems.</p>
<p>To assist in assessing how much to invest and when, Id recommend laying out a matrix of risk of failure, investment cost of redundancy or resiliance, risk/cost to customer/business, time to recovery.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: E-Oasis Alerts</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1340</link>
		<dc:creator>E-Oasis Alerts</dc:creator>
		<pubDate>Sat, 30 Jul 2005 07:42:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1340</guid>
		<description>&lt;strong&gt;Downtime unaccepted.&lt;/strong&gt;

Brad Feld, a Colorado-based VC, wrote an insightful piece on acceptable downtime for rapidly growing companies.  It&#8217;s rarely the case, however, that executive management gets full disclosure on what root causes are responsible for the embarrasin...
</description>
		<content:encoded><![CDATA[<p><strong>Downtime unaccepted.</strong></p>
<p>Brad Feld, a Colorado-based VC, wrote an insightful piece on acceptable downtime for rapidly growing companies.  It&#8217;s rarely the case, however, that executive management gets full disclosure on what root causes are responsible for the embarrasin&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Clint Sharp's Blog an' Vlog</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1339</link>
		<dc:creator>Clint Sharp's Blog an' Vlog</dc:creator>
		<pubDate>Fri, 29 Jul 2005 22:03:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1339</guid>
		<description>&lt;strong&gt;Acceptable Downtime&lt;/strong&gt;

Brad Feld wrote a post yesterday titled &#8220;Acceptable Downtime&#8221;, where he explains that he has a position on the board of a startup which is considering adding redundancy to their web based service to mitigate the possibility of a catastrophi...
</description>
		<content:encoded><![CDATA[<p><strong>Acceptable Downtime</strong></p>
<p>Brad Feld wrote a post yesterday titled &#8220;Acceptable Downtime&#8221;, where he explains that he has a position on the board of a startup which is considering adding redundancy to their web based service to mitigate the possibility of a catastrophi&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Clint Sharp's Blog an' Vlog</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-5844</link>
		<dc:creator>Clint Sharp's Blog an' Vlog</dc:creator>
		<pubDate>Fri, 29 Jul 2005 22:03:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-5844</guid>
		<description>&lt;strong&gt;Acceptable Downtime&lt;/strong&gt;

Brad Feld wrote a post yesterday titled &#8220;Acceptable Downtime&#8221;, where he explains that he has a position on the board of a startup which is considering adding redundancy to their web based service to mitigate the possibility of a catastrophi...
</description>
		<content:encoded><![CDATA[<p><strong>Acceptable Downtime</strong></p>
<p>Brad Feld wrote a post yesterday titled &#8220;Acceptable Downtime&#8221;, where he explains that he has a position on the board of a startup which is considering adding redundancy to their web based service to mitigate the possibility of a catastrophi&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack Krupansky</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1337</link>
		<dc:creator>Jack Krupansky</dc:creator>
		<pubDate>Thu, 28 Jul 2005 18:21:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1337</guid>
		<description>1) *Anybody* using an &quot;emerging&quot; service simply has to be honest with themselves and admit that excessive downtime is a risk they agree to assume in exchange for gaining access to the innovative new service.  The customer needs to provision their own fallback plans, regardless of the vendor&#039;s offerings and &quot;commitments&quot;.

2) I thought hardware and open-source software were supposed to be so cheap to be essentially free?  At least that&#039;s the hype or &quot;spin&quot;.  If so, what excuse is there for a lack of 400% redundancy?  Every business and technology plan should have a special section entitled &quot;How Scalable are we Really?&quot;.

3) If the business plan for a VC-funded venture targets major accounts which have critical uptime requirements, isn&#039;t it the responsibility of the entrepreneur to gain full funding for the appropriate level of redundancy and fallback to meet targeted customer requirements?  If so, what&#039;s the issue here?  A failure of the entrepreneur to do robust planning?  A failure of due diligence by the VCs?

4) The important thing is that entrepreneur be up-front and spin-free with customers as to what levels of service they can expect and what levels of &quot;drama&quot; are likely.

-- Jack Krupansky
</description>
		<content:encoded><![CDATA[<p>1) *Anybody* using an &#8220;emerging&#8221; service simply has to be honest with themselves and admit that excessive downtime is a risk they agree to assume in exchange for gaining access to the innovative new service.  The customer needs to provision their own fallback plans, regardless of the vendor&#8217;s offerings and &#8220;commitments&#8221;.</p>
<p>2) I thought hardware and open-source software were supposed to be so cheap to be essentially free?  At least that&#8217;s the hype or &#8220;spin&#8221;.  If so, what excuse is there for a lack of 400% redundancy?  Every business and technology plan should have a special section entitled &#8220;How Scalable are we Really?&#8221;.</p>
<p>3) If the business plan for a VC-funded venture targets major accounts which have critical uptime requirements, isn&#8217;t it the responsibility of the entrepreneur to gain full funding for the appropriate level of redundancy and fallback to meet targeted customer requirements?  If so, what&#8217;s the issue here?  A failure of the entrepreneur to do robust planning?  A failure of due diligence by the VCs?</p>
<p>4) The important thing is that entrepreneur be up-front and spin-free with customers as to what levels of service they can expect and what levels of &#8220;drama&#8221; are likely.</p>
<p>&#8211; Jack Krupansky</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Colin Evans</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1336</link>
		<dc:creator>Colin Evans</dc:creator>
		<pubDate>Thu, 28 Jul 2005 18:08:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1336</guid>
		<description>My experience working at a telco infrastructure provider was that downtime was always bad, but the customers mind it less if you can make them aware of when and why it is happening and minimize unexpected side affects (billing errors, delivery errors, etc).

Sometimes all this means is having good enough monitoring that you can graceully take the system down when it starts to fault, and put up a polite apology on the website while this is happening.  This turns an &quot;unexpected outage&quot; into a &quot;planned outage&quot;.

Of course, if you have &quot;planned outages&quot; too often, your customers have good reason to be unhappy.

</description>
		<content:encoded><![CDATA[<p>My experience working at a telco infrastructure provider was that downtime was always bad, but the customers mind it less if you can make them aware of when and why it is happening and minimize unexpected side affects (billing errors, delivery errors, etc).</p>
<p>Sometimes all this means is having good enough monitoring that you can graceully take the system down when it starts to fault, and put up a polite apology on the website while this is happening.  This turns an &#8220;unexpected outage&#8221; into a &#8220;planned outage&#8221;.</p>
<p>Of course, if you have &#8220;planned outages&#8221; too often, your customers have good reason to be unhappy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Cornish</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1335</link>
		<dc:creator>Dan Cornish</dc:creator>
		<pubDate>Thu, 28 Jul 2005 17:04:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1335</guid>
		<description>To the customer uptime and five nines mean nothing. The assumption is it should just work. The end user needs to have some feedback if something goes wrong.  We have learned that the best thing to do when you have a problem is to create a system to fail gracefully. This means if the primary system is down, then have a cheap read only version which can pick up the slack. Maybe with 1-2 hour old data. This is better than being completly down, and far far less expensive than totally realtime redundant systems. As a startup ourselves, we have struggled with this issue. Since cash is limited, we could not afford all the bells. Sometimes this helps focus your mind.

It turns out  after doing a lot or research, our customers will accept a read only version while we get the other system back online. While we have never been down for very long, it is better to have someting up than nothing. Also communication is important. When ever we have had a problem, we shoot an email to all our administrators and call on the phone our major customers. The worst thing for our customers is to not be able to log into the system and not know what is going on.

To put it in VC terms, it is ALWAYS better to tell bad new right away to your customers, board, investors, than to wait and hope.
</description>
		<content:encoded><![CDATA[<p>To the customer uptime and five nines mean nothing. The assumption is it should just work. The end user needs to have some feedback if something goes wrong.  We have learned that the best thing to do when you have a problem is to create a system to fail gracefully. This means if the primary system is down, then have a cheap read only version which can pick up the slack. Maybe with 1-2 hour old data. This is better than being completly down, and far far less expensive than totally realtime redundant systems. As a startup ourselves, we have struggled with this issue. Since cash is limited, we could not afford all the bells. Sometimes this helps focus your mind.</p>
<p>It turns out  after doing a lot or research, our customers will accept a read only version while we get the other system back online. While we have never been down for very long, it is better to have someting up than nothing. Also communication is important. When ever we have had a problem, we shoot an email to all our administrators and call on the phone our major customers. The worst thing for our customers is to not be able to log into the system and not know what is going on.</p>
<p>To put it in VC terms, it is ALWAYS better to tell bad new right away to your customers, board, investors, than to wait and hope.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Masterson</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1334</link>
		<dc:creator>John Masterson</dc:creator>
		<pubDate>Thu, 28 Jul 2005 16:35:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1334</guid>
		<description>My link didn&#039;t work in my earlier comment:

&lt;a href=&quot;http://www.codesta.com/knowledge/management/uptime%5Frealities/&quot; rel=&quot;nofollow&quot;&gt;http://www.codesta.com/knowledge/management/uptime%5Frealities/&lt;/a&gt;
</description>
		<content:encoded><![CDATA[<p>My link didn&#8217;t work in my earlier comment:</p>
<p><a href="http://www.codesta.com/knowledge/management/uptime%5Frealities/" rel="nofollow">http://www.codesta.com/knowledge/management/uptime%5Frealities/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen Pierzchala</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1333</link>
		<dc:creator>Stephen Pierzchala</dc:creator>
		<pubDate>Thu, 28 Jul 2005 16:12:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1333</guid>
		<description>Third Time Lucky?

Any downtime in the online channel is unacceptable. This is the equivalent of putting a CLOSED sign in your window.

Do you accept downtime in your real-world channel? If a customer is in your store, with a product in hand, and cash at the ready...do you say sorry?

We are 10 years into online commerce. Firms have to stop factoring in downtime. It indicates that companies still think like old-school mainframe folks.

Stop. The. Madness.

smp
</description>
		<content:encoded><![CDATA[<p>Third Time Lucky?</p>
<p>Any downtime in the online channel is unacceptable. This is the equivalent of putting a CLOSED sign in your window.</p>
<p>Do you accept downtime in your real-world channel? If a customer is in your store, with a product in hand, and cash at the ready&#8230;do you say sorry?</p>
<p>We are 10 years into online commerce. Firms have to stop factoring in downtime. It indicates that companies still think like old-school mainframe folks.</p>
<p>Stop. The. Madness.</p>
<p>smp</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike</title>
		<link>http://www.feld.com/wp/archives/2005/07/acceptable-downtimie.html/comment-page-1#comment-1332</link>
		<dc:creator>Mike</dc:creator>
		<pubDate>Thu, 28 Jul 2005 16:02:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.feld.com/wp/?p=499#comment-1332</guid>
		<description>Very interesting. I can imagine that there are probably a few decision points in creating the system architecture for these types of service platforms where the trade-off between performance and manageability would benefit from this type of perspective.

It seems like systems engineers and software architects usually want to design elegant, hi-performance s/w, and marketing people always want feature-rich applications so who is there to advocate for reliability?

-mjr
</description>
		<content:encoded><![CDATA[<p>Very interesting. I can imagine that there are probably a few decision points in creating the system architecture for these types of service platforms where the trade-off between performance and manageability would benefit from this type of perspective.</p>
<p>It seems like systems engineers and software architects usually want to design elegant, hi-performance s/w, and marketing people always want feature-rich applications so who is there to advocate for reliability?</p>
<p>-mjr</p>
]]></content:encoded>
	</item>
</channel>
</rss>
<!-- WP Super Cache 0.8.9.1 -->
