« swipe left for tags/categories
swipe right to go back »
An increasing number of companies we are investors in are focused on DevOps. A year or so ago I read an early draft of a new book titled The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win. I really enjoyed it and asked Gene Kim, one of the authors to write a guest post on DevOps. He wrote it a while ago and it has sat in my draft queue waiting for the perfect moment to emerge. That moment is now. Following is a guest post on DevOps by Gene Kim, Multiple Award-Winning CTO, Researcher, Visible Ops Co-Author, Entrepreneur & Founder of Tripwire.
Since 1999, my passion has been studying high performing IT organizations. On this journey, we benchmarked 1,500 IT organization to understand what differentiated the highest performing organizations and allowed them to do what the others only dreamed of. Our findings went into a book that we published in 2004 called The Visible Ops Handbook, which described how these organizations made their “good to great” transformation.
Since then, this journey has taken me straight into the heart of the DevOps movement. Although I initially dismissed DevOps as just another marketing fad, my friend John Willis corrected me, in the way that only true friends can do, saying, “Don’t be dense. DevOps finally proves how IT can be a strategic advantage that allows a business to beat the pants off the competition. This is the moment we’ve all been waiting for.”
In that moment, I saw the light. Over the years, I’ve come to believe with moral certainty that everyone needs DevOps now, especially software startups where the successful execution of Development and IT Operations preordain success or failure.
Today, we can see how DevOps patterns enable organizations like Etsy, Netflix, Facebook, Amazon, Twitter and Google to achieve levels of performance that were unthinkable even five years ago. They are doing tens, hundreds or even thousands of code deploys per day, while delivering world-class stability, reliability and security.
DevOps refers to the emerging professional movement that advocates a collaborative working relationship between Development and IT Operations, resulting in the fast flow of planned work (i.e., high deploy rates), while simultaneously increasing the reliability, stability, resilience of the production environment.
The culture and practices that enable DevOps to happen cannot be delegated away. In a growing startup where teams start to specialize and multiply, the chaos of daily work often starts to slow down the smooth flow of work between Development and IT Operations, sometimes even resulting in outright tribal warfare.
In this blog post, I’ll describe what this downward spiral looks like, and what everyone in the company must do to break this destructive pattern and ensure that Development and IT Operations work together in a way that creates such a competitive advantage that it may almost seem unfair.
Why Everyone Needs DevOps
There is currently a core, chronic conflict that exists in almost every IT organization. It is so powerful that it practically pre-ordains horrible outcomes, if not abject failure. It happens in both large and small organizations, for-profit and non-profit, and across every type of industry.
In fact, this destructive pattern is the root cause of one of the biggest problems we face as an industry. But, if we can beat it, we’ll have the potential to generate more economic value than anything we’ve seen in the previous 30 years.
I’m going to share with you what this destructive pattern is in Three Acts, that will surely be familiar to you. (You can get the whole story in my book, The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win).
Act I begins with IT Operations, where we’re supporting a large, complex revenue generating application. The problem is that everyone knows that the application and supporting infrastructure is… fragile.
How do we know? Because every time anyone touches it, it breaks horrifically, causing an epic amount of unplanned work for everyone.
The shameful part is how we find out about the outage: Instead of through an internal monitoring tool, it’s a salesperson calling, saying, “Hey, Gene, something strange is happening. Our revenue pipeline stopped for two hours.” Or, “the banner ads in my market are being served upside down and in Spanish.”
There are so many moving parts that it takes way too long to figure out what caused the problem du jour, which means we’re spending more and more time on unplanned work and increasingly unable to get our planned work done.
Eventually, our ability to support the most important applications and business initiatives goes down. When this happens, the organization suddenly finds itself unable to achieve the promises and commitments made to the outside world, whether it’s investors, customers, analysts or Wall Street.
Promised features aren’t delivered on time, market share isn’t going up, average order sizes are going down, specific revenue goals are being missed…And that’s when something really terrible happens.
In Act 2, everyone’s lives gets worse when the business starts making even bigger promises to the people we let down, to compensate for the promises we previously broke. Often, the entire organization starts dreaming up bigger, bolder features that are sure to dazzle the marketplace, but without the best grasp on what technology can and can’t do, or fully realizing what caused us to miss our commitments in the first place.
Enter the Developers. They start seeing more and more urgent date-driven projects put in the queue, often requiring things that the organization has never done before. Because the date can’t be moved (because of all those external promises made), everyone has to start cutting corners.
Development must focus on getting the features done, so the corners that get cut are all the non-functional requirements (e.g., manageability, scalability, reliability, security, and so forth). This means that technical debt starts to increase. And that means increasingly fragile infrastructure in production.
It is called “technical debt” for a reason—because technical debt, like financial debt, compounds.
When technical debt begins to accumulate, something very insidious starts happening. Our deployments start taking longer. What used to take an hour now takes three hours, then a day, then two days—which is okay, because it can still get it done in a weekend. But then it takes three days, and then a week, then two weeks!
Our deployments become so expensive and so difficult that the business says that we have to lengthen the deployment intervals, which goes against all our instincts and training. We know that we need to shrink the batch sizes, not make them bigger, because large changes make for larger failures.
The flow of features slows to a trickle, the deployments take even longer, more things go wrong, and because of all the moving pieces, issues take even longer to diagnose. Our best Dev and Ops people are spending all their time firefighting, and blaming each other when things go wrong.
I’m guessing that most of you can relate to at least some portions of this story? As I said, this happens both in large enterprises and growing startups alike. In my fifteen years of research in this area, I’ve found almost all IT professionals have experienced this cycle.
Act 3: How DevOps Breaks Us Out Of Our Downward Spiral
We know that there must be better way, right? DevOps is the proof that it’s possible to break the core, chronic conflict, so we can deliver a fast flow of features without causing chaos and disruption to the production environment.
When John Allspaw and Paul Hammond gave their seminal “10+ Deploys Per Day: Dev and Ops Cooperation at Flickr” presentation at the 2009 Velocity Conference, people were shocked and amazed, if not outright fainting in the aisles at the audaciousness of their achievement.
It wasn’t a fluke. Other organizations such as Facebook, Amazon, Netflix and the ever-growing DevOps community have replicated their performance, doing hundreds, and even thousands, of deployments per day. DevOps is not only for large, established companies. It’s for any company where the achievement of business goals rely upon both Development and IT Operations. These days, that means almost every company.
We all need to be putting DevOps-like practices into place. This is why Kevin Behr, George Spafford, and I wrote The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win.
A novel, you might ask? How is a novel going to solve my problems?
As a friend once told me, “Before you can solve a complex problem, you must first have empathy for the other stakeholders. And story-telling is most effective means of creating a shared understanding of the problem.”
Dr. Eliyahu Goldratt demonstrated the power of a novel as a teaching tool through his book, The Goal: A Process of Ongoing Improvement. It’s a novel written in the 1980s about a plant manager who has 90 days to fix his cost and due date issues or his plant will be shut down. When I read this book nearly 15 years ago, I knew that this story was important, and that there was much I needed to learn, even though I never managed or worked in a manufacturing plant.
It isn’t an overstatement to say that The Goal and Dr. Goldratt’s Theory of Constraints changed my life—in fact, it probably was one of the biggest influences on my professional thinking. For eight years, my co-authors and I wanted to write The Phoenix Project, because we all believed that IT is not a department, but a strategic capability that every business must have.
As you can imagine, I was incredibly honored and thrilled when Jez Humble, author of the award-winning book Continuous Delivery recently told me, “This book is a gripping tale that captures brilliantly the dilemmas that face companies which depend on IT. The Phoenix Project will have a profound effect on IT, just as The Goal did for manufacturing.”
For those of you are looking for some places to start your DevOps journey, here are my three favorite DevOps patterns:
- Make sure we have environments available early in the Development process. Enforce a policy that the code and environment are tested together, even at the earliest stages of the project.In the ideal, IT Operations is able to create an environment (that is, everything except for the application code: databases, operating system, networking, virtualization layer, etc.) with one step. That can be as simple as copying a virtual machine, or as complex as an automated build system that generates the environment from scratch (e.g., puppet, chef, etc.)Furthermore, use the same build mechanism to build the Production, Test and Dev environments at the same time. If we modify the Agile sprint policy so that instead of merely having shippable code, we have shippable code and the environment that it runs within, we’ll have done code deployments many, many times when it’s time for the real-life production deployment.
- “Wake up developers up at 2 a.m. when they break things.” Yep, you heard me. This quote came from Patrick Lightbody, the CEO and founder of BrowserMob. He continued, “When we woke up developers, we found that defects got fixed faster than ever.”The goal is to shorten and amplify feedback loops, and to bring Development closer to the customer experience. In DevOps work streams, developers often deploy their own code, and fixes forward when things go wrong. By doing this, developers can see the consequences of their decisions and actions.(Note the symmetry here: the previous pattern #1 about making environments available early is all about embedding IT Operations into Development, while this pattern is about putting Development into IT Operations.)
- Create reusable deployment procedures: When every deployment is done differently, every production environment can become different, like snowflakes. When this occurs, no mastery is ever built in the organization in procedures or configurations. As Luke Kanies said, “If your infrastructure is special, you’re doing it wrong.”To make this reality, create reusable user story for IT Operations, such as “Deploy app into high availability environment,” which then goes on to define exactly the steps to build the environment, as well as how long it takes, what resources are required, etc.By doing this, we codify the deployment and engineering procedures requires to build reliable, resilient and properly configured environments, and can then factor that into our planning processes, such as in the PMO.
If you enjoyed this taste of DevOps and believe it can help achieve your goals, “The Phoenix Project” is available now, or you can download a free 170 page excerpt of the book. And of course, you can always find the latest writings on DevOps at the IT Revolution blog, where you can get our free whitepaper “The Top 11 Things You Need To Know About DevOps.”
Long live DevOps!
At this year’s NVCA meeting, my partner Jason Mendelson (who was the chair of the event) interviewed Dick Costolo, the CEO of Twitter. Dick is an awesome CEO, awesome human, and awesome interviewee. Among other things, he’s hilarious, and PandoDaily wrote a fun summary of the interview in their post What CEOs could learn from comedians.
Dick had many great one liners that fit in 140 characters as you’d expect from someone who is both the CEO of Twitter and was once a standup comedian. But one really stuck in my mind.
It’s not your job to defend your team. It’s your job to improve your team.
Upon reflection, all of the great CEOs and executives that I’ve ever worked with believe this and behave this way.
Every time I make an investment I believe it is going to be an incredible success. I don’t know any VC who invests thinking “eh – this will be mediocre. When you start the relationship you believe it’s going to be massively successful. The same is true of hiring an executive. Dick made the point that the cliche “only hire A players” is completely obvious and banal. CEOs don’t run around saying “hey – let’s hire C players – that’s what we want – C players.” Everyone you hire is someone you think will be an A player, by definition.
But, in the same way that every VC investment doesn’t become a 100x return, every person you hire won’t turn out to be an A player. After a few months, you start to really understand the strengths and weaknesses of the person. And you see how the person interacts with the rest of your team. This is normal – there’s no way you could know any of this during the interview process.
The not so amazing CEO or executive immediately falls into a mode of trying to defend the person, or the team, to the outside world (board, investors, customers) and other members of the team. I’ve heard a remarkable number of different rationalizations over the years about why a person or a team is going to work. And, when I press on this, the underlying response is often simply “give us / me / them more time.”
Instead of defending the team, the amazing CEO will respond with “yup – we need to get better – here’s what we are doing.” And then they’ll add “what else do you think we should do?” and “how can you help us improve?” This type of language – accepting reality and focusing on improving it, rather that defending it, is so much more powerful.
Of course, often the answer is that to improve a team, you have to eliminate a person or move them to a very different role. This is hard, but it’s part of the process, especially in a fast growing company. Someone who was incredible at a job when the company is 50 people might be horrible at the job when the company is 500 people. Nothing is static – including competence.
This is true of CEOs as well. We can all be better at what we do – a lot better. It’s easy to fall into the trap of defending our own behavior when someone offers us feedback or constructive criticism. The walls go up fast when someone attacks us, or we fail. But if you switch immediately from “defend” to “improve”, you can often get extraordinary feedback and help in real time. And sometimes you have to replace yourself, as Jonathan Strauss at Awe.sm did recently and explained in his tremendous post Replacing Oneself as CEO
I loved working with Dick at FeedBurner – I learned an incredible amount from him. I treasure every minute I get with him these days and one of the biggest bummers about not being an investor in Twitter is that I don’t get to work with him on a regular basis. It was joyful to listen to him and realize that there is another wave of people at a rapidly growing and very important company that are learning from him, as he works to improve his team on a continual basis.
Now that we are in March, you should have a pretty good view of how your Q1 is likely to end up. If you are a revenue generating company, you’ve probably got a formally approved 2013 plan by now (if not, why not?) Your board is paying attention to your performance against plan, and you and your management team are executing based on the plan you had approved, which likely includes both a revenue plan and an expense plan.
If your sales and revenue are not on or ahead of plan, it’s time to take a hard look at what is going on. Q1 is the easiest quarter to make since you just created the annual plan. If you miss Q1, especially in a recurring revenue, services oriented business, or adtech business, there is almost no way you will make it up over Q2 – Q4. Sure – it’s nice to think something magic, special, and happy will happen, but it almost never does.
Step 1: Put on the brakes right now on discretionary spending, especially headcount. You are probably spending at plan. If sales / revenue / MRR are behind plan, you are just creating a bigger problem for yourself.
Step 2: Do an aggressive root cause analysis of why you missed Q1 so far (January and February). Use the five whys approach and keep digging until you actually understand what is going on. Don’t let your sales organization wave things off. Don’t assume it’s all going to come together on 3/31. Don’t assume the high level metrics you are looking at tell the story. Go deep as a management team. Get everyone on the management team in a room for the day on Saturday 3/9, and figure it out. Yeah, I know some of you are going to SXSW – figure it out. It’s important.
Step 3: Keep playing through on your plan for all of Q1 other than discretionary spending. Be surgical about what is going on. Use this as a wakeup call that you aren’t executing well yet, or at least to the plan you put out there. Do you have confidence you’ll make it up in March? If you do after you think hard about it, then you’ll know in a few weeks. But don’t wait for those weeks to pass to get your mind into the issue.
Step 4: Re-forecast Q1 and the rest of 2013 based on what you expect the actuals for Q1 to be. Again, go deep. You just created an annual plan so the process and the numbers should be fresh. Use it to re-forecast based on the new information you learned in January, February, and Step 2. Get it in shape so that after you know the score for Q1, you can quickly put it in front of the board.
Step 5: Call a board meeting for around April 15. Make this a Q1 review and Q2 – Q4 planning meeting. As part of this, get a new 2013 plan approved that takes into consideration what you learned in Q1.
Don’t panic, but don’t be caught off guard. Assume you won’t make things up and get ahead of them by figuring out what your real trajectory is.
Oh – and if you are beating your Q1 plan, then start thinking about how you can accelerate and grow even faster!
Following is a guest post from Chris Moody. Chris is president and COO of Gnip, one of the silent killers in our portfolio. Once the main stream tech press starts noticing Gnip, they will be blown away at how big they got in such a short period of time by just executing. Chris is a huge part of this – he joined Gnip when they were 10 people and has been instrumental in working with Jud Valeski, Gnip’s founder and CEO, to build a mind blowing team, business, and market leadership position.
Following is a great email Chris sent me Friday night in advance of the Foundry Group “Scaling Your Company Conference” which we are having this week for CEOs of companies we are investors in that are on the path from 50 to 500 people.
Startups that experience success are typically built upon a strong foundation of trust among the early founders/employees. This trust has been solidified through long days/nights in small offices working on hard problems together. The amazing thing is that the founders don’t always realize that their company is even operating under an umbrella of trust or that trust is one of their core values. Instead, they just know that it feels easy to make decisions and to get shit done.
When companies try to scale, one of the biggest mistakes they make is trying to replace trust with process. This is rarely a conscious decision, it just feels necessary to add new rules in order to grow. After all, there are a lot of new people coming into the company and it isn’t clear who of the new people can be trusted yet.
A startup obviously needs to add process in order to scale, but if you replace trust with process, you’ll rip the heart right out of your company. When adding processes, ask yourself the following questions:
- Does this new process help us go faster?
- Does this new process help us be more efficient?
If the answer to these questions is “yes” you are off to a great start.
Now ask yourself “Are we adding this process because we don’t trust people to make decisions?” If the answer to this question even has a hint of “maybe” you need to stop and really consider the cost of that process.
Replacing trust with process is like a cancer that will spread quickly and silently throughout the company. One day you’ll wake up and think “this place doesn’t feel special any more” or ask yourself “why is it so hard for us to get stuff done.”
Trust could be one of your most valuable company assets. As a leader, you need to fight like hell to protect it. If you are successful protecting trust, you’ll actually grow much faster and you’ll still have a place where people love working.
I’ve seen trust work at a 700 person company. Trust can scale.
My shift from manager hours to maker hours is officially over. I’ve learned a lot the past two months about how I work and the challenges of trying to both shift to maker hours as well as be effective in a blended manager / maker world.
I started out in June with a hard shift to maker hours. I only scheduled calls between 1pm and 4pm – the rest of my time was unscheduled. I was able to maintain this rhythm for about 30 days before my scheduled time expanded to 5pm, then 6pm, then noon. Ultimately the backlog of “other stuff” started to creep in and it was hard to ignore it.
My primary maker task was writing – I finished Startup Communities: Building an Entrepreneurial Ecosystem in Your City, the second edition of Venture Deals: Be Smarter Than Your Lawyer and Venture Capitalist, and made some, but not nearly enough, progress on Startup Life: Surviving and Thriving in a Relationship with an Entrepreneur (which I’m writing with Amy.) There was a lot of overhead associated with each book as I worked on the website (I’ll finally launch the Startup Revolution site later this week), some publisher stuff (which I knew about from before), and plenty of “edit cycle” stuff.
I discovered that I could only write effectively for four hours a day – any more than that and whatever I did was crap. If I did anything – even check my email – before I started writing, I got virtually nothing done that day. So – the ideal “writing day” was “get up early, have coffee, write for two hours, run, write for two more hours, switch into manager mode and deal with everything else.”
The magic lesson here is something I already knew – my best time for creative work is from 5am to 7am. This is my normal rhythm that I’ve had for a long time. Trying to change it was hard and when I reflect back on things I’m not sure I was any more productive than if I had simply decided to be incredibly disciplined for the past 60 days and just written every morning from 5am to 7am and then let the day be whatever it was.
As I shift back to manager mode, that’s the approach I’m going to take for August and see what it gets me.