You Don’t Mean Average, You Mean Median

Every quarter, without fail, a bunch of articles appear talking about the venture capital industries investment pace as a result of the PWC MoneyTree report.  I used to get calls from all of the Denver / Boulder area reporters about my thoughts on these – that eventually stopped when I started responding “who gives a fuck?”

A few days ago I got a note from Steve Murchie about his new blog titled Angels and Pinheads.  I’m glad Steve is blogging about this as he’s got plenty of experience and thoughts around the dynamics of angel investors – some that I agree with and some that I don’t.  Regardless, my view is that there more there is out there, the better, as long as people engage in the conversation.

In his post Mind the Gap he made an assertion that “the VC industry has effectively stopped investing in seed stage ($500K and less) and startup-stage ($2M and less) opportunities.”  As a VC who makes lots of investments between $250k and $2M, and who has plenty of good friends who happen to be VCs that also make investments in this range (such as Union Square Ventures, First Round Capital, True Ventures, SoftTech VC, FB Founders, Alsop Louis, O’Reilly Alpha Tech, and Highway 12), I thought Steve’s assertion was wrong and I told him so in the comments.  He countered with the PWC Moneytree data on Q3 VC investments.

Stage Total $M % of Total # Deals Avg / Deal $M
Later Stage 1611 33.49 168 9.6
Expansion 1610 33.48 185 8.7
Early Stage 1081 22.49 198 5.5
Startup/Seed 507 10.54 86 5.9

Steve’s response to the Startup/Seed “Average Deal Size” was “WTF??!”  While that is the correct reaction, his conclusion (that VCs aren’t investing between $250k and $2M) is incorrect for two simple reasons: (1) the data is the PWC MoneyTree Report is incorrect and incomplete and (2) the interesting number to look at, assuming the data is correct, is the Median, not the Average.  If you wonder why, Wikipedia’s explanation is pretty good: “The median can be used as a measure of location when a distribution is skewed, when end values are not known, or when one requires reduced importance to be attached to outliers, e.g. because they may be measurement errors.”

Let’s look at the underlying data in Silicon Valley (that results in the above table) to understand this better.  Going to the PWC Moneytree Startup/Seed investments in Silicon Valley for Q309, you get the following:


The first six “startup/seed” investments each raised $10M or more.  Now, I’ll accept that these might be classified as “startup rounds” (e.g. the first round of investment) but no rational person would categories these as seed investments.  But, for purposes of this example, let’s keep them in the mix.  The average is $6.4M and the median is $5.0M.  Now, let’s toss out only the ones $10M or great since these clearly aren’t “seed” investments.  Our average is now $3.4M and the median is now $2.0M.

I’m still feeling generous (e.g. I’ll waive reason #1 – that the data is incorrect / incomplete – for the time being).  Let’s look at the PWC Moneytree Startup/Seed investments in New England  for Q309.


The average is $8.4M and the median is $5M.  Now, toss out everything above $10M.  The average is now $3.9M and the median is $4M.

But it gets better.  Let’s take all of the PCW Moneytree Startup/Seed Investments in the US for Q309.  There are 86 of them and as we know from the first table the average is $5.9M.  But the median is $4M.  Now, toss out the ones above $10M.  The average is now $4M and the median is now $3M.  This exercise – again – assuming the data is correct – shows the difference between average and median, as well as how much the numbers are skewed upward by “startup/seed” investments $10m or more.

I’m not going to try very hard to show that that the data is incorrect, but I’ll give you two examples.  The first is FourSquare, a well known seed investment led by Union Square Ventures and O’Reilly AlphaTech.  It was a $1.35M financing, has three employees, and occurred in 9/09.  This is about as close to the definition of a seed investment as you can get.  Yet, PWC Classifies it as Early Stage (plus they got the investment amount wrong as they list it as $1.15M.)  For reference, Dow Jones VentureSource classifies this as a seed investment and gets the amount right.

Let’s do another one.  This time look at what PWC MoneyTree has on First Round Capital


compared to what Crunchbase has on First Round Capital for Q309.


The differences that I think are incorrect on PWC’s part are that (1) GumGum is missing, (2) CoTweet is classified as Early Stage instead of Seed, (3) BigDeal is missing, (4) DNAnexus is missing (although it looks like it might have happened in Q2 even though it was widely reported in August), (5) Continuity Engine is classified as Early Stage instead of Seed, (6) ClickEquations is missing, (7) Sofa Labs shows up twice, and (8) Sofa Labs is classified as Early Stage instead of Seed.  Now Crunchbase is missing Project Fair Bid (even though they reported on it) so they aren’t perfect, but at the minimum the misclassification between Seed and Early Stage is dramatic.  Just for grins I looked these up in Dow Jones VentureSource and their data is closer to CrunchBase’s (especially the Round Type), but there are still differences.

Ever since I started investing in 1994 I’ve heard people spouting VC investment statistics to justify different viewpoints.  I’ve always felt this was a “garbage in / garbage out” phenomenon.  While there are some academics that do rigorous work around this (and understand the difference in importance between averages, medians, and er – statistically significant results), they are few and far between.  And – most of the data people actually use and discuss is stuff like the PWC Moneytree Report.

I keep fantasizing that this madness will stop, but I doubt it will.  In the mean time, I think I’ll go for an average run at a median pace.