# The Bullshit of Government Statistics

I just got the following breaking news alert from The New York Times.

“U.S. Economy Adds 290,000 Jobs in April; Jobless Rate Rises to 9.9%”

Let’s parse this.  The first clause says “U.S. Economy Adds 290,000 Jobs in April.”  This means to me that a bunch of people found new jobs in April.  A bunch.  Yay!  Good economy.

The second clause says “Jobless Rate Rises to 9.9%.”  This means to me that the number of people in the U.S. that don’t have jobs went up in April.”  A quick search showed that the March “jobless rate” (actually the unemployment rate) was 9.7%.  That’s a big relative jump, especially given that it was 9.7% for the first three months of 2010 according to the Bureau of Labor Statistics Economic News Release titled Employment Situation Summary that came out a few minutes ago.  Boo!  Bad economy.

How could this be?  The simple explanation is mid-way through the WSJ article titled U.S. Added 290,000 Jobs in April which appeared about six minutes after the NYT article:

“The two numbers are calculated by the Bureau of Labor Statistics in different ways. The payroll figure is taken from a survey of employers, while the jobless rate is calculated using a household survey.”

I just read through the BLS report and looked at a few of the tables.  Yes, there’s a ton of data here.  However, it breaks all kinds of rules about how to present data to reach a conclusion.  Our friends at the BLS need to hire Edward Tufte to get some help with their data presentation skills.

There are now two stories based on two completely calculations munged together into one sound bite.  The explanation will likely turn into “more people are looking for jobs now.”  But why is the denominator shifting around?  Weren’t those people already jobless (unemployed), even though they weren’t looking for jobs?  Oh – wait, if we include the people not looking for jobs in the historical unemployment calculation, the unemployment rate goes up, maybe by a lot.  Eek – wouldn’t that be more scary.

It’s a simple game the government is playing with the numbers.  Occasionally I’ll run into a company that does this – usually around revenue vs. gross margin dynamics, or bookings vs. revenue, or GAAP accounting vs. actual cash flows (where what really matters is cash flows.)  Picking the better number vs. dealing with reality is disingenuous at best; presenting them in conflicting ways that obscure the message is bullshit.

Oh – and 20 minutes later the newest NYT Breaking News Alert is now “Four-Month Rise Strengthens U.S. Job Outlook.”

• Not only does the current method not count people who have given up on finding employment, it also fails to count people who are seeking a full-time position but getting by with part-time.

Worse, the method of measuring this statistic has changed considerably over time, yet we compare the resulting figure across different eras. If we measured unemployment today with the same statistical measures that were used in the 30's we would have a much clearer picture of just how bad got this time around.

• Julian Gallow

So true. "Unemployment" is measured as "people claiming unemployment benefits" and leaves out those whose benefits have expired [which always overstates European unemployment relative to US unemployment because their expiration occurs much later, if at all], the under-employed, discouraged workers, people studying where they would work if they could, and people retiring where they would work if they could. It also leaves off immigrants returning as their work has dried up – not exactly unemployment because they no longer exist, but it is certainly job shrinkage.

The correct measure would be working-age population (236 million over-16s in 2008) less those in work (144 million), less those studying or retired, less those unable to work (injured, etc) less those choosing not to work (homemakers, lottery winners, etc) plus the factors described above. In other words, the statistic doesn't answer the question, "How many people who can and want to work at a given level are doing so?". Harder than just reading the total line from the unemployment claims, I know.

• "It’s a simple game the government is playing with the numbers."
http://en.wikipedia.org/wiki/Boskin_Commission

• "It’s a simple game the government is playing with the numbers."
http://en.wikipedia.org/wiki/Boskin_Commission

• Not to get on a tangent on CPI, but amazingly the *1995* Boskin commission had nothing to say on bogus hedonic fudging on gizmos (your \$499 TV really costs \$399, you see, because it has a bigger screen than a \$499 TV 3 years ago) or how owner equivalent rent doesn't make sense in a housing bubble – circumstances relevant to *2000s*. These things were only marginal or experimental in how CPI was calculated at the time of the commission.

Except briefly in 2009, CPI has been consistently and deliberately *understating* inflation, not overstating.

• Of course, how things are calculated is only half of the problem; even companies and governments that keep data normalized and methods consistent have the opportunity to game the numbers by moving capital around, deferring/accelerating investments across accounting periods, etc. Interestingly, this was the topic of Tufte's dissertation (yeah, I actually read Tufte's dissertation).

• I've always hated apples and oranges comparisons, especially when someone tries to make an "apples and apple-equivalent oranges" rationalization.

• I've always hated apples and oranges comparisons, especially when someone tries to make an "apples and apple-equivalent oranges" rationalization.

• Michael

A few more jobs are being added, but a few more than that join the labor force to try to get those jobs thus the increase in the UE rate which is typical at the beginning of most expansions

• Tracy Hall

While the mix of data sources is reprehensible, please do recognize that the result *can* still be true – if population (I.e. people entering the job market) grows faster than jobs…

• Rob

The method may not account for all of the unemployed, but this is as it has been through more than one administration. Thus, while you may disagree with the method of counting, the consistency of using this method gives us a barometer for change. If it moves from 9.7% to 9.9% it indicates an increase in unemployment based on the current and consistently applied method of accounting. Additionally, the same increase spread over a larger population count would actually show smaller % of increase.

With regards to presenting a conflicting message, I think displays a more genuine perspective than you seem to give it.

The message is in the conflict. Conflicting information does not always obscure a message, but can in fact provide relative clarity and in this case temper the enthusiasm of those hopping on the single 290,000 jobs added bandwagon(reflected in the market at present).

It seems odd to me that since information was clearly shared on how the statistics are gathered and clarity is provided around why we would see an increase in unemployment as jobs rose, you would cry foul.

Personally, I too think we should find a more accurate method of accounting for unemployment, but I don't imagine it would have changed today's message much.

• Actually, my biggest issue is how this translates into sound bites.   Just read the headline. http://money.cnn.com/2010/05/07/news/economy/jobs… />
Yeah – if you read the whole article AND apply critical thinking to it, you might reach the same conclusion.  My fear is that most people won’t do the thinking and as a result just get the sound bite.

• Max Lybbert

I had forgotten that the numbers are calculated differently. The main claim is simply that the official unemployment figure is "the number of people without jobs who are looking for jobs." That does not include "people who want jobs but have given up looking" — a.k.a., discouraged workers. The rise is being attributed to discouraged workers looking for jobs, meaning that they get added to the unemployed numbers. Also, of course, each month new people enter the workforce but may not get jobs right away.

• This is unfortunately a widespread problem. Look at how inflation numbers are tortured until they give the answer the government wants to hear (2%, give or take, no matter what happens to real estate or food or energy – you know, the essentials whose price changes actually matter to consumers). That's why people end up creating sites such as shadowstats.com.

Tufte has recently been hired by another government agency, but I don't have high hopes because I think the obfuscating is deliberate rather than the byproduct of incompetence. We might end up with better-looking charts that give a veneer of credibility to that bullshit, with no change to the underlying fudging.

• Great additional point.  The lack of “normalizing the data over time” – continually varying (or changing) the methodology and then comparing the numbers, is another consistent trick the government plays on us.

• On the other hand, it makes sense to have two different metrics developed in two different ways. They allow you either to have more faith in your conclusion or to ask deeper questions about what's going on in the economy. It's common to make compromises in measurement approach because of problems with measurability. This is okay provided that you measure the same way each time.

For example, how do you measure the number of people who have given up looking for a job but will resume their search if/when the economy picks up? Just who are all these people for whom work is apparently optional?

Frequently changing the methodology and then comparing "old method numbers" to "new method numbers" is just inexcusable.

• Norris Krueger

From my own sordid days in economic forecasting – data torturing can be necessary but can become a game unto itself. They are politicized but the bigger political influence can be the economists/statisticians trying to make the data look elegant. A given data series “ought” to look like X.(And any time you smooth data, you risk ADDING bias. Use a 4-month moving average to smooth the data & you may well suddenly find a 4-month cycle in your data.)

Anyway, do you want data that is timely, accurate, consistent, cheap and actually measures what you want? Can’t have it. Pick two or three. (And/or find clever proxies like percent of MSA’s beer consumption that is microbrew.. great measure of local economic confidence.)

I can remember when the CPI was measuring inflation as ~2-3% but if you looked at less-current data (out of the GDP calcs) like the implicit deflator for Personal Consumption Expenditures, the inflation number was 7-8% inflation. The President even tried to deep-six that data, LOL.

What I hear bubbling up in the comments is the question of “Are we measuring the right thing? Or are we measuring a proxy?” As a predictor of anything useful, EMployment data is much better (less squirrelly than UNemployment).

There are measures of underemployment but they basically look at PT versus FT hours, not people taking a lesser job. And… what about the folks working off the books? That has to be going up (always does in recessions, often surging as the economy starts to rebound). One way to proxy that is simply the amount of US currency per capita, btw. Anyway, how do we fit *those* people in?

I also remember they used to have two different monthly series for household/personal income – they were never in sync. The seemingly messier survey-based data actually was more valuable because you got it immediately – the census-type data takes months & months to get right. The survey-based data was more accurate & timely while the census data was more precise. (oops- terminology: Imagine firing a series of shots at a target – accuracy is how close to dead-center are the shots on average; precise is how tightly they cluster. You can be very precise and very off-target.)

Oskar Morgenstern (of von Neumann & Morganstern fame) did a famous study in 1950 of US government economic statistics. His conclusion — pretty accurate..plus or minus 50%. All that noise often makes the data less than useless. He recommended that we trade precision for accuracy. If you add biases that are known, it’s much easier to adjust them usefully than if have consistent data where the biases are unknown (see ‘ridge regression’). Anyway, I think it was Kenneth Boulding who reprised that study in the 1980’s and found… the same as Morgenstern, LOL

One final thought: Just remember that the serious analysts know this & don’t act based on soundbites but, like the politicians & media, they are more than happy to use them to rationalize what happened. To quote Mr Twain, “predicting is hard, especially about the future”! 😉

• Michael

They also fail to mention temporary government jobs like census workers which at best blurs the picture

SUCH language, bovine scatology, indeed!

Yup, again, now that you've described your mother's accomplishments and shown her picture, no one's going to blame your language on her!

My language would be much worse than yours.

But, of course! Every business needs a 'business model, and the media has BS! They REALLY like it, and that's what they do: Everything at the 4th grade level except sex at the 10th grade level except anything having to do with numbers at the 2nd grade level or lower.

Some of the Fed stats may actually be okay, but the newsies are grotesquely incompetent at reporting them and don't know and don't care.

So, for

"The first clause says 'U. S. Economy Adds 290,000 Jobs in April.' This means to me that a bunch of people found new jobs in April. A bunch. Yay! Good economy."

Well the clause more likely means that in April the number of people hired was H and the number of people fired was F and H – F = 290,000. So, if F > 0, and I have to believe it is much greater than 0, then nicely H is much greater than 290,000. So, the number of people who "found jobs" is greater than 290,000.

But any such details are way, WAY beyond the 2nd grade and, hence, also the newsies.

In particular they just will NOT give definitions of their terms or quantities. And we don't know how the measurements are done. We don't get graphs of the quantities over time or in comparison with related quantities. Net, we get BS.

The failings of the media go on and on.

Yup, a nice solution is possible. And the solution solves this problem and many more. Working on it. Need to type in some more Web pages!

• Adding to Tufte's design constructs that would illuminate this confusion, is a new book on understanding health statistics
How to see through the hype in medical news, ads and public service announcements.

While you and most of your readers have the background and experience to see through the planned confusion, most readers do not understand messages like "Colon cancer will strike 1 in 19 people" or the exaggerated numbers in colon cancer survival statistics. This book does an excellent job of giving layman the tools to see through the hype and planned mis-information.

I'll go further, it is time for the VC community to get aggressive behind a GOV2.0 movement, premised on opening government up, and aggressively privatizing bureaucracy with open systems.

See my examples in a series here at Big Government: http://bit.ly/cpFCcu

• digik

The U.S. unemployment figures have been measured this way for at least a decade. (not defending it) I first came across it in a class debate topic trying to compare the relative merits of the U.S. economic systems vs european systems – hmm maybe that was actually almost two decades ago. It was a problem because the u.s. unemployment numbers were reported just as you noted, but european governments tended to report some variant of the others discussed here. So, before even starting one had to overcome the entrenched thought that U.S. unemployment was lower than european, when actually they were comparable at the time.

If this bothers you, then you could also look at total tax loads and start to see how those differences add up

• They actually have several reported figures, the one generally reported, by news sources, from the report is the U-3, it ignores the under-employed and discouraged workers. There is however another number the U-6, which is the total unemployment, with inclusion of the under-employed and discouraged workers, and it is stays around double of what we actually see in the U-3. Here is a good over view of the U-6, and explanation of it:http://portalseven.com/employment/unemployment_ra

• Ralph

There are quite a few good wall street blogs that have reported on these issues for years. I think the issue has a cultural component that does not get any attention. The general populace is completely apathetic about this issue. They don't seem to care whether the data is any good or not. The press finds that if they try and tackle these issues that their readers change the channel. That or we have become so bad at teaching math that the majority don't even understand that there are some fundamentals about numbers that must be followed for them to have any meaning.

• Norris Krueger

@Jimminy – thanks for the link – I had forgotten the series names. Idaho's numbers are scarier than I thought… +1

@ Brad – not sure it matters as its your blog but another +1 — evoking painful (and amusing) memories of my sordid past.

@everyone – great discusssion, thanks!

Not to complicate things- it is also important to disaggregate. Employment changes = Number of jobs created – number of jobs lost.

Even in the best of times, jobs go away; even the worst of times, jobs get created. We need to look inside the job numbers. Policy prescriptions depend on the details. (And it will affect how we forecast employment numbers.)

More people were laid off in 2000-2001 than in 2008-2010. Think about that.

In this recession, a lot of people got laid off but proportionately there was an even greater dropoff in jobs created. That shortfall in job creation is far worse than in 2000-2001. [The direct link to the Heritage Foundation report:http://bit.ly/dDyiBs (you will like this!)]

Are businesses creating more jobs? Are they laying off fewer? Pretty much every time a recession has ended… it's job creation that led us out. (And *business* formation! Net business formation used to be the #1 or #2 leading indicator of GDP but they changed the data series in ways that made it less predictive.)

Checkhttp://www.youreconomy.org for the underlying data for your state/county/MSA. It has number of biz establishments entering & exiting (by size of firm and eventually by industry) AND the number of jobs added & jobs lost by: Firm Births & Death, Growing & Shrinking (also jobs made/lost by business migration but that's a teeny number). Idaho from 2000-2007 gain ~70K jobs overall… but underneath that there was a lot of churn (287K jobs from births but -282K lost from deaths; 200K jobs from growing firms & 134K jobs lost from firms shrinking; a small net loss from business migration) Where Idaho differs from national norms are the number of jobs created by births [lots of self-employment but not many scaling] and growing firms could have done better. Deaths & shrinking comparable to national norms. Hence the policy prescription is to focus on growing firms. But in another state, it might well be the opposite.

Anyway, sorry for the long-winded way of saying — another problem with the stats is that they are aggregating multiple dynamics. Do check outhttp://www.youreconomy.org — fun to play with (and annoy government officials with… heh, heh)

• One press spin that is BS that always ticks me off is the often used headline or its variations:

“(blah blah about economy downturn) – yet entrepreneurs are optimistic about the economy. Entrepreneur, Mr X, says ‘my company is booming, (blah blah self promote blah)’ so it seems things are looking up”

umm, someone needs to tell the idiot reporters that entrepreneurs are just about always optimistic!

…one more relevant item, brad, I believe uncle sam is adding hundreds of thousands (I believe I saw like 300k+) of govt jobs just for the census – now that is something that needs to be done, but to say for that to be bundled into metrics about how our economy is growing certainly qualifies as uber BS.

BTW, – That Dodd finance bill is still looking very hairy and scary…

• I never believe anything the government ever tells me. If the government tells me it's sunny outside but I see it's raining, I'm going to bring an umbrella!

The government is filled with nothing but liars, charlatans, etc. It's disgusting. I wish they'd take a modified, neoteric ethics oath. If they lie, cheat, steal they should be sent to the guillotine!

For real statistics, I'd suggest checking out Shadow Statistics. The real unemployment rate is at around 20 per cent and it will increase in time because the only thing government can do is take away jobs from the private sector into the public sector and once the money completely runs out then we'll all be unemployed!

• Pingback: affordable auto insurance california()

• Pingback: cheap online car insurance()