⇥ Fun with… not understanding statistics

Via Daring Fireball, I came across this piece written by Kevin Drum for Mother Jones on how the Wall Street Journal has manipulated an income chart in order to show that the American middle class has all the money. That’s the chart on the left here:

Income chart

Drum made the chart on the right to show that the upper class, which he defines as anyone who makes more than $200,000 is, in fact, in possession of the bulk of the U.S. wealth.

The problem is, both charts are wrong.

Drum defines the “upper class” as anybody who makes $200,000 based on what the Democrats claim are the targets of their tax laws, which seems perfectly reasonable to me.

Inexplicably, however, after lumping the rich that way, he breaks out every other category individually. In other words, after accusing the Journal to have created an “optical illusion,” he proceeds to do exactly the same thing to prove his point.

That’s not the way this chart should have been drawn. You can’t lump together the statistical groups that are convenient to make your case and fail to do so with the others only because they are not as favourable.

A look behind the numbers

The interesting thing is, there is no need to manipulate the number to prove Drum’s point. You just need to present them the right way.

If we define the “rich” as anyone with income above $200,000, then we need to also define the “poor” and the “middle class.” But how?

According to Wikipedia, the poverty line is established at around $10,000 per person. That’s as arbitrary a number as any, but it is an officially recognized one, so we can use it and at least be on equal footing with the definition of the “rich.” The numbers on the Journal’s chart are not fine-grained enough for this income level, so I decided to use $20,000 instead1.

We can then further define the “working class” as someone who is above the poverty line but below the median income level (about $40,000—approximated up to $50,000 so that we can use the numbers in the original charts), and “middle class” anyone who makes more than $50,000 but less than $200,0002.

Once we do that, the chart appears to agree with the Journal’s interpretation of the facts—the middle class does have more money than any other group:

Income distribution by class

But that’s still pointless

This, however, brings us to the real issue at hand. The problem with the chart that appears in the Journal is not one of visual trickery, but of conceptual flaw.

Both charts in Drum’s article do not tell us anything useful because they are concerned with classes and not people. Because classes are arbitrarily-defined groups of people, knowing that one owns more of the wealth than the other doesn’t really mean much.

On the other hand, we could see how that translates into individual ownership of wealth—that is, how much of the total pie one person of any given class owns, and this is where things get interesting, because the distribution of members in each of the groups is not linear, as this chart demonstrates:

Population by class

Now we can get a more accurate picture of how wealthy each individual in the four groups I have defined really is:

Percentage of wealth by person

As you can see, the average “rich” person owns a much bigger percentage of the total wealth than the average “middle class” person, while the poor owns… well, almost nothing.

This, I presume, is what Drum wanted to demonstrate all along: that while the middle class as a whole may own more wealth than the rich, the average middle class person owns much less than the average rich person—the textbook definition of income inequality.

What does it really mean, though?

Of course, one could take this one step further and wonder what the actual meaning of these charts and the numbers behind them.

After all, the numbers only tell us that the rich are richer than the middle class, and the middle class is richer than the poor—but that’s exactly what the definitions of “poor,” “middle class,” and “rich” tell us, without the need for two pages of calculations.

My personal perspective is that these numbers on their own tell us nothing one way or another. In a capitalist society, it seems to me that opportunity, and not wealth, is the true determinant of fairness. That is, given an equal set of goals, everyone should have the same access to them, without regards to anything other than their abilities. Income inequality then becomes a function—rather than a cause—of an individual’s willingness to roll up their sleeves and make the most of their natural talents.

Obviously, that’s never the case, and wealth undoubtedly plays a factor into the real access to opportunity, but that’s not something that we’re going to learn by looking at charts like the ones published by the Journal, by Drum, or by yours truly.

And now, for some intellectual honesty

I would be remiss if I didn’t alert you to the fact that I didn’t have a lot of time to work on these charts, and had to cobble together data from different sources in a hurry. In some cases, I also had to resort to correlating data from slightly different sets by throwing in an intelligent guess or two.

Therefore, it’s possible that my numbers are off and my conclusions incorrect, though I don’t think that’s the case. Or, to be more precise, my numbers may be slightly off, but the end result should be consistent with reality.

  1. This gives me enough data to work with without affecting the end result, because the lower end of the scale has very little wealth.
  2. For those keeping score, this is an ever-so-slightly modified breakdown from Leonard Beeghley’s Structure of Social Stratification in the United States