CENTRAL TENDENCIES

Every writer has an "idiolect," or a personal vocabulary of distinguishing words that they use a lot, or common words they use hardly at all. For example, Shakespeare rarely used the word "also" for some reason. In all of his writing the word appears a total of less than 40 times. His contemporaries used it as much as we use it today, which is to say: constantly. Shakespeare, for reasons known only to him, just did not use the word much. Maybe he didn't like it, or maybe it reflects some extreme hyper-local dialect of English he learned.

One of the things that stands out about my own writing is how often I use the word "accordingly." Another thing I say a lot – intentionally – is "modal."

I apologize in advance if this is information you already know, but I've had the experience several times in the last year of very intelligent people – editors, journalists, people in the publishing industry – send me edits on things I've written indicating that they are not clear what modal means. It is possible that in an academic field in which everyone is used to dealing with data sets I encounter that term regularly enough to assume it is common. But when people who know the language highlight it and say "This isn't clear, please explain" it obviously is less common elsewhere.

I say "modal" a lot because when people say "average," they almost always mean "modal." It's a pet peeve. Allow me very briefly to explain the most familiar measures of central tendency in data and explain why you see a certain kind of news story in political journalism that incorrectly substitutes average for modal.

Average (or mean) is widely understood. Add up the salaries of a group of 10, divide the sum by 10, and that is the average salary. Unfortunately average is also used in non-data contexts as an adjective meaning "ordinary" or "common." That is bad.

Why? Well here's a true story – of the 11 players on my high school football team's defense, our average net worth today is well over $5 million. Seriously. Ten of us make totally unremarkable incomes doing normal jobs, and the eleventh guy made over $50 million playing in the NFL. On average, we're all worth seven figures!

So, averages can be very misleading. Especially in smaller sets of data.

Median is the middle value that divides a set of observations in half. If the median household income in the US is $56,516. That means 50% of households earn less and 50% earn more. Imagine every observation in the data lined up in a row; the median is the one right in the middle.

While the median household income is $56,000, the average is $79,000. See? High values – people making billions of dollars – skew the average toward the right (on a simple graph).

That brings us, finally, to mode. The mode is the most commonly observed value in a data set. This generally is only useful – but then tends to be the most useful – when the data are categorized subjectively. For example, say we decided to categorize households as Rich, Middle Class, and Poor based on some subjective cutoff points. Count up the total number in each category; the one with the most is your mode.

Another great example of when mode is useful is academic: grades. Say the Dean wants to know how my students did in a course. There are 20 students. I calculate everyone's grade as a percentage of all possible points. I say, "The average grade was 80%." But what does that mean? There are almost infinite combinations of 20 percentages that will average to 80%. Maybe 16 students got 100% (A+) and 4 got 0% (F). Maybe all 20 got exactly 80% (B-). A better way to reflect the performance of the class would probably be to say, 16 students got an A+. But 4 students enrolled and never showed up, so they got an F.

That brings me to the reason I think about this daily: news stories about the "average" voter in the United States.

There is no "average" voter because several of the important variables for "measuring" voters are categories like race, gender, educational attainment, and so on are not continuous values. You cannot "average" race or the attainment of degrees in the American electorate. What you could do, just as one example, is to say that 44% of the electorate falls into the category of "White, no college degree." Therefore, the "Cletus Safari / Diner Enthusiast" guy constantly being interviewed in the media is not average; he cannot be. He is the modal American voter, as long as the criteria of interest are race and education.

It's a petty hill to die on, and most people understand fine what is meant in common usage when a journalist refers to the "average voter." But it is incorrect as well as silly – because we have a perfectly good term for what he or she actually means.

32 thoughts on “CENTRAL TENDENCIES”

  • J. Dryden says:

    Fun Fact: Upon receiving one’s PhD, “A Petty Hill to Die On” is tattooed on your [redacted by the Illuminati] in prison ink. Ed can leave academia, but that shit’s in his bones, now.

  • I support you in your determination to tilt at windmills, and will myself continue to protest "impact" as a verb. Because I HATES IT. (And don't even get me started on "impactful" dear god.)

  • c u n d gulag says:

    Interesting post, Ed.
    As usual, you find the kernel of something interesting in the most mundane of subjects.

    I'm curious, based on what you wrote, what would the defining characteristics be of a "median" American voter?
    An Independent of uncertain ethnicity and sexual orientation along the NE/KS border?

    Anyone?

  • My pet peeve (and I work in government as a policy analyst (city planner)) is "utilize". Why utilize "utilize" when you can just use "use" instead?

  • YES THANK YOU!

    And if you thought "modal" gets you funny looks, try using "bimodal"! For those of you who don't know: sometimes (e.g. with the A+/F grades example Ed gives), a data set doesn't have your classic one-hump-in-the-middle bell-curve-y sort of distribution, and instead has two (or more) clusters. The "mode" is the most common single value, and "modal" describes data at or near that value, in the middle of that main cluster. But if there are *two* clusters, we say it's "bimodal".

    Which happens a *lot*, and when it does, that's a good way to suggest that mean/median are going to be particularly unuseful ways to describe the data.

  • This is cool because I first looked up the meaning of "modal" while reading one of your posts, and learned something about data analysis. I've been more discriminating about the use of "average" since then, though I don't think I'll start saying "the modal Jo" any time soon. I may instead switch to "the typical Jo", as I think that's the better euphemism for "modal".

    Anyway, thanks for the typically illuminating rant/illustration/lesson! And thanks @blahedo for the cool follow-up!

  • Benny Lava says:

    Modal is a very rarely used term even in the social sciences. It is especially obfuscating because modal has many other competing definitions outside of your niche. Like modal jazz or mode of transportation. So I guess RIP Ed.

  • Benny Lava says:

    P.S. this is why journalists don't use the phrase mean voter even when they know what it means. Because the readers don't due to a competing definition.

  • ProfessorPlum says:

    In the same mode as "use" and "utilize", I hate hate hate it when people talk about their "methodology" when they really are talking about their "method". When did that nonsense star?

  • I agree with complaints about "emails" and especially "impact" and "utilize" plus a few other abuses of word meaning and language usage. Like what's with this "metric" stuff? Huh? Supposed to make you sound like you wear tortoise shell glasses and elbow patches? Oh, god. I just dated myself. Well, why not, no one else will. Never mind.

  • Is there a website that will analyse pasted text for patterns such as over- and underrepresented words, to find such styles? There are word clouds, of course, but they don't quite do that.

  • I'm just glad that somewhere, a copyeditor is employed.

    I hope they're full-time and paid.

  • john danley says:

    Modal should be dried on a low to medium-high temperature and taken from the clothes dryer while slightly damp to reduce wrinkling.

  • Very educational, and thanks for the word "idiolect." I first became aware of my own when being misquoted, with words I never use. It's like seeing your name under someone else's photo.

  • If history teaches us anything, it's that rebelling against a rising (albeit incorrect) common parlance is a futile battle. Literally. <– See what I did there?

  • @ Benny Lava

    "P.S. this is why journalists don't use the phrase mean voter even when they know what it means. Because the readers don't due to a competing definition."

    Yet they use, "Base voter". Base and mean are both words used to indicate the same thing. Maybe it's an inside joke.

    Yes, I'm being deliberatley obtuse–practicing in case they win the world.

  • It’s probably out of fashion, but I learned the meaning s of “average” (mean, median, mode) from this book: https://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/0393310728/ref=sr_1_2?gclid=CjwKCAjwq-TmBRBdEiwAaO1en5zRSVEx1C2Bx14dXaXizG8LFi5GWGpAodm1dpFK9S1_U7U1K5hXQRoCIjQQAvD_BwE&hvadid=241635027273&hvdev=t&hvlocphy=9009746&hvnetw=g&hvpos=1t1&hvqmt=e&hvrand=5502291426301281619&hvtargid=kwd-96237380&hydadcr=21906_10171159&keywords=how+to+lie+with+statistics&qid=1557748235&s=gateway&sr=8-2 The author would compromise his integrity by coming up with a “how to Lie With Statistics about Smoking” at the behest of the Tobacco industry: never published, but cited.. That shouldn’t diminish the original book. I was maybe 12 when I first read it.

  • What really annoys me is when people say some for of the following: "You do realize that half the population, by definition, has a below average IQ?" Of course it's easy to get the point they intend to make, and it's perfectly forgivable to confuse mean with median. It's the smarmy, pompous, "look how smart I am" tone it's usually said in that makes it so galling.

  • I just realized that I fit in the category 'white, no college degree' and I'm an FDR Democrat. Any pundit interviewing me in a diner would get some unexpected answers.

  • One of my all time favorites is “Did ya know that most people have more than the average number of legs?”

  • Of course, I could then snark that most voters should not be served with ice cream on top.

    Fact is, "average" is shorthand for "most likely encountered for race, gender and faith" voter. It is a means of faintly obsolescent romanticism among the members of the press, even those outside the mainstream, as they can't seem to recognize that "Christian White Guy" is becoming an increasingly outdated means of measurement of the (strict sense) mode. They can't break out of it, though – which is why they're chasing these guys out in the boonies to "listen to their stories". It beats working, I guess, and gives them a chance to get to travel (although why anyone would want to travel to West Bumfkistan, Iowa, is beyond my mortal ken, if one does not have immediate business or family out there).

    "Christian" doesn't mean a whole hell of a lot anymore – most fools who voted for Donald Trump were not motivated by ostensibly Christian goals so much as maintaining their place in the social pile-up. (I have an atheist friend who hated Hillary Clinton and refused to vote for her – in Wisconsin, no less, which gave me reason to make him less of a friend when Trump won.) Nor does "white" or "male" (or even "heterosexual"), as many white males vote against their own values, needs or even their own empowerment. I get the feeling that white males these days don't *want* to think or do things for themselves. They're living in an age when wanting to do that reveals a certain intellectual and social independence, and many of them have just left their parent's basement…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>