Web stats

Statistics are hard, as Charles helpfully pointed

out to me a couple of weeks ago. But one has the idea that techier people

grasp the relevant concepts more than your standard arty journalist might. And

then one reads Paul Boutin

in Slate:

While the Web guys admit they could be off by half, Nielsen claims its television

ratings have a margin of error of 4

percent.

If you follow that link, you’ll find that it doesn’t quite say what Boutin

says it says. In fact, the words "margin of error" don’t even appear.

Rather, one finds this:

According to sampling theory and a very tasty laboratory test, 19 out of

20 times we take a well-stirred sample of soup containing 5,000 vegetable

pieces, we get between 48% and 52% carrots. There is no guarantee that the

percentage of carrots in a sample of this size will be between 48% and 52%

(one time in 20 it will be outside this range, but usually not far outside

this range). The same sampling errors apply to a representative sample of

television viewers.

Ignore the carrot language for the time being. What Nielsen is saying here

is that the company is 95% certain that its TV ratings are within 4 percentage

points of well, something. But that something isn’t the "true figure"

– the actual number of households watching a certain program. Nielsen

first assumes that its sample is perfectly representative (that’s what

they mean by "a well-stirred sample"); only then does it

calculate the margin of error. (This is true of all opinion polls, by the way,

including – and especially – political ones.)

In other words, there are two ways that Nielsen can be more than 4% out in

its TV ratings. On the one hand, it could simply be unlucky. Indeed, 5% of its

ratings are more than 4% out; it’s just that no one knows which 5%

they are. Alternatively, its methodology could be imperfect. Any problem with

the representativeness of the sample, or reporting bias, or technological glitches,

is not included in what Boutin calls the "margin of error". Which

means that if there was any way of actually measuring exactly how many households

were watching a given TV program on a given night, we’d find that more than

5% of Nielsen’s ratings would be more than 4% off base.

But there isn’t. So Nielsen ratings are accepted as the least bad option for

broadcasters and advertisers. On the web, of course, there are alternative ways

of measuring traffic to websites – looking at one’s own server logs being

the most obvious – and so it’s much easier to tell when Nielsen is wide

of the mark. "The more I dig into how Web ratings work, the more I realize

people in other media are in denial," says Boutin. Which might be true,

if people in other media really believed the Nielsen rankings. But in fact,

those people are simply making the only decision they can make: to take Nielsen’s

figures at face value, because there is no alternative.

Anybody counting anything is going to make a mistake. Take SAT scores, for

instance. There will, on occasion, be errors in the way that a certain person’s

test has been graded. The machine goes wonky, the wrong score is spat out, with

major or minor consequences. One hopes those errors are very infrequent. But

more to the point, there will often be occasions when someone with high scholastic

aptitude gets a low SAT score, or someone with low scholastic aptitude gets

a high SAT score. Everything from a hangover to a complicated love life to a

successful test-cramming service can affect SAT scores, which means they are

a far from perfect proxy for whatever it is they’re trying to measure. But it’s

useful for there to be something standard and quantifiable in the academic world,

and the SAT is one of the least-bad options.

The fact is that it’s not people in other media who are in denial, it’s Boutin’s

"web moguls". They think that because they have hard-and-fast numbers

for their own website, that there is or can be some kind of knowable truth about

how many pageviews and unique visitors they have. In reality, however, just

like anything else quantifiable, there are going to be measuring mistakes both

big and small. Everybody else has been resigned to this for decades. It’s only

on the web where people still dare to hope.

This entry was posted in Uncategorized. Bookmark the permalink.