12/19

Arbitron doesn’t like Audience Estimate Comparisons

Can you convert streaming metrics to shares and compare those shares to the numbers in an Arbitron ranker?

Of course you can, but that doesn’t mean Arbitron will like it.

Today Arbitron released their “Thoughts on Comparing Audience Estimates,” and while they make some good points, those points don’t tell the whole story.

Let’s dive in.

Don’t compare the two, says Arbitron because…

One to Many vs. Many to One

Some Internet music services are using the traditional radio audience metrics of average quarter-hour (AQH) and Cume. To date, these metrics have only been applied to “one-to-many” curated broadcast stations, which then can be aggregated to create combinations of stations. AQH and Cume estimates, whether produced by Arbitron or other measurement services, historically have been subject to minimum reporting standards limiting the number of stations that are reported in any individual market, even though some listening occurs to small local or out-of-market stations.

The listening model for most Internet music services is “one to one.” As an example, a user of an Internet music service may not be served an ad until being signed on for a specified amount of time.

The listening model for broadcast radio is “one to many;” specifically, listeners are exposed to the same commercials at the same time and without regard to how long they have been listening to the station.

So what?

Ratings are measures of aggregated listening.  That listening can be gathered into an infinite number and variety of buckets, whether or not they are defined as a “station.”  The role of a station as “one to many” has no bearing whatsoever on the degree to which listeners are aggregated by third parties with different means of aggregating those listeners. The presence or absence of a broadcast tower has nothing to do with the average number of listeners tuned in to any one brand, whether or not that listener has a unique listening experience outside the spots.

Meanwhile, there is no need for every listener to be exposed to the same commercial at the same time in order for them to hear that same commercial on their own time.  Reach is reach.  We’re talking about spots here, not the Super Bowl.

How Estimates are Calculated

Arbitron audience estimates are subject to limitations explicitly cited in our reports, and Arbitron publishes a Description of Methodology that explains in full detail the methods employed in developing our audience estimates. Users of audience estimates from Internet music services should consider whether those estimates come with a similar set of limitations and whether they are accompanied by a detailed description of methodology that allows potential users of the data to evaluate the estimates. Important factors to consider include the source of the population estimates required to create the ratings and the geographic definitions of the “metro survey areas,” as well as many other procedures.Arbitron believes that unless a user of Internet music service audience estimates has directly comparable descriptions of how each of the estimates is derived, the estimates should not be considered equivalent to Arbitron audience estimates.

This is reasonable on the surface.  But any party looking to publish ratings from streaming providers and compare those with Arbitron has an enormous incentive to mirror Arbitron’s methodology to whatever degree possible and sensible.  It is inevitable that the differences between these technologies will only shrink over time.

Indeed, even Arbitron is struggling on how to compare streaming numbers to AQH metrics which are dated artifacts of a time decades ago when radio featured programs and those programs were 15 minutes long.  Perhaps we should be bringing Arbitron up to date rather than blowing dust onto metrics which are based on every user with 100% accuracy, not a smattering of sampled users with sketchy accuracy.

Who’s There?

One important attribute of the Arbitron PPM methodology is the number of tools that Arbitron employs to indicate what persons are exposed to and the duration of the exposure. To our knowledge, many Internet music channels simply indicate that a session started. There appears to be no way of confirming if anyone is on the other end throughout the session.

For example, the PPM service requires that panelists keep the device with them and in motion. If the minimum requirement for motion is not met in a day, then the panelist will not be counted. If a panelist moves away from the encoded audio source, the PPM detects this change. Arbitron also applies a Qualification Edit, which is a process that screens data quality, meter status, and motion detection data in order to determine a panelist’s In-Tab status for a given media day.

Likewise, with the Diary service, Arbitron instructs participants to write down when they hear a radio— whether they chose the station or not—from the time they start listening to the time they stop.

Users of audience estimates from Internet music services should consider that while these services may use a “time out” or similar function to determine if someone is listening, this is not equivalent to the Diary and PPM compliance requirements, and that non-equivalency may result in editing rules that impact the audience estimate.

The truth is that “keeping a device in motion” proves only that a listener “keeps a device in motion.”  Arbitron knows when a meter is moving in the proximity of a radio, not whether a listener is actually LISTENING to the radio or even whether that device is in the possession of a listener – or his dog.

Meanwhile any argument which suggests that diaries are in any way more accurate than streaming data is laughable on its face.  Research has shown that lots of diary completion happens at the end of the day when the listener “reconstructs” what he listen to throughout the day and fills in blanks with listening that may never have occurred (hence the impact of “recall”).  In some cases listening is fabricated from imagination and the answer to “who’s there?” is “whomever I invent.”

In fact, every medium measured any way you like has a “black hole” where attention is involved.  TV is measured without knowing exactly where attention is being paid, and so it is with radio.

Validation of Self-Reported Demographic/Geographic Data

Users of Internet music service audience estimates should consider whether the service’s self-reported registration data are reliable and that users do not have multiple accounts (the “unique” aspect of the Cume estimate) or have provided inaccurate information about their gender, age, or location. As part of Arbitron’s services, participating households are directly contacted to ensure that household composition data are correct. For example, if a person uses more than one account, it would invalidate any measure of reach or Cume because a single person would be counted more than once.

Again, legitimate on the surface.  But streaming providers can audit this information to determine the rate of inaccuracy, if any.  My gut tells me it will not be significant.

And even with a frictional rate of error, we need to compare that against Arbitron’s heavily audited process on a relatively paltry sample of respondents intended to represent an entire marketplace of consumers.

So pick your errors.  Which do you think is greater?  The errors that come from some self-reported inaccuracies or the ones that come from, for example, amplifying the results of a PPM sample of 1,600 in Boston across dozens of stations, segments, periods, and dayparts in a marketplace of more than 4 million?

* = required field

Dive Into The Blog