Some weird stats around Fairfax’s specialist digital properties …


Was digging through Netview today and noticed some really weird stats around 2 of Fairfax’s ‘marquee’ properties – The Vine and Essential Baby.

Some questions if I may …

How come 60% of Essential Baby’s users are male, when if you look at gender splits across sites like Baby Centre and Bub Hub they’re generally 65%+ female?

And how come 67% of The Vine – ie, the 18-29 targeting youth website – are over 35 years of age.

These aren’t made up stats – they’re part of the new Netview panel numbers.

The question needs to be asked, when an advertiser is buying either Essential Baby or The Vine … what are they really buying?

Are they buying hip youth or new Mum’s … or are they buying the same general older male audience they could get across The Age or SMH?

35 responses to “Some weird stats around Fairfax’s specialist digital properties …

  1. Another cracking bit of insight Ben. I can sense the squirming at Fairfax.

  2. Pingback: 67% of The Vine are over 35 years old says Netview « Sound Alliance Blog

  3. Pingback: Counting eyeballs online – The Content Makers

  4. Hi Ben,

    At this particular point in time, the Netview numbers for many smaller sites across the industry, including The Vine and Essential Baby, are not robust enough yet at a demographic level to give a good indication of actual audience breakdowns. This issue is something that Nielsen are working hard to rectify. Just to confirm – Over 70% of TheVine’s Members are aged between 18 – 30 years*, debunking any myth that TheVine is not targeting the youth market. Essentials Baby’s audience is made up of 90% women, 9% are women trying to conceive and 1% men*.

    *Source: Internal Fairfax Digital Member Survey, June 2009

  5. talkingdigital

    hi melanie – so are you saying the netview numbers for the vine and eb are not robust enough yet at a demographic level? even though they are reporting 282k and 400k ub’s respectively? is this the view of the IAB/MFA/Nielsen?

    so does this mean for any site under 500k (lets say) we should just ignore the demographic info nielsen reports?

    btw nice to have a commentator from fairfax on here

  6. Good sleuthing Ben. I can only hope that you have queried Nielsen Online on an official basis as well.

    A couple of thought starters though. There can be chasms of difference between registered users and active users – which is why I tend to politely ignore registration numbers and profiles. It would be interesting to compare the profile of the “ever accessed” those sites (i.e. the reached) versus the last month’s data. If the “ever accessed” looks more like the registration numbers then it is a usage issue for that month, if not we have to dig deeper to see what is driving this skew. It would also be interesting to look at the profile over a few months – i.e. the old (smaller) panel and the new (larger) panel. Was there a discontunity with the new panel, or has it always been this way?

    Regarding the UA thresholds I know that the increased panel size did allow for a lower reporting threshold – but for the life of me I can’t recall what it is … can anyone at NO or the IAB step in here. Mind you, the lower you go the less reliable the data is as was pointed out. Ben, you also quoted UBs – doesn’t NetView report UAs? Have either you or myself mixed this up, or were you quoting UBs from another source … because UAs do not equal UBs. (I don’t have access to the NetView software so have to work from memory here – a tool known to be faulty at times)

  7. those numbers are netview so technically should be UA … not UB. my bad.

    i guess my question is this … it feels to me that melanie is spinning data she doesn’t like the look of. if the demo info is so unreliable then why have the large publishers endorsed the new netview. appears they’re happy to use the info when it suits and then question it when it doesn’t

    all we want is a clear set of data and no spin. i’m with you john – registered user data really doesn’t mean much as it only tells a small part of the story. there’s a common, generally spoken, theory that FD boosts Vine traffic by attracting middle aged males from their mastheads with promises of young skin. all the nielsen numbers do is back this up.

    it’s as mysterious as the MI data which shows absolutely massive overlap between watoday and brisbane times users the smh and age readers. the cynic in me says it’s just smart traffic pushing/distro. but i’m a cynic.

  8. I haven’t seen that the demo data is “so unreliable”, but at a macro level the new NetView data passes the “well that makes sense” test.

    What is interesting is how come Baby Centre and Bub Hub show the expected female skew, whereas Essential Baby doesn’t. In my many years of panel experience, it would be extremely hard to build a panel which skews in different directions like that. That is, something else appears to be driving Essential Baby’s “maleness”. I wonder if the “time spent” metric has any clues? Do the male visitors spend less time than the female visitors, and different patterns to the two other sites, which would indicate different casualness in visitation by gender?

  9. with Essential Baby these are the numbers I’ve grabbed

    400,000 users
    4.36 sessions per month
    33m, 12s per month
    53 pages per person

    Now looking at these it appears to be a loyal audience. For the main sites these numbers are strong.

    Kidspot.com.au’s numbers don’t tell as good a story … they have 824k users but … users spend only 2m 15s on the site a month across 1.6 sessions and 5 pages consumed. To me this suggests minimal audience loyalty and engagement and heavy reliance on search (ie – in/out traffic rarely returning)

    so compare the 2 and EB looks great. Until … the issue of audience gender split

    EB – 60.25% male, 39.75 female
    Kidspot – 66.05% female, 33.95% male

    Now lets look at web page views composition

    EB – 56.15% male, 43.86 female
    Kidspot – 74.6% female, 25.4% male

    So not only are there more males on EB than females, they’re also consuming more pages.

    So Melanie – what is the reason for this? Is it still the fault of Netview?

  10. Thinking about it and digging through some data … I suspect “story seeding” could be behind the high UA with the male skew.

    The BIG difference in the new NetView panel is the ‘at-work’ usage for ‘at-home’ panellists is now accounted for.

    Is it too big a stretch of the imagination to say that a story on the EB site that is seeded on the SMH site could be drawing in male at-work users who link through to just that story and then vamoose (hence the question about ‘time-spent’ metrics), which drives up the UAs but also introduces a male skew that doesn’t reflect the registered users who would have accessed via the EB home page?

    Just a guess mind you …

  11. i have just contacted megan clarken at nielsen for an official response around this too.

    stay tuned.

  12. talkingdigital

    Ok – so this is what I received from Stuart Pyke from Nielsen as their official response

    “Hi Ben,

    “Nielsen back’s the reliability of our demographics. As both John and yourself have alluded to previously in this thread there are many reasons why member data will differ from the overall data when it comes to demographics. Our overall data obviously includes all the visitors who drop by and read a single article as well as the core readership that no doubt represent the most active users of a site (this is presumably what any publisher would capture in their membership). When there is a particularly popular article that draws visitors from a demographic that is different to the core readership then it follows that the overall demographics that we report in NetView will skew away from the core readership.

    “I should point out that this is not an entirely uncommon occurrence and it happens every now and then for all the large publishers. It almost always is driven by a particularly popular story and has the most dramatic effect when the story is linked from a larger website elsewhere in the publishers’ network (and the larger website has a different demographic to the landing site).”

  13. Pingback: Weird stats - mUmBRELLA

  14. Pingback: The mystery of the parenting site that has lots of males, and the youth hangout for over 35’s … « talking digital – Ben Shepherd

  15. The Quantcast stats offer closer demographics, even though neither of those sites are on direct measure
    http://www.quantcast.com/thevine.com.au
    http://www.quantcast.com/essentialbaby.com.au

  16. Neilsen’s Netview panel is a joke, 7000 people? This has the reliability of TV rating panels…

    Debating the details of such a tiny panel is pointless.

  17. are you serious simon? how many people does the netview panel need for you to consider it not a joke?

  18. Hey ben, As we all know the power of digital media is the capability to track and measure, in theory we should be able to know the actual visitation demographics, not just a general survey.

    7,000 = a sample size of 0.03% and considering the research is making claims across so many demos and profiles it needs to be much more substaintial. If the panel was 50,000 it would still only be 0.2% of the population, but a much better sample size.

    Hitwise gathers data from over 3,000,000 internet connections across Australia, 428 times the amount of data being used by Neilsen…

  19. PS Don’t mean to vent, I lost a frustrating game of indoor soccer last night, maybe I haven’t gotten over it yet 😉

  20. Well we’re starting to flush a few out now aren’t we.

    First, Duncan. Yes the Quantcast data does have the gender split we’d expect. But did you notice that this was US users only? I wonder how many US users would have seen the male-skew “seed stories” on SMH. I did notice that EB ‘spiked’ in the US’s “rough estimate” (gotta love that honest!) – so maybe some were referred traffic as well.

    Now onto Simon T. Small. I assume you come from the “Small Knowledge Of Audience Measurement” line of the Smalls.

    Have you ever had a blood test Simon? If so, did they take a 5mm syringe of blood? Well the average human body has 6 litres of blood. So they take .08% of your blood. I suppose that is a joke as well? Do you ask the nurse to drain you dry just to make sure?

    One of the first things you learn when you study statistics (you do have a statistical degree don’t you Simon? … well I assume so given you confidence im using a term like “joke”) is that when you start seeing the same results time and time again, any further sampling is simply redundant and a waste of money. As a simple example, how many people would you have to observe in a public place to work out what proportion of the population was male and female? You’d have a pretty good idea after n=50 … you don’t have to observe all 22m people in Australia.

    Simon, I gather you come from the “just because I can count it I do”. Even if what you are counting is the wrong thing or is misleading (like unique cookies). I think it was Einstein who said it best “Not everything you can count should be, and not everything you should count can be”.

    Regarding Hitwise (realising that sample size is not the holy grail), are there any ISPs that it doesn’t include (leaving aside that a single IP does not equate to a single user). Is it true that Australia’s largest ISP with around half of Australia’s traffic (and all the skewes and biases that introduces) is not included? Can anyone confirm that?

    So Simon, I look forward to your supporting evidence as to why an n=7,000 continuously reporting panel of people that reflect the composition of the Australian online population is a joke.

    I didn’t lose any indoor soccer games last night, so this is not a vent, it’s a rational and reasoned analysis I hope. Oh, I did watch the Aussies go 6-0 up in the ODI, so that does make me feel better.

    Onwards and upwards with the debate!

  21. Please, please keep this thread alive and the comments flowing. I haven’t had so much fun online since I clicked a link on SMH and then flicked through a gallery of arty nude chicks on The Vine.

    Stig (over 35 / male)

  22. Clearly Stig we mean different things when we dicuss “online figures” … hehehe

  23. The appearance of businessday.com.au and leaguehq.com.au as numbers 6 an 7 on google trends also visited list for essentialbaby indicates the fairfax washing machine is doing some work.
    Essentialbaby is a member driven site so would expect the real metrics will be as expected. The rest is just noise.

  24. When I ran The Fix at Ninemsn, the celebrity vertical had more women over 35 than any other single demographic.
    The majority were professional women who got their celebrity news from The Fix when they arrived at work.
    It is over simplistic to assume that celebrity news only appeals to Gen Y.

  25. John (of Gap Research?)

    1. why then do so many websites in Netview come up marked as ‘not enough data’? The major problem is when you’re looking at websites with low-medium volumes of traffic (most Australian websites). With a 7000 person panel on a 100,000 UB website you might only get 33 people (0.03%) from the panel looking at the site and I’m sorry if I’m wrong in thinking that is poor data sample. And unlike you, I’m certainly not a statistician.

    2. Why would you turn down sample data of 3M for sample data of 7,000?

    FYI, search for ‘Hitwise’ in Google and you’ll find out a little about this company.
    http://www.hitwise.com/au/

    John, I’m just sorry to get an old school fella like yourself all worked up

  26. talkingdigital

    hi ricky – no one is saying that celeb trash only appeals to Gen Y. the question is more why are over 35’s going to a site that is supposed to be edgy/cool/contemporary.

    last time i checked the vine isn’t positioned as a gossip/trash site and it certainly isn’t sold that way in market.

  27. Indeed Ben.
    The point I was making was that Celebrity Fix was trashy, and deliberately so, and staffed by Gen Yers so we had thought that was who it would appeal to.
    In fact, internally we called it celebrity news for addicts by addicts, because the authors were not journos, they were singers in bands, former staffers with Big Brother and so on.
    But it was interesting that over time, as the audiences coalesced into defined groups, we found the largest single group for celebrity was women over 35.
    Music was teen boys, movies was women in their mid-20s and TV was men in their 30s interested mainly in sport.
    It was an interesting insight.

  28. Very simple Simon (and very apropos). Because amongst all 7,000 there was so little traffic proportionately to allow drill down. The ESOMAR guideline is n=35 reporting sample. Let’s be generous and say n=100. That means you had 7,000 people from which you collected ALL traffic. That means that 99% of them didn’t go to that site. This 99% estimate is pretty darned ACCURATE – I mean it might be 97% or 98% – but we’re confident that bugger all people use the website. What it also means is that to drill down any further is sheer folly, and RESPONSIBLE researchers point that out.

    Let’s also look at your 100,000 UB website. That actually means 100,000 unique cookies – not people. This would equate to somwhere between 25,000 and 50,000 individuals would have visited the site (remember advertisers buy people and not cookies!). And that is a global figure. That’s also the accumulation across a month. A decent footy game gets more than that through the gate.

    You appear to be so besotted with big numbers that you’d probably take 3 million Zimbabwean dollars ahead of 7,000 Aussie dollars.

    Sorry, but if ‘old-school’ means quality over quantity – then count me in ! Of course I would select a representative panel of n=7,000 IN WHICH EVERYONE HAD AN EQUAL CHANCE OF BEING REPRESENTED over 3million IP addresses for which the other 10m are EXCLUDED. That is a recipe for serious bias. Bias is a MUCH MORE SERIOUS issue than the sampling fraction which is accounted for by weighting.

    And something just struck me … why am I explaining this? I strongly doubt you will get it. As you say, you’re not a statistician. I’m guessing you’re not a researcher. However I am betting that you are a salesman – hence the love for big numbers even if they are simply unbelievable.

    I sincerely hope I haven’t bored the pants off everyone! Have a good weekend one and all,

  29. Simon, please leave those that know what they’re on about to it, and go back to playing with your agency with a weird name, rubbish website and ridiculous ethos.

  30. Beeg, thankyou for your contribution to the discussion.

    John, I totally agree with you, it provides a pretty valuable measure in terms of visitors, as per Ben’s original post we’re not discussing the total traffic volumes.

    My issue, and I could be wrong, is with demographic profiling, building a profile based on websites with relatively small traffic volumes (lots of Australian websites).

    I’m now unsure of my opinion, but please use the example below to tell me if I’m right or wrong or otherwise.

    Back to the example of the 100,000 UBs and (assuming your calculation) that equates to 50,000 people/individuals/humanoids. That equates to measuring demographic data on the website based on 15 people (based on a 0.03% sample size).

    Let’s say Neilsen quote that the site gets a 41%/59% male/female split, and to keep things simple, let’s assume it doesn’t vary per age range. Let’s assume that 24% of the individuals are 25-34, that means that 1-2 males aged 25-34 (out of a maximum of 6ish) from the panel attributed to the claim.

    THAT is where I’m concerned about the integrity of the data/claims and why it relates to Ben’s original post.

    John, what are your views on this?

    Ben, am I on another planet?

  31. A good example Simon.

    For small sites – your 50,000 UAs averages out at around 2,000 people a day, so that’s small – the demos are a bit iffy, because we’re slicing and dicing an already small number.

    Let’s make sure we’re clear on one thing here though … we have a PANEL of n=7,000 as opposed to a SAMPLE of n=7,000. Most people think these are the same thing – they’re not. Samples provide a single point in time. Panels provide longitundinal (i.e. repeat usage) analysis.

    But there is a ‘hidden benefit’ of panels over samples, and that is a thing called the “effective sample size”. In essence, the panel is n=7,00 0 people a day. That is, over a month we have over n=200,000 “people days”. That is, the number of observations (think of it as cells on a spreadsheet) is now extremely large. However, a panel with 200,000 observations is NOT as efficient as a single point in time sample of n=200,000 – because we get repeat observations. In some instances, where behaviour is extrememly varied (the Internet?) they are very efficient. In other instances, where behaviour is extrememly consistent, repetitive or predictable (voting intention?), they are nowhere near as efficient.

    So back to your slice’n’dice demo example it might be an average of 2-3 in a demo, but across a month this is more like 100 observations. To a statistician, observations are more important than sample – as long as they know how the observations are collected so that the “effective sample size” can be calculated.

    You will notice that the new NetView still only reports on around 1,000 sites – they’re the ones that under the formula are “robust” (i.e. effective sample size is deemed to be high enough). Sites below that level simply have too few observations to get a reliable estimate and are suppressed.

    But Simon, you raise a valid point. Maybe there are TWO thresholds – one to release “All People” UA, and one to allow drill-down to demos. The latter would require a higher threshold of course.

    What do people think of that idea?

    Finally, there is a delicious irony in all this. The sites that need much larger sample sizes to all reporting and demo drill-down are the very small sites. These are the sites that can least afford to pay for it! Remember that in order to make a sample (or panel) twice as efficient (i.e. halve the standard error) you have to QUADRUPLE you sample (and your costs of course!)

    This is the thinking behind the hybrid. How we can leverage tagged traffic data (tagged so we know that all sites are enumerated for volume on the same basis) to allow reporting for very small sites … and then meld that data with the panel data to allow demographic reporting. Needless to say I am confident we can “crack this nut” – it won’t be easy … but we’re gonna try !

  32. Thanks for clearing that up John. Have you reviewed the offering that Hitwise provide? It may be a solution for the smaller sites…?

  33. Pingback: Five to follow – Ben Shepherd; Margaret Simon; Amnesia; David Knox; Stilgherrian - mUmBRELLA

  34. Simon, I have had a look at the site, and I found the information sparse under the “How we do it” section. About the only substance I could glean was 25m people globally and 3m people in Australia.

    I’m not even sure what they mean by “people”. From what I can gather (and if anyone has any detail on the Hitwise process I’d LOVE to see it … and I mean DETAIL) they collect anonymised IP addresses from ISPs. How they “impute” people I don’t know. Take myself for example. Both my wife and I share the same connection (IP address) from home (which is also our home office) through a router. How Hitwise would know that there was 1, 2, 3 etc people behind that IP is beyond my reckoning – let alone work out our demographics. Being generous, they may mean 3m households of the 7+m households in Australia … but who wants household data.

    However, this all come down to skews and biases. If more than 50% of IP addresses are excluded then that’s potentially a big problem. Word on the street is that they don’t include Bigpond. How would they get the Bigpond home page anywhere near correct? What about all the (for example) free AFL traffic on Bigpond – I doubt they would get that correct either.

    I repeat, if anyone knows detail about Hitwise I’d love to know more. Nielsen Online were VERY open and transparent when they went through the IAB audit process – hence the detailed knowledge of their system, its strengths and its weaknesses.

  35. I’ve sent this post to a contact at Hitwise, hopefully they’re follow up with a comment 🙂

Leave a comment