Fact Check: LGF PageViews – Broken, or Rigged
We’ve heard more than a few inquiries from DoD readers who’ve wondered about how LGF calculates the “views” stat that appears on each LGF thread and page. For myself, I’ve certainly visited a lot of LGF pages (especially old ones), and I’ll admit that something always seemed a little fishy. And we know that Charles dumped sitemeter a while ago, along with Quantcast more recently, which makes it a little tougher to verify anything from 3rd party sources. Instead, Johnson opted to go with his own custom-built page view counter last September to display the stats publicly at the top of each thread. But how accurate is his “views” thingy, anyway?
We had left the subject alone for a long time, but thanks to CJ’s recent trash tweets, he has provided the perfect excuse for us to try to sort this out (and we apologise if this gets a little wonky):
Never mind the fact that “views” doesn’t translate directly into “people” for any website (because of refreshes/revisits), or that not all those people (or any?) were neccessarily laughing, I think we should dive into this a little further, and see if the even the 21,600 views part is accurate.
First, let’s start with what Johnson himself has said about how views are calculated (this was a regarding a change that was made a little over a month after the feature was installed):
If I’m reading that right, it means that any visitor to LGF’s front page is actually registering 10 page views (one into each thread counter; whether they were actually read or not, as 9 of them aren’t likely to be on the screen without a scroll). Additionally, we’re to assume that if said visitor actually clicks on any of the threads, that would count as another “view”, and up the counter yet again (and if this person goes back to the front page after reading, it would register – you guessed it – another 10 views?). It’s a slick way to pad stats, IMO, but unusable for comparison to other sites (DoD, for example, has a separate view counter for the front page in our dashboard, and front page views don’t effect thread views).
So, 21,600? Not really. Of course we’d need to know what % of LGFs incoming traffic is front page vs. direct thread links to get an idea on how short the real number might fall (of where a normal blog would record it), but it is going to be unquestionably less. But if CJ wants to count the thread that is 10 spots down on the front page as “viewed”, well….whatever.
But wait, it gets better (or worse, depending on how you look at it)…
While spelunking through the depths of the LGF archives, I would notice that a lot of these old threads showed the counter increase in my browser as it was loading, and in many cases I could have sworn that it was by more than one. While a jump like this would make sense for an article on the front page (because of all the traffic), it seems weird that it would happen on a thread from, say, 2008. So, we had The Boiler Room put in a little overtime, and see if they can take this front page/”scroll assume” cheater effect out of the equation using old threads that aren’t on the front page and unlikely to have any interfering traffic. What we found was pretty interesting:
Boiler Room engineer No. 2 explains the methodology:
I picked twenty old (2008) threads at random, and (using a Selenium code) hit ten of them 20 times each over about 20 minutes. The other ten threads I hit just twice – once at the beginning and once at the end of the twenty minute period. Idea here is to eliminate the influence (or at least quantify it if it’s there) of other people coincidentally hitting on the same old threads.
After about 200 hits over twenty minutes the ten test threads showed over 400 hits view counts increase. The ten control threads – hit ten times over 20 minutes, showed only about 20 hits increase in LGF view counts. This is about the same 2×1 ratio – expected – but proves (to me anyway) that the ‘extra’ 200 hits on the ten test threads were from me and not from some highly coincidental other traffic on those same threads (since this hypothetical other traffic didn’t show up at all on the control group threads).
Then I switched control threads for test threads, repeated the experiment, same result (ie new test threads got about 400 bumps in view counts for 200 hits, etc).
I checked against a different, non-LGF site – and 20 hits by the script produced exactly 20 bumps in the page views. So that tells me there’s nothing funky in the internals of the Selenium code that hits a page twice for whatever reason in the process of loading the page into the browser.
No. 2 went on to explain that while checking some of the threads “manually”, he would sometimes notice view jumps of 3 (and even 4). That IS fishy.
But wait…it gets even better (or worse, depending on how you look at it)…
Since I prefer to double-check our engineers, I thought I’d try some of this stuff out for myself, using a handful of old LGF threads from 2003. Sure enough, well…just watch:
In any case, if you add in the front page tomfoolery, you’ve got a view counter that is set up to display substantially inflated numbers.
Ya don’t say?
Fact checked! 21,600? Busted…bigtime.
(Hat tip: The Boiler Room)
Update: Patterico links.
Also, in the day since we’ve posted this, we’ve naturally had a lot of folks try to duplicate what happened there in the video (Patterico said he couldn’t). With the other feedback that we’re getting, it appears that the refresh issue I’m demonstrating is specific to IE users, and that you must allow the page to load completely for it to jump by 9 like that.
So, I whipped up another quick video, this time just refreshing the page with comments, and making sure to wait for the page to load completely. I’m using IE8 on my VAIO, and was able to get it to jump by 9 with each refresh: