If you have a web site, as with any marketing, you need to measure how it is performing. One essential tool to assist with this that many websites use is Google Analytics.
Once set up and left to run, it records loads of useful data such as the number of visits to your site, the time spent on each page, whether the visitor leaves the site or goes on to look at more (the bounce rate) and lots of other stuff that you probably won’t ever look at.
If you regularly look at your analytics data (which you should), you might have noticed a marked increase in the number of recorded visits / sessions / pageviews to your site recently. So, that’s good isn’t it ? Well, maybe not !
There is an massively growing problem recently with what is called Google analytics SPAM.
There are actually two types, one is called “referral SPAM” originating from spambots directly accessing your website, the other, more painful version, is what is now known as “Ghost referrals”. Both result in fake referrals from hackers being recorded in your Google Analytics account resulting in indication that you are getting lots more visits than you really are.
The effects of Ghost referral SPAM are generally more of a problem on smaller sites with low volumes of traffic. You can see in the example plot below, that of the total sessions recorded (the blue line) nearly 60% were seen to be from fake (Ghost) referrals (the orange line).
In this other example, 89% of the data was from Ghost referrals!
We make use of the analytics data to see how a site is performing and for the basis of decisions on how to improve it. Clearly, if there is a lot of additional fake data included it can skew the results and lead to bad decisions.
The first thing therefore is to be aware of it and then identify it so that you can extract the real data so that you can then reveal the true picture.
How do they do it ?
In order to register Ghost referrals in your analytics data, the hackers don’t actually visit your website, instead they directly inject data into your Google Analytics account.
Google developed their “Google Analytics Measurement Protocol” to allow developers to send raw user interaction data directly to Google Analytics servers. Unfortunately, the hackers were quick to spot this and make use of it for their dastardly deeds.
Current thinking in the industry is that generally the spammers use randomly generated Google Analytics id’s which means the spammer does not actually know what site’s analytics account they are spamming. However, there is nothing to stop them specifically targeting a site if they wanted to.
One of the parameters that is required in the protocol is the page name. Because they don’t visit your website, they don’t have a page name so they often just insert dummy names which explains why you may see non existent page names listed in your analytics data.
Why do they do it ?
The actual benefit of referral SPAM to the hackers is not so clear but the following are some thoughts from the industry.
- Maliciously inflating your website traffic to mess up your data.
- Tricking you into visiting malicious websites (URL’s) listed in your referral reports.*
- Generating backlinks for the hackers (or people they are working for) from publicly accessible server logs.
- Hiding the real referrer headers while attacking the website.
Another thought I had is that this could be used by unscrupulous SEO companies to falsely indicate that the work they are doing (or not) is producing an increase in the number of visits to your website.
* One important point to note is DON’T visit any of the pages you find listed in fake referrals in your data – you leave yourself open to being infected with malicious malware and/or viruses.
Stopping Ghost referral
Although they are not saying much about it at the moment, Google are working on a way to overcome this problem.
Since they never actually visited your site, you can’t block Ghost referrals at your website server. All you can do is create filters in your analytics account to at least make them visible so that you can subtract them from the totals to get the real numbers.
Unfortunately, the biggest problem with ghost referrals is that the spammers change them constantly in order to make it difficult to identify, so you need to be continuously building and updating these filters.
The bottom line is that it increases the amount of work you have to do but if you ignore it, you can see from the example plots above, it could lead you to making very wrong decisions about your marketing.