Removing Bad (Spam) Traffic From Google Analytics

fake-spamIf you use Google Analytics, you’ve probably noticed in the past few months a ton of fake traffic in your website analytics. Traffic from referrals like social-buttons.com, best-seo-offer.com or 100dollars-seo.com (and other obviously legit sites). You’ll notice this traffic has a zero session view time and only visits the root of your site. Actually, “visits” isn’t even the correct term, because to my knowledge, these site are just loading the Google Analytics javascript with your tracking code. I guess they believe it’s an easy way for people to see their site “giving” you traffic, then you visit them – and who knows what happens next.

Below are the steps I use to create a new segment in Google Analytics. You can use this segment instead of the “All Sessions” default.

 

1. Add a new segment:

1

2. Give it a name:

(I called mine “Real Traffic”, since I’m attempting to keep only actual user visit data).

2

3. Go to the “advanced” area to start adding:

3

4. Let’s add the first rule to remove data with blank hostnames:

Make sure Sessions and Include is set, then select Hostname and Matches Regex. We’re including multiple domains (your domain and others you want to include (like webcache.googleusercontent.com). Be sure to add your domain to this list! Note: the “|.*” is RegEx for and + starts with.

4

5. Second and last rule:

Now the second rule has 2 conditions and is removing the other part of the bad / spam data using referral sources. This is where the action happens and starts to really clean the bad fake traffic. Make sure to set Sessions to Exclude, and use “and” in between the two parts of the condition.

5

 

That should be it. Save your segment and see if it makes a difference. I prefer this method over filters since it doesn’t remove any data.

In some of my test sites, I’m finding 96% of the traffic is fake (below is a comparison).

ga-sample

 

For reference, here are the sites I’ve found so far to be junk and I’m excluding:

social-buttons.com
simple-share-buttons.com
free-share-buttons.com
free-social-buttons.com
event-tracking.com
Get-Free-Traffic-Now.com
buttons-for-website.com
semalt.com
best-seo-offer.com
best-seo-solution.com
buttons-for-your-website.com
makemoneyonline.com
100dollars-seo.com
dailyrank.net

Edit 7/15 - 2 more additions to add:
success-seo.com
videos-for-your-business.com

Edit 8/3 - another awesome referrer:
yourserverisdown.com

Removing Bad (Spam) Traffic From Google Analytics

Force WWW & Fix Redundant Hostnames on Google / SEO

I feel the term “SEO” is completely overused, however, there are a few things you want to do besides just having great content. One is make sure your site url is consistent. chrisbitting.com is different from http://www.chrisbitting.com. 

Google Analytics will provide you a suggestion to fix this if you’re experiencing traffic from multiple hostnames. Something like:

Property http://www.yourdomain.com is receiving data from redundant hostnames. Some of the redundant hostnames are:

This is easy to fix using your Global.asax page. Just add this to your code, replacing “yourdomain” with your actual domain. The Application_BeginRequest will catch and redirect to the altered url, also issuing a 301 to help search engines.

void Application_BeginRequest(object sender, EventArgs e)
    {
        

        if (HttpContext.Current.Request.Url.ToString().ToLower().Contains(

            "http://yourdomain.com"))
        {

            HttpContext.Current.Response.Status = "301 Moved Permanently";

            HttpContext.Current.Response.AddHeader("Location",

                HttpContext.Current.Request.Url.AbsoluteUri.ToString().ToLower().Replace(

                    "http://yourdomain.com", "http://www.yourdomain.com"));

            HttpContext.Current.Response.End();
        }
    }

You could do this using web.config + rewrite, but I enjoy this method more.

Force WWW & Fix Redundant Hostnames on Google / SEO

Creating a copy of your website using GNU Wget for Windows or OS X

There are times when you want to have a copy of your site (the frontend / user side). GNU Wget has been around a long time, but in my opinion, it’s still a great tool to backup / mirror websites.

Wget has many options and parameters, of which I won’t even scratch the surface, but below are the simple steps to get Wget setup and running on Windows and OSX machines. Wget is a command line utility, so it might appear overwhelming, but don’t worry, it’s cake!

Windows steps:

Step 1. Download / install Wget. Visit http://gnuwin32.sourceforge.net/packages/wget.htm and choose to download the Setup labeled “Complete package, except sources.”

Step 2. After installation is finished, open a command prompt (cmd.exe).

Step 3. Go to your GNU application folder (on 64 bit it’s in C:\Program Files (x86)\GnuWin32\bin, on 32 bit, it’s probably in C:\Program Files\GnuWin32\bin).

Step 4. To test if wget is installed correctly, run “wget -V“. It should return the current version and some credit. If not, revisit previous steps.

Step 5. To download / mirror a site, run  wget -e robots=off -r -l 0 -P “c:\\temp” http://www.chrisbitting.com – replacing “c:\temp” with the folder you want the site files to download to and “chrisbitting.com” with your site address.

wget_pc

 

You should now see the command prompt update with the progress – depending on the size of your site – it may take some time to download everything. After it’s finished, your directory should contact a mirror of your site, including html, css, images, etc.

 

Apple OS X steps:

Step 1. Open a blank terminal.

Step 2. Install homebrew by running:

ruby -e “$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)”

Step 3. After installing brew (and entering your password), run:

brew doctor

Step 4. Now install Wget using:

brew install wget

Step 5. When installation finishes, run “wget -V” to ensure Wget installed correctly. It should return the current version and some credit. If not, revisit previous steps.

Step 6. To download / mirror a site, run  wget -e robots=off -r -l 0 -P “/temp” http://www.chrisbitting.com – replacing “/temp” with the folder you want the site files to download to and “chrisbitting.com” with your site address.

wget_osx

You should now see the terminal update with the progress – depending on the size of your site – it may take some time to download everything. After it’s finished, your folder should contact a mirror of your site, including html, css, images, etc.

To see the multitude of options Wget provides, run “wget –help“. Happy downloading!

 

Creating a copy of your website using GNU Wget for Windows or OS X

Creating a .bash_profile file in OS X and adding PATH directories

If you’re starting out with a fresh install of OS X (10.9 in my example) and are using any development tools, at some point I’m sure you’ll want to add some directories to your system PATH. In short: this allows you to use an application in a specific directory from any other directory – commonly when you’re running commands in Terminal.

To start, we’ll utilize a text editor – in my case I’m using TextMate – but any plain text editor should do. Let’s get to it:

  1. bash_1Let’s first make sure you don’t already have a .bash_profile. In TextMate, go to File > Open. Browse to your home folder (with the house icon) and click “Show Hidden Files”. In your home folder you shouldn’t already see a .bash_profile file. (If you do, then you don’t need to create a new file and can open your file, make changes and skip to step 5.)
  2. bash_2So cancel the open dialog and enter some text into the untitled file currently open. You’re usually entering something like: export PATH=${PATH}:/somedirectory/asubdirectory:/anotherdirectory
  3. bash_3Now let’s save our new .bash_profile. Go to File > Save As. Browse to your home folder (with the little house icon again). Enter the filename as “.bash_profile” (without quotes).
  4. bash_4If you get a message saying “names that begin with a dot are reserved for the system” chose “Use ‘.’
  5. bash_5That’s it. Now if you already have a terminal open run source ~/.bash_profile (this just give you access to the updated PATH).
Creating a .bash_profile file in OS X and adding PATH directories

Local web server for testing / development using Node.js and http-server

localhost8080If you’re developing html / javascript applications and want to test locally, many times you will go beyond what local file access (file:///C:/…) in browsers will allow (like XMLHttpRequests, json calls, cross domain access and Access-Control-Allow-Origin restrictions).

A simple solution instead of deploying your code to apache or IIS is to install a local http server. http-server for Node.js is a fast, easy install and app that will allow you to use any directory as a http://localhost.

Installing this simple http server only takes a few steps:

  1. Install node.js if you don’t already have installed (from http://nodejs.org)
  2. In a command prompt / terminal, now run:
    npm install http-server -g
    

    (this installs http-server globally so you can access from any folder or directory)

  3. Now using command prompt or terminal, browser to a folder with some html you want to serve as http. (ie: c:\someproject\).
  4. Run:
    http-server
    
  5. Open your browser and visit http://localhost:8080.

 

You can change port 8080 (the default) to anything using “-p”, so http-server -p 8088 would change your local site to http://localhost:8088

Run http-server –help to see the other options available for running.

Local web server for testing / development using Node.js and http-server

Wirecast – Unable to start, find the Quicktime plug-in solution

If installing Wirecast (a great live streaming application by Telestream) and you run across this little error upon starting up for the first time:

 

Wirecast was unable to start
Unable to find the QuickTime plug-in. Please reinstall Wirecast.

wcast1

I would not recommend reinstalling Wirecast but simply downloading and installing QuickTime:

  1. Visit www.apple.com/quicktime/download/ and download
  2. Run Setup
  3. I would uncheck the boxes on this dialog:
    wcast2
  4. And at the end of the install I would also click “No Thanks”:
    wcast3

 

 

You should now be able to run Wirecast without the QuickTime error. This applied to Wirecast 5 (5.0.3) and Windows 7 x64.

As a side note, I’ve been experimenting with DaCast streaming provider, so far seems to work well.

Wirecast – Unable to start, find the Quicktime plug-in solution

Fixing / Removing Invalid Characters from a File Path / Name – c#

Below is a simple method for fixing bad filenames and paths. This uses the character lists from Path.GetInvalidPathChars and Path.GetInvalidFileNameChars (part of System.IO).

You should be able to pass a filename, directory or path. Example, calling these three lines would yield the below:

cleanPath(@"c:\tem|<p\fi<>le.txt")
cleanPath(@"c:\tem|<p\")
cleanPath(@"fi<le.txt")

Returns:

c:\tem-p\fi-le.txt
c:\tem-p\
fi-le.txt

You can also pass a string that’s used to replace the bad characters.

cleanPath(@"c:\tem|<p\fi<>le.txt", string.Empty)

Returns:

c:\temp\file.txt
 private string cleanPath(string toCleanPath, string replaceWith = "-")  
      {  
           //get just the filename - can't use Path.GetFileName since the path might be bad!  
           string[] pathParts = toCleanPath.Split(new char[] { '\\' });  
           string newFileName = pathParts[pathParts.Length - 1];  
           //get just the path  
           string newPath = toCleanPath.Substring(0, toCleanPath.Length - newFileName.Length);   
           //clean bad path chars  
           foreach (char badChar in Path.GetInvalidPathChars())  
           {  
                newPath = newPath.Replace(badChar.ToString(), replaceWith);  
           }  
           //clean bad filename chars  
           foreach (char badChar in Path.GetInvalidFileNameChars())  
           {  
                newFileName = newFileName.Replace(badChar.ToString(), replaceWith);  
           }  
           //remove duplicate "replaceWith" characters. ie: change "test-----file.txt" to "test-file.txt"  
           if (string.IsNullOrWhiteSpace(replaceWith) == false)  
           {  
                newPath = newPath.Replace(replaceWith.ToString() + replaceWith.ToString(), replaceWith.ToString());  
                newFileName = newFileName.Replace(replaceWith.ToString() + replaceWith.ToString(), replaceWith.ToString());  
           }  
           //return new, clean path:  
           return newPath + newFileName;  
      }  

Hope it helps!

Fixing / Removing Invalid Characters from a File Path / Name – c#