Saturday, February 5, 2011

fake traffic bots

Faking traffic to one's website is not done only for ego-boosting reasons ("look how many people read me, ma"), but also for profit. Here we try to show some examples and clarify some myths.

bot-publishing-reactionMany new bloggers, who start on the principle “if you build it, they will come”, might be misguided into thinking that all the bloggers who boast several thousand visitors / day are actually getting them. The sad reality is that almost all such visitors could be fake and furthermore, you cannot really tell, as these sites do not usually publish their stats. Even if they did, you’d have no way of telling if these stats were “doctored”.

But what do such scripts look like and how easy can they be obtained?

A quick Google search will take you to numerous websites willing to sell such scripts. I would strongly advise against buying anything of that sort. You could however play a bit with the following scripts, as published by icfun. The scripts were released to increase the View Count of videos published on MySpace, but they can be easily adapted for any other similar purposes.

NB: I have not tried these scripts and most likely they no longer work as they are very old, but they could give you a good idea of what fake traffic scripts look like and how they are used.

Suppose you have uploaded a video linked at http://vids.myspace.com/index.cfm?fuseaction=vids.individual&VideoID=2841784. Just extract the video ID and use one of the scripts below to increase the video view count.

Perl

I'm giving an example of Perl script to increase the video view count. Just put your Perl code under a loop according to your desired view count.

use LWP::UserAgent;
use HTTP::Request::Common;
my $id = 2841784;
my $url = "http://mediaservices.myspace.com/services/media/video.asmx/IncrementVideoPlays?videoID=$id&token=29254735542157_ec2&versionID=1";
my $ua = LWP::UserAgent->new;
$ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/2006120418 Firefox/2.0.0.1');
my $req = HTTP::Request->new(GET => $url);
$req->content_type('application/x-www-form-urlencoded');
my $res = $ua->request($req);

That's it. If you want to add proxy support to your script, add the following:

my $ua = LWP::UserAgent->new;
my $proxy_url = "http://78.29.232.10:8080/";##change your ip and port
$ua->proxy('http', $proxy_url);

PHP

If you don't know Perl scripting, here is the php code to crack. Just use curl. Use the following code inside your loop to increase the video view count. You can even use proxy support with curl.

$id = 2841784;
$url = "http://mediaservices.myspace.com/services/media/video.asmx/IncrementVideoPlays?videoID=$id&token=29254735542157_ec2&versionID=1";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); ## return the content into a variable
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_TIMEOUT, 20);
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11');
$content = curl_exec ($ch);
curl_close ($ch);

If you don't know curl or do not have it, try this simpler code. Just use inside your loop to increase the video view count. Beware - you can't add any proxy with this function.

$id = 2841784;
$url = "http://mediaservices.myspace.com/services/media/video.asmx/IncrementVideoPlays?videoID=$id&token=29254735542157_ec2&versionID=1";
file_get_contents($url);

Python

For python lovers, here is the code to use. Just use the code inside your loop.

import urllib2
id = "2841784";
url = "http://mediaservices.myspace.com/services/media/video.asmx/IncrementVideoPlays?videoID="+id+"&token=29254735542157_ec2&versionID=1";
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
usock = opener.open(url)
url = usock.geturl()
data = usock.read()
usock.close()

Sometimes, you might find yourself in the opposite situation, where you are trying to determine if certain reads of your pages come from a bot or are legit. This is not easy and it involves some cyber-detective work. Luckily, Sajal Kayan has published such a script to distinguish between a genuine Googlebot and a SEO bot:

#!/usr/bin/env python

'''

logazier.py - v0.0.1

Tested with nginx log file. should work with apache also.
Before using make sure to adjust the "filename" and "trusted" variables.

Other than standard Python libararies, PyDNS is also needed.
PyDNS can be found at : http://pydns.sourceforge.net/ or
installed by running : easy_install pydns

Copyright (C) 2009 Sajal Kayan - sajal at thaindian.com

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.

'''

import os
import commands
import DNS
import re

#Note IP must be first field in access log.. or youll need to adjust awk command below
filename = "/path/to/access.log" # path to your apache logs (relative or absolute).
trusted = 'trusted.txt' # Path to store trusted IPs (relative or absolute)

def revlookup(ip):
try:
host = DNS.revlookup(ip)
return host
except:
return "err"

def lookup(host):
try:
ip = DNS.DnsRequest(qtype='A').req(host).answers[0]['data']
return ip
except:
return "err"



# Remove 'ionice -c3 nice -n15 ' if u dont care about hogging all resources ....
command = "ionice -c3 nice -n15 grep Googlebot " + filename
#Read trusted ip list so we may ignore them
#os.system("touch " + trusted)
if os.path.exists(trusted):
f = open(trusted)
while 1:
line = f.readline()
if not line: break
# print line
#process(line)
command += "| grep -v " + re.sub("\n", "", line)
f.close()
else:
open(trusted, 'w').close()

command += ' | awk \'{ print $1 }\' '
#print command
ips = commands.getstatusoutput(command)
ips = ips[1].split("\n")
uniqueips = set(ips)
ips = sorted([(ips.count(ip), ip, revlookup(ip)) for ip in uniqueips])
ips.reverse()

for count, ip, host in ips:
if host[-13:] == "googlebot.com" and lookup(host) == ip :
print str(count) + " - " + ip + " - " + host + " - TRUSTED"
#add to cache of trusted
text_file = open(trusted, "a")
text_file.write(ip + "\n")
text_file.close()
else:
print str(count) + " - " + ip + " - " + host + " - FAKE - " + lookup(host)

Note that I do not provide support for modification and / or implementation; if you need help, ask the original author or hire a freelancer.


Finally, have a look at the following video coming straight from Google on how you can draw more visitors to your site with white hat techniques.


Sources / More info: logazier.py (article), icfun-scripts



No comments:

Post a Comment

Thank you for commenting and rest assured that any and all comments are welcome, whether positive or negative, constructive or distructive. Unfortunately, if you comment in this view I might not know about - please use the regular (Desktop) view.
I am using Disqus for commenting, but Blogger is not showing it so your comments may end up not being displayed - tell Google about it!