Monday, July 23, 2012

Blocking an IP address from accessing your blog

I have recently noticed in one of my blogs typical spider activity. A certain IP would visit, then proceed to access only the updated articles or the articles with new comments, within seconds. For a while, I considered blocking this IP, then I decided on a better solution.

This is typical If you are on Blogger / Blogspot, one solution is to capture the offending IP via a stats package that displays the last IPs to have visited your site: Statcounter, ShinyStat, Sitemeter, Histats (or the somewhat local gTop or WTAstatistics). Here’s their measured loading time (some load in parallel, though, so it doesn’t really matter):

  • Feedjit – 2x1s
  • Trafic. - 0.3s
  • Gtop. - 0.5s
  • WTA statistics. - 0.2s
  • Histats.com 0.4s
  • Sitemeter.com 0.5s + 0.7s
  • ShinyStat 0.2s
  • whois.amung.us 0.3s
  • mybloglog - 1.2s
  • statcounter.com - 0.1s
  • quantcast - 0.1s

Once you identified the IP that bothers you, you might employ a PHP script hosted on a free webhost such as $0.00 WebHost. One such PHP script could be (see Sources below for original):

<?php
/*
Blogspot IP address blocker. before </head
<SCRIPT LANGUAGE=’javascript’ SRC=’path/filename.php’ TYPE=’text/javascript’></SCRIPT>
*/
$iplist = array("IP Address 1","IP Address 2","IP Address 3"); // the list of banned IPs
$ip = getenv("REMOTE_ADDR"); // get the visitors IP address
// echo "$ip";
$found = false;
foreach ($iplist as $value) { // scan the list
if (strpos($ip, $value) === 0){
$found = true;
}
}
if ($found == true) {
echo "top.location = "error.html";n"; // page to divert to
}
?>

To install it, modify the IP Addresses to the ones you want banned (if you only have one IP address, delete the others) and create the “f off” page, then replace error.html with its full path. Finally, add the line <SCRIPT LANGUAGE=’javascript’ SRC=path/filename.php’ TYPE=’text/javascript’></SCRIPT> to your blog, right before </head>, after replacing path/filename.php with the full address and name of your PHP script.

(The IP we are discussing is 94.75.116.x, belonging to Aster.pl in Warsaw, Mazowieckie. It starts by accessing the updated page, then going to each updated article.)

Mazowieckie-BotThe problem with this approach is that an IP address can easily be changed and you may find yourself playing a cat & mouse game with your “reader”. That is wasted time. Furthermore, we publish full feeds and don’t restrict our content in any way, on the contrary, we try to make it as easy to access as possible.

A better approach would be to password protect a part of each article, while providing some way for the reader to guess the password – via hints or by sending it monthly to your sponsors or those who donate. This approach would defeat spiders and content aggregators and would make it easier for readers to read your content via your page, as you intended.

Several methods to encrypt and decrypt parts of your articles via JavaScript has been presented earlier on in this blog. You may find it in action at asa.zamo.ca (click the “Arata-mi motivele.” link at the end of that article and use the password consisting of the quote in the link above it).

LE: Relying on Sitemeter for IP addresses might be a bad idea: it blocks the last octet and they started serving ads (first seen 28-07-2012). Even though I had it enclosed in a way that made it invisible, it’s still more than I am willing to accept. Sitemeter-Ads

Sources / More info: ip-block

No comments:

Post a Comment

Thank you for commenting and rest assured that any and all comments are welcome, whether positive or negative, constructive or distructive. Unfortunately, if you comment in this view I might not know about - please use the regular (Desktop) view.
I am using Disqus for commenting, but Blogger is not showing it so your comments may end up not being displayed - tell Google about it!