Saturday, May 15, 2010

Hacking .htaccess I

I recently uploaded some JavaScript files to my webserver to be used on some other external websites. They did not work at first and in the troubleshooting process I remembered how important it is to understand how .htaccess works.

apache logoMost people who do not use a platform such as Blogger or self-host on an Apache server with shared hosting. In that situation they cannot modify httpd.conf, but they can and should use .htaccess in individual directories. This config file takes almost all the commands that httpd.conf takes and affects all the subdirectories. If placed in the main directory of your web space, it will affect all the subdirectories below. The following summary, courtesy of 1and1, lists the main commands that can be used.

Command Description Examples (unrelated)
ErrorDocument Define your own custom error pages  
AddType Assign a MIME-Type to a file ending AddType x-mapp-php4 .html .htm AddType x-mapp-php4 .php3
RewriteEngine Activate mod_rewrite module RewriteEngine on RewriteBase / RewriteRule ^([a-z]+)\.html$ /index.php?$1 [R,L] ( /xyz.html -> /index.php?xyz)
Allow/Deny Host or IP based access control

<FilesMatch "\.inc$"> order deny,allow allow from all </FilesMatch> (allows .inc files d/l)

FilesMatch File based access control

<Files filename_to_be_parsed> ForceType x-mapp-php4 </Files> (files no ext –> PHP scripts)

AuthType "Basic" password check  
Redirect Redirection to another page or site  
Options (de)activate index, symbolic links, etc. Options +Indexes (dir indexing) CheckSpelling off (no suggestions)

Sometimes, .htaccess files are named differently, according to the AccessFileName directive.

1. Authentication

Before setting up user authentication for folders and / or files, you need to create a password file. You will normally create it through a shell prompt:

htpasswd -c /usr/local/apache/passwd/passwords username

If you don’t have access to it, you will have to use a form such as the one provided by 4webhelp. You want to store the password file in directory inaccessible for browsing but readable by Apache web server (www-data user or apache in RedHat / Fedora).

# chown www-data:www-data /home/secure/apasswords
# chmod 0660 /home/secure/apasswords

A common htaccess hack using Groups would look like this – omit the last two lines if not using them and replace with “require valid-user”:

AuthType Basic
AuthName "Password Required"
AuthUserFile /www/passwords/password.file
AuthGroupFile /www/passwords/group.file
Require Group admins

To protect a single file, add the following after the first 3 lines:

<Files my-secret-file.html>
require valid-user

Make sure .htaccess is located in the directory to be protected or where my-secret-file.html resides.

2. Block IP addresses

To block certain IP addresses from accessing your site, use:

Order allow, deny
Deny from
Deny from 222.222.
Deny from
Allow from all

With “allow, deny” order, access is disabled by default and the “allow” directives are parsed before the “deny” ones. The default Apache order is “deny, allow”, which means that all access is enabled by default and only users specifically denied are allowed. This means that if you only want to deny certain IPs, you list them in “deny” without bothering with any other statements.

3. Block referrer spam

If you publish your stats or if your blog has some way to acknowledge those who link to you, you might find yourself to be a target of people who generate bogus requests that appear to come from their sites. You publish a link to them and they get your linklove. To prevent them from accessing your site, use:

# set the spam_ref variable
SetEnvIfNoCase Referer "^http://(www.)?" spam_ref=1

SetEnvIfNoCase Referer "^http://(www.)?" spam_ref=1

SetEnvIfNoCase Referer "^casino-poker" spam_ref=1

# block all referres that have spam_ref set
<FilesMatch "(.*)">
Order Allow,Deny
Allow from all
Deny from env=spam_ref

You also ban proactively based on spam words:

SetEnvIfNoCase Referer "*some_word*" spam_ref=1

some_word can be something like phentermine, viagra, cialis, shemale, porn, nude, celebrity etc.

4. Custom 404 (Page Not Found)

The code is rather simple, and can only be used with local file paths:

ErrorDocument 404 /errors/notfound.html
ErrorDocument 500 /internal_error.html
ErrorDocument 401 /authorization_required.html
ErrorDocument 403 /forbidden.html

Make sure that “notfound.html” (or whatever you are using) contains absolute links, such as http://server/.. or /images/.., as the context will change.

The following are some of the most common HTTP error codes:

400 Bad Request
The server received a request it cannot handle due to bad syntax for example

401 Unauthorized
Such an error will show up in case a user did not supply a proper login credentials when using the .htaccess based user/pass protection

403 Forbidden
The request page is forbidden. Such an error shows up when you have a Deny from directive

404 Not Found
As the error message says the page that you have requested cannot be found on the server.

410 Gone
The requested page have been removed permanently

500 Internal Server Error
The server encountered an error. Usually such error messages show up with CGI scripts. Also you can get such an error message when you have bad syntax in your .htaccess file.

5. Allow directory listing

Most webhosts disallow by default directory listing for security reasons. It is however a great way to host and serve files quickly. To enable it in certain directories (and subdirectories) use

Options +Indexes
IndexIgnore *.gif *.zip *.txt

This will enable directory indexing but skip certain files you might want to keep private. Optionally, add +MultiViews +FollowSymlinks to the first line, or use FancyIndexing, which provides a plethora of other options:

<IfModule mod_autoindex.c>
IndexOptions FancyIndexing IconHeight=16 IconWidth=16

The IconHeight and IconWidth are optional. So are:

IconsAreLinks SuppressHTMLPreamble

To use Header files, just drop README or readme.html in directories, or use one for all of them:

HeaderName /inc/header.html

6. Disable Hot-Linking

Hot-linking refers to the use of your images in foreign sites. The best defence is to watermark your images, thus transforming theft in advertising. For those times when the traffic penalty is too high, use the following:

RewriteEngine on

RewriteCond %{HTTP_REFERER} !^$

RewriteCond %{HTTP_REFERER} !^http://(www.)?*$ [NC]

RewriteRule \.(gif|jpe?g|png)$ - [F]

If you have more than a site using those files, try:

RewriteEngine on

# Options +FollowSymlinks

RewriteCond %{HTTP_REFERER} !^$

RewriteCond %{HTTP_REFERER} !^http://(www.)?*$ [NC]

RewriteCond %{HTTP_REFERER} !^http://(www.)?*$ [NC]

RewriteRule .(mp3|jpg|wav)$ - [F]

If you are hosting movies as opposed to images, replace the filetypes with (mov|avi|wmv|mpe?g).

7. Redirect URLs

The code to achieve this is rather simple:

Redirect /folder

To use a Permanent 301 redirect, just add “permanent” between “Redirect” and “/folder”. You can also use regular expressions:

RedirectMatch "\.html$"

8. Add Mime-Types

This is the main reason why I revisited .htaccess and ended up writing this article.

AddType video/x-ms-asf asf asx
AddType audio/x-ms-wma .wma

You might want to use the above to define mime-types for webhosts that are not properly configured. You can also use it to force the browser to download certain files rather than opening them:

AddType  application/octet-stream  .doc .xls .pdf

It turned out that my server was properly configured after all; however, it took a bit of time for the JavaScript files to start being served (that is, after upload) with the proper application/x-javascript Hee hee

9. Force SSL/https

If you want users to access your site with https only, place the .htaccess in the root folder of the website and use

RewriteEngine On 
RewriteCond %{SERVER_PORT} 80 
RewriteRule ^(.*)$$1 [R,L]

If you only want to force it for a particular folder, use

RewriteEngine On 
RewriteCond %{SERVER_PORT} 80 
RewriteCond %{REQUEST_URI} somefolder 
RewriteRule ^(.*)$$1 [R,L]

10. Maintenance page

Sometimes while working on your page you do not want to allow access until all work is complete and tested. Use the following:

RewriteEngine on
RewriteCond %{REQUEST_URI} !/maintenance.html$
RewriteCond %{REMOTE_ADDR} !^123\.123\.123\.123
RewriteRule $ /maintenance.html [R=302,L]

In a future episode we might look into more advanced techniques.

Sources / More info: 1&1, best-mime, iana-mime, htaccess-apache, htaccess-tips, webhostingtalk, htaccess-cracking, htaccess-wp, apache-auth, w3-http-error-codes, htaccess-eg, yt-htaccess

No comments:

Post a Comment

Thank you for commenting and rest assured that any and all comments are welcome, whether positive or negative, constructive or distructive. Unfortunately, if you comment in this view I might not know about - please use the regular (Desktop) view.
I am using Disqus for commenting, but Blogger is not showing it so your comments may end up not being displayed - tell Google about it!