Apache mod_rewrite + Vanity URLs + PHP


I'm trying to make the most of my year end break by familiarizing myself with some additional web development tools like PHP, HTML5 and jQuery. One of the projects I'm working on is a sort of rich content internet clipboard like my clipray.net site but with the added capability of storing images, files and rich text. One thing I really like about clipray is that I can create a new clipboard just by typing in a new clipboard name after the site name in the url bar (like http://clipray.net/newclipboard). This was achieved using Rails Routing.

I was able to replicate this behavior using PHP and Apache.

This article should:

  • Describe mod_rewrite and its rule system
  • Show a working example of a VirtualHost directive that includes mod_rewrite rules
  • Show how to make this work on the PHP side of things

Notes

 

mod_rewrite:

mod_rewrite is an Apache module which allows you to redirect a URL based off of a set of regular expression rules. It can come in handy if you want your site to have a user friendly URL system (commonly referred to as Vanity URLs). The apache articles linked above explain the module in detail.

 

Enable the RewriteEngine and setup Rules:

 
You can enable the rewriting engine by applying this configuration line:

RewriteEngine On

Once enabled you can create RewriteRules. Here is an example of a working rewrite / redirection rule:

RewriteRule ^/[a-zA-Z0-9]+/*$ /clip.php?id=$1

If a user were to access a website which uses this rule (say http://html5clipboard.net/clipping) this is what would happen:

  1. The rewrite rule is evaluated and /newclipboard is found to match the regex
  2. The name of the clipboard (with the leading slash) is captured in the $1 variable
  3. This is the URL path that actually gets accessed: http://html5clipboard.net/clip.php?id=newclipboard

 

There are a few things to note:

  • Rules start with the RewriteRule directive and a space
  • The first argument is a regular expression which describes the paths you want to rewrite
    • The regex is defined between the ^ (carat) and $ (Dollar sign) symbols
  • The second argument is the path you want to take if the rule matches URL input
    • Note the $1 symbol in the above example. This expands to what is captured by the regex defined in the first argument.
  • There is also a third argument which isn't listed here. Details can be found in the mod_rewrite docs listed in the Notes section above.

 

Configuring mod_rewrite rules in an Apache VirtualHost directive:

While there are lots of resources online that describe how to use rewrite rules, there are comparatively few which discuss the details around implementing them in a VirtualHost. I wasn't able to find one that showed a complete VirtualHost directive that is setup with rewrite rules. Having a working example easily visible goes a long way towards helping me understand how to apply something in my own environment.

At the bottom of the Apache httpd.conf file (could be named differently if you are using a linux distribution other than Centos 6) there are configuration settings for VirtualHosts. In order to get non SSL Virtual Hosting to work you must enable Name based Virtual Hosts in your Apache config file. Here is a copy/paste of the snippet that must be enabled:

#
# Use name-based virtual hosting.
#
NameVirtualHost *:80

Basically, you need to uncomment the line bolded above. It should be easy to find.

Once NameVirtualHost is enabled ( *:80 means it will listen on all IP addresses on port 80) you can create a VirtualHost directive which contains the rewrite rules. Here is a working example with mod_rewrite configuration lines bolded:

<VirtualHost *:80>
    ServerName html5clipboard.net
    ServerAlias www.html5clipboard.net

    DocumentRoot /var/www/apps/php/webapp

    #
    # Enable rewrite engine
    RewriteEngine On

    # Prevent folder or file access
    RewriteCond %{SCRIPT_FILENAME} !-d
    RewriteCond %{SCRIPT_FILENAME} !-f

    #
    # Rewrite rules that apply to this virtual host
    # Leave a space for site-specific pages
    RewriteRule ^/about/us$ /about.php
    RewriteRule ^/about/legal$ /legal.php

    # Main rule which allows for arbitrary vanity URLs
    RewriteRule ^/[a-zA-Z0-9]+/*$ /service/clip.php?id=$1

    # Logging
    ErrorLog logs/webapp.net-error_log
    CustomLog logs/webapp.net-access_log common
</VirtualHost>

Here is a line-by-line explanation of the above VirtualHost configuration. I'll explain everything so it is clear what applies to mod_rewrite and what is just a normal part of a VirtualHost directive:

  • <VirtualHost *:80>
    • Specifies the start of the virtual host configuration
  • ServerName html5clipboard.net
    • Lists the hostname that this V-Host uses
  • ServerAlias www.html5clipboard.net
    • indicates that this V-Host will respond to traffic destined for the www subdomain
       
  • DocumentRoot /var/www/apps/php/webapp
    • This specifies the html/php/other files that correspond to the website
  • RewriteEngine On
    • turns on mod_rewrite for this virtual host
  • RewriteCond %{SCRIPT_FILENAME} !-d
  • RewriteCond %{SCRIPT_FILENAME} !-f 
    • These two rules prevent the rule from matching real files or folders that exist in your document root location
  • RewriteRule ^/about/us$ /about.php
  • RewriteRule ^/about/legal$ /legal.php
    • These are two rules we want to be tried before the main vanity url rule. Rules are processed in the order they are defined in the config file (top to bottom).
  • RewriteRule ^/[a-zA-Z0-9]+/*$ /service/clip.php?id=$1
    • This is the vanity URL rule which will match if any of the other rules don't.

  • ErrorLog logs/webapp.net-error_log
  • CustomLog logs/webapp.net-access_log common 
    • These two directive specify where logging for the VirtualHost should take place.
  • </VirtualHost> - Indicates the end of a virtual server directive

 

Parsing the Vanity URL in PHP:

All of the heavy lifting has been done by this point. What remains is to extract the vanity URL via PHP and process according to your wishes. In the example listed at the top of the article I was using a php file named clip.php to do some action based off of the Vanity URL input.

Here is the complete source for clip.php:

<?php
$clipboard =$_SERVER['REQUEST_URI'];
$clipboard = trim($clipboard, "/");
echo $clipboard."  has been activated!";
?>

 

Explanation:

  • The name of the clipboard / vanity URL is read in from $_SERVER['REQUEST_URI'];
    • $_SERVER is a PHP Super Global variable which contains information about the web request
  • Forward slashes are trimmed from the beginning/end of the request
  • A message with the name of the clipboard is displayed back to the user

Not much to it, but you get the idea. With the name you could lookup user information in a database or pull data from a webservice / alternate document location on your webserver.