Force a www Subdomain Using An HTAccess File

There is a well know issue involving the indexing of websites in search engines like Google. The URL http://imperium.ca/ is different from the URL http://www.imperium.ca. We expect them to take us to the same site because they usually do. Unfortunaetly, they could be set up to take you to completely different sites, so search engines like Google need to track them seperately.

As a result, Google may split your page rank across these multiple domains and cause half to be listed as supplemental results[1]. If the pages are deemed to be duplicate content, they could even be dropped from the search engine results all together. The solution is to use a consistent form so that the search engines never track the second domain. Doing a 301 redirect will force a consistent domain and solve your problems.

You can find lots of 301 redirect solutions (like these examples from webconfs.com or these ones from this Google group) on the web that use code or htaccess files. Unfortunately, these examples cannot be generalized for any situation. Each solution is build for a specific domain. HTAccess files are built with the flexibility of a full regular expression engine. By taking advantage of that we can build a script that manages the format of your URL so that the same file can be used for any domain.

HTAccess
HTAccess files are a great way to configure your Apache web server on a per-directory basis. An htaccess file in your root directory will be applied to all subdirectories, but placing one in a subdirectory can overload configurations from parent htaccess files. It’s incredibly easy to maintain. With the mod_rewrite module enabled, it’s incredibly powerful too. But, as I learned, it is also incredibly unintuitive to build or test. I’ll walk you through the steps that I took to redirect all website traffic to adhere to a consistent format.

There are a number of syntax items and terms that you need to understand first. I’ve defined the ones you will use, but be careful. Some terms can mean something different in a different context.

Directives
RewriteCond will specify the conditions that must be met for the script to run the subsequent RewriteRule. It requires an input, and a regular expression, and allows an optional flag.

RewriteRule will do the work to rewrite the URL into a different format. It requires an input, a regular expression, and a flag.

Regular Expression Syntax
A set of square brackets [ ] define a list that the regular expression can match.

A set of circular brackets ( ) define a group of rules that apply a section of an input string. These groups have one other important use; they allow us to capture the input that matches the regular expressions in them and store it to a variable that can be called later.

A string that starts with the ! symbol will only capture strings that do not satisfy the regular expression that follows it.

The ^ symbol denotes the start of a regular expression, unless it is used inside a square bracket [ ]. When used in these, it instead holds the same purpose as the ! character, but relative to the contents of the square brackets.

The $ symbol denotes the end of a string.

The . symbol represents the wildcard. It will match any character.

The helps define special characters. For example, . would match a dot (.) in a string rather than the usual wildcard.

The * symbol will define 0 or more of the previous character.

The + symbol will define 1 or more of the previous character.

Server Variables
A % symbol is used to identify a variable that is specified in curly braces {}. For example, the %{HTTP_HOST} is a variable that holds the URL stripped of the request URI and protocol, while the %{REQUEST_URI} is a variable that only stores the request URI at the end of a URL.

Enable Apache’s mod_rewrite module
The first thing your htaccess file needs to do is enable the mod_rewrite module. As long as your server is configured to allow you to override these options, then all you need to do is add these lines.

Options +FollowSymlinks
RewriteEngine On
RewriteBase /

Force This URL to Have a Subdomain
We want to make sure that a subdomain is always in the URL. But, if some other form of subdomain is already added, we don’t want to overwrite this. So, these rule should capture all URLs that are not empty, and only have a single dot (.) in them, then redirect these URLs to a counterpart that has the ‘www’ subdomain specified.

# don't capture any requests where the HTTP_HOST is empty
# but do capture requests that have exactly 1 dot
# then redirect these requests to a properly formatted variable.
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{HTTP_HOST} ^([^/.]+).([^/.]+)$ [NC]
RewriteRule ^(.*)$ http://www.%1.%2%{REQUEST_URI} [L,R]

The %1 and %2 you see in the last line is the thing that distinguishes this code from most other htaccess files you will find. In the previous RewriteCond we capture the input that matches the regular expressions in the circular brackets. We can then output these for our RewriteRule using the % symbol. The number identifies which variable we want to use (a third set of brackets in the RewriteCond regular expression could be used by adding %3).

Force a trailing backslash
The absence of a trailing backslash in compared to the presence of one can cause the same problems as the absence (or presence) of a subdomain. Obviously, we wouldn’t want to force a URL to a physical file to have a backslash at the end, but we want a uniform way to print our URLs.

# don't capture any requests that are actual files
# but do capture all other requests that do not end in a backslash
# then redirect these requests to instead have a trailing backslash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} ^.+[^/]$
RewriteRule ^(.+)$ $1/ [L,R]

In the last line, you will notice a $1 in the RewriteRule. This is a lot like the %1 and %2 in the previous code segment that forced a subdomain, but the difference is where the variable is pull from. Using a $ rather than a % will tell the rule to retrieve the variable from the input of the RewriteRule rather than the RewriteCond regular expression. Just like the %, the variable is retrieved from the content that matches the circular brackets.

The Downside
The only issue I have found with this so far is that a mod_rewrite redirect will cause posted data to be lost. The post data doesn’t get reposted when you redirect. The htaccess file just isn’t equipped with the tools to manage them.

I’ve had to go through all of my forms and ensure that the action attribute always points to a properly formatted URL. It is an unfortunate issue because it means that you can’t just drop the file into any web project and assume your issues are over. Furthermore, it can be a difficult issue to find if you are not aware of this problem or have not thoroughly tested your code after implementing this script.

Luckily, if you are implementing this before you start your project, it won’t be much of a problem. Typos in the action attribute can be caught during your usual coding/testing cycles.

The Upside
I’ve already touched on a few advantages in this article:

Your URLs are canonical so search engines like Google know what to index. This will maintain or improve your page rank.
The script is built with flexible regular expressions so that you can put this file into any project without needing to change any code.
In addition to this, we gain a consistency that we can inherently count on. If you’ve ever tried to pull the SERVER_NAME server variable from your code, you would know that this returns whatever your URL specifies. A check against this variable to see what page/domain a user is requesting often forces you to compare against numerous different possible URLs to ensure you have covered all cases. Using this script, you can count on your URLs consistently being in the same expected format.

Additional Resources
While researching this solution, I bookmarked the links that best aided me. If you need to know more about htaccess files so you can customize yours to be tailored to your specific problem, then I recommend reviewing these sources.

Apache Documentation
Data Koncepts
Ranking Labs
mod_rewrite, a beginner’s guide by Neil Crosby
Matt Cutts: Gadgets, Google, and SEO

New Healthcare Technology Shifts Attention From Medical Charts To Patient Needs

In collaboration with Scribe Healthcare Technologies, Imperium Inc. has created an application that allows medical practitioners and staff to spend more time with patients and less with paperwork.

The new application, Scribe Direct, automates the transfer of audio files from medical offices to transcriptionists at Scribe Healthcare Technologies. During the transfer, the application automatically archives the audio files according to doctor so they can be accessed at later dates. Once the transfer has been successfully completed, the application creates a log file to record each success. If, for whatever reason, an error occurs during transfer, Scribe Direct reattempts the transmission until it is completed successfully.

Through correspondence with Scribe’s President and Chief Executive Officer, Mark. D. Boyce, this application has been fine tuned to ensure that Scribe’s clients are presented with a product that meets their needs. Prior to the application’s development, doctors who use transcriptionists would create treatment notes on audio recording devices, instead of filling out charts and making handwritten notes during or after treatment. These audio files would then be sent to Scribe and, once received, transcriptionists would convert the audio to text. Automating the file transfer and storage procedures in this process saves doctors and their staff time, thereby improving the overall efficiency of the entire medical office.

For more information on Scribe’s services and products, please visit www.scribe.com.

Imperium Brings New Sunnyside Site to Life

Calgary’s own Sunnyside Home & Garden Centre has decided to better service their customers by making home and garden tips available online. Their new website provides access to everything home owners and gardeners require to make plans for customizing and decorating living spaces.

The key features of the new site — step-by-step project tutorials, a vast plant database, and printable brochures containing recommendations regarding home and garden maintenance — are all designed to aid and inspire. Users can visit the site to contact home and garden experts for advice or to view current weather forecasts. Sunnyside’s return policies and upcoming sales and events, as well as changes to store hours are now posted. In addition, directions to the home and garden centre from any location are accessible on the website. Product and company information are also posted on the site.

Imperium Inc. designed Sunnyside’s website with the needs of its users in mind and with consideration to the company’s ideals. Sunnyside is a company with roots that date back to the early 1900’s. A business such as Sunnyside appreciates the value of roots for establishing growth, so it strives to stay in touch with and support what makes it the successful company it is today — its customers.

For more information, please visit www.sunnysidehomeandgarden.com.

Laptop Donated to Support Animal Care

To show its support, Imperium Inc. has donated a 10-inch MSI Netbook with a 120 GB hard drive to the Cochrane & Area Humane Society for its Saving Lives – Changing Lives Fundraising Gala. The 6th annual event will be held Saturday, June 13th, 2009 at the Cochrane Ranche House.

The laptop, along with many other exciting items, will be displayed in an auction to raise funds for animal care. The evening will feature both a live auction and a silent one and all proceeds from the evening go directly towards providing care for animals at the shelter. Those who attend will also have the opportunity to enjoy cocktails, a delicious banquet, and live entertainment. Special guests include local artist Graham Mehain, blues guitarist Leroy Cassell, and Canadian Country Music Award Winner Jessie Farrell.

In previous years, this event has been a great success thanks to the planning and organization of the shelter’s dedicated volunteers. Last year alone, the gala raised over $32,000 for the animals of the Cochrane & Area Humane Society.

Tables of eight are available. Tickets for the evening can be purchased for $125 and they usually sell out quickly. Tickets can be purchased at the shelter or pre-ordered and picked up at the door. To order tickets, e-mail tickets@cochranehumane.ca or contact Nicky Blackshaw via phone (403) 932-2072 (ext. 105).