Technical

Permanently Redirecting URLs using 301 Headers in Static and Dynamic Sites

Permanently redirecting traffic using a 301 header is the proper and fail-safe way to ensure that your web-visitors continue to find content that has permanently moved to a new locations. Search engine ranking is crucial for any business. Web infrastructure changes and changes in best-practices can lead to a change in the URI of content items. Being that Search Engine indexes take time to update (10 days to 3 months), it is important to continue to maintain the old URLs until all the external referrers are updated.

Why there may be a need to use new paths

It is probable you have gained some knowledge and technologies have changed since you last overhauled your website and you may have a new naming system for your paths that's uniform and that hopefully embeds keywords in your document paths to maximize your ranking advantage. This situation makes it necessary for a web producer to create new URL paths for existing documents while keeping the old URLs active until everyone is updated.

How to implement

You cannot reliably depend on client technology to handle any tasks beyond the display of simple standards-compliant markup. Therefor you can employ a number of methods on the server to implement 301 Header redirects as follows:

Server-side Implementation

Windows IIS with Active Server Pages (ASP)
<%@ Language=VBScript %>
<%
Response.Status = "301 Moved Permanently"
Response.AddHeader "Location", "http://newpagelocation.com/path"
Response.End
%>

PHP
<?php
header("HTTP/1.1 301 Moved Permanently");
header("Location: /my-new-path");
exit();
?>

ColdFusion
<CFHEADER statuscode="301" statustext="Moved Permanently">
<CFHEADER name="Location" value="http://newpagelocation.com/path">
mod_rewrite (Apache Web Server)
rewriteEngine on
rewriteRule ^contact\.php$ http://newpagelocation.com/path [R=permanent,L]

Addendum for Web Content Management System

Placing this code in the template and testing for the requested URL to see if the page has been redirect can add a heavy overhead that could slow or even cause errors in your system. It is better to create placeholder pages with the URL of the removed pages and place the 301 redirect code for each case.

NB: If your CMS can schedule publishing and archival of pages, you can determine when to retire the redirect for a given page (estimate 6 months for all Search Engines to update their indexes)

For example, to implement this in Drupal:

  • Create regular node of any node type
  • Select the the PHP content type
  • Place the following code in it and modify the destination URL to the new path alias (for engines to index the new name as opposed to the node ID)
  • Create a path alias for this document bearing the URl of the OLD path Save and publish the document

 _________________________

How it works

Since you create a document with the same path as the old location, when the old URL is called, the redirecting node will 301 redirect it to the new location thereby enabling you to maintain your old URLs while promoting your new paths and indicating to Search Engines that your page has moved.

NB: Sending a 301 header redirect is the best and proper way to redirect permanently moved pages.

Client-side Redirection
As explained at the beginning of this document, the only reliable and proper way to redirect traffic from a non-existent page and properly signal the new location is to use the 301 Header redirect. In the absence of access to server technologies, the next best thing is to use client-side scripting and META REFRESH headers to redirect traffic. These methods do not indicate a permanent change or URI and bookmark and index driven agents such as search engines and link referrers will not know that your content has moved unless a human-being manually modified the path.

Javascript
<script type="text/javascript">
window.location.href='http://newpagelocation.com/path';
</script>

META REFRESH

<html>
<head>
<meta http-equiv="refresh" content="0;url=http://newpagelocation.com/path">
</head>
<body>
This page has moved to <a xhref="http://newpagelocation.com/path"> The content has moved to http://newpagelocation.com/path</a>
</body>
</html>

Drupal: Comparative Review of a Flexible and Powerful Web CMS

I first encountered Drupal (4.5) when it was availed to me by a hosting provider that I was using in early 2004. I had previously evaluated and used other GPL and commercial CMS applications, although it takes time to get up-to speed in Drupal in comparison with simpler CMS applications such as Mambo/Joomla and Magnolia which lack in flexibility, Drupal is easier to understand, and is more feature-complete when compared to more complex and 'bloated' CMS applications from the likes of Microsoft and Vignette. Drupal's flexibility, apparent complexity, and ability to be configured and tweaked for virtually unlimited uses have facilitated it's continued use in all my web projects.

As a web production manager by employment and a freelance CMS consultant, Drupal's flexibility, constantly growing toolkit of modules, and a vibrant user community are priceless in implementing CMS. The daily use and maintenance of Drupal-based websites has enabled me to regularly support new members of the Drupal community â?? My small way to say show appreciation for this valuable infrastructure - This is an excerpt of an entry submitted for the Drupal Newsletter (April 2006)

Removing ?SESSID from Drupal URLs After Login and Search

?PHPSESSID= is appearing in indexed URLs and all path requests after I do some kind of form submission (Login or search) in Drupal 4.66.

Drupal 4.7 Comment

This issue seems to have been resolved in Drupal 4.7.
After upgrading to Drupal 4.7 from Drupal 4.66, this issue has stopped occurring. It seems that this problem was caused by the previous application of a security fix that upgraded Drupal 4.65 to Drupal 4.66. If you are using version 4.66, please continue reading the rest of this article for detailed coverage of this issue as well as the suggested solutions.

Description of Problem

The problem of the session ID appearing in Drupal URLs is both unsightly and potentially damaging to your SEO objectives and any successes made. I am writing this document because after getting 300 of my web pages indexed by google, it was impossible to ignore the case and impage of having session IDs show in my indexed URLs. Also, I begun to get blank pages when I call some URLs or attempt to surf my website as a logged-in user; notably login (/user) and search submissions.

I did some online reading to find out why this was happening and was actually undoing the benefits of using clean URLs in Drupal because it made the URLs look like they were querystring type URLs from a dynamic application that did not pay attention to SEO requirements regarding the syntax of acceptable URL paths.

Attempted Resolution

After isolating the instances that caused the SESSION ID appear in the URL, and since there was no clearly documented configuration switch that could get rid of this bother, I decided to look within the many files that run Drupal path generation to see if I could find a line referring to SESSID since this portion of the URL did not seem to change and was probably a hard-coded prefix of that variable. I did not find it after downloading the code and searching it using a desktop application.

Next, I suspected that maybe .htaccess had a mention that was calling session IDs having read in Drupal support forums that this session ID was being embeded to pass state between pages for users. This though did not explai why page indexing by google had to have this since the spidering bot did not and could not login.

Why Do Indexed URLS have SESSID?

I concluded that there are three instances that cause the SESSID to appear int he URL

  • Calling the login or user information URL - .../user
  • Using the search form
  • And browsing as a logged in user (related to the first cause)

In my opinion, the only purpose this ID could serve is that of maintaining login sessions and as for other surfing instances (such as search bot indexing trips), this was not necessary.

I concluded that this appeared in indexed URLs because I had a login link at the top of the screen and after the bot calls that URL, there is a session ID created and used in all the subsequently indexed paths. I quickly removed the login link to curb this problem in the short term and ensure that all google indexes from then on did not include the Session ID.

Solution

To solve this problem, I knew that I could look for the file that was creating this and somehow remove of change the line to exclude the session ID. I searched the Drupal knowledgebase and found at least two existing bug reports that relate to this: 'PHP Session ID in Google' and 'Some URLs get ?PHPSESSID added to them'. These two 'solutions' did not work for be either because of my hosting situation (Site5), or because the submitted path was not compatible with Drupal 4.66. I persistently got a 500 error for modifying .htaccess

Working Solution -Modifying Drupal's common.inc

I decide to edit the /includes/common.inc file in Drupal to remove/modify thecode that writes the session ID into the URL:

Between line 170 and line 189 I commented out /* */ the struck-out section of code on line 184 and line 187 to eliminate the inclusion of SESSION ID in any URL. There is a mention in an existing bug report on the Drupal website that this section is unnecessary and should be removed [http://drupal.org/node/4109#comment-38607]

---------------------

function drupal_goto($path = '', $query = NULL, $fragment = NULL) {
if ($_REQUEST['destination']) {
extract(parse_url($_REQUEST['destination']));
}
else if ($_REQUEST['edit']['destination']) {
extract(parse_url($_REQUEST['edit']['destination']));
}

$url = url($path, $query, $fragment, TRUE);

if (ini_get('session.use_trans_sid') && session_id() && !strstr($url, session_id())) {
$sid = session_name() . '=' . session_id();

if (strstr($url, '?') && !strstr($url, $sid)) {
$url = $url /* .'&'. $sid*/;
}
else {
$url = $url /*.'?'. $sid*/;
}
}

---------------------

This removed session IDs from my URLs although I still get occassional blank pages when searching or loggin in, which I currently attribute to the theme that I am using (Simple2) because I previously tested with a different theme and did not get blank pages. Once I build a new theme, I will change this to eliminate the second problem, but in the mean-time, anypages that will be indexed will not have the urgly SESSION ID in the URL and hopefully when google re-indexes the old pages, it will clear the session IDs.

Recommended Solution

Being that this problem is a PHP Session issues, the recommended solution is to block the use of session IDs in the URL to maintain state. This will require all applications to use cookies instead of session IDs to maintain state from one page to another.

Add the following lines in the .htaccess file

php_value session.use_only_cookies 1
php_value session.use_trans_sid 0

Common instructions say that you should just place them in the .htaccess file. When I try to do that, I get a 500 error and the error goes away if I place it between PHP tags to be:

<IfModule mod_php4.c>
php_value session.use_only_cookies 1
php_value session.use_trans_sid 0
</IfModule>

NB: Drupal settings.php includes some run-time initialisation settings. It is NOT recommended to repeat the same setting in multiple places as this could create a conflic and cause lengthy troubleshooting.

Here are the settings in Drupal's settings.php that do the same thing that the previously discussed additions to .htaccess will do.

ini_set('session.use_only_cookies', 1);
ini_set('session.use_trans_sid', 0);

If you need more information, please contact us for any Web Production, eMarketing and Infrastructure advice and services

 

Drupal: Sending email without an SMTP Server configured in PHP

Based on the assumption that most Drupal installations are done on Unix or Linux/LAMP plateforms, there is not enough information on how to implement the email function when installation on Windows or any other Operating System that does not have the Send-mail engine that is classic of Linux and other Unix-like operating systems.

Valid XHTML 1.0 Strict
This site is accepts Oped ID authentication for login
This Website is Built Using Semantic Markup and Cascading Style Sheets (CSS)
Some usage rights are reserved, please contact us for approval before using it