301-redirect.ppt 331KB Jun 23 2011 07:22:58 AM
301 Redirect:
How Do I Love You, Let Me Count the Ways
presented by Stephan Spencer,
Founder & President, Netconcepts
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Time To Drink From the Firehose!
No need to take furious notes though.
(Phew!)
Download this Powerpoint right now
from
www.netconcepts.com/learn/301redirect.ppt
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Let’s Go Under the Hood with 301s
In .htaccess (or httpd.conf), you can redirect
individual URLs, the contents of directories, entire
domains… :
– Redirect 301 /old_url.htm
http://www.example.com/new_url.htm
– Redirect 301 /old_dir/ http://www.example.com/new_dir/
– Redirect 301 / http://www.example.com
Pattern matching can be done with RedirectMatch
301
– RedirectMatch 301 ^/(.+)/index\.html$
http://www.example.com/$1/
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 Redirects via Rewrite Rules
My preference is to use Apache’s mod_rewrite
module and set up rewrite rules that use the
[R=301] flag. Or if on Microsoft IIS Server, using
ISAPI_Rewrite plugin.
The rewrite rules go in either .htaccess or your
Apache config file (e.g. httpd.conf, sites_conf/…)
– Precede all the rewrite rules with the line “RewriteEngine
on”
– If within .htaccess, also add another line “RewriteBase /”.
Never add to the server config). Use it and you won’t have
to have “^/” at the beginning of all your rules, just “^”
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
An Example Rewrite Rule
A simple example for httpd.conf
– RewriteRule ^(.*)/index\.html$ /$1/ [R=301,L]
Store stuff in memory with () then access
via variable $1
A rough equivalent for .htaccess
– RewriteBase /
– RewriteRule ^(.*)/?index\.html$ /$1/ [R=301,L]
Ah, but there’s an error with the rule
immediately above. Hint: “.*” is “greedy”
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
The Magic of Regular Expressions
You need to become a master of pattern matching
–
–
–
–
–
–
–
–
* means 0 or more of the immediately preceding character
+ means 1 or more of the immediately preceding character
? means 0 or 1 occurrence of the immediately preceding char
^ means the beginning of the string, $ means the end of it
. means any character (i.e. wildcard)
\ “escapes” the character that follows, e.g. \. means dot
[ ] is for character ranges, e.g. [A-Za-z].
^ inside [] brackets means “not”, e.g. [^/]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Regular Expression Errors
Incredibly easy to make errors in regular
expressions
When debugging, RewriteLog and
RewriteLogLevel (4+) is your friend!
Back to the previous example...
– RewriteRule ^(.*)/?index\.html$ /$1/ [L,R=301]
What’s the problem? .* is greedy and so it will
capture the “/” within memory
– http://www.example.com/blah/index.html redirects
to http://www.example.com/blah//
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Regular Expression Gotchas
“Greedy” expressions. Use [^ or .*? instead
of .*
– e.g [^/]+/[^/] instead of .*/.*
– e.g ^(.*?)/ instead of ^(.*)/
.* can match on nothing. Use .+ instead
– e.g. .+/ instead of .*/
Unintentional substring matches because ^ or
$ wasn’t specified or . was used for a dot
instead of \.
– e.g. ^/default\.htm$ instead of /default.htm
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Let’s Go Deeper Down the Rabbit Hole
A more complex example
– RewriteCond %{HTTP_HOST} !^www\.example\.com$
[NC]
– RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
[NC] flag makes the rewrite condition caseinsensitive
[L] flag saves on server processing
[QSA] flag not needed. It’s implied when using
R=301. Don’t want the query string maintained, put
? at the end of the destination URL in the rule.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Speaking of Tracking Parameters
Here’s how to 301 static URLs with a tracking
param appended to its canonical equivalent
(minus the param)
– RewriteCond %{QUERY_STRING} ^source=[a-z0-9]*$
– RewriteRule ^(.*)$ /$1? [L,R=301]
And for dynamic URLs...
– RewriteCond %{QUERY_STRING} ^(.+)&source=[az0-9]+(&?.*)$
– RewriteRule ^(.*)$ /$1?%1%2 [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
More Fun with Tracking Parameters
Need to do some fancy stuff with cookies before
301ing? Invoke a script that cookies the user
then 301s them to the canonical URL.
– RewriteCond %{QUERY_STRING} ^source=([a-z0-9]*)$
– RewriteRule ^(.*)$ /cookiefirst.php?source=
%1&dest=$1 [L]
Note the lack of a R=301 flag above. That’s on
purpose. No need to expose this script to the
user. Use a rewrite and let the script send the
301 after it’s done its work.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 Retired Legacy URLs
Got legacy dynamic URLs you’re trying to phase out
after switching to static URLs? 301 them...
– RewriteCond %{QUERY_STRING} id=([0-9]+)
– RewriteRule ^get_product.php$ /products/%1.html?
[L,R=301]
Switching to keyword URLs and the script can’t do
anything with the keywords if passed as params? Use
RewriteMap and have a lookup table as a text file.
– RewriteMap prodmap
txt:/home/someusername/prodmap.txt
– RewriteRule ^/product/([0-9]+)$ ${prodmap:$1} [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 Retired Legacy URLs
What would the lookup table for the above rule look
like?
– 1001 /products/canon-g10-digital-camera
– 1002 /products/128-gig-ipod-classic
DBM files are supported too. Faster than text file.
You could use a script that takes the requested input
and delivers back its corresponding output.
– RewriteMap prodmap
prg:/home/someusername/mapscript.pl
– RewriteRule ^/product/([0-9]+)$ ${prodmap:$1}
[L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Canonicalization
Non-www and typo domains
– (The example mentioned earlier...)
– RewriteCond %{HTTP_HOST} !^www\.example\.com$
[NC]
– RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
HTTPS
– (If you have a separate secure server, you can skip this
first line)
– RewriteCond %{HTTPS} on
– RewriteRule ^catalog/(.*)
http://www.example.com/catalog/$1 [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Canonicalization
If trailing slash is missing, add it
– RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
– WordPress handles this by default. Yay
WordPress!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Iterative URL Optimization
When iteratively optimizing a page’s URL,
301 all previous iterations directly to the
latest iteration. Don’t daisy chain 301s.
– WordPress handles this beautifully, and by default
– Tip: Use Netconcepts’ “SEO Title Tag” plugin to
mass edit all your permalink post URLs and let
WordPress handle the 301s automagically. But
don’t then “set it and forget it”. Continue
optimizing the URLs iteratively over time to
maximize search traffic.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
If You’re on Microsoft IIS Server
ISAPI_Rewrite not that different from mod_rewrite
Rewrite rules go in httpd.ini file
Precede first rewrite rule with “[ISAPI_Rewrite]”
Capitalization and IIS’ case insensitivity w.r.t. URLs
– RewriteRule (.*) http://www.example.com$1 [I,RP,L]
Non-www and typo domains
– RewriteCond Host: (?!www\.example\.com)
– RewriteRule (.*) http://www.example.com$1 [I,RP,L]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
More IIS Examples
Drop the default
– RewriteRule (.*)/default.htm $1/ [I,RP,L]
Add trailing slash if it’s missing
– RewriteCond Host: (.*)
– RewriteRule ([^.?]+[^.?/]) http\://$1$2/
[I,RP,L]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Conditional Redirects?
Risky territory! Read
Redirects: Good, Bad & Conditional
To selectively redirect bots that request URLs with
session IDs to the URL sans session ID:
– RewriteCond %{QUERY_STRING} PHPSESSID
RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} Ask\ Jeeves
RewriteRule ^(.*)$ /$1 [R=301,L]
browscap.ini provides spiders’ user agents
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Conditional Redirects Not Necessary
Almost always another way (w/o using user agent or
IP)
In the above example, simply 301 everybody – bots
and humans alike – and stop appending PHPSESSID
– See http://yoast.com/phpsessid-url-redirect/ for more on
this.
– If you have to keep session IDs for functionality reasons, you
could use a script to detect for whether the session has
expired, and 301 the URL to the canonical equivalent if it
has.
Matt Cutts will be talking about this topic tomorrow in
“Ask the Search Engines” session. Don’t miss it!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Capture PageRank on Dead Pages
Traditional approach is to serve up a 404, which drops that
obsolete URL out of the index, squandering that URL’s link
juice.
But what if you 301 redirect to something valuable (e.g.
your home page or the category page one level up) and
dynamically include a small error notice?
Or return a 200 status code instead, so that the spiders
follow the links on the error page? Then include a meta
robots noindex so the error page itself doesn’t get indexed.
IMPORTANT: Don’t respond to garbage (nonsense) URLs
with anything but a 404 status code. Googlebot looks for
this!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Thanks!
This Powerpoint can be downloaded from
www.netconcepts.com/learn/301redirect.ppt
For 180 minute long screencast (including
90 minutes of Q&A) on SEO for large
dynamic websites – including transcripts
– email seo@netconcepts.com
Questions after the show? Email me at
stephan@netconcepts.com
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
How Do I Love You, Let Me Count the Ways
presented by Stephan Spencer,
Founder & President, Netconcepts
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Time To Drink From the Firehose!
No need to take furious notes though.
(Phew!)
Download this Powerpoint right now
from
www.netconcepts.com/learn/301redirect.ppt
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Let’s Go Under the Hood with 301s
In .htaccess (or httpd.conf), you can redirect
individual URLs, the contents of directories, entire
domains… :
– Redirect 301 /old_url.htm
http://www.example.com/new_url.htm
– Redirect 301 /old_dir/ http://www.example.com/new_dir/
– Redirect 301 / http://www.example.com
Pattern matching can be done with RedirectMatch
301
– RedirectMatch 301 ^/(.+)/index\.html$
http://www.example.com/$1/
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 Redirects via Rewrite Rules
My preference is to use Apache’s mod_rewrite
module and set up rewrite rules that use the
[R=301] flag. Or if on Microsoft IIS Server, using
ISAPI_Rewrite plugin.
The rewrite rules go in either .htaccess or your
Apache config file (e.g. httpd.conf, sites_conf/…)
– Precede all the rewrite rules with the line “RewriteEngine
on”
– If within .htaccess, also add another line “RewriteBase /”.
Never add to the server config). Use it and you won’t have
to have “^/” at the beginning of all your rules, just “^”
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
An Example Rewrite Rule
A simple example for httpd.conf
– RewriteRule ^(.*)/index\.html$ /$1/ [R=301,L]
Store stuff in memory with () then access
via variable $1
A rough equivalent for .htaccess
– RewriteBase /
– RewriteRule ^(.*)/?index\.html$ /$1/ [R=301,L]
Ah, but there’s an error with the rule
immediately above. Hint: “.*” is “greedy”
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
The Magic of Regular Expressions
You need to become a master of pattern matching
–
–
–
–
–
–
–
–
* means 0 or more of the immediately preceding character
+ means 1 or more of the immediately preceding character
? means 0 or 1 occurrence of the immediately preceding char
^ means the beginning of the string, $ means the end of it
. means any character (i.e. wildcard)
\ “escapes” the character that follows, e.g. \. means dot
[ ] is for character ranges, e.g. [A-Za-z].
^ inside [] brackets means “not”, e.g. [^/]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Regular Expression Errors
Incredibly easy to make errors in regular
expressions
When debugging, RewriteLog and
RewriteLogLevel (4+) is your friend!
Back to the previous example...
– RewriteRule ^(.*)/?index\.html$ /$1/ [L,R=301]
What’s the problem? .* is greedy and so it will
capture the “/” within memory
– http://www.example.com/blah/index.html redirects
to http://www.example.com/blah//
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Regular Expression Gotchas
“Greedy” expressions. Use [^ or .*? instead
of .*
– e.g [^/]+/[^/] instead of .*/.*
– e.g ^(.*?)/ instead of ^(.*)/
.* can match on nothing. Use .+ instead
– e.g. .+/ instead of .*/
Unintentional substring matches because ^ or
$ wasn’t specified or . was used for a dot
instead of \.
– e.g. ^/default\.htm$ instead of /default.htm
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Let’s Go Deeper Down the Rabbit Hole
A more complex example
– RewriteCond %{HTTP_HOST} !^www\.example\.com$
[NC]
– RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
[NC] flag makes the rewrite condition caseinsensitive
[L] flag saves on server processing
[QSA] flag not needed. It’s implied when using
R=301. Don’t want the query string maintained, put
? at the end of the destination URL in the rule.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Speaking of Tracking Parameters
Here’s how to 301 static URLs with a tracking
param appended to its canonical equivalent
(minus the param)
– RewriteCond %{QUERY_STRING} ^source=[a-z0-9]*$
– RewriteRule ^(.*)$ /$1? [L,R=301]
And for dynamic URLs...
– RewriteCond %{QUERY_STRING} ^(.+)&source=[az0-9]+(&?.*)$
– RewriteRule ^(.*)$ /$1?%1%2 [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
More Fun with Tracking Parameters
Need to do some fancy stuff with cookies before
301ing? Invoke a script that cookies the user
then 301s them to the canonical URL.
– RewriteCond %{QUERY_STRING} ^source=([a-z0-9]*)$
– RewriteRule ^(.*)$ /cookiefirst.php?source=
%1&dest=$1 [L]
Note the lack of a R=301 flag above. That’s on
purpose. No need to expose this script to the
user. Use a rewrite and let the script send the
301 after it’s done its work.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 Retired Legacy URLs
Got legacy dynamic URLs you’re trying to phase out
after switching to static URLs? 301 them...
– RewriteCond %{QUERY_STRING} id=([0-9]+)
– RewriteRule ^get_product.php$ /products/%1.html?
[L,R=301]
Switching to keyword URLs and the script can’t do
anything with the keywords if passed as params? Use
RewriteMap and have a lookup table as a text file.
– RewriteMap prodmap
txt:/home/someusername/prodmap.txt
– RewriteRule ^/product/([0-9]+)$ ${prodmap:$1} [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 Retired Legacy URLs
What would the lookup table for the above rule look
like?
– 1001 /products/canon-g10-digital-camera
– 1002 /products/128-gig-ipod-classic
DBM files are supported too. Faster than text file.
You could use a script that takes the requested input
and delivers back its corresponding output.
– RewriteMap prodmap
prg:/home/someusername/mapscript.pl
– RewriteRule ^/product/([0-9]+)$ ${prodmap:$1}
[L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Canonicalization
Non-www and typo domains
– (The example mentioned earlier...)
– RewriteCond %{HTTP_HOST} !^www\.example\.com$
[NC]
– RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
HTTPS
– (If you have a separate secure server, you can skip this
first line)
– RewriteCond %{HTTPS} on
– RewriteRule ^catalog/(.*)
http://www.example.com/catalog/$1 [L,R=301]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Canonicalization
If trailing slash is missing, add it
– RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
– WordPress handles this by default. Yay
WordPress!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Iterative URL Optimization
When iteratively optimizing a page’s URL,
301 all previous iterations directly to the
latest iteration. Don’t daisy chain 301s.
– WordPress handles this beautifully, and by default
– Tip: Use Netconcepts’ “SEO Title Tag” plugin to
mass edit all your permalink post URLs and let
WordPress handle the 301s automagically. But
don’t then “set it and forget it”. Continue
optimizing the URLs iteratively over time to
maximize search traffic.
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
If You’re on Microsoft IIS Server
ISAPI_Rewrite not that different from mod_rewrite
Rewrite rules go in httpd.ini file
Precede first rewrite rule with “[ISAPI_Rewrite]”
Capitalization and IIS’ case insensitivity w.r.t. URLs
– RewriteRule (.*) http://www.example.com$1 [I,RP,L]
Non-www and typo domains
– RewriteCond Host: (?!www\.example\.com)
– RewriteRule (.*) http://www.example.com$1 [I,RP,L]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
More IIS Examples
Drop the default
– RewriteRule (.*)/default.htm $1/ [I,RP,L]
Add trailing slash if it’s missing
– RewriteCond Host: (.*)
– RewriteRule ([^.?]+[^.?/]) http\://$1$2/
[I,RP,L]
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Conditional Redirects?
Risky territory! Read
Redirects: Good, Bad & Conditional
To selectively redirect bots that request URLs with
session IDs to the URL sans session ID:
– RewriteCond %{QUERY_STRING} PHPSESSID
RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} Ask\ Jeeves
RewriteRule ^(.*)$ /$1 [R=301,L]
browscap.ini provides spiders’ user agents
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Conditional Redirects Not Necessary
Almost always another way (w/o using user agent or
IP)
In the above example, simply 301 everybody – bots
and humans alike – and stop appending PHPSESSID
– See http://yoast.com/phpsessid-url-redirect/ for more on
this.
– If you have to keep session IDs for functionality reasons, you
could use a script to detect for whether the session has
expired, and 301 the URL to the canonical equivalent if it
has.
Matt Cutts will be talking about this topic tomorrow in
“Ask the Search Engines” session. Don’t miss it!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Capture PageRank on Dead Pages
Traditional approach is to serve up a 404, which drops that
obsolete URL out of the index, squandering that URL’s link
juice.
But what if you 301 redirect to something valuable (e.g.
your home page or the category page one level up) and
dynamically include a small error notice?
Or return a 200 status code instead, so that the spiders
follow the links on the error page? Then include a meta
robots noindex so the error page itself doesn’t get indexed.
IMPORTANT: Don’t respond to garbage (nonsense) URLs
with anything but a 404 status code. Googlebot looks for
this!
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Thanks!
This Powerpoint can be downloaded from
www.netconcepts.com/learn/301redirect.ppt
For 180 minute long screencast (including
90 minutes of Q&A) on SEO for large
dynamic websites – including transcripts
– email seo@netconcepts.com
Questions after the show? Email me at
stephan@netconcepts.com
© 2009 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com