Contao Open Source CMS > Contao forum

Switch to german forum

Index > User tutorials > htaccess file tips and tricks

nigelcopley
User
Avatar
Hi everyone,

I'd like to share with you some changes you need to make to your htaccess file in order to get increased seo benefits

This is the original htaccess

iconCode:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .*\.html$ index.php [L]

The problem with this depending on your set up may mean that you site homepage can be accessed in any number of ways such as

www.mydomain.com/
www.mydomain.com/index.html
www.mydomain.com/index.php

This results in page duplication in Google, which will not get you banned but will hinder your sites ability to rank well.

To counteract this you should add the following lines to your htaccess file, changing the domain to your own on the second line

iconCode:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.mydomain.com/$1 [R=301,L]

If you have any other htaccess tips and tricks please continue this thread

Nigel
Last edited by nigelcopley, 2008-10-01 12:03
2008-10-01 12:00
davidm
User
Avatar
Posts: 82
Paris, France
Thanks Nigel for the tip :)

I know I have set up my TL install with additionnal htaccess directives, also to avoid duplicate content (this is fairly common knowledge, and I believe it could make it easier if we had TL ship with the code commented to let users choose if they need it or not) :

If you want to redirect all request without www to your domain with www

iconCode:
RewriteCond %{HTTP_HOST} ^domain.tld$
RewriteRule ^(.*) http://www.domain.tld/$1 [QSA,L,R=301]

If you want to redirect all request with www to your domain without www

iconCode:
RewriteCond %{HTTP_HOST} ^www\.domain\.tld$ [NC]
RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L]

Now one thing I'd like to know is how to improve the default htaccess for error handling.

I have indeed noticed that if I type in a non existing page, this way http://nodeo.net/toto.html -> everything works

If I type in http://nodeo.net/toto or http://nodeo.net/toto.php or http://nodeo.net/toto.js or http://nodeo.net/toto/ -> it doesn't work :(

Now it might be might setup (running a dedicated box with Debian Etch) and I could handle this with htaccess and not use TYPOlight's error page but I don't think it's right to to it this way... Anyone ?
Last edited by davidm, 2008-11-25 17:36
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-25 17:34
nigelcopley
User
Avatar
Hi David,

You're welcome, whilst for you and I it may be common knowledge, I wouldn't say that it is for everyone else here. As for your own little problem try putting this in your htaccess.
iconCode:
ErrorDocument 404 /file.html

Now expanding on this you could also have any of the following in your htaccess file

iconCode:
ErrorDocument 400 /errors/badrequest.html
ErrorDocument 401 /errors/authreqd.html
ErrorDocument 403 /errors/forbid.html
ErrorDocument 404 /errors/notfound.html
ErrorDocument 500 /errors/serverr.html
Last edited by nigelcopley, 2008-11-25 17:46
2008-11-25 17:44
davidm
User
Avatar
Posts: 82
Paris, France
Thanks Nigel, as I said I could handle my 404 like you suggest, that's what I have in mind if I don't find the answer to my question (which is how can TYPOlight handle errors for any scenario e.g not only toto.html but also toto/ toto.php... etc, which would make sense since we have an error page type. I know modx (antoher cms I use) handles it for example)
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-25 17:51
nigelcopley
User
Avatar
Ok, i see what you're saying now!....

If you know anything about PHP, you could take a look in /system/modules/frontend >> PageError404.php - That;'s about as much help as i could be
2008-11-25 18:05
davidm
User
Avatar
Posts: 82
Paris, France
I don't know much about PHP but I'll check this out and maybe take inspiration from modx, thanks :)

Edit : Ended up asking for help on this one, as I'd like to know if it's a normal behavior or trouble on my part...
Last edited by davidm, 2008-11-25 20:58
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-25 20:22
davidm
User
Avatar
Posts: 82
Paris, France
iconnigelcopley:
To counteract this you should add the following lines to your htaccess file, changing the domain to your own on the second line

iconCode:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.mydomain.com/$1 [R=301,L]

Tested it, doesn't seem to work for me : when I type in index.php or index.html it remains there and I don't get redirected to the site root :(
I deactivated the www / no www custom rewrite but nothing changes, still doesn't work.
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-26 10:16
nigelcopley
User
Avatar
It definately works

www.businessfeet.com/index.php

You may also find it easier to use the robots.txt file to disallow those pages from being indexed, it works just as well!!!

User-agent: *
Disallow: /index.php
Disallow: /index.html
Last edited by nigelcopley, 2008-11-26 10:29
2008-11-26 10:27
nigelcopley
User
Avatar
I just checked my htaccess and I have the rule in twice, once for html and once for php. Maybe someone could show us a way to condense this code.

iconCode:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.businessfeet.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html$ http://www.businessfeet.com/$1 [R=301,L]
2008-11-26 10:32
davidm
User
Avatar
Posts: 82
Paris, France
Ok thanks a lot for the fast answer I'll update accordingly and re-test !
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-26 10:50
davidm
User
Avatar
Posts: 82
Paris, France
Ok I switched the order of the directives in the htaccess, and added the www, and now the index.php is properly redirected to the root but not index.html

iconCode:
# Avoid duplicate content issues

RewriteCond %{HTTP_HOST} ^www\.nodeo\.net$ [NC]
RewriteRule ^(.*)$ http://nodeo.net/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.nodeo.net/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html$ http://www.nodeo.net/$1 [R=301,L]

Also I dumbly failed to mention that I am using the FolderURL extension. Maybe there's something there...
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-26 11:42
nigelcopley
User
Avatar
Folder url shouldn't cause any issues

i did notice though that you are trying to redirect www to non-www and then trying to redirect .html/.php to www.xyz.com/

try removing the www. from the second and third directives
2008-11-26 11:58
davidm
User
Avatar
Posts: 82
Paris, France
You're right, at the beginning I did it with no www but it didn't work in either cases.
Adding the www help with index.php which now works.

There probably is something to change in the rewrite rules for people who - like me - have a no www redirect set up.

Not solved yet....
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-26 12:15
nigelcopley
User
Avatar
I just changed mine to non-www redirect and it works fine

Maybe you could post your htaccess file here so that I can see
2008-11-26 12:26
davidm
User
Avatar
Posts: 82
Paris, France
Thanks for the help, here it is :


iconCode:
# Enable mod_rewrite
RewriteEngine On
RewriteBase /

# Block any URI protocol in the query string
RewriteCond %{QUERY_STRING} (ftp|https?):|/etc/ [NC]
RewriteRule .* - [F,L]

# Block any URI protocol in the request
RewriteCond %{REQUEST_URI} (ftp|https?):|/etc/ [NC]
RewriteRule .* - [F,L]

# Rewrite TYPOlight URLs
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [L]

# Avoid duplicate content issues

RewriteCond %{HTTP_HOST} ^www\.nodeo\.net$ [NC]
RewriteRule ^(.*)$ http://nodeo.net/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.nodeo.net/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html$ http://www.nodeo.net/$1 [R=301,L]

# The following directives stop screen flicker in IE on CSS rollovers. If
# needed, un-comment the following rules. When they're in place, you may have
# to do a force-refresh in order to see changes in your designs.

ExpiresActive On
ExpiresByType image/gif A2592000
ExpiresByType image/jpeg A2592000
ExpiresByType image/png A2592000
BrowserMatch "MSIE" brokenvary=1
BrowserMatch "Mozilla/4.[0-9]{2}" brokenvary=1
BrowserMatch "Opera" !brokenvary
SetEnvIf brokenvary 1 force-no-vary

As I said tried to switch your rules for index to non www but doesn't work...

BTW running Apache 2.2.8
Last edited by davidm, 2008-11-26 12:48
++ open source enthusiast, textpattern and modx veteran, typolight student ++

.: loving typolight's versatility and flexibility, impressed by reliability and security and enjoying fast development time :.

++ amazed by the Catalog/CatalogExt/Taxonomy combo and lots of high quality modules... kudos ! +++
2008-11-26 12:44