Joomla Robots.txt File

The robots.txt file is a handy little file located in the root directory of your joomla installation. At first glance in the file it looks to have very little information and very little use. However this is completely wrong, this file is very handy and it is actually looked for by the googlebot when it is scanning your website for pages.

What the googlebot does is scan through the file and each line represents a request made by the website for the search bot not to scan a particular directory. So what does this mean to our search engine optimisation efforts? When it comes to search engines and pages ranking well you must have valid unique content, valid and unique titles and meta tags and be relevant to the subject the person has just searched upon.

Often we find with Joomla and its many components that there are a few that do not fit well with our most common URL changing programs, like JoomSEF, OpenSEF and alike. It is these components we wish to exclude from the search engines because they cause more harm than good. Often the components we face the most trouble with at Kanga Internet are the Events Calendar & ExtCalendar components. Both of these create hundreds of redundant and blank pages when scanned by the search engines. These pages tend to clog the search engines up as they scan your website. What we should be trying to achieve is valid and unique pages that our customers are trying to search on and we are trying to push forward as our most prominent pages. Clogging leads to poor rankings.

So how can we affect this change?

We can simply add the following line to our robots.txt file

{moscode}Disallow: /badseocomponent/{/moscode}

In my last blog post I spoke about turning off PDF and print pages but we can also block the search engines scanning the PDF and Print pages by adding a few lines in the robots.txt file like so .. {moscode}Disallow: /index2.php{/moscode}

You may also want to UN-block the images line as this can lead to google scanning you images and pictures. If you have a photo gallery it can mean you images get indexed in googles image search and drive more traffic to you website.

Once you have JoomSEF or other SEF URL changing component installed you can also safely block the /index.php file and /option directory as these are both direct URL's and will be no longer necessary. {moscode}Disallow: /index.php Disallow: /option{/moscode}

