preventing pages from being indexed: the robots meta tag

August 3rd, 2007
The simplest way to prevent search engine robots from indexing a certain page is to use the "robots" META tag. This is an example:

<meta name="robots" content="noindex,follow">

This informs search engine robots not to index the page, yet to follow any contained links. You have to make sure that you put the robots meta tag inside your head tag.

There are also two additional directives you can use: "index" which explicitly tells the robot to index the page, and "nofollow" which instructs the robot not to follow links. The default behaviour is: "index the page and follow links."

This method is simpler to implement than using a /robots.txt file and also it does not require web-master's privileges on the server, but there is also a major drawback: only few search engine robots support this meta tag. Googlebot is one of the few.

There are several reasons for not wanting a page to be indexed. There are pages that act more like passages to other pages in a deeper level, than actually content containers. A typical example is the front page of news coverage website, which contains mostly the headlines of the latest news, rather than the stories themselves.

Leave a Reply