Blogger now allows custom robots.txt, this is very useful because we can set the visibility of our articles on search engines, we can determine whether the article will be indexed by search engines or not.
By default, every blog that uses the Blogger platform will have a robots.txt as follows:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://www.example.com/feeds/posts/default?orderby=updated
And has the following explanations:
Mediapartners-Google is a robot from Google Adsense, leave it as is because if you mistakenly change that than the ads served will not fit with your content.
The next line is for all the robots and marked with an asterisk (*). On the default configuration, it is clear that the label of our blog is not indexed Disallow: /search.
Keep in mind that a slash (/) is as your homepage, so for example if you want the label to get indexed, do not just fill up with a slash like this Disallow: / because that would be you do not allow the robot tracing your blog, but it should like the example below:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow:
Allow: /
Sitemap: http://www.example.com/feeds/posts/default?orderby=updated
With the configuration as above then all of the articles and the label will be indexed. And to block a robot for particular page (I take the example of my FAQ page) you can simply write as follows:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /p/faq.html
Allow: /
Sitemap: http://www.example.com/feeds/posts/default?orderby=updated
Update: To resolve the pagination problems on blogspot after we remove the Disallow: /search than we can use the following configuration to block the pagination page:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search?updated-min=
Disallow: /search?updated-max=
Disallow: /search/label/*?updated-min=
Disallow: /search/label/*?updated-max=
Allow: /
Sitemap: http://www.example.com/feeds/posts/default?orderby=updated
After the changes, make sure everything is fit like what we want by visiting www.example.com/robots.txt. Replace the Example.com with your domain name.
Warning! Use with caution. Incorrect use of these features can result in your blog being ignored by search engines.
Note: Last updated on January 5, 2013.