Why does anyone need a robots.txt?
If you create a robots.txt file it does not mean that it will improve your search engine ranking, but it will provide robots with information regarding files you will not allow to be crawled and indexed for search engines. Whenever a robot crawls your site, the first thing it looks for is the robots.txt file. If it cannot find one it will automatically crawl and index your entire site. If you do not have a robots.txt file may also create 404 errors in your server logs, making it harder to track 404 errors.
A simple robots.txt file?
Open notepad and type the following . Please note that this text will allow all robots to crawl and index all files. Save the file as robots.txt and uploaded it to the root directory of your domain.
This allows all robots to crawl all files on your server.
User-agent: *
Disallow:
How to disallow certain files?
Open notepad and type the following. In this example you are not allowing robots to crawl the images file on your server. Please note: Disallowing a specific file to be crawled will keep it from being indexed. The file disallowed will not show up in the search engines. HOWEVER, this is only effective for friendly robots. Robots can choose to ignore your instructions.
This allows all robots to crawl all files except the images file.
User-agent: *
Disallow: /images/
This will allow all robots to crawl all files except the images file and the awstats files.
User-agent: *
Disallow: /images/
Disallow: /awstats/
For every file that you do not want crawled just add it to the disallow line.
This will deny access to Googlebot-image to any files in your domain
User-agent: Googlebot-Image
Disallow: /
This denies Googlebot-image to your images file
User-agent: Googlebot-Image
Disallow: /images/
Upload the file to the root directory of your domain. To confirm that it has no errors go to the following site and check to make sure everything is correct. http://tool.motoricerca.info/robots-checker.phtml
No comments:
Post a Comment