The robots.txt file
The robots.txt file is a simple text file used to inform Googlebot about the areas of a domain that may be crawled by the search engine’s crawler and those that may not. In addition, a reference to the XML sitemap can also be included in the robots.txt file.
But most of the time we become confused about it. What should we do?? what not??
To overcome these confusion I am giving you some robots.txt samples which will lead you towards your destination.
The “User-agent: *” means this section applies to all robots. The “Disallow: /” tells the robot that it should not visit any pages on the site.
To exclude all robots from the entire server
To allow all robots complete access
To exclude all robots from part of the server
To exclude a single robot
To allow a single robot
To exclude all files except one
After submitting your robots.txt file you wanna be sure if you are doing right or wrong!! Is your site will be crawled by Google or not??
To get relief from this there is an awesome tool to test robots.txt if your web contents can be accessed by crawlers.