Crawl postponed because robots.txt was inaccessible.
I am getting a lot of messages from Google about having problems crawling some old sites I built and that they are pausing future the crawls. When checking in Google webmaster tools, it gives the reason for no robot.txt file.
I do not have a robot.txt file because I want all pages crawled. In fact, I remember where there were previous discussions that you were better off without this file if you wanted everything crawled. Google states, “You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file (not even an empty one).”
Then why do I get an error, “Crawl postponed because robots.txt was inaccessible. More info.” In Google webmaster tools I show three errors of it trying to find this file.
When fetched as Googlebot it says "(Caution symbol) Unreachable robots.txt"
Is Google changing it position with this? It has happened on multiple unrelated websites (none monetized) just regular websites.
You don't have to use the robots.txt file just for blocking. You can also "allow all" giving unrestricted access like this:
That should satisfy G and get your sites crawled again.
I would suggest doing a little reading though. There are directories that can (and should) be blocked without affecting anything, /cgi-bin/ comes to mind.
I think I am going to have to add that to all my websites. I got another email 15 minutes. It sounds like it stops crawling because of it. Here is a copy.
website name dot com/: Googlebot can't access your site
Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.
You can see more details about these errors in Webmaster Tools.
Scott's right and some people also add "images" to that "do not crawl" list.
I received about another dozen emails from Google on other websites I manage. This is bugging me more that I need to tell Google to crawl all the pages.
I think I'll fire up Filezilla and just upload the robots.txt file to every single website. I wonder if this is part of their new change or something.
I guess if you take the "glass half full" approach to life it shows that the Big G actually abides by the robots.txt doctrine...
You need a robots.txt file only if your site includes content that you don't want search engines to index.