Show HN: Yikes imagine trusting Google's documentation
usesnippet.app> The snippet from "search docs crawling indexing pause online business" states that adding a Disallow: / rule for Googlebot in robots.txt will keep Googlebot away permanently as long as the rule remains. "search help office hours 2023 june", however, advises against disallowing all crawling via robots.txt, warning that such a file "may remove the website's content, and potentially its URLs, from Google Search." This directly contradicts the claim that a full-disallow rule safely blocks Googlebot without negative consequences, creating a true conflict about the effect and advisability of using a disallow rule to block Googlebot.
If you want to block Googlebot "permanently", why would you expect to stay listed in Search? The first page actually agrees with the second - if you only want to temporarily block crawling, it recommends not blocking Googlebot.
Actually, your last "conflict" is bad too. A 503 fetching robots.txt does stop crawling the site, for at least twelve hours and possibly forever (if other pages return errors). The only crawling Google will continue to do is to keep trying to fetch robots.txt.
I appreciate what you're trying to set up here but 2/4 is a pretty bad record for a demo.
I somewhat disagree with you on the last conflict: you have one document stating pretty clearly that returning a 503 for the robots.txt "blocks all crawling". The other document states there's a 12 hour block after which Google may decide to crawl the other pages (not just the robots.txt like you said).
Thanks for the feedback though, definitely some work to be done on validating the conflicts we surface.