What Is robots.txt (and Why It Matters)

robots.txt sounds technical, but the concept is simple — and a misconfigured one is one of the most common reasons a website disappears from Google entirely.

What is robots.txt?

robots.txt is a plain text file placed at the root of your website — typically at yourdomain.com/robots.txt. It contains instructions for search engine crawlers, telling them which pages or sections of the site they’re allowed to visit.

It looks something like this:

User-agent: *
Disallow: /wp-admin/
Allow: /

This tells all crawlers: don’t visit the admin area, but everything else is fair game.

Why does robots.txt matter for your business?

If robots.txt contains an instruction that accidentally blocks Google from your site, Google will stop visiting your pages. Blocked pages can’t be indexed. Pages that aren’t indexed don’t appear in search results.

The most dangerous line is:

Disallow: /

This single instruction tells Google to stay away from your entire website. It’s often added by developers during a site build — to prevent the unfinished site from appearing in Google — and then forgotten when the site goes live.

How do you check your robots.txt?

Go to yourdomain.com/robots.txt in your browser. If you see a page with text, your robots.txt file exists. If you get a 404 error, there’s no robots.txt file (which is fine — Google will crawl everything by default).

Look for any Disallow: / line. If it’s there without other restrictions, it’s blocking Google from your entire site.

Can robots.txt block individual pages?

Yes. You can block specific folders or pages:

Disallow: /private/
Disallow: /checkout/

This is useful for pages you don’t want Google to index. The problem arises when pages you do want indexed are accidentally included in a Disallow rule.

Does Google always follow robots.txt?

Google says it respects robots.txt, but with one caveat: if a page has been linked to from elsewhere on the web, Google may still show it in search results (as a URL without a description) even if robots.txt blocks crawling. To fully prevent a page from appearing, you need a noindex meta tag, not just a robots.txt rule.

→ Back to the full picture: Why your website doesn’t show up on Google
→ Related: How to check if Google can see your website

Not sure if your robots.txt is blocking Google? GhostSite checks it as part of a full visibility audit.

Check your website now →