How to Create a robots.txt File for Your Website
The robots.txt file is one of the most powerful — and most misused — tools in SEO. It is a simple text file placed at the root of your website that tells search engine crawlers which pages or sections they are allowed or not allowed to access. Getting it wrong can prevent Google from indexing your entire site. Getting it right ensures crawlers spend their budget on your most important content.
What is robots.txt?
robots.txt is a plain text file stored at yourdomain.com/robots.txt. When a search engine crawler (Googlebot, Bingbot, etc.) visits your site, it checks this file first before crawling any other page. The file uses a simple syntax to specify which crawlers are allowed or disallowed from accessing specific paths.
Example of a basic robots.txt file:
User-agent: * Disallow: /admin/ Disallow: /private/ Allow: / Sitemap: https://yourdomain.com/sitemap.xml
How to Create a robots.txt File for Free
Our Robots.txt Generator creates a correctly formatted robots.txt file in seconds without any technical knowledge:
- Go to the Robots.txt Generator
- Select which crawlers to target (all crawlers, or specific bots)
- Enter the paths you want to disallow
- Add your sitemap URL
- Click Generate — the tool outputs the complete file content
- Copy the output and save it as
robots.txt - Upload it to your website's root directory (the same folder as your homepage)
What Should You Disallow in robots.txt?
Block crawlers from accessing areas that should never appear in search results:
- /admin/ — your CMS admin panel
- /private/ or /members/ — password-protected sections
- /cart/ and /checkout/ — e-commerce transaction pages
- /search/ — internal search results pages (these create infinite crawl paths)
- /tmp/ and /cache/ — temporary files
- Duplicate content pages (parameter-based URLs like
?sort=price&filter=red)
Critical robots.txt Mistakes to Avoid
Blocking your entire site
The most catastrophic error is adding Disallow: / which blocks all crawlers from your entire site. This prevents Google from indexing any of your pages. Always test your robots.txt using Google Search Console's robots.txt Tester before publishing changes.
Blocking CSS and JavaScript files
Google needs to render your pages — including CSS and JavaScript — to properly understand and index them. Never block /css/, /js/, or /theme/ directories. Blocking these files can cause Google to see a broken version of your pages and rank them lower.
Thinking robots.txt prevents indexing
Blocking a URL in robots.txt prevents Google from crawling it, but does not guarantee it will not be indexed. If other sites link to a disallowed URL, Google may still index it (without reading its content). Use a noindex meta tag on pages you want excluded from search results.
Forgetting the Sitemap directive
Always include your sitemap URL at the bottom of your robots.txt. This ensures all crawlers can find your sitemap regardless of whether you have submitted it to Search Console.
How to Verify Your robots.txt is Working
After uploading your robots.txt file, verify it at yourdomain.com/robots.txt in your browser. In Google Search Console, go to Settings → robots.txt to use the built-in tester. You can also use our Google Index Checker on key pages to confirm they are still indexed after any robots.txt changes.
A well-configured robots.txt file helps Google spend its crawl budget on your most valuable pages, which can improve how quickly new content gets indexed and ranked.