Free SEO Tool

Robots.txt
Generator

Build a valid robots.txt file with precise crawl control. Define user-agent rule groups, set Allow and Disallow paths, block AI training bots, add a sitemap reference — then download and deploy in seconds.

Live Preview User-Agent Rules Allow & Disallow AI Bot Blocking CMS Presets Crawl-Delay
Global Settings
Quick Presets

Apply a ready-made rule set for your platform. Clicking a preset adds a new rule group with recommended Disallow paths.

User-Agent Rule Groups 0 groups
Generated robots.txt
File Summary
Rule Groups
0
Add a group to begin
Total Directives
0
Validation
Directive Breakdown
Disallow
0
Allow
0
Groups
0
Validation Checks
Quick Tips
PLACEMENT → Must live at /robots.txt root
→ One file per domain / subdomain
→ Case-sensitive on Linux servers

KEY RULES → * wildcard matches all bots
→ Disallow: / blocks everything
→ Empty Disallow means allow all
→ More specific rule wins ties

REMEMBER → robots.txt ≠ noindex
→ Use noindex tag to deindex
→ Always reference your sitemap

Step-by-Step Guide

How to Use the
Robots.txt Generator

01
Enter Your Domain

Type your root domain into the Domain field. This auto-generates the Sitemap URL reference and validates that all paths belong to the correct origin.

02
Apply a Preset

Choose a CMS preset — WordPress, Shopify, Next.js, or others — to instantly add a recommended set of Disallow rules tailored for that platform.

03
Add Rule Groups

Click Add Rule Group to create a user-agent block. Use * for all bots, or enter a specific crawler like Googlebot, Bingbot, or GPTBot.

04
Set Allow & Disallow

Add Disallow paths to block crawler access and Allow paths to carve out exceptions within blocked directories. All paths must start with /.

05
Toggle Options

Enable AI bot blocking, bad bot blocking, timestamps, and comments in Global Settings. The live preview and raw text update instantly with every change.

06
Download & Deploy

Download your robots.txt file and upload it to the root of your domain. Verify it's accessible at yoursite.com/robots.txt before submitting your sitemap.

Common Questions

Frequently Asked
Questions

What is a robots.txt file and do I need one? +
A robots.txt file is a plain text file placed at the root of your domain that tells web crawlers which parts of your site to access or avoid. It follows the Robots Exclusion Protocol and is the first file most bots request when visiting a site. While not strictly required, it is strongly recommended for any live website — it protects admin areas, conserves crawl budget, and signals where your sitemap lives.
Does Disallow in robots.txt stop a page from appearing in search results? +
Not reliably. Disallowing a URL prevents crawlers from visiting it, but Google may still index and show the page in search results if it discovers the URL through external links or other signals. To reliably prevent a page from appearing in search results, add a noindex meta tag or an X-Robots-Tag HTTP header directly to the page — not robots.txt.
What is the difference between Allow and Disallow? +
Disallow tells a crawler it cannot access a given path. Allow explicitly grants access to a path that would otherwise be blocked by a broader Disallow rule. For example, Disallow: /private/ blocks the whole directory, while Allow: /private/status.html permits that specific file. When both rules apply to the same path, the more specific (longer) rule wins. If two rules have equal length, Allow takes precedence for Google.
Should I block AI training crawlers in robots.txt? +
That depends on your preferences and content strategy. Bots like GPTBot (OpenAI), ClaudeBot (Anthropic), CCBot (Common Crawl), and Google-Extended crawl web content to build AI training datasets. You can block any or all of them by adding Disallow: / under their specific user-agent name. Blocking AI crawlers does not affect your search engine rankings but prevents your content from being used to train AI models.
What is crawl-delay and should I use it? +
Crawl-delay instructs a bot to wait a set number of seconds between successive requests to your server, which can reduce strain from aggressive crawlers. Note that Googlebot ignores Crawl-delay entirely — use the crawl rate control in Google Search Console instead. Bing, Yandex, and many other crawlers do respect it. Only set a crawl delay if you have a specific server load reason to do so.
Where must the robots.txt file be placed? +
The robots.txt file must be accessible at the exact path /robots.txt at the root of your domain — for example https://yoursite.com/robots.txt. It cannot be placed in a subdirectory. Each subdomain requires its own robots.txt file at its own root. After uploading, always verify it loads correctly in a browser and returns an HTTP 200 status code before relying on it.

About This Tool

What is a
Robots.txt Generator?

The Tool

The SEO HQ Robots.txt Generator lets you build a fully valid robots.txt file without writing a line of code. Create multiple user-agent rule groups, add Allow and Disallow paths with real-time validation, apply CMS-specific presets, toggle AI bot blocking, and reference your sitemap — all with a live syntax-highlighted preview that updates instantly.

Why It Matters

A correctly configured robots.txt is your first line of crawl control. It protects sensitive paths from being crawled, prevents duplicate content from consuming crawl budget, and signals to all bots where your sitemap lives. A poorly written file can inadvertently block critical pages or leave admin sections accessible to every crawler on the internet.

Key Features

  • Multiple user-agent rule groups
  • Allow and Disallow path directives per group
  • CMS presets: WordPress, Shopify, Next.js, and more
  • AI crawler and bad bot blocking toggles
  • Live syntax-highlighted preview with raw text view
  • One-click download as robots.txt or copy to clipboard

Best Practices

  • Always include a Sitemap: reference line
  • Use * to set baseline rules for all bots
  • Never block CSS, JS, or images from Googlebot
  • Test with Google's robots.txt Tester in Search Console
  • Review and update after every major site restructure
  • Use noindex — not robots.txt — to prevent indexation