To make sure specific pages are not unintentionally blocked by search engines in Webflow, you need to properly configure your robots.txt and page-level SEO settings.
1. Understand robots.txt Purpose and Scope
- The robots.txt file tells search engine crawlers which parts of your site they should or should not access.
- It applies globally to all bots unless user-agent rules specify otherwise.
- However, if a page has both a
noindex meta tag and is blocked in robots.txt, search engines can't see the noindex and may still index the page based on external links (but without a content preview).
2. Access the robots.txt File in Webflow
- Go to Project Settings for your site.
- Click on the SEO tab.
- Scroll to the robots.txt section to edit.
3. Format the robots.txt to Allow Important Pages
- Make sure you're not using a "Disallow: /" directive, which blocks the entire site.
- Only block paths you want to hide, such as staging content or utility pages.
- Example:
User-agent: *Disallow: /internal-page/Allow: /services/ (optional, use only if you’ve previously disallowed a broader directory)
4. Use Page-Specific Settings for Noindex/Index
- Go to the page in the Webflow Designer.
- Open the Page Settings (gear icon).
- Under SEO settings, make sure “Exclude this page from search engine indexing” is unchecked for pages you want indexed.
- Check this option only for pages you want to prevent from being indexed.
5. Publish to Check Changes
- Publish your site after making changes to both robots.txt and page-level SEO settings.
- Use tools like Google Search Console’s URL inspection or the robots.txt Tester to confirm behavior.
Summary
To ensure pages are not ignored by search engines, avoid blocking them in robots.txt and make sure their SEO settings don’t have "exclude from indexing" enabled. Confirm and test changes using Google tools after publishing.