Troubleshooting crawl errors

Learn how Cloudflare interacts with search engine crawlers (particularly Google’s) and how to troubleshoot crawl errors.


Overview

Cloudflare whitelists search engine crawlers and bots. If you observe crawl issues or Cloudflare challenges presented to the search engine crawler or bot, contact Cloudflare support with the information you gather when troubleshooting the crawl errors via the methods outlined in this guide.


Adjust Google and Bing crawl rates

To optimize CDN performance, Google and Bing assign special crawl rates to websites that use CDN services in order. Special crawl rates do not negatively affect Search Engine Optimization (SEO) and Search Engine Results Pages (SERPs). To change your crawl rates for Bing and Google, follow the guides below:


Prevent crawl errors

Review the following recommendations to prevent crawler errors:

  • Do not block Google crawler IP addresses via Firewall Rules or IP Access Rules within the Cloudflare Firewall app.
Confirm an IP address belongs to Google by consulting Google’s documentation on verifying googlebot IP addresses.
  • Do not block the United States via Firewall Rules or IP Access Rules within the Cloudflare Firewall app.
  • Do not block Google or Bing User-Agents in your .htaccess, server configuration, robots.txt, or web application.
Google uses a variety of User-Agents to crawl your website. You can test your robots.txt via Google.
  • Do not allow crawling of files in the /cdn-cgi/ directory. This path is used internally by Cloudflare and Google encounters errors when crawling it. Disallow crawls of cdn-cgi via robots.txt:
Disallow: /cdn-cgi/
Errors for cdn-cgi do not impact site rankings.

Troubleshoot crawl errors

Troubleshooting steps for the most commonly reported crawl errors are mentioned below.

HTTP 4XX Errors

HTTP 4XX errors are the most common type of crawl error. Cloudflare delivers these errors from your web server to Google. These errors are caused for various reasons such as a missing page on your web server or a malformed link in your HTML. The solution depends upon the problem encountered.

HTTP 5XX Errors

HTTP 5XX errors indicate that either Cloudflare or your origin web server experienced an internal error. To correlate occurrences of crawl errors with site outages, monitor your origin web server’s health. Monitoring your website health both through Cloudflare and directly to your origin web server IPs determines whether errors occurred due to Cloudflare or your origin web server.

DNS Errors

Troubleshooting steps vary depending on whether your domain is on Cloudflare via a Full or CNAME setup. To verify which setup your domain uses, open a terminal and execute the following command (replace www.example.com with your Cloudflare domain):

dig +short SOA www.example.com

For domains on a CNAME setup, the result response contains cdn.cloudflare.net. For example:

example.com.cdn.cloudflare.net.

For domains on a Full setup, the result response contains the cloudflare.com domain in the nameservers listed. For example:

  josh.ns.cloudflare.com. dns.cloudflare.com. 2013050901 10000 2400 604800 3600

Once you’ve confirmed how your domain was setup with Cloudflare, proceed with the troubleshooting steps appropriate to your domain setup.

CNAME

Contact your hosting provider to investigate DNS errors and provide the date Google encountered DNS errors. Additionally, review the Cloudflare System Status page for any network outages on the date the errors were encountered by Google.

Full

Contact Cloudflare support and provide the date and time that Google observed the errors.

Requesting troubleshooting assistance

If the above troubleshooting steps do not resolve your crawl errors, follow the steps below to export crawler errors as a .csv file from your Google Webmaster Tools Dashboard. Include this .csv file when contacting Cloudflare Support.

  1. Log in to your Google Webmaster Tools account and navigate to the Health section of the affected domain.
  2. Click Crawl Errors in the left hand navigation.
  3. Click Download to export the list of errors as a .csv file.
  4. Provide the downloaded .csv file to Cloudflare support.

Related resources

Google’s documentation on crawl errors and troubleshooting

 

Not finding what you need?

95% of questions can be answered using the search tool. This is the quickest way to get a response.

Powered by Zendesk