Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.
> curl -I -H "User-Agent: Googlebot" https://www.cloudflare.com
HTTP/2 403
https://www.checkbot.io/robots.txt
I should probably add this SEO tip too because the purpose of robots.txt is confusing: If you want to remove/deindex a page from Google search, you counterintuitively need to allow the page to be crawled in the robots.txt file, and then add a noindex response header or noindex meta tag to the page. This way the crawler gets to see the noindex instruction. Robots.txt controls which pages can be crawled, not which pages can be indexed.
Anyone knows of others like that?
Here is mine: https://FreeSolitaire.win/robots.txt
https://www.cloudflare.com/sitemap.xml
which contains links to educational materials like
https://www.cloudflare.com/learning/ddos/layer-3-ddos-attack...
Potentially interesting to see their flattened IA....