Google Made a Big Robots.txt announcement today
Updated: July 2nd 2019
We’ve got some bad news for you. Google announced they’re killing Noindex Support in the robots.txt file. Publishers that have relied on this directive, er all of us, have until September 1st 2019 to switch to an alternative method.
There is good news though! There are a few ways you can do this without interrupting your day-to-day operations.
Robots.txt Noindex: An Unofficial Directive
The robots.txt directive isn’t going to be supported anymore because it’s actually an unofficial directive.
In the past, Google supported many of these “Unofficial” directives but have decided to cease support due to their move to move away from “undocumented and unsupported rules in robots.txt”
“Today we’re saying goodbye to undocumented and unsupported rules in robots.txt. If you were relying on these rules, learn about your options in our blog post.”
So that’s it folks. Another core feature that SEOs have relied on for over a decade is now defunct. This is also accompanied by the death of nofollow and crawl-delay directives as well.
How Do You Noindex a Page without Robots.txt?
If you’re panicking that Noindex support is gone, don’t. You aren’t out of options to hide your content from search results! Google has a few options for you to continue blocking a page from its search crawlers:
- Noindex in Robots meta tags – Noindex support is only gone for the robots.txt file. You can still use it for Robots meta tags! Just use this method to remove any URLs from the index when crawling is allowed. You can get detailed instructions here directly from Google Support.
- 404 & 401 HTTP Status Codes – Both of these codes indicate that the page doesn’t exist. If it doesn’t exist, it gets dropped from the index once it’s crawled and processed.
- Password Protection – They can’t index something that’s hidden behind a password! If you’ve got content you want hidden, you can just password protect it and rest assured it won’t be seen. Unless you use markup to indicate subscription or paywalled content that is.
- Disallow in robots.txt – It’s just Noindex that’s going away. You can still disallow the search engines to crawl a specific page which would result in it not being indexed. If you use WordPress, the Yoast plugin actually lets you do this with a simple checkbox too. While search engines may also index a URL based on links from other pages, if they can’t see the content, they will make the page less visible in the future.
- Search Console Remove URL Tool – A quick and easy method to remove a URL temporarily from Search results. Keyword there is temporarily though. Look to one of the first 4 options for a more permanent solution.
Why Noindex Support Google? WHY!?
In simple terms, they’re moving to standardize their protocols. They’ve announced that they’re open-sourcing their production robots.txt parser.
It left people with questions like “Why isn’t a code handler for other rules like crawl-delay included in the code?”
According to Google, “Since these rules were never documented by Google, naturally, their usage in relation to Googlebot is very low. Digging further, we saw their usage was contradicted by other rules in all but 0.001% of all robots.txt files on the internet. These mistakes hurt websites’ presence in Google’s search results in ways we don’t think webmasters intended.”
So all in all, it’s a move geared towards optimizing the crawler while also rectifying some of the instances that cause undesirable results for webmasters who created contradictory rules.
Why is Robots.txt Noindex Support a Big Deal?
Many webmasters have relied on this method to block certain content from showing up on their websites. The most important thing is to make sure you update your method of telling Google to noindex your pages. If you’re using nofollow or crawl-delay commands, you’ll want to use the true supported method for those as well since they’re affected by this update too.
Let’s tip our hats to our old friend. Time to move on and use an official method to block those crawlers!