Google is on an improvisation state. Google, a name that knows everything you can’t even imagine. You think a word and it will tell you its meaning. With increased competition, Google has also started thinking about bringing changes that will make its platform safer and more responsive for its millions of users worldwide.
Google made the following announcements on Google webmaster blog which will be effective from September 1, 2019:
- After September 1, 2019, Google will no or more support unsupported and unpublished rules in robots exclusive protocol.
- They also released their robots.txt parser as an open source project.
- Google is currently working on making the robots exclusion protocol a standard.
- Their next target is unsupported implementations of the internet draft, such as crawl-delay, nofollow, and noindex as they were never documented and they hinders the website’s search results.
Google has been looking to make these changes for years and now it has started with pushing to standardize the protocol. These steps are taken to maintain a safe environment and to welcome future open source releases. Google also mentioned some alternatives you can use to replace unsupported derivatives in robots.txt files.
Noindex in robots mega tags is one of the best ways to remove URL’s from the index when crawling is allowed. Another option is 404 or 410 status codes which usually means page does not exist can also help in dropping URL’s from Google’s index after they are crawled and processed. Next alternate is password protection. According to Google’s webmaster blog, a page behind the login will generally remove it from Google’s index, until and unless markup is used to indicate subscription or paywalled content. One can also use Search Console remove URL Tool which is a quick and easy method that removes URL temporarily from Google’s search results. You can also disallow robots.txt as Search engines can only index pages they know, so blocking the page means its content won’t be indexed.
Google also cleared that they will look into the matter of making those pages less visible in future that the search engine may also index as a URL based on links from other pages, without seeing the content.Google took this step as to avoid any problems in working of their websites. After analyzing everything it was found that the rules of robots.txt files with the noindex directive listed was never documented and were hurdles in smooth functioning of Google search results which affects the websites’ presence.
Things to do
Please note that you should no more use noindex directive in the robots.txt file. If you are, then you should start following the above alternatives in order to avoid any issues related to file loss before September 1. And also start looking for an alternative for the nofollow or crawl-delay commands as the next announcement will be against unsupported implementations of the internet draft, such as crawl-delay, nofollow, and noindex.