How To Check To See If Blocked Pages Are Indexed

blocked web page

Reading Time: 2 minutes

Share This

You put a robots.txt on your site expecting it to keep Google out of certain pages. But you worry – did you do it correctly? Is Google following it? Is the index as tight as it could be?

Here’s a question for you. If you have a page blocked by robots.txt, will Google put it in the index? If you answered no, you’re incorrect. Google will indeed index a page blocked by robots.txt if it’s being linked by one of your pages (that do not have a rel=”nofollow”), or if it’s linked from another website.  It doesn’t usually rank well because Google can’t see what’s on the page, but it does get PageRank passed through it. What a waste! Google will probably give you a snippet like this:

robots.txt snippet

2 Ways To Check For Indexed Pages You Thought Were Blocked

Don’t worry, I have a couple relatively painless ways to check your own indexation.

Using Google.com (More Manual)

Visit your robots.txt file and look at the blocked directories.  Let’s use Toys R Us for example:

  1. Check http://www.toysrus.com/robots.txt
  2. Take the first blocked directory (as of 3/22/2015, it’s /search/)
  3. Query Google.com using site:toysrus.com inurl:/search/ (this will attempt to find any URL that has /search/ on the toysrus.com site)
  4. Take note of any listings stating a description for this result is not available because of this site’s robots.txt 
  5. Repeat with all the other blocked directories
  6. Find all linking pages and determine your best course of action (eg, a nofollow attribute, a meta-noindex, the “remove URL” from Google Webmaster Tools)

In some cases, this trick will result in noisy results. If you tried this example above, you probably didn’t find blocked URLs until about page 5, where you see “In order to show you the most relevant results, we have omitted some entries very similar to the 47 already displayed. If you like, you can repeat the search with the omitted results included.” Since Toys R Us uses “search” as a parameter, Google tries to show it. This is not the “search” we’re looking for.

Using RankTank’s SERPitude (Less Manual)

Remember the rich snippet tester using Google Docs?  Well, the mighty Mr. Malseed made a more powerful desktop tool called SERPitude. It’s a free, easy to use rich snippet checking tool that shows you live Google snippets. Why not use this to check for rich snippets that say “a description for this result is not available because of the site’s robots.txt“? It’s a powerful scraper that accepts advanced search operators.

Or, enter no query and scrape the entire index (though you’ll have to click “add next 100 results” a few times, depending on how big the site is).

Here’s an example of SERPitude with an operator (click for larger image):

SERPitude

Download SERPitude

The only drawback is that SERPitude doesn’t expand the “In order to show you the most relevant results…” link, so in some cases you may have to rely on Google.com directly. But, you can export all the data and manipulate with Excel to find the footprints. That’s pretty awesome.

Summary

SERPitude is certainly a great tool for purposes as well, but solid for understanding what the SERPs are showing your searchers. Now that you identified your blocked pages, the real fun comes in tracking them down, deindexing them, and plugging the links with a rel=”noindex”.  Go to it, Sherlock!

Share This
Bill Sebald

Bill Sebald

Managing Partner

I've been doing SEO since 1996. Blogger, speaker, and occasionally teaching at Drexel and Philadelphia University. I started Greenlane in 2005 to help clients leverage search marketing to hit business goals. I love this stuff.

Follow Me on Twitter

Leave a Reply

Your email address will not be published. Required fields are marked *

More Related Articles

Guide to Google Ads
Digital Marketing PPC

Ultimate Guide to Google Ads Recommendations

Last updated Sept. 2020 In your Google Ads account, you may have noticed a new Optimization Score, as well as recommendations on how to improve that score. Your initial thought may be “Wow, this is really insightful. Thank you Google…
Continue Reading

Adsense Explorer
Digital Marketing PPC

Introducing the Greenlane Adsense Explorer (a free PPC tool)

Here’s the thing: We knew Google was running our clients’ display ads on irrelevant sites. And the daily upkeep was maddening. We needed a fix, so we built a Chrome extension to give us more control over where our ads…
Continue Reading

Don’t Care About Zero-Click SERPs? Trade Search Volume for Click Potential when Planning SEO Projects
Digital Marketing SEO

Don’t Care About Zero-Click SERPs? Trade Search Volume for Click Potential when Planning SEO Projects

Monthly search volume has long been the reigning metric to determine SEO opportunities for page creation and optimization. But given that only 50% of searches result in a click, how do you determine what searches are best to target for…
Continue Reading