You put a robots.txt on your site expecting it to keep Google out of certain pages. But you worry – did you do it correctly? Is Google following it? Is the index as tight as it could be? Here’s a question for you. If you have a page blocked by robots.txt, will Google put it in the index? If you answered no, you’re incorrect. Google will indeed index a page blocked by robots.txt if it’s being linked by one of your pages (that do not have a rel=”nofollow”), or if it’s linked from another website. It doesn’t usually rank well because Google can’t see what’s on the page, but it does get PageRank passed through it.
Google gives you a few ways to “deindex” pages. That is, kick pages out of their index. The problem is, despite some serious speed improvements in crawling and indexation, they’re pretty slow to deindex and act upon canonical tags. This quick trick can help you isolate and remove pages en masse.
If a website is a mess of URLs and duplicate content, Google will throw their hands up in the air out of frustration. This is a bad spot to be in. You’ll find your traffic and rankings drop while your indexation becomes bloated. Your crawl rate (which we’ve found correlates with traffic) will be curbed. It could seem all very sudden, or it could be gradual. Every case is different – but it’s always a nightmare. Keeping track of website changes is critical with SEO. The other day I peaked into our own Google Webmaster Tools indexation report, and saw something pretty alarming in the “index status” report.
The folks at Mountain View made the conscious decision that keywords alone couldn’t deliver them the results they wanted to see (ahem, “their users wanted to see”). Google tried some different modeling but ultimately came around to semantic search (that is, using semantic technology to refine the query results). Now I said much of the industry has picked up on it. Not all. I still see a lot of pretending Panda, Penguin and Hummingbird never happened. That’s unfortunate for innocent clients around the world. But for most of us probably reading this, we’re students of a new lexicon. With words like “triples” and “entities” and “semiotics” and “topic modeling.”
Entity optimization as a big SEO play isn’t quite upon us yet. It’s a slow, growing Google addition. I know – it frustrates me too. So much potential, of which I believe will greatly improve search results in the future. Google isn’t nearly showing the fruits of everything it knows through entities, whether through cards or search results – at least not relative to the way they rank on keywords alone. But can knowledge cards help bring qualified traffic while considering searcher intent? SEOs always talk about searchers intent. Anyone who’s been doing SEO for a while knows that building for intent can be a challenge.
I remember a few years ago blowing the mind of a boss with a theory that Google would eventually rank (in part) based on their own internal understanding of your object. If Wikipedia could know so much about an object, why couldn’t Google? In the end, I was basically describing semantic search and entities, something that has already lived as a concept in the fringe of the mainstream.
For one reason or another, plenty of sites are in the doghouse. The dust has settled a bit. Google has gotten more specific about the penalties and warnings through their notifications, and much of the confusion is no longer… as confusing. We’re now in the aftermath – the grass is slowly growing again and the sky is starting to clear. A lot of companies that sold black hat link building work have vanished (and seem to have their phone off the hook). Some companies who sold black hat work are now even charging to remove the links they built for you (we know who you are!). But at the end of the day, if you were snared by Google for willingly – or maybe unknowingly – creating “unnatural links,” the only thing to do is get yourself out of the doghouse.
As a search engine junkie, I’m always pulling for the little guy with a unique idea. I love competition in the marketplace. Sadly, the dominance of Google (and others) forced many of these into the shadows. But if you are curious like me, I present Greenlane’s active list of search engines: Baidu – The leading…