How Google’s BERT Works In SEO

ByBill Sebald December 4, 2019July 15, 2023

Every few years, Google announces a significant update to its organic search system. From the inclusion of semantic search in Hummingbird to the announcement of machine-learning (RankBrain). Google is trying to get better at two things: understanding the intent behind a query and understanding the language of webpages. The better Google gets at these two skills, the more people will use Google search.

Why does Google keep investing in their search engine? It’s because they simply cannot continue ranking pages based on keywords alone. Users have evolving expectations from Google in terms of answering their needs. Plus, their queries are getting longer and more conversational. Google’s only solution is a constant improvement in their technology.

Let’s pause for a definition: Natural language processing (NLP) is a branch of artificial intelligence that helps computers understand, interpret and manipulate human language. NLP draws from many disciplines, including computer science and computational linguistics, in its pursuit to fill the gap between human communication and computer understanding. (Definition source.)

What do we know about BERT?

BERT is an acronym for Bidirectional Encoder Representations from Transformers.
BERT was previously released as an open-source base for pre-training deep learning models to ultimately boost natural language processing. It may sound complicated but think of it like gathering the materials before training an employee to do a particular job.
BERT was announced as part of Google’s systems by Pandu Nayak (VP of Search) on Oct 25, 2019.
Pandu says, “This technology enables anyone to train their own state-of-the-art question answering system.” Certainly Google.

The specific version of BERT that Google is using is very powerful. As Britney Muller explains it in her BERT video on MOZ, it’s a multi-tool that replaces several one-off tools. And it’s only going to get smarter.

Google started with Wikipedia as the corpus. BERT looks at clips of text and converts them to vectors. (Think of a vector as a translation for computers.) Then it uses a technique called masking to blank out a word. As part of the training protocol, text before and after the word (and maybe the entire clip of text) are reviewed. BERT is trying to figure out what the missing word is. The more it practices, the better it gets at understanding the context of the whole clip of text.

Additionally, the more Google can understand the question (query), and the more Google can understand the answers it finds around the web, the better Google can match the appropriate answer to a question.

Google says you cannot optimize for BERT. That makes sense. The training data is from Wikipedia. But, if you write poor, chunky, keyword-stuffed copy as part of your SEO campaigns, you can certainly improve that copy for a better natural language processing. Then, a smarter Google can see your improved copy, and consider using it as a result for more queries. If you write chunks of copy for eCommerce pages, or phone in your blog posts, I believe Google will now have a better idea of how low-quality your copy is.

Final Thoughts

As Google gets smarter, your content should be in step. I still see plenty of poorly written copy on the web. Or, content that is so complicated that context is missing for most normal readers. If you write for an average human, I believe you’ll have a better advantage when it’s your turn to get interpreted by Google. And maybe this will provide a higher likelihood of earning a rich snippet.

Using Google Search Console Tools To Clean Up Your (404) Act

ByBill Sebald April 14, 2020July 15, 2023

This article is an entire rewrite from its original 2012 version. When Google launched the new Search Console (thus killing the legacy 404 report), it rendered the old version of this post useless. The bad news – the new URL inspector doesn’t work the same as the legacy report. In the previous report, you could…

Digital Marketing | SEO | Technical

Step-By-Step Google Disavow Process

ByBill Sebald January 27, 2014July 15, 2023

Read Full Article

For one reason or another, plenty of sites are in the doghouse. The dust has settled a bit. Google has gotten more specific about the penalties and warnings through their notifications, and much of the confusion is no longer… as confusing. We’re now in the aftermath – the grass is slowly growing again and the sky is starting to clear. A lot of companies that sold black hat link building work have vanished (and seem to have their phone off the hook). Some companies who sold black hat work are now even charging to remove the links they built for you (we know who you are!). But at the end of the day, if you were snared by Google for willingly – or maybe unknowingly – creating “unnatural links,” the only thing to do is get yourself out of the doghouse.

Digital Marketing | SEO

Optimize NOW For Entities and Relationships

ByBill Sebald February 13, 2014July 15, 2023

I remember a few years ago blowing the mind of a boss with a theory that Google would eventually rank (in part) based on their own internal understanding of your object. If Wikipedia could know so much about an object, why couldn’t Google? In the end, I was basically describing semantic search and entities, something that has already lived as a concept in the fringe of the mainstream.

Analytics | Digital Marketing | SEO

SEO Reporting: How to Build Meaningful Analytics Reports

ByBill Sebald October 9, 2019July 15, 2023

The SEO report. It’s a calling card for some agencies. These reports can be ornate or no-frills (everyone has their own style). Smart companies use APIs to compile reports without spending manual hours. Some rely on automatic SEO reporting tools. For other companies, it’s a time-intensive and considerably low-value exercise.

Content Strategy | Digital Marketing | SEO

How To Work Relationships and Concepts Into Your Copy

ByBill Sebald December 5, 2014July 15, 2023

The folks at Mountain View made the conscious decision that keywords alone couldn’t deliver them the results they wanted to see (ahem, “their users wanted to see”). Google tried some different modeling but ultimately came around to semantic search (that is, using semantic technology to refine the query results). Now I said much of the industry has picked up on it. Not all. I still see a lot of pretending Panda, Penguin and Hummingbird never happened. That’s unfortunate for innocent clients around the world. But for most of us probably reading this, we’re students of a new lexicon. With words like “triples” and “entities” and “semiotics” and “topic modeling.”

Similar Posts