Senior SEO Consultant

Share
  • 0
  • 0

Semantic search and entity optimisation are not new notions in SEO, however recent Google updates, patents and studies have brought them to the forefront of popular conversation.

Throughout this post I’m going to explore how to research entities, what they mean and how they can be used to create better website structures. By website structures, I mean the ecosystem as a whole; content, template level information architecture, the website architecture, internal linking.

Throughout this article I’m going to be using a number of processes and techniques, as well as some software – the processes and techniques I’ll go into as we get to them, but from a top level the tools I’ll be using are:

Why Is This Important to Google?

Failure to provide relevant and useful search results will drastically lead to a loss of trust by users from around the world and consequently a fall in revenue.

So it makes sense that Google is prioritizing content salience or relevance over trust and authority to a point. To rank high in search, Google must first ensure that the candidate’s content analysis results overcome a saliency threshold before it can turn to other factors — in other words, your content needs to match a number of user search intents.

Back in the good old days of search engine optimization, we obsessed with the phrase keyword density and used it as a yardstick to determine salience.

However that turned out to be very ineffective as it encouraged keyword stuffing and paid no attention to conveying useful information.

Over the years, several SEO tools emerged allowing content writers to analyze keywords based on their search frequency by using Google AdWords API and other analytics software.

Eventually, people became obsessed with writing content that tried to stuff most-searched keywords focusing more on keyword density and entirely ditching keyword relevancy, and Google retaliated by taking away the AdWords API.

Google has put in so much effort to convince us to forget about keywords and focus more on providing good quality content that people will love. It is however interesting to note that efforts being made by Google to turn unstructured data into structured data are in danger of bringing back SEOs.

What Is Salience?

Google’s Natural Language API is a powerful tool that compounds a lot of Google’s algorithms, so let’s have a look at what it’s all about. It’s one of a handful of SEO tools that’ll allow you to compare the content you have with the content currently ranking on Google to test whether the content is more relevant or salient.

You can also use Google image search tags to identify entities, however when performing this exercise at scale – I prefer to blend multiple data sources, and through the API (0 – 5k units a month for free), you can automate a lot of the analysis.

With Google’s NLP, you can derive insights from unstructured text and determine the salience of content, but how does this work?

Google’s NLP AI is programmed to analyze content by splitting the content into what we call “entities” Entities represent a phrase in the text that may include a person, an organization, or location.

An example of the Google Natural Language API using a paragraph from my Travel SEO Guide

Salience Scores

These numbers (salience scores) are actually rankings that show the importance or centrality of that entity to the entire content or article.

Using the example picture above, the most salient entity in an article is <holiday website>,the different entities are then listed in order of salience within the text, and each entity is assigned a score between 0 and 1.

Determining the  Salience or Relevance of a Particular Content

For many years now, Google has collected and organized millions of information on the internet, and they understand how difficult it is to file and categorize unstructured content.

For instance, if you walk into a library and request for a book on the topic “stars”, does the librarian direct you to look in the Astronomy, Astrology, or Autobiography sections?

A single topic can related to a number of different categories and fields, and consequently without more specific information about your request, the librarian could end up being bewildered and provide you with a result you’re not wanting.

Now replace the library and librarian, with Google and the search bar.

What Google’s NLP does while analyzing all the content of unstructured web pages is that it breaks down every piece of content and splits individual pieces into smaller components. It’s logical that a website is about many things, even a web page alone is about many things too. In fact, a single sentence is also about so many things.

In the meantime, website owners want their websites optimized increase performance (traffic and conversions), so whilst this is all good, how do you turn this into something actionable?

Taking Advantage of NLP Salience as a Measurable Search Factor

In March 2016 Barry Schwarz covered a story in Search Engine Land that Google’s Andrey Lipattsev had revealed that links, content and RankBrain are three more prominent signals in Google’s algorithm.

Now, I’ve made my thoughts clear on the whole artificial intelligence optimisation (AIO) issue, in that you can’t discernibly optimise for it. If you want, you can spin that voice search is artificial intelligence – but ultimately all of search (on Google) is influenced by RankBrain.

However from the SEL article, we can infer that salience is a measurable factor in determining content relevancy, and therefore its likelihood to perform within competitive SERPs (assuming all other factors equal – which they’re not).

Salience can however be an asset in determining how optimized your content is in terms of user value, related entities, and potential to satisfy subsequent secondary intents as well as the primary user search intent. It can also help inform content structures and architectures for supporting content.

Isn’t This the Same as E-A-T?

No, and never use that phrase again.

E-A-T was first covered in-depth by Jennifer Slegg in 2016, and then again more recently (and loudly) by Marie Haynes and for many it’s become a go to buzzword for optimizing content from an SEO perspective.

However at no point have Google “increased trust as a factor” or anything like that. The rater guidelines are disconnected from the algorithms:

“You can view the rater guidelines as where we want the search algorithm to go,” Ben Gomes, Google’s vice president of search, assistant and news, told CNBC. “They don’t tell you how the algorithm is ranking results, but they fundamentally show what the algorithm should do.”

Also, algorithms require measurability – so how the hell would the algorithm quantitatively measure expertise?

 

Share
  • 0
  • 0
  • André Moura

    Hi Dan. Very nice text. I’m starting with this and your article was very valuable. One question: i ran an entity analysis call and in my text I had multiple times the keyword “project manager” mentioned. What’s the reason to get the api response with different entities “project manager”? How should we work with these results? (to illustrate here: I had 8 times project manager mentioned in the original text, and the result provided by google i got 4 “project manager” each one with a different salience)

    • Dan Taylor

      Hi André

      From my understanding it relates to the usage within the corpus of text, for example:

      a) Betty is carrying an axe.
      b) Betty is carrying an axe, she’s going to cut firewood.
      c) Betty is carrying an axe, she has been chopping firewood outside.

      Different meanings in subtext and context, but at the end of the day, Betty is still carrying an axe.

      • André Moura

        Thanks, Dan! 🙂