nd keywords: how entities impact modern SEO strategies
The introduction of entities has been a major change in the transition from “search engines 2.0” to “search engines 3.0”.
This article examines these shifts and the impact of entities in modern SEO. It also explains how you can adapt your strategy to succeed in this new age.
Build your own SEO notional machine
A teacher who taught me to code in my early years introduced a concept called “notional machines” which changed my perspective on programming and SEO.
It’s the developer’s mental model of what occurs inside a computer when they hit run.
My teacher stressed that the more accurate and detailed this mental representation, the better I will be able to solve new problems.
The most successful programmers are those who have developed the most accurate, reliable and notional machines.
As a similarity to SEO, whenever we learn new concepts, review a case-study, or observe a change in the environment, we constantly update our mental model of and how search engines function.
Skilled SEOs can achieve better results than unskilled ones because they are able to pull solutions out of a more accurate modeling.
Anderson Ericsson’s research in this field provides solid evidence that confirms the point.
According to his studies, those who excel at their professions have superior mental models that are more easily accessible.
The models help them understand complex cause-and effect relationships, determine what is important in a scenario and identify processes that may not be immediately obvious.
Google’s search engine has been altered by the addition of entity SEO.
Many SEO professionals seem to still be operating under “search engine 2.0” rules, even though the “search engines 3.0” follow a slightly new set of rules.
Entity SEO is a set of concepts and vocabulary that comes from the machine learning and retrieval disciplines.
The terms are not simplified to their essence, so they may appear complex. You’ll discover that the concepts aren’t too complicated once we simplify them.
The goal of this project is to create a notional machine that shows how search engines today use entities.
This article will show you how to update your SEO knowledge to reflect the new reality.
Understanding the “why” of these changes may seem insignificant, but many SEO professionals “hack” the matrix by using their knowledge of how Google interprets web pages to their advantage.
By manipulating Google’s subject matter understanding, people have created sites with millions of visitors and changed Google’s understanding in recent years.
Refresher: how we got to search engine 2.0
Let’s first review the changes that have been made to the original version 1.0.
At first, search engines were based on a “bag of words model”.
This model treated the document as a collection of words without regard to their context or arrangement.
The search engine refers to an database of inverted index, a data structure that maps words to their location in a collection of documents. It then retrieves documents with the most matches.
This model was often unable to deliver accurate and relevant search results due to its inability of understanding the context and semantics both of documents and queries.
If, for example, you searched “jaguar”, using the “bag of Words” model, your search engine would only return documents that contained the word “jaguar”, without taking into account the context.
The user may not have intended to search for the Jaguar brand of cars, but instead, they could find information about the Jaguar animal or the Jacksonville Jaguars Football team.
Google’s “search engine 2.0” adopted new strategies. This iteration of Google’s search engine aimed to understand the intent behind a user’s query, rather than just matching words.
If a user searches for “jaguar”, the engine can now use the search history of the user and the location to determine the most likely context.
The engine may prioritize the results based on the brand of car if the user is searching for Jaguar models or lives in an area that has a high demand for Jaguars.
The introduction of personalized search results, which take into account factors such as user history and location, has significantly improved the relevancy and precision in search results. This was a major evolution from the “bag of terms” model to “search engines 2.0”.
Search engine 2.0 vs. search engine 3.0
We had to change our mental models as we moved from “search engines 1.0” and “search engines 2.0”.
SEO professionals began to realize the importance of quality backlinks, which led them to abandon backlinking software and look for backlinks on higher-quality sites, among other changes.
It’s evident that in the age of “search engine 4.0”, the mental shift required to adapt to these changes is not complete.
Many concepts from 2.0 are still in use, mostly because practitioners require time to see the correlation between the adjustments they make and the results that follow.
SEO professionals are still struggling to adapt to the changes. They may have tried to do so, but not been successful.
In order to clarify these new distinctions, and to provide some guidance on how you can modify your approach, I’ll present a simplified but useful comparison between “search engines 2.0” and “search engines 3.0”.
Information retrieval and query processing
Imagine entering the search term “Elvis” in Google.
The sophistication of Google’s algorithms allowed it to understand the user intent behind the query. This went beyond matching keywords.
If a user searches for “Elvis”, then the system will use machine learning and natural language processing to anticipate and understand the intention behind the query.
It would search for “Elvis”, and return results that included the word, or were based on (almost entirely) the relevancy of copy on websites and personalization parameters like user history and location.
This model was not without its limitations. It relied heavily on the keywords, search history of the user, location, and phrases in the text of indexed pages.
Elvis could refer to Elvis Presley or Elvis Costello. It could also be a restaurant in your area named “Elvis”.
It was a challenge because it relied heavily on the user to refine and specify their query and was still restricted by the semantics.
Improvements to query processing in 3.0
The introduction of entities has revolutionized search.
Hummingbird, RankBrain and other algorithms have paved a way for entities to play a greater role since 2012.
Entities are distinct concepts or things that can be people, places or objects.
In our previous example, “Elvis”, no longer a simple keyword, is now recognized as a distinct entity and likely refers to the famous singer Elvis Presley.
When an entity such as “Elvis Presley”, is identified, a search engine can now link a variety of attributes to this entity. These include aspects like his music, filmography, or his birth and death date.
This new search method significantly expands the search scope. A search for “Elvis”, before, would primarily focus on the 2,000,000 pages that included the exact keyword “Elvis.”
In this entity-centric search model, the engine will also consider pages that are related to Elvis’ attributes.
This could potentially expand the search to 10,000,000 pages even if they don’t all explicitly mention “Elvis.”
This model also allows the search engine understand that other keywords relating to the attributes and entity of Elvis, such as “Graceland”, “Blue Suede Shoes”, are implicitly related to “Elvis”.
Searching for these terms can also yield information about Elvis. This broadens the search results.
Search engine 3.0: Topic boundaries and query processing
Google’s perception of the topics that should be on a single web page has also changed significantly as a result of these improvements in entity processing.
It was beneficial to create separate pages for each keyword so that the page can be optimized specifically for that term.
In “search engine 3.0”, however, the boundaries are more fluid. They are constantly updated based on machine-learning predictions and observed behavior.
In the new age, a topic can cover a broad range of topics, or focus on a specific aspect. Websites can become experts in broad or niche areas because of this flexibility.
Example
Take crayons as an example. A website could cover everything there is to learn about crayons – their types, history, manufacturing process and usage tips.
This website is intended to be a leading authority on all things ‘crayons.’
Another website may focus on red crayons and their unique pigments. It might also include statistics about their popularity, or cultural significance.
This website is trying to establish itself as a topical authority within a more limited context. However, it’s still valid. The focus on “red crayons” must, however, be in line with the website’s overall goal.
Google may be confused by adding micro-contexts to your website that do not match its overall purpose. This could dilute the topical authority of your site.
Theoretically, a site could focus its content on “labels found on red crayons.”
It is a very specific topic, and you might wonder if Google will recognize it as an authority on the subject.
Machine learning is used by social media sites to predict how users will interact with certain content.
The system could recognize that a user is interested in “labels on red crayons” if they interact with the content frequently. In this case, the website containing the content might be recognized as a subject expert.
Google could theoretically do something similar, or at least maintain expectations about how well content should perform based on the metrics that they track.
Google takes into account several factors to determine the answer.
Does this topic generate a lot of searches?
The site could be considered a topical authority if people actively search for information on the “labels on red crayons” and it provides valuable, comprehensive content.
What are the best user metrics?
Google may interpret long-term visits, low bounce rates and other engagement signs as an indication of the authority of a site on a topic.
Topical authority is an idea based on the relativeness of different topics (entities). You can consider your site a topical expert on topics as broad as “technology” or as narrowly focused as “vintage typesetters.”
It is important that your website displays positive user behavior, and uses entities to create relationships in the content. Google will then rely on the site you have created to improve its understanding of the topic, regardless of how many searches there are.
Takeaways and SEO applications
More comprehensive content wins
Previously, many pages were not included in searches because they didn’t contain the exact words used.
A well-linked webpage that does not include a specific search term will not be included in the results. This is true even if it has other strong ranking factors such as engagement from users and backlinks.
It encouraged SEOs to create fewer, more focused pieces of content to achieve ranking for a keyword.
The game has changed with the introduction of 3.0, which focuses on entities and their relationships.
The exact search term does not matter. Google will now attempt to link related entities across your site to the entities found on your page.
The algorithm will determine your approximative relative ranking and then rank you accordingly. This fundamental change allows pages that have strong ranking factors to compete, even if specific terms are missing.
Content creators and SEO specialists should focus on creating comprehensive and expansive content.
Instead of spreading out your topics over multiple articles, focus all your efforts on the broad and in-depth articles.
You can use the current SERPs to help you identify important topics. However, don’t limit yourself by them.
The user will receive valuable and comprehensive content if you go beyond the current topical coverage of SERPs.
This will allow you to cater for the user’s current query as well as any related queries that they may have. It will also increase your content’s visibility and relevance in this new age of search.
Don’t focus on keywords but rather the answer intent. Be careful with headlines
SEO has evolved in the “search engine 3.0 era”. You can’t just insert keywords from the Search Console report in your content to hope for better rankings.
Google’s advanced algorithm can now detect when keywords are used out of context. This can confuse the algorithm, and lead to a possible drop in rankings.
Header Order Matters
Connect the ideas that are most important to your goal. Make sure that the content below the header is relevant to the topic of the headline.
Remember brainstorming in your elementary school writing class?
We would draw circles and write the topics inside them. Then we would link these circles by drawing a line between smaller circles that had topics related to our story.
Do not overcomplicate the process. You can also use this strategy for headings.
Search engine 3.0 requires a more deliberate approach to keyword use, while also addressing the user’s intent and maintaining context in order to improve relevancy and ranking potential.
Documents are scored and ranked
After a search engine such as Google has found potentially relevant documents, it is crucial to score and rank these pages for the user.
The evolution of AI and NLP has markedly changed the way documents are ranked. This marks a clear difference between the 2.0 and the 3.0 eras.
2.0 era (post-bag-of-words, pre-RankBrain)
Google’s scoring system in the 2.0 era was largely driven by algorithms such as PageRank, Hummingbird and Panda.
The algorithms used heavily relied on matching keywords and the number backlinks in order to rank documents. Each document was given a score for the pages, and then sorted by rank.
Panda and Penguin are algorithmic evolutions that penalize sites for trying to manipulate the system. They do this by moving away from keyword-matching.
Keyword-based search systems are still more efficient than evolved language methods, but the hardware isn’t yet advanced enough to provide fast results.
Search engine 3.0: Ranking and scoring
Google’s method of scoring and ranking documents in the “search engine 3.0 landscape” has changed significantly.
Both software and hardware upgrades have led to this. Google evaluates the suitability of a web page for a query by evaluating several factors.
The main difference is that you can now quantify relevance, instead of relying on external signals such as backlinks to determine the best content:
Factual accuracy
Content from reliable sources that is factually accurate continues to be ranked higher. Google’s Knowledge-Based Trust affirms this.
We call the trustworthiness scores we computed Knowledge Based Trust (KBT )… A manual evaluation of a small subset of results confirms that the method is effective.
User interaction signals
These reasons can make it problematic to post low-quality content and then edit it later. Google considers historical and current user engagement data for a webpage.
Google’s “Engagement & Experience Based Ranking” patent (US20140244560A1) outlines this shift, which highlights the use of historical engagement scores as part of its ranking considerations.
Quality engagements
Engagements such as long-clicks that keep the user on your site for a considerable amount of time are valuable.
The ranking of your website can be negatively affected by non-quality engagements such as quick returns to search results (“pogo-sticking”).
These engagement metrics will boost your topical authority and influence your ranking.
Poor user engagement, however, can result in a drop of your page’s rankings. It can take some time to recover from a ranking drop. This highlights the importance of providing relevant, high-quality content that encourages user engagement.
Takeaways from SEO and their applications
Fact-checking
Google will check for factual accuracy. Spend time creating accurate content.
It is important to do proper research, check the facts, and use reputable sources. Use a fact-checking schema to increase the credibility and relevance of your informative articles
User engagement
Attention to the metrics of user engagement on your page. Consider revising your strategy if your content doesn’t engage users as you expected.
Crawling and indexing
Let’s wrap up this exploration of the search procedure by looking at how Google has evolved its web crawling and indexing methods with its focus on entity.
Understanding these changes are crucial, as they will directly affect how you structure your website and develop your content strategy. This includes constructing a topical map.
Google’s spiders (also known as web crawlers) were the first to systematically search the Internet in the “search engine 2.0 era” for new and updated webpages.
The team would then follow the links on each webpage and collect information about them to be stored in Google’s database. The process was designed to discover new content and keep the index up-to date.
Google indexes all pages that have been found by its crawlers.
Content (text, images, videos) of each page was analyzed and the page was classified based on its content.
It was primarily the keywords and phrases in the text that were considered, as well as factors such backlinks. These factors are used to determine the relevance and authority of a web page.
Things have become much more complicated in the “search engine 3.0 era”.
Google crawlers continue to discover new and updated web pages by following links on the Internet. Now, Google’s crawlers are also trying to determine what the keywords on the page mean.
A page about Elvis, for example, could also be indexed with related entities such as “Graceland,” “Rock and Roll Music,” and “Blue Suede Shoes.”
They also follow your internal links in order to determine the relationships between your website’s entities.
It’s a little like a librarian who doesn’t just catalog books by their titles, but also reads them to see how they relate to one another and the overall theme of the book.
Google can now deliver more precise and relevant search results by gaining a deeper understanding of the user.
What is the relationship between crawling and topical authority or entities?
Google doesn’t just crawl a website anymore. It looks at all the pages. It also looks at the theme or topic of the entire website.
Topical authority is a great way to do this.
A website that consistently produces high-quality material on a particular topic can be viewed as an authority in the field.
Google can give a site a boost in search results if it deems it an authority. You’ll often see small-backlink sites ranking for competitive terms due to the boost in topical authority scores.
Google has only just begun to acknowledge the topical authority concept. It’s been around for a while, but Google is still catching up.
On May 23, 2023, Google published “Understanding News Topic Authority.”
Even though many SEOs were convinced that topical authority played a role in ranking, this was not confirmed by Google’s published content (aside from digging through patents).
Do not be fooled by the word “news”. Topic authority is a Google-wide measure that applies to all websites crawled by Google, not only news sites.
Google’s US20180046717A patent describes this concept of topical authorities.
The patent describes how to determine a website’s level of authority by evaluating the depth and consistency of the topic.
A website that consistently publishes high-quality articles about “organic garden” could have a high purity factor (yes, Google considers your site’s ability stay on topic), which would contribute to a higher score.
Google can also graph the content you create, just as ChatGPT does with words embedded in embeddings.
Google can then visually determine if the content on your site is consistent and similar, enhancing its understanding about your website’s authority.
In essence, Google’s new indexing system does not only focus on understanding individual pages, but also recognizes the website’s topical focus.
This highlights the importance of a consistent content strategy as it can have a significant impact on your website’s visibility.
Takeaways from SEO and their applications
Topic focus
Google can detect when your website deviates away from its primary topic. Inconsistent content can cause confusion about the purpose and goal of your website.
Keep your content strategy consistent to reap the benefits of topical authority.
Content depth
It is important to build depth into your content, but only relevant depth. The depth of content should be based on your site’s main purpose.
If your website’s main purpose is to inform visitors about digital photography, then don’t write in depth about the history and development of film cameras.
It’s not a good fit for your website’s main focus, which is digital techniques. Instead, deepen your content by exploring various digital photography techniques, reviewing digital cameras, or providing tips for editing digital photos.
Too much content may dilute your authority
A website that has too much content can dilute its purpose and meaning.
Make sure your sitemap only contains content that supports the key ideas you want to promote and is of sufficient quality for Google to understand your entities.
Use contextual bridges
It’s crucial to create new content that is connected to your website’s primary goal by using “contextual links”.
Ask yourself, before adding any new pages to your site, how they can be tied back to the main goal.
Google will then begin to associate your new page with the primary entity.
Limitations and constraints on topical authority
There are some limitations to the site that we build.
Google still gives a fair amount of power to these ranking factors that date back to the Web 2.0 era: time spent on the internet and backlinks.
It takes time to establish topic authority. This timeline can be shortened with the recent explosion in AI content creation tools. However, it still takes some time.
Topical authority also refers to how authoritative other sites are in your niche.
You will be compared with other sites in the same niche if, for example, you create amazing content based upon an incredible topical mapping.
We will then refer to the age old problem of backlinks on the internet and the time.
It is very difficult to surpass sites who have done a fantastic job of developing an entity and have done this on a domain which has been online for several years. Possible, sure, but difficult nonetheless.
Let’s talk links.
Even seasoned SEOs can struggle to create sites that rank high without backlinks.
Backlinks remain a major ranking factor. They might not be quite as powerful as before, but they still have a lot of power.
The biggest problem with giving backlinks such a high ranking is that they come from news conglomerate websites, which don’t “specialize” on any particular topic.
You’ve probably seen this: when you Google “best widget for” xyz, the top 10-15 results all come from news networks that claim to be the best sources for purchasing these widgets.
Are the news sites specialized in the creation or sale of these widgets and widgets?
Are these news sites able to provide a topical perspective on these widgets?
Not at all.
Why do news sites still dominate SERPs if they don’t have any topical authority? The answer comes down to backlink profiles and time spent on the internet.
Editors of large news networks, knowing that they will be ranked extremely high after clicking the publish button on their site, solicit the sale ad spaces.
The companies also know that their product will appear near the top of Google’s SERPs so they are willing to pay thousands of dollars for the feature.
Parasite SEO is a term used to describe the way they rob the news website of its ability to dominate SERPs when it publishes anything.
Even if your site is a top news source, it may still struggle to compete with the powerhouses.
Unfortunately, until Google fixes this problem, being a topical expert isn’t sufficient to compete with the SERPs that are dominated by news sites.
Mastering SEO at the age of entities
By guiding you along the path from query processing, to indexation, and ranking, hopefully I’ve helped you update your “notional computer” to better reflect the latest changes to Google search engine.
You can improve the ranking of your website and that of your clients by focusing your efforts and time on this refined understanding.
It’s important to remember, too, that the theory only shines in reality.
Affiliate SEO practitioners, for example, discovered a long time ago that a large amount of content produced on their topic could result in a boost to the authority SEO ranking.
It was a realization that existed long before we understood what entity SEO is.
SEO is a journey that never stops evolving. There are always new opportunities to improve.
Armed with these insights and knowledge, it’s now time to experiment and create your own SEO strategy. The proof is in the pudding. Enjoy your testing!
Paul DeMott and wrote this article together.
This is the third in the series of articles on entity SEO. You can read the first two articles by clicking here.
-
The ultimate guide to entity SEO
-
How do you optimize for entities?
-
3 ways AI can be used to optimize sitewide entities
The post Beyond Keywords: How Entities Impact Modern SEO Strategies first appeared on Search Engineland.