y: 31% international websites have hreflang errors
Many SEOs find it difficult to implement hreflang. For those who don’t know much about more than one language, it can be challenging to understand the many syntax variations.
It’s also difficult to understand specific language nuances or regional targeting. This would typically only be understood by native speakers of the language, or someone who has studied the language thoroughly.
Improper hreflang implementation could lead to many problems (e.g., duplicate content and poor SERP visibility), which can be detrimental to SEO performance.
Hreflang should be implemented with care. Hreflang is well documented and can be identified using various SEO tools.
Hreflang errors study
I was able to access the NerdyData database, which allowed me to determine how common hreflang issues were and which are more prevalent.
NerdyData provided a list with 18,786 websites that contained at least one instance hreflang declaring a substitute within the source code. This study is limited to hreflang that was implemented in the source code.
This is not possible through XML sitemaps, or the HTTP header.
The study was conducted by:
- To verify the presence of hreflang at the homepages, run crawls in ScreamingFrog
- GEO-IP redirects removed so that the complete URL list resolves in 200s.
- Visual SEO Studio and HreflangChecker.com are used to process URLs in batches. This allows you to spot common issues.
31.02% of websites contain conflicting hreflang directives
My research shows that 31.02% websites which serve multiple languages have conflicting Hreflang directives. Conflicting hreflang is possible when different hreflang tags are used for different languages or geographical targeting.
Simply put, multiple URLs have been assigned to a particular language or region. This confuses search engines. Take this example:
This confusion can lead to problems with duplicate content, incorrect ranking and indexing, which could make it difficult to rank well in the SERP.
Even if your website is ranked among the top performing, users will experience poor user experience if they receive the wrong version of the page.
16.04% of the hreflang clusters lack self-referencing tags
When a page contains a hreflang link pointing to its URL, self-referencing hreflang occurs.
The page basically indicates that it is available in multiple languages, which includes the original language.
Although it may seem redundant, international SEO is a good practice. Unfortunately, 16.04% of websites with multiple languages don’t have self-referencing hreflang tag.
When self-referencing hreflang tag are used, search engines can better understand the relationship between pages of the same page. This includes pages that are available in different languages.
It is important to include hreflang as it is one of the approximately 20 canonicalization signals.
47.95% websites don’t use x-default
Search engines can see that a page isn’t targeting a particular language or location by using the x-default attribute. This makes it a default language version.
This is especially helpful when a page is available only in one language but does not deliver content in the preferred language of the user.
Hreflang doesn’t require the x-default attribute. It is currently not used by 47.95% multilanguage websites.
It can still be useful to search for pages in specific languages that aren’t yet available. This helps search engines to find the best version of the page.
Important to remember that the x default attribute should be used only if another language cannot be found. Each language should be identified with a hreflang tag if it is not available in another language.
X-default should be avoided on pages that are specific to a language or location.
8.91% contain at least one instance invalid language codes in hreflang clusters
It is important to use the two-letter ISO-639-1 format within your hreflang attributes.
Language codes can sometimes go wrong and cause multiple problems that could affect international targeting.
My research revealed that 8.91% sites that target more than one language contain unidentified language codes.
This could be due to confusion in the way language and location codes are combined, but there may be other causes.
Some language codes are not exactly the same spelling as a country.
You might think that the language code for Croatian is “cr”, but it’s actually called “hr”. Because the code isn’t easily readable, it’s easy make mistakes when implementing the language codes.
1.6% of hreflang clusters have at least one instance containing invalid region codes
Contrary to previous statistics, very few hreflang clusters include invalid region codes.
Although the two-letter ISO-3166-1 area codes are not required, they can be helpful when trying to target the same language in multiple countries that have different spelling rules. This gives search engines more context when they look at language and location.
You must use the code “en_US” to target users based in the United States. It should be set to “enGB” so that it targets users based in the United Kingdom. This will make your target audience irrelevant.
These are some of the most common errors:
These entries target English, but also target Europe and the UK. Because it’s GB (Great Britain), both UK and EU codes are invalid. You can’t target Europe as a country.
Spanish targeting in Latin America can also prove problematic. Clusters may try to target es-419, es-419 and es-437 in an attempt to target the entire region. Instead of targeting specific countries, you should target individual countries or leave Spanish as a general language.
22.4% of hreflang clusters include irregular/unusual combination language-region combinations
Targeting countries with no native language with hreflang has many benefits. The most important is to make it easier for non-native speakers to use the site.
Dutch, for example, is the national language of the Netherlands. However, an 95% population speaks English. There are also around 97,8000 British nationals who live in the Netherlands.
It makes sense to target the Dutch with English-language websites because of the high number of English-speaking speakers.
But not all combinations are right. Take this example:
Although the examples are technically correct and will pass a hreflang testing, they will not be able to replace Zambia’s Chinese-speaking speakers.
Alternate versions that make no sense can lead to additional crawl demand, and versions that Google might consider duplicated and override the canonicals.
Search Engine land – Study: 31% international websites have hreflang