rstanding and solving ‘Discovered’ – currently not indexable //
If you see “Discovered-currently not indexed” on Google Search Console it means that Google knows the URL but hasn’t crawled or indexed it yet.
This does not necessarily mean that the page will never get processed. It is possible that they will return to the page later, as their documentation indicates.
Google could also be hindered by other factors, such as:
- Server problems and technical issues onsite can restrict or prevent Google’s crawl ability.
- Questions relating to the page, such as quality.
To queue URLs for their coverageState status and other useful data points, you can also use Google Search Console Inspector API
Request indexing via Google Search Console
This is a simple solution and will solve the problem in most cases.
Google sometimes slows down when crawling new URLs. Sometimes, however, the root cause is more complex.
One of the following things could happen when you ask for indexing:
- URL is “Crawled but not indexed”
- Temporary indexing
Both of these symptoms are signs of deeper issues.
This is because request indexing can sometimes give your URL a temporary boost, which can raise the quality threshold of the URL and lead to temporary indexing.
Quality issues on pages
Here is where the vocabulary can become confusing. “How can Google determine page quality if it isn’t yet crawled?”
It is a great question. The answer is no.
Google makes an assumption about the quality of a page based on pages elsewhere on the domain. They also use URL patterns and site architecture to classify the pages.
This means that these pages can be moved from “awareness to the crawl queue” based on their lack of quality on pages similar.
Pages with similar URL patterns, or pages located in the same areas of the site architecture, may have a lower-value proposition than other content that targets the same keywords and user intents.
There are many possible causes.
- The main content depth.
- Presentation.
- Supporting content.
- Uniqueness in the content and perspectives offered
- Or, even more manipulative issues (i.e. the content is low-quality and auto-generated, spun or directly duplicates existing content).
If you are interested in Google crawling your content with greater purpose, it is worth working on improving the quality of the site cluster and specific pages.
To increase the proportion of high-quality pages and low-quality pages, you can also remove other pages from the website.
Efficiency and crawl budget
The crawl budget in SEO is often misunderstood.
This is not something that most websites need to worry about. Google’s Gary Illyes stated that 90% of websites do not need to worry about crawl budget. This is often seen as a problem by enterprise websites.
Crawl Efficiency can also affect websites of any size. It can cause problems in how Google crawls the website and processes it.
To illustrate, if your website:
- Duplicates URLs that contain parameters
- Resolves with or without trailing slashes.
- Available on HTTPS and HTTPS
- Serves content from multiple subdomains (e.g., https://website.com and https://www.website.com).
If so, you could be experiencing duplication issues that can impact Google’s assumption on crawl priority based upon wider site assumptions.
Google may be spending too much on crawling by sending unnecessary requests and URLs. Googlebot crawls websites in sections, which can cause Google to not have enough resources to find all URLs that are published.
Regularly crawl your website and make sure that you do this:
- Pages resolve to a single subdomain (as desired).
- Pages can be resolved to one HTTP protocol.
- URLs that contain parameters can be canonicalized to root (as requested).
- Redirects are not necessary for internal links.
You can prevent crawling of parameters on your website, such as ecommerce product filter parameters, by disallowing them from the robots.txt file.
Google may also consider your server when allocating the budget to crawl your site.
Cragging issues can occur if your server is slow responding or too busy. Googlebot may not be able access the page in this situation, causing some content to not get crawled.
Google will therefore try to return later to index the website but it will undoubtedly cause delays in the entire process.
Internal linking
It’s essential to have links between pages when you have a website.
Google may not pay as much attention to URLs without enough or sufficient internal links. In fact, it may even remove them from its index.
Sitebulb and Screaming Frog crawlers can help you determine the number of pages with internal links.
Optimizing your website is easier if you have a well-organized and logical website structure.
If you are having trouble with this, you can “hack” into the crawl depth by using HTML sitemaps.
These devices are intended for humans, not machines. They can still be useful, even though they are considered relics.
It’s a good idea to separate URLs from your website and put them on different pages. They shouldn’t all be linked from the same page.
External links must also use the tag for internal linking, and not rely on JavaScript functions like onClick().
Investigate how JavaScript or Jamstack framework handles internal links if you are using it. These tags must be used.
Search Engine Land first published the post Understanding & Resolving “Discovered”.