19 Mar 2022
At the end of January, Google announced that a new API for Search Console was launching. This API allows the querying of data formally found through the “URL inspection” functionality in Search Console at scale. This was previously only available through the user interface on a single URL basis.
Some of the important data in this report includes:
- Whether the URL is indexed and eligible to show in Google’s search results.
- Whether the URL is in a sitemap.
- When the URL was last crawled.
- Whether Google has found a canonical and whether that is being respected.
What can you do with the Inspect URL API?
Google provides detailed instructions on how to access the API should you wish to create your own request. But more excitingly, we’re already seeing SEO tool providers deploy functionality to make that process easier. And they’re grouping and visualising the data to start discovering insights too.
Two of our favourite tools are Sitebulb and Screaming Frog, both of which have been fast off the blocks with new functionality to help draw insights from this newly-scaled data source.
Before you start having a play yourself, there are some core things to remember.
The API limits requests to 2,000 URLs per day
That’s per website and per website property in Search Console. If you’re using a crawler like Sitebulb or Screaming Frog to interrogate the API you will want to control which URLs are requested if your site is larger than 2,000 URLs so that only the URLs you’re interested in return data and you don’t waste your budget on unimportant URLs. Sitebulb has in-built functionality that will aim to request data only for the most valuable URLs via their “URL rank” metric, however, it’s still wise to try and crawl only the URLs that you want API requests for.
The limit is 2,000 URL requests per website per day
Yes, you read that correctly—it’s not possible to do 2,000 in Sitebulb and another 2,000 in Screaming Frog or any other tool.
The only way to get more requests per day is to segment your site into smaller “website properties” in Search Console e.g. by sub-folder. Each one of these can handle 2,000 requests each per day, however, you will need to update your credentials in each tool for each new crawl with a new web property and then stitch the data back together again. Make sure to discuss the API usage with other potential users of your limit as they will be denied if you’ve already reached the limit for the day.
How to use the URL Inspection API to find the gems in your data
Now that we have URL Inspection tool data at scale, we can start to identify trends, monitor changes over time, diagnose site-wide issues and more quickly trouble-shoot specific problems. Understandably, these tools visualise the data slightly differently so what you can get out of each will vary slightly.
Here are some of our favourite uses for this new data source:
- Identifying unindexed content – these tools now allow us to easily define whether priority content has been indexed. This information can be overlaid with other crawl data to highlight whether this is deliberate (e.g. due to a noindex tag) or unintended behaviour. When it’s the latter we can quickly dig into why our otherwise indexable content is not being indexed and take action
- Finding indexed content not in the XML sitemap – whilst we previously had access to sampled reports of indexed content that wasn’t in a submitted sitemap, this new functionality allows us to report on whether our 2,000 most valuable URLs are included. If they’re missing we can take action at scale to rectify it
- Fixing pages where Google has chosen to ignore the canonical tag and selected it’s own – this is another example where we previously had sampled data in Search Console but can now quickly validate the size of this problem across 2,000 priority URLs. This helps to identify trends between the content and the potential reasons for Google to ignore the specified canonical
- Ensuring valuable content is regularly being crawled – using the “Last crawl” date we can see whether our valuable content is being crawled at a frequency that we’re happy with. If we know that content has been updated more recently than the last crawl date we can take action to request crawling so Google can index the changes. Likewise, if our important content hasn’t been crawled in a while we can look at why that is
When might the Inspect URL API be useful?
This new data will be particularly helpful if you’re expecting big changes to the content on your site and you know which URLs are likely to be affected. Here are some examples of specific times to use the API:
Once a new site has gone live and the appropriate post launch activities have taken place (new sitemaps submitted, high value URLs requested for indexing, redirects tested, technical error checks performed) these reports will be invaluable for understanding how quickly Google is processing and indexing your new content.
Content hub launches or rapid publishing schedules
if you’re submitting multiple pages of content to be crawled and indexed in a short time frame then populating a crawl list and querying them each day will help you to stay on top of when they’re being indexed.
Changes to product naming, offers and terminology
if you need to make subtle changes to body copy to reflect a change to product naming, remove legacy promotional offers or change the terminology around your product or service then this will enable you to identify whether that content has been crawled since you made the change. Your customers will see the amended content if they click through from the search results but this may stop your site showing up in results for the terminology you’ve removed or avoid pulling it into the meta description if you act on these insights.
It’s still early days for the Inspect URL API, tool developers and SEOs but we can see huge potential for this new API and are excited to discover new ways of using it.
If you’re considering a site migration, large-scale content changes for your website or have lots of content due to launch soon and would like SEO support please get in touch.