What is it?
Are you auditing your site’s content effectively?
Every website has a rich bank of data sitting in its templates.
This can often be overlooked when auditing and evaluating content.
With custom extractions, you can pull any recurring template element for a large set of similar pages, such as your website’s blog section.
For example, you could extract elements such as Author name, Publishing date, Article category, Tags used, Word count of the main article body, or even Number of comments.
Why is it Important?
Content audits are a necessary, even vital, part of on-site optimisation.
Generally they focus on reviewing content performance and quality at a page-by-page level, often manually.
This can lead to important site-wide trends being overlooked, as well as creating unnecessary work for the person conducting the audit.
By incorporating custom extractions into the content audit approach, you save valuable time by pulling information you would otherwise have to manually check by opening each page.
Once pulled, you can overlay performance data to slice and segment your site’s content in new ways.
You can gain valuable insights through this analysis that wouldn’t otherwise be possible.
Questions that custom extractions can help you answer might include:
- Which blog categories consistently perform the highest?
- Does one of your authors perform better than the others?
- Is your blog section using tags consistently and effectively?
- Do certain categories or themes correlate to higher engagement?
- What’s the average word count of a comment in each category?
What to do next?
To get started with custom extractions, you’ll need just two things:
- A tool to conduct the extraction
- An XPath query to tell the tool where to find the data
XPath is the syntax for defining the parts of an XML document. You don’t need technical knowledge to create an XPath query, as there are free tools that will work them out for you. We particularly like the ‘XPath Helper’ Chrome browser extension.
Once you have your XPath, you’ll need a tool to plug it into. A dedicated site crawling tool like Screaming Frog SEO Spider is preferable, but Google Sheets’ IMPORTXML function is a great free alternative.
Once you have all of this ready, the steps are simple:
- Create your custom extraction with your XPath query in your tool of choice.
- Run a site audit for the list of pages that your XPath query is applicable for, e.g. all blog articles.
- Export the extracted data from each page into a spreadsheet and overlay performance data from your site’s analytics platform.
- Dig into the data, identify trends, and supercharge your content!
For further support with custom extractions for content audits, reach out to your content team or we’ll be happy to help.