How deep your website is crawled depends on several factors including your website’s authority, content popularity and social media footprint. On top of that there are technical factors that play an important role when it comes indexing and crawling. A key factor is having a good website architecture that supports all the pages properly and ensures that they are visible and accessible.
Well-designed website architecture ensures that the deepest tier is no more than three clicks away; this is good for user experience. Furthermore, from a search engine optimisation perspective, having your deepest pages within three clicks allows you to distribute PageRank properly which in turn makes your lower tier pages more visible and accessible to spiders.
Let’s face it, not all websites are built properly. You may come across websites that are many years old or ones built using steam engine era content management systems that create 6+ level deep tiers. Distributing PageRank to the lowest tier becomes a huge issue because pages are buried very deep. You would therefore have to support those pages through your highest level landing pages or second tier. You can do this by implementing an internal linking structure, supporting weaker pages by high PR pages .
Remember, links that are above the fold and surrounded by text distribute PageRank better than links in the lower parts of the page i.e. footer or side bars.
Let’s assume you are working on website which has thousands of pages, some of which might have high PageRank and others might be buried very deep as a result they won’t rank well for their relevant keywords. In such case, you need to do the following:
- Find out all the high PR pages of your website
- Internally link high PR pages to lowest tier pages with targeted anchors ensuring the pages have some degree of relevancy
Find high PR pages
Finding out high PR pages manually can be a rather daunting task, so here is a relatively quick way of doing it. You need to install SeoQuake extension for Chrome. SeoQuake displays Google PageRank, Alexa Rank and other SEO parameters right within the SERPs.
Once you have installed SeoQuake follow these steps:
- Turn Google Instant off.
- Go to Advanced Search and select 100 results per page. Without switching off Google Instant you will be unable to have 100 results per page.
- Edit SeoQuake Options and un-tick all the unwanted parameters and hit Save. Disabling unwanted parameters reduces wait time.
- Invoke site: operator on your domain name in Google search i.e (site:number10.gov.uk)
- Straight away you get 100 pages of number10.gov.uk with relevant PR values for each page.
- Now you need to export SeoQuake’s data as CSV and use Excel to sort all the top PR pages. Click on “View as CSV”
- Select all and copy the data.
- Open your favourite text editor i.e Notepad and paste the data and make sure you save the file as data.txt. You should repeat this step for all pages appearing in SERP pagination making sure you save all the data in a single file. Remember, if you save the data as CSV you will have problems sorting it in Excel because it will add everything in a single column.
- Start Excel and open the data.txt which will automatically bring up the Text Import Wizard. Select “Delimited” and click Next. On the following screen tick Semicolon under Delimiters. This will create two columns, one for URL and one for Google PageRank as shown below.
- You should now see the following columns in your worksheet. Highlight column B in its entirety and select Filter from the Sort & Filter functions.
- Click on the down arrow next Google PageRank and un-tick Google PageRank, n/a. (If you see “wait…” it means that SeoQuake didn’t get a chance to retrieve all the PR for the SERPs… try again and give it some time).
- Click on the Filters down arrow next to Google PageRank and select “Sort Largest to Smallest”.
This will sort all the pages by PageRank. You can now see all the highest PR pages for your website, time to find relevant deep tier pages to internally link to. Remember to carefully examine each page and make sure you are linking to relevant pages. With this technique you want to achieve two things:
- Steer the spider to lowest tier pages.
- Distribute PageRank juice to weak pages that don’t rank well for their relevant terms.