For the purposes of simplicity this post is based on HTML semantic markup. HTML on its own has many tags that can be interpreted and used differently.

In an earlier post about contextualisation I mentioned how to emphasize text to give web pages meaning and how it affects the way search engines treats that data in the SERPs (Search Engine Results Page’s – or rankings).

Semantic markup plays a large part in how this achieved, and it is more commonplace now to have semantic code in your HTML from many web designers pushing and adhering to standards; and a by-product from businesses demanding better quality code so that their data has a greater chance of portability when it comes to the large array of devices and software that handles content/information nowadays.

What does semantic mean?

The Oxford English Dictionary defines it as so:

(Adjective) Relating to meaning in language or logic.

In terms of coding for the Web (in particular with derivatives of HTML) it is a relational method of giving data meaning through the use of tags and attributes.

Why use semantic markup?

As well as having benefits in seo, using semantic markup aids devices, software and search engines in interpreting data. It also helps web developers: separating style from content creates cleaner code with usually a lighter footprint for maintaining and updating websites.

Imagine a word document you’ve written where you have no punctuation, it includes your titles and subheadings, has no new lines or paragraphs, no bullet points, and to spice things up, it includes all sorts of numbers you have taken from an expenses report…

What would you be left with is a garbled mess of words and numbers. It would be one laborious task for not only you but someone else to work out what was going on within your document.

Punctuation, bolding and tables, in a sense, are our own human readable form of semantic markup. We use visual cues and symbols to make sense of the information we’re looking at.

Remember, it is not so much on how the information is displayed, but more on how it should be portrayed and interpreted.

For instance, we know a comma is a pause and a full stop is an end to a sentence. Titles and subheadings are usually short and are separated by distinct new lines. Semantic markup works almost in the same way; it can give us visual cues but it is the underlying message that always remains the same.

So what does it have to do with the Search Engines?

Search engines, and in particular Google, looks for these tags of code within the HTML of a document, and it is by using these tags correctly that can help you in your rankings with the added benefit of having that information displayed on other parts of their services.

For seo, marking up your site semantically has a lot of benefits, for instance: code is lighter and easier for bots to crawl; the text-to-code ratio is improved; you’re main content is likely to be higher in the source code (rather than be obstructed by ugly styling code or the content from a stray side column); and by giving titles, words and menu items greater chances of being interpreted.

Can you give an example of using semantic markup?

The most common types of tags, that should be used for semantic purposes, and you might have or will come across, are the Title <h> tags, Paragraph <p> tags and the Strong <strong> tags.

Most tags have their content displayed slightly differently, and most where their appearance is changed will most likely be block level elements, meaning (visually) by default they will start on and finish with a new line within a web page. (There are exceptions though and quite a few of them such as <em> and <strong> tags which are mentioned in the examples below.)

You can tell when a web page is well formed, semantically, when you disable css and javascript in your browser and then load the page. The page should be still human-readable top-down (header, content, side columns, then footer – usually in that order), without text flying out to the sides or text mysteriously being inserted. Sure, it will look ugly, but the document should be still readable and make sense.

In the end though, you will still have to see the code to see if it is formed correctly (although there are no strict rules, but a consensus of guidelines to stick to).

To give you an idea here is a simplified extract taken from a semantically correct page:

<div id="page">
<div id="header">
<div id="menu">
<li><a href="home.html">Home</a></li>
<li><a href="services.html">Our Services</a></li>
<li><a href="contact.html">Contact Us</a></li>
<div id="content">
<h1>Your Main Heading</h1>
<p>Your main content</p>
<h2>Your Sub Heading</h2>
<p>Some more text on sub-heading</p>
<div id="side-column">
<!--Your side column which might contain text, call to actions, images and/or helpful links here -->
<div id="footer">
<!--Your straplines/essential links and copyright info. -->

From the extract above you can see there are a number of <div>’s present. <div>’s and <span>’s are ambiguous in their nature so they are used to separate layout (the visual) from the content (the data). Separating data this way helps in organising your site. For example, you could separately style your header’s content without affecting other sections, and easily, through using css by changing the styling in the one file, which is far more efficient than changing code on every page.

Let us have a look at a few tags…

<h1><h2><h3><h4><h5> (Headings)

<h1>Your Title</h1>
<h2>Your Subtitle</h2>

Usage: For containing titles and subtitles of your content. Ideally there should be only one <h1> per page. <h> tags should be used like a tree structure with a <h1> at the top with branches of <h2>’s and offshoots of <h3>’s within those <h2>’s and so on and so forth.
Example of incorrect usage: Using an <img> tag to replace the title. If titles need to look smarter it might be wiser to use the css background property instead.

<p> (Paragraphs)

<p>A paragraph</p>
Usage: Used for a paragraph of text. Every paragraph should be separted by <p> tags.
Example of incorrect usage: You might spot in someone’s source <p> </p>, this is lazy code for a new line, someone might have designed the page using an old version of Dreamweaver, or someone using a CMS text editor. Best practice would be to replace it with a <br /> (new line tag), or in most cases where it is used to make space at the bottom of a document, is to sort out the spacing in the footer or the height of the side column using css.

<em>& <strong> (Emphasis)

<em>I’m emphasizing this!</em>
<strong>Important: Please note.</strong>

Usage: Both are used for emphasizing/stressing a portion of text.
Example of incorrect usage: Sometimes designers use the <b> and <i> tags to make portions of their text stand out. Although, technically, there is nothing wrong with this, the tags are for display purposes and do not really belong there. By using <em> or <strong> instead you’re emphasizing the text. If styling with <b>s’ and <i>’s then it should be handled through css (font-style: italic; or font-weight:bold;) on the containing tag or through an inline element such as <span>.

<ul><ol><li> & <dl><dt><dd> (List elements & Defintion Lists)

<ul><li>1st list item</li><li>2nd list item</li></ul>
<dl><dt>Car</dt><dd>A vehicle with four wheels.</dd><dt>Motorbike</dt><dd>A vehicle with 2 wheels.</dd></dl>

Usage: These are used for structural-relational purposes. List items are the equivalent of bullet points. And definition lists are for terms along with their definitions.
Example of incorrect usage: For list items some designers choose to encapsulate their list of items in a <p> tag and every item separated by a <br> tag; commonly used in main menus as well. Using the list element instead helps define the list of items as being related.

<table><tr><th><td> (Tables)

<tr><th>Column1 Heading</th><th> Column2 Heading</th></tr>
<tr><td>Column1 Data</td><td>Column2 Data</td></tr>

Usage: For tablulated data. For showing rows and columns of data (much like what is displayed on a spreadsheet).
Example of incorrect usage: A common misconception is to not to use tables at all – which is incorrect. Tables should not be used for styling your whole site or portions of your site. Coding sites using only tables was particularly rife in the early days of the Web as it was deemed the safest way of content being displayed the same in all the older browsers. Browsers are more inline with each other in displaying content now, so coding your site wholly using tables is almost frowned upon.

As you can see again from the examples of incorrect usage above, in most cases bad coding is where they are trying to have the tag for display purposes. Always, where possible, separate content from layout.

The pros far outweigh the cons. Again, using semantic code results in cleaner code, it is easier to update and maintain your website, and importantly, when used to good effect it can lead to better rankings.