One of the most important issues that HTML5 brings to us is the possibility of placing semantic content in our code, this means that we can say robots or crawlers what is about the content inside some tag.
For example, the tags <sidebar>, <header>, <footer> or <nav> will act as a div that also is explaining which type of content you are going to find inside them. That’s semantic.
But you do not need to use those new tags in HTML5 for having a semantic page, HTML4 and XHTML are also semantic, though not that much, if you use standards correctly.
How to use standards correctly to have a semantic web?
Well, we only need to use the tags we normally use for what they were thought. This means, if you have a header or title for a text, it should be placed between the <Hn> tags, (H1, H2, H3, …). If then you place some text, it should be in the paragraph tag (<p>), and for placing an inline style make sure to use a <span>.
This means each tag has its purpose, a div is a block while the span is inline, you cannot place a div inside a <p> because you cannot fit web blocks inside a paragraph, only text. Neither you can do so inside a header (h1, h2, …) or a span, or a <p> inside a <p>, etc.
This means there are some tags that will not let you place whatever you want inside them*. You cannot do <p><h1></h1></p> or <h1><p></p></h1> but you can <h1></h1><p></p>.
Divs are the most flexible of all of them, they do not place any style at all or have any semantic role, they just are web blocks, so you can use divs everywere to make headers, paragraphs, list items, etc. And you can have as many divs as you want inside the first one, or a chain of them with a div->div->div->div->div->… which takes lots of front end developers to fall into divitis and instead of using the correct tag they use divs for everything, avoiding the rules between tags, and all semantic information too.
* I mean, if you want to follow standards, of course browsers will not have a problem showing your unstandarized content.
Solving the divitis problem
If you have fall into the divitis maelstrom don’t worry. The receipt its cheap and easy, the cure will take some time. You only need to remember how you used HTML before divs and use the W3C validator a very lot until you remember everything.
Here’s an example of how a web page shouldn’t be done:
<div class="title">Title <div class="date">30/01/2011</div></div> <div class="text">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque at magna velit. Maecenas suscipit, lorem vel ultrices congue, diam tellus ullamcorper magna, non ultrices orci nulla sit amet dui.</div> <div id="listBox"> <div class="listItem">Lorem ipsum dolor sit amet</div> <div class="listItem">Lorem ipsum dolor sit amet</div> </div>
And here’s that example well done:
<h1>Title <span class="date">30/01/2011</span></h1> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque at magna velit. Maecenas suscipit, lorem vel ultrices congue, diam tellus ullamcorper magna, non ultrices orci nulla sit amet dui.</p> <ol> <li>Lorem ipsum dolor sit amet</li> <li>Lorem ipsum dolor sit amet</li> </ol>
Divs are very useful when stablishing page modules, areas or boxes, but not for everything in your page.
Is that all we have in HTML4 or XHTML?
There isn’t only the correct use of tags for semanticize a web in old standards, there were also some tags mainly unknown that exist for this very purpose.
- abbr. This lets you use an abbreviation and stablish its meaning.
Example: <abbr title=”Mister”>Mr</abbr> - acronym. Just like abbr but for acronyms.
Example: <acronym title=”Extensible Hypertext Markup Language”>XHTML</acronym> - address. To stablish that something is an address.
Example: <address>19 Mine Street, London</address> - cite. To cite others work.
Example: <cite>Jeffrey Zeldman, Designing with web standards, page 145</cite> - code. This is not only semantic but also helps in showing a source code in monospace.
Example: <code>var i int = 0;</code> - del. Used when a text is deleted without deleting it. Normally used when something has been updated and corrected.
Example: Google is <del>evil</del> cool. - dfn. To define the meaning of a word.
Example: <dfn>Standard</dfn>: A group of rules approved by a group of people.
There are a lot more you can use just checking W3C specs, all of them very useful for robots for blind people, translators, or even crawlers.