HTML5 sectioning elements, headings, and document outlines

A subject I have returned to a couple of times is how to use headings to make good document outlines in HTML documents. See Headings and document structure conclusions for a summary of my reasoning.

Recently I’ve been taking a closer look at how HTML5 changes the way we create document outlines. I’m not entirely sure that I have understood the specification fully (it isn’t written with “authors” like me in mind after all). If I have understood how it works, I think the new outline algorithm requires you to think carefully when using the new sectioning elements (article, section, nav, and aside) if you also want a coherent document outline without untitled sections.

The HTML 4.01 outline

To explain what I mean, let’s look at some examples. Here is a simplified HTML 4.01 markup structure of a typical “document/article” page:

<body>
	<div id="header">Site title etc.</div>
	<div id="nav">
		<ul>
			<li><a href="/">Nav item</a></li>
		</ul>
	</div>
	<div id="main">
		<h1>Article title</h1>
		<p>Article content.</p>
		<h2>Article sub-heading</h2>
		<p>More content.</p>
		<h3>Article sub-sub-heading</h3>
		<p>More content.</p>
	</div>
	<div id="content-secondary">
		<h2>Sidebar heading</h2>
		<h3>Sidebar sub-heading</h3>
	</div>
	<div id="footer">
		<h2>Footer heading</h2>
		<p>Footer content.</p>
	</div>
</body>

That creates the following document outline (you can use the Web Developer extension to check document outlines):

  1. Article title
    1. Article sub-heading
      1. Article sub-sub-heading
    2. Sidebar heading
      1. Sidebar sub-heading
    3. Footer heading

You could argue that the sidebar and footer headings don’t really belong in the outline for the Article title, but let’s leave that for another discussion. The point here is that we have a clear outline of the headings on this page, with the title of the article as the top level heading. We want to keep that when converting this markup structure to HTML5.

The HTML5 outline

So let’s take the HTML 4.01 structure and change it to make use of the sectioning elements in HTML5. Most tutorials and articles I’ve seen suggest something like this:

<body>
	<header>
		Site title etc.
		<nav>
			<ul>
				<li><a href="/">Nav item</a></li>
			</ul>
		</nav>
	</header>
	<article id="main">
		<h1>Article title</h1>
		<p>Article content.</p>
		<h2>Article sub-heading</h2>
		<p>More content.</p>
		<h3>Article sub-sub-heading</h3>
		<p>More content.</p>
	</article>
	<aside>
		<h2>Sidebar heading</h2>
		<h3>Sidebar sub-heading</h3>
	</aside>
	<footer>
		<h2>Footer heading</h2>
		<p>Footer content.</p>
	</footer>
</body>

For reasons of backwards compatibility I have not changed all headings to h1 elements as suggested in the HTML5 specification, instead keeping the same levels for all headings (using h1 everywhere has no effect in this case anyway). Because of this the document outline will be identical to that in the HTML 4.01 example in a user agent that doesn’t implement the HTML5 outline algorithm. However, if you use a tool such as the HTML5 outliner that does implement said algorithm you get this outline:

  1. Footer heading
    1. Untitled NAV
    2. Article title
      1. Article sub-heading
        1. Article sub-sub-heading
    3. Sidebar heading
      1. Sidebar sub-heading

Say what? How did the footer heading become the top-level heading for this page? And what’s with the “Untitled NAV”?

The footer is now the header?

First the footer. The footer element is not sectioning content, i.e. it does not create a new section. This leaves <h2>Footer heading</h2> as the only heading in the context of the body element’s section. Since the body element is the document’s sectioning root the outline algorithm makes it the top level heading, despite it being the last heading in the document and an h2.

This took me a while to wrap my head around, so it felt quite unintuitive to me.

So how can we work around this, then? To prevent the footer’s heading from becoming the page heading we can wrap the footer element’s contents in a section element:

<footer>
	<section>
		<h2>Footer heading</h2>
		<p>Footer content.</p>
	</section>
</footer>

This really feels like a hack, but it does move the footer heading to its expected position in the document outline. The problem is that it leaves the body element without a heading, so now the outline looks like this:

  1. Untitled BODY
    1. Untitled NAV
    2. Article title
      1. Article sub-heading
        1. Article sub-sub-heading
    3. Sidebar heading
      1. Sidebar sub-heading
    4. Footer heading

nav sectioning

Now we have two untitled sections. The nav element is untitled since it is a sectioning element, and so the outline algorithm creates a new section for it. Since there is no heading in the nav element, it becomes “Untitled NAV”.

I don’t see why nav should be a sectioning element. It doesn’t seem right. Sure, navigation lists can often benefit from having a heading to label them, but since nav is a sectioning element you have to put a heading inside each nav element to avoid ugly and possibly confusing untitled sections in your outline. So let’s do that:

<nav>
	<h2>Main navigation</h2>
	<ul>
		<li><a href="/">Nav item</a></li>
	</ul>
</nav>

Avoiding “Untitled BODY”

To avoid making the body element an untitled section, you need a heading that is outside of any sectioning elements. The problem with this is that it prevents you from putting the actual content of a document inside an article element and at the same time have its heading be the document’s top level heading.

Because of this the article element only seems useful when you have multiple articles on a single page sharing a common top level heading. Examples of this can be a home page or a blog’s archive pages. But it doesn’t seem to make sense on pages displaying a single article, such as this page.

I can think of a few different ways of approaching this problem.

Avoid using any sectioning elements (no)

Forgetting about the sectioning elements removes the problem. Outlines will behave as before. The obvious drawback is that you miss out on any hypothetical advantages using these elements will give you once user agents support them. Hmm.

Repeat the article title in a heading outside any sectioning element (no)

Duplicating the main heading, then probably hiding it with CSS, just to get it in its proper place in the document outline? Probably not such a good idea, whether you look at it with your accessibility or your SEO hat on.

Make the site title the top level heading (no)

Some argue that the site title/logo/organisation name should always be the top level heading. I don’t share that opinion—it’s the document title that’s most important when you look at a single page. If you’re in the “my logo is my h1 and every page on my site has the same title” camp this can work. I’m not.

Avoid using the article element on single-article pages (probably)

What about using the sectioning elements, but not wrap the content on single-article pages in an article element to make the article’s heading end up in the body element? Something like this:

<body>
	<header>
		Site title etc.
		<nav>
			<h2>Main navigation</h2>
			<ul>
				<li><a href="/">Nav item</a></li>
			</ul>
		</nav>
	</header>
	<div id="main">
		<h1>Article title</h1>
		<p>Article content.</p>
		<h2>Article sub-heading</h2>
		<p>More content.</p>
		<h3>Article sub-sub-heading</h3>
		<p>More content.</p>
	</div>
	<aside>
		<h2>Sidebar heading</h2>
		<h3>Sidebar sub-heading</h3>
	</aside>
	<footer>
		<h2>Footer heading</h2>
		<p>Footer content.</p>
	</footer>
</body>

That gives us the following outline:

  1. Article title
    1. Main navigation
    2. Article sub-heading
      1. Article sub-sub-heading
    3. Sidebar heading
      1. Sidebar sub-heading
    4. Footer heading

Getting close. The only issue here is that the orders of the headings in the outline doesn’t match their order in the markup. That could potentially be confusing to someone who navigates by headings in a user agent that presents the headings in the outline order. I have no idea if this is what for instance screen readers will do once (if) they start supporting HTML5, so it’s very hard to tell if it will be a problem or not.

Another tweak of this is to go the “site title as h1” way and change the header to this:
<header>
	<h1>Site title etc.</h1>
	<nav>
		<h2>Main navigation</h2>
		<ul>
			<li><a href="/">Nav item</a></li>
		</ul>
	</nav>
</header>
That would make the outline look like this:
  1. Site title etc.
    1. Main navigation
  2. Article title
    1. Article sub-heading
      1. Article sub-sub-heading
    2. Sidebar heading
      1. Sidebar sub-heading
    3. Footer heading

Hmm. Maybe. I still think an outline without the site title in it makes it clearer what the current page is about, so the version without an h1 around the site title is the markup structure I’m leaning towards using for single-article pages.

Are you confused too?

While writing this I get the feeling that the HTML5 outline algorithm might be a bit too complex. It could be that it takes some getting used to, of course, but at the moment it seems like it causes more problems than it solves, at least if you want to avoid untitled sections in your document outlines.

Of course it’s possible that I have misunderstood something, in which case I’d be happy if someone could correct me.

Posted on February 11, 2011 in HTML 5, Accessibility