html tag when expanded with a few attributes is a real boon for increasing the accessibility of your web pages. In this the third article in my “From the Top” series, I’ll introduce each of those attributes and explain the benefits of their use.
- The lang Attribute
- Language Tags and Localisation
- Reading Order
- Why it’s Important to Define the Language
- The xmlns Attribute
The lang Attribute
Declaring the language of a web page occurs at two levels. Firstly, the primary language of a document may be used by search engines to return only web pages in a specific language and is declared in either the Hypertext Transport Protocol (HTTP) header or as a
meta tag. Access to server settings or ability to perform content negotiation will affect your decision on which method to use.
The second, and more specific level defines the default text-processing language a specific range of text is written in. For Hypertext Markup Language (HTML) this is achieved using the
lang attribute and for Extensible Markup Language (XML), in our case as Extensible Hypertext Markup Language (XHTML) using
xml:lang. Where both are present (i.e. in backwards compatibility mode) the
xml:lang takes precedence.
For both flavours of markup the text-processing or natural language is inherited along the document hierarchy so to apply your main language to the entire document the
html tag is ideal. The default text-processing language can be changed further along the heirarchy by applying the lang attribute to a more specific element.
Language Tags and Localisation
The value of the
xml:lang) attribute is referred to as a language tag. It comprises the primary subtag optionally followed by further subtags separated by a hyphen. Language tags use two or three letter language codes such as
en for English,
de for German and
fr for French. in such cases where a two and three letter code exists for the same language, the two letter code should be used. By including a subtag the natural language of the document can be localised further for dialect or region so
en-GB would identify British English text and
fr-CA would indicate content written in Canadian French. Subtags are case-insensitive. There are special-case primary subtags of
x- but these will be outside your normal usage (unless you have a killer site for Klingons that is) so I will leave those for you to look at another day.
It is easy to forget that whereas “western” languages are read from left to right, there are also major languages such as Chinese and Arabic that read in the opposite direction. The reading order is not necessarily inherited from the chosen language tag so we will add the
dir attribute to the
html tag and assign the value “ltr” to it. For languages read from right to left the attibute value would be “rtl”. These are the only two options, imagine the fun to be had with “ttb” (top to bottom) for authentic Japanese writing (and yes, “rtl” also).
Why it’s Important to Define the Language
Declaring the text-processing, or “natural” language of a page is beneficial for many purposes:
- To assist screen readers and braille translators.
- To meet World Wide Web Consortium (W3C) Web Accessibility Initiative (WAI) guidelines – specifically checkpoints 4.1 and 4.3.
- To meet legislative requirements, for example the Disability Discrimination Act (DDA) in the UK.
- To provide authoring tools with the ability to check spelling and grammar.
- To identify the correct language of a section of text for translation tools.
- To style information in a specified language using the Cascading Style Sheets (CSS)
- To filter search engine results based on the user’s language preference.
- To assist the parsing of the text of the document with XSL or some other scripting by other people / devices.
The xmlns Attribute
If your markup is XHTML another attribute you must include is the
xmlns declaration for the XHTML namespace. Remembering that XHTML is a reformulation of HTML as an application of XML, an XML namespace is a collection of names, identified by a Universal Resource Indicator (URI) reference, that are used in XML documents as element types and attribute names. You need to declare the namespace so that a user agent knows which elements belong to which language. The namespace is declared using the attribute
xmlns followed by the URI, which for our purposes is
To maximise the universal accessibility of our pages we should always include language information in our pages. We can identify the natural language of the content by using the
lang attribute and/or the
xml:lang attribute for XHTML and must always include the XML namespace if using XHTML. Additionally, we can specify the primary language of the document using HTTP headers or the
meta tag. Examples of the opening
html tag include:
For XHTML 1.0 in backwards compatibility mode:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr">
For XHTML as
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" dir="ltr">
For HTML 4:
<html lang="en" dir="ltr">
- W3C: Specifying the Language of Content
- W3C: FAQ: Why Use the language attribute?
- W3C: HTML and XHTML Frequently Asked Questions
- W3C: Language Tags in HTML and XML
- W3C: Normative Definition of XHTML 1.0
- W3C: Namespaces in XML
Next in â€œFrom the Topâ€
Next week I’ll cover the title tag and provide a few tips from around the Internet to writing effective page titles.
The Complete “From the Top” Series
- Document Type Definitions.
- MIME and Content Negotiation.
- Defining Content Language — this article.
- The Head Element.
- The Title Element.
- The Meta Element.
- The Link Element.