From the Top: The Meta Element

The meta element provides web document authors the ability to provide information about the document rather than the content itself. In the sixth article in my series “From the Top”, I continue to explore what we may wish to include between the head tags. This article is not a discussion on Search Engine Optimisation (SEO) techniques although the guidelines offered may help (and ethically so) in that regard.

The Meta Element

Generally, specifying meta data includes declaring a property value and a content value. For example, in the head of this document you will find the following meta information:

<meta name="author" content="Karl Dawson" />

In the above example, the name attribute is identifying a property of “author” with the value of that property defined in the content attribute. Allowable attributes include:

  • name
  • content
  • lang
  • dir
  • scheme
  • http-equiv

Whereas the name attribute may be substituted with the http-equiv attribute, the content attribute is always present in a meta tag because its value is the meta data. Also note that the meta tag is self-closing with either /> or > for Extensible Hypertext Markup (XHTML) or Hypertext Markup Language (HTML) respectively.

The Name and Content Attributes – working as a pair

The name attribute identifies the property whose value is specified by the value of the following content attribute.

Popular property values include:

  • keywords
  • description
  • generator
  • author
  • copyright

A meta tag describing keywords, for example <meta name="keywords" content="keyword1, keyword2, etc"> is often included to provide search engines with keywords to find your website. However, due to the amount of abuse from keyword spamming many search engines ignore this meta tag and instead rely on weighting a page for relevancy to a search term based on the actual content of the page. Additionally, if you repeat a keyword too many times your site will be penalised and given a lower ranking. To cover your bases you might still want to include a list of keywords but don’t get too hung up on them. Choose 15 to 20 keywords from your title, content headings and opening paragraphs to assist, but not harm your search engine ranking.

Often accompanying a keyword meta tag, a description meta tag provides search engines and directories with a summary of the page. Write a short, unique 25 word description of no more than perhaps 160 to 180 characters (including whitespace). This blog utilises a WordPress plugin to take the first few sentences of the article — essentially the introduction, to automatically do this. If your content management system (CMS) or blog publishing system has a similar feature then you may wish to adopt a similar approach (or write one for a bespoke development of course).

An example from the home page of this blog:

<meta name="description" content="personal blog on web standards and accessibility by Karl Dawson" />

The generator meta tag is needless fluff as far as I’m concerned but for this blog I’ve left it in place — all it does is indicate what program was used to create the blog (WordPress). If anyone can provide any benefits of this to visitors or authors (other than idle curiosity) please leave a comment below.

Author and copyright meta tags are probably useful, especially for articles and tutorials written to blogs where your name doesn’t form part of the website’s name. Again, from this site an example would look like:

<meta name="author" content="Karl Dawson" />
<meta name="copyright" content="Copyright That Standards Guy - Theme and Mods Mike Cherim - http://green-beast.com/. All rights reserved." />

Where the content attribute’s value is a Universal Resource Identifier (URI) you may prefer to use a link tag instead of a meta tag:
<meta name="DC.identifier" content="http://www.ietf.org/rfc/rfc1866.txt"> becomes…
<link rel="DC.identifier" type="text/plain" href="http://www.ietf.org/rfc/rfc1866.txt">

The Lang and Dir Attributes — describing the meta data some more

At this point it is probably worthwhile to mention the lang and dir attributes. The lang attribute assists speech synthesizers to apply language-dependent pronunciation rules on the value of the content attribute. This would be useful, for example, if the primary language of your website was English but a page author had a French name. In this case you would modify the author meta tag as follows:

<meta name="author" lang="fr" content="Thierry Henry" />

For a fuller description of these attributes be sure to read an earlier article on content language.

The Scheme Attribute

Just like the lang and dir attributes, the scheme attribute assists user agents by providing context to correctly interpret meta data. Our American cousins for example write their dates different to us Brits (and everyone else too?) so “10-3-06″ may mean either 10 March 2006 or 3 October 2006. To eliminate any ambiguity you would set the value of the scheme attribute as “Day-Month-Year”. For example an article written on the aforementioned date could have the following meta tag:

<meta scheme="Day-Month-Year" name="date" content="10-3-06" />

The scheme attribute can also provide non-critical, helpful information to user agents by giving a contextual reference to a property value.
The World Wide Web Consortium (W3C) give the following example of an International Standard Book Number (ISBN) being applied to a web page about a book:

<meta scheme="ISBN" name="identifier" content="0-8230-2355-9" />

Values for the scheme attribute depend on the property name and the associated profile.
A popular meta data profile would be that of the Dublin Core Metadata Initiative (DCMI).

“The [DCMI] is an organization dedicated to promoting the widespread adoption of interoperable meta data standards and developing specialized meta data vocabularies for describing resources that enable more intelligent information discovery systems”.

The HTTP-EQUIV Attribute

The http-equiv attribute may be used in place of the name attribute to create HTTP header responses. Allowable values include Content-Type, Expires, Set-Cookie, Refresh or PICS-Label.

  • Content-Type — Authors may use a meta tag to define default information for a document such as the default scripting language, default style sheet language and the document character encoding — with the latter being the most popular of the three. Many argue that the information contained in a content-type meta tag such as <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> should be written directly into the HTTP header but when a page is viewed offline it may be beneficial to have this information available to the user agent.
  • Expires — An http-equiv of value expire indicates that a document will expire on the date set in the content attribute. When the date passes, the user agent should load a new copy from the server rather than use a copy from its cache.
  • Set-Cookie — This sets the name and value for a persistent cookie in the following format <meta http-equiv="Set-Cookie" content="value=n;expires=date; path=url"> where value is the name of the cookie, “n” is the value to be set into the cookie. If expires=date; is omitted then the cookie will be deleted when the browser closes (a session cookie).
  • Refresh — Never use refresh as this makes the page inaccessible to some users. Even if you only intend to deliver overall level A compliance this is important and easy to achieve. If it is necessary to automatically forward a visitor to a different page use a server-side redirect.

Platform for Internet Content Selection (PICS) labeling

This is an infrastructure that associates meta data with Internet content. Originally developed for parents and teachers to control what children could view on the Internet it also allows for other uses of labeling such as intellectual property rights management. With filtering and other ratings sites using this system it could be beneficial to include such meta data in future projects. Safe Surf and the Internet Content Rating Association (ICRA) are two such self-certifying rating systems with the latter recommended in the UK Cabinet Office’s web developer’s handbook. It should be noted that the ICRA moved away from the use of the PICS system in 2005 but ICRAplus does still recognise such labeling.

The recommended method of distributing the PICS label is to include it in the HTTP header but failing that you can either maintain a label bureau on your own web server (or use an external host such as Cyber Patrol) or use an http-equiv meta tag. An example label from this blog looks like:

<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.classify.org/safesurf/" L gen true for "http://www.thatstandardsguy.co.uk" r (SS~~000 1))'>

Other meta tags

There are two other meta tags that you may wish to consider using also. The first is designed to stop the image toolbar controls in Internet Explorer from showing when you hover over an image. Essentially it provides a mechanism to expand an image to full size or to save the image without opening the context menu options. It can be easily argued that interfering with the user agent is the route to the dark side but some people will want to use it to maintain their designed look. The second tag prevents Microsoft products from auto-generating “smart tags” on your web pages. These tags would underline key words on your web pages much like a spell checker would and create a link to Microsoft approved (paid for) advertising. I don’t think Internet Explorer 6 shipped with this “feature” so if anyone can expand on this please leave a comment. Below is each tag:

<meta http-equiv="imagetoolbar" content="no" />
<meta name="MSSmartTagsPreventParsing" content="true" />

Conclusion

Meta data is used to describe the document itself rather than the content within. Include well-written meta tags for keywords and description for search engines and their result pages and consider other meta tags, using profiles like Dublin Core that add useful information about the web page.

Next in “From the Top”

In the next article I will be looking at how to include links to other documents and files in the head section of your web pages.

Further Reading

The Complete “From the Top” Series

 

This entry was posted in Markup. Bookmark the permalink.

7 Responses to From the Top: The Meta Element

  1. Mike Cherim says:

    As usual, Karl, this is pretty thorough and well written. Regarding dates, I’ve read somewhere that the only way to write a date fully accepted in all corners of the world is year/month/day, numerically: 2006-03-13. Good stuff.

  2. Thierry says:

    Nicely written, but I find these two confusing:

    The meta element provides web document authors the ability to provide information about the document rather than the content itself.

    Author and copyright meta tags are probably useful, especially for articles and tutorials..

    So are they related to the document itself or its content? What name should appear in there? The name of the author of the document (designer/developer) or the name of the person who actually wrote the content (article, tutorial, whatever)?

  3. Karl Dawson says:

    Hi Thierry,

    I would use the name of the person who owns the document rather than the developer’s name for the author — especially if you (as a developer) had created a website for someone else. With self-publishing to blogs like this one then the author is most likely the developer also.

    The content and the web page (document) are naturally linked when it comes to keywords and description — a page about cats is bound to mention that in the description meta tag but meta data can be more abstract than that case when you consider embedding time stamps etc.

    In a content management system (CMS) environment within a large organisation the author meta tag would provide a means to identify someone responsible for the page and hence if a problem was reported with the content — broken link, spelling error etc. then the support team (who are most likely not responsible for content) would know who to contact to correct the page. The detail, or content, on the page is irrelevant in this example.

    Regards, Karl

  4. Usually the thing that is overlooked by most authors regarding the META is the default Scripting and Style Sheet Language may be defined there.

  5. Tommy Olsson says:

    It’s worth mentioning that you cannot actually change the content type using an HTTP equivalent. You may specify the character encoding, though.

    In other words, you can’t use
    <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/>
    to serve a document as XHTML. A user agent needs this information before the HTTP body is received, in order to select a suitable parser. It needs to go in a true HTTP header.

  6. Daryl A. Szady says:

    I use these two meta elements:

    UTC, so there isn’t any ambiguity when it was created or modified.

  7. Daryl A. Szady says:

    Sorry I didn’t reply in the correct form before:
    <meta name="date" content="2006-04-06T05:41:27.178Z" />
    <meta name="modified" content="2006-04-06T08:04:44.73Z" />