Transitioning to XHTML is the easiest step to understand, because
it's a matter of simply following rules. When you run a site through
the W3C HTML Validator, it doesn't care if the site uses tables-based
layout or CSS-based layout, or if your site is accessible or not. As
long as the actual HTML code follows specific rules, your site will
validate. You'll also be happy to know that you can transition to XHTML
using FrontPage 2003 as well as Expression Web.
First, you should be aware that there are two "flavors" of XHTML --
XHTML Strict, which is what you'll be learning here, and XHTML
Transitional. XHTML Transitional is much looser than Strict and allows
you to have older types of HTML code. But there are many good reasons
for following the Strict rules -- for example, updating your site
design will take less time in the future and your site will be more
compatible with future improvements in technology.
Testing Your Web Site
To test your web site and see how many rules it breaks, go to http://validator.w3.org/
and type the URL of one of your web pages, then click Validate. When
you get to the report, you may need to revalidate after selecting the
XHTML 1.0 Strict Doctype. Within Expression, you can also go to Tools
> Compatibility Reports. Click the green arrow in the Compatability
Reports panel and select XHTML 1.0 Strict, then click the Check button.
There may be a lot of errors listed! Don't worry -- once you go through
these next steps, you'll have (mostly) squeaky-clean XHTML.
10 Steps to an XHTML Strict Web Site
- The first step is to make sure that you have the proper DOCTYPE defined.
The DOCTYPE is a line of code that goes at the very top of your HTML
code. It tells the browser which set of rules to follow.
- If your page does not already have a DOCTYPE, you
will have to copy and paste this code into the very first line of your
HTML code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- In
Expression Web, you can set the program so that all new pages use the
XHTML Strict DOCTYPE. Go to Tools > Page Editor Options. Click the
Authoring tab and set the Doctype dropdown to XHTML 1.0 Strict.
- Also
in Expression Web, if you have an old DOCTYPE on your page (or don't
have one at all), you can highlight it, then type Ctrl-Enter to access
the Code Snippets dialog. Select dtx1s and the correct DOCTYPE code will automatically be added.
- Another key rule is that your tags have to be properly "nested." For example, if you have text that is both bold (
<b>...</b>)
and italicized (<i>...</i>), you need to make sure that one
set of tags contains the other, like layers of an onion. For example,
you can have either <b><i>bold on the outside</i></b> or <i><b>italic on the outside</b></i>, but you can't have <b><i>confused and improper nesting</b></i>.
- If you use FrontPage's formatting tools, this will most likely not be an issue.
- You
only have to worry about this if you hand-code a lot and are in the
habit of improperly nesting tags. You will have to run through your
code and fix those tags. Unfortunately the validator isn't smart enough
to tell you when tags aren't nested properly -- the error that is
generated looks exactly the same as what's discussed next…
- Next, you'll want to make sure that all of your HTML tags are properly "closed." There are two types of tags -- ones that have both an opening and closing tag that contain content in between them (such as
<p>…</p>, <h1>…</h1>, <b>…</b>) and ones that are "empty," without a closing tag (such as <img ...>, <br>, and <hr>). The
rule for XHTML is that you have to have both opening and closing tags;
empty tags need to have a slash before the last bracket, like this: <img ... />. For example, the line-break tag should be written like <br /> instead of simply <br>.
- FrontPage has a bad habit of leaving the closing tag off sometimes, so you might find floating
<p> tags without the </p> or floating <li> tags without the </li>.
- FrontPage
also doesn't automatically add the closing slash to empty elements,
either. Expression Web will add the closing slash if you add objects
and elements from the panels, but if you are used to hard-coding HTML
without that closing slash, it won't add it for you.
- In the HTML Validator, this type of error is described like this:
end tag for "element name" omitted, but OMITTAG NO was specified.
- The great news is that FrontPage 2003 and Expression Web will fix these errors for
you, relatively easily. In the Code Pane, right-click and select the
Apply XHTML Formatting Rules option. Those closing slashes and tags
will be added for you! Run your page through the validator again and
all of those errors should disappear.
- Another XHTML rule is that all the HTML tags and attribute names have to be lowercase --
<p>, not <P>, and the attributes should be in quotation marks.
An attribute is a property that is coded within the tag. For example,
this img tag has the src, width, height, and alt attributes defined:
<img src="image.jpg" width="300" height="200" alt="Photo" />
Both the tag name (img) and the attribute names (src, width, height, and alt) must be lowercase, and the attributes should be surrounded with quotation marks.
- Old
FrontPage code is sometimes uppercase. In FrontPage 2003, you can make
sure that the default code formatting is set properly by going to Tools
> Page Options, clicking the Code Formatting tab, and checking "Tag
names are lowercase" and "Attribute names are lowercase."
- In the HTML Validator, this type of error is described like this:
element "UPPERCASE ELEMENT NAME" undefined or
attribute "UPPERCASE ATTRIBUTE NAME" undefined
- Option 1: Go through your site by hand and fix the tags and attributes to be lowercase.
- Option
2: Let FrontPage or Expression Web reformat the HTML for you. First
make sure the default code formatting is set properly (see first bullet
point). Then, right-click in the HTML and choose Reformat HTML.
- Why
would you ever choose Option 1? By using the Reformat HTML command,
FrontPage/Expression Web doesn't just make uppercase tags/attributes
lowercase, it also changes all of the indents and carriage returns to
fit its idea of what organized code should look like. If you have a
specific way of formatting your HTML code, it could mess everything up!
If you care about how your HTML code is formatted, you would want to
choose Option 1 over Option 2. If you don't care, Option 2 is certainly
less time-consuming.
- Don't use FrontPage "bots."
Specifically, FrontPage link bars, FrontPage DHTML scripting, FrontPage
forms processed by FrontPage Server Extensions, FrontPage hit counters,
FrontPage Photo Galleries, and many other FrontPage components will not
validate as XHTML strict. (See the last article in this series for more
information about alternatives to using FrontPage bots.)
- An XHTML rule which is very beneficial is that all images need to have the
alt description defined.
This is usually a short phrase that describes what the picture is for
viewers who may have their images turned off or for blind people who
use a text reader to navigate web pages. It can also help with search
engine relevancy as it allows your picture to be part of your content.
- In FrontPage, double-click on the picture to bring up
the Picture Properties dialog box. Click the General tab and enter a
short text description under the Alternative Representation > Text
field.
- In Expression Web, double-click on the picture to
bring up the Picture Properties dialog box. Enter a short text
description under the Alternate Text dialog box.
- You can hard-code an description in the HTML by adding the alt attribute inside an image tag:
<img src="image.jpg" width="200" height="100" alt="Alternative description goes here" />
- There are certain tags that are not allowed to be in other tags.
- There
are some tags that are called "block elements." These typically reserve
a whole "row" for their content. For example, headings and paragraphs
typically have some space before and after them. Some frequently used
block elements include the headings (
h1, h2, h3, h4, h5, h6), paragraphs (p), horizontal rules (hr), and blockquotes (blockquote).
Other tags are called "inline elements," which means that they show up
in the same line as the content around them. For example, bold text
doesn't create its own paragraph spacing above and below it, but shows
up in the context of the rest of the text. Some inline elements include
bold and strong text (b, strong), italicized and emphasized text (i, em), images (img), and links (a). The basic rule to remember here is that you cannot have block elements inside of inline elements. For example, you can't have <b><p>Bold paragraph</p></b>; you have to have <p><b>Bold paragraph</b></p>.
- There are also some common-sense elements that can't be nested, as well. For example, a link element (
<a href="...">Link</a>)
cannot contain another link element -- it's impossible to have a "link
inside a link." And a form element can't contain another form inside of
it.
- Finally, you can't have inline elements directly in the main
<body> of a page. So, you can't have:
<body>Text in a page</body>
or <body><img src="photo.jpg" ... /></body>
It instead has to be inside a block-level element. Here are several perfectly valid possibilities:
<body><p>Text in a paragraph</p></body>
<body><div>Text in a div</div></body>
<body><h1>Text in a heading</h1></body>
- Ampersands have to be represented by &
in XHTML Strict. This includes ampersands that you want to show up on a
page and ampersands inside of hyperlink URLs. It does not include
ampersands that are used for other "special characters" such as the
© code to display the copyright symbol, or for ampersands that
are within an area of script.
- If you have already run the Apply XHTML Formatting command, this problem has most likely already been fixed!
- This next rule has to do with blocks of scripting or styles. If you have blocks of stylesheet info or Javascript in your site, the "old school" way was to use the comment tags (
<!-- and -->) like this:
<style type="text/css">
<!--
stylesheet code went here
-->
</style>
<script type="text/javascript">
<!--
scripting went here
-->
</script>
In XHTML Strict, however, you should replace the comment tags with this instead:
<style type="text/css">
<![CDATA[
stylesheet code went here
]]>
</style>
<script type="text/javascript">
<![CDATA[
scripting went here
]]>
</script>
This may not work in certain browsers, however, so you should test your
site thoroughly. It may be worth it to only use external stylesheets
and external javascript files (in a .css or .js file) to avoid the
issue altogether.
- Here
comes the most difficult part of XHTML 1.0 Strict compliancy, and the
big reason why people with older sites might choose to throw them out
the window and start over from scratch (or with a pre-XHTML-compliant
template). Basically, there are several attributes from old HTML that simply can't be used in XHTML Strict. Fixing
these issues are sometimes easy, but most of the time you may need to
integrate a stylesheet in order to resolve the issue. Below, I've
listed some of the more common issues that you may run into and a few
ideas for how to fix them.
- Errors: there is no attribute "language" or required attribute "type" not specified
This affects Javascript tags.
- Wrong:
<script language="javascript">
- Right:
<script type="text/javascript">
- Errors: there is no attribute "topmargin" (or leftmargin, marginwidth, marginheight)
This is often found in older sites in the <body> tag.
- Wrong:
<body topmargin="0" marginwidth="0" marginheight="0" leftmargin="0">
- Right:
HTML code: <body>
Stylesheet code: body { margin: 0px; padding: 0px; }
- Error: there is no attribute "border"
This affects the <img> tag.
- Wrong:
<img src="picture.jpg" border="0" width="10" height="10" alt="Picture" />
- Right:
HTML code: <img src="picture.jpg" width="10" height="10" alt="Picture" />
Stylesheet code: img { border: 0px; }
- Error: there is no attribute "hspace" (or "vspace")
This also affects the <img> tag.
- Wrong:
<img src="picture.jpg" hspace="5" width="10" height="10" alt="Picture" />
- Right:
In the stylesheet, you can create a class that adds margins, then apply
it to the image. For example, the stylesheet may have this code that
creates a class with a margin to the left and right of the image,
instead of using the hspace attribute:
.imagewithmargin { margin-left: 5px; margin-right: 5px; }
The HTML code can then apply the special class to the image:
<img src="picture.jpg" class="imagewithmargin" width="10" height="10" alt="Picture" />
- Error: there is no attribute "align"
The align attribute has in the past been used for paragraphs, headings, tables, images, and divs.
- Wrong:
<p align="right"> or <img align="left" … />
- Right:
Create custom classes in the stylesheet for text or image alignment and
apply the classes in the HTML as needed. Below is the CSS code:
.textright { text-align: right; } aligns text to the right
.floatleft { float: left; } if applied to an image, it will "stick" to the left and text will wrap around it
The HTML code could then look like this instead:
<p class="textright">
<img class="floatleft" ... />
- Error: there is no attribute "name"
The name attribute is often used in forms. It has been phased out and
the id attribute should be used instead. This may require some
modification of your Javascript, as well.
- Wrong:
<form name="loginform">
- Right:
<form id="loginform">
- Error: there is no attribute "height" (or "width" or "background")
This error mainly applies to tables and table cells. Tables can have
widths, but cannot have heights. Table cells can't have either widths
or heights. Neither can use the background attribute to define a background image. (Interestingly enough, the bgcolor
attribute is allowed if you want to define a background color.)
However, you can use an inline style to set widths, heights, and
background images of table cells. It is definitely not recommended to
ever set the height for a table.
- Wrong:
<table width="100" height="200">
<tr>
<td width="100" height="100" background="image.gif">Table cell</td>
</tr>
</table>
- Right:
<table width="100">
<tr>
<td style="width: 100px; height: 100px; background: (‘image.gif');">Table cell</td>
</tr>
</table>
- Recommended:
Use stylesheets and custom classes to format background colors,
background images, and other attributes of tables and table cells.
Right now, it's okay to use cellpadding, cellspacing, border, and
bgcolor to format tables. However, you can do all the same things using
stylesheets, so you might as well learn how to do it now so that it's
easier to make modifications in the future. Here's how:
- Instead of:
<table border="1" bordercolor="#ff0000" cellspacing="0" cellpadding="3">
- Do this:
Add these custom classes in the stylesheet:
.redborder { border: solid 1px #ff0000; border-spacing: 0px; border-collapse: collapse; } This replaces cellspacing and borders.
.redborder td { padding: 3px; } This replaces cellpadding.
Then in the HTML code, just add a class to the table:
<table class="redborder">
Doesn't that look cleaner?
- Error: there is no element "font"
The <font>
tag, which has often been used to define text colors, fonts, and more,
is officially Out. All text formatting should be done using a
stylesheet. Below is one example, but understanding how to create
stylesheets will be covered in a separate article.
- Instead of:
<p><font color="#fff000">This is red text!</font> This is not.</p>
- Do this:
Stylesheet code has a custom class:
.red { color: #ff0000;"}
HTML code references the class:
<p><span class="red">This is red text!</span> This is not.</p>
- This,
by the way, means that you shouldn't use FrontPage's font-formatting
toolbar for picking fonts, font sizes, or colors, as FrontPage 2003 and
below will automatically generate
<font> tags.
Instead, use CSS and create a custom class with the properties that
you're looking for. (Expression Web will create a custom class for you
when you use the formatting toolbar, so that's okay!)
This article doesn't cover every single error or problem that you
may encounter, but you've certainly learned the basics and most
frequent issues when moving to XHTML Strict!
Is There a Shortcut?
You may be wondering if there is an easier way to do all of the above. The answer is yes:
- You can use the Dreamweaver program to convert your pages to
XHTML. There is documentation on the internet and within Dreamweaver if
you would like to learn more. The nice thing is that if Dreamweaver
comes across something that it can't fix, it logs it in a report so
that you can go through it and fix it yourself.
- There is a program called HTML Tidy that you can download for free at http://sourceforge.net/projects/tidy.
However, this is NOT a typical "Windows program" with easy-to-follow
dialog boxes. You will definitely have to read the documentation and
support information before attempting to use it (http://tidy.sourceforge.net/).
Good News for the Future
The good news is that Expression Web will generally help you to
create XHTML Strict code, so it's the fixing old sites that can be
time-consuming. If you accidentally leave off a closing tag, Expression
Web will highlight code so that you realize that something is wrong.
(Unfortunately it isn't always smart enough to highlight the right
code, but the problem will usually be nearby.) Also, if you use an
attribute that doesn't exist anymore in XHTML Strict, Expression Web
will put a squiggly red underline, similar to how Microsoft Word points
out spelling problems.
PixelMill has many templates that have been pre-validated to be
XHTML 1.0 Strict (at least the base layout part!). Applying the things
from this article should help you to keep them validated!
In the next article, I'll explain what it means to have semantic markup, and why it's a good thing...