XHTML and How to Write ItBecause CSS is a mechanism for styling XHTML, you can't start using CSS until you have a solid grounding in XHTML. And what, exactly, is XHTML? XHTML is a reformulation of HTML as XMLdidja get that? Put (very) simply, XHTML is based on the free-form structure of XML, where tags can be named to actually describe the content they contain; for example, <starname>Cher</starname>. This very powerful capability of XML means that when you develop your set of custom tags for your XML content, you also must create a second document, known as a DTD (document type definition) or a similarly formatted XML schema, to explain to the device that is interpreting the XML for how to handle those tags. XML has been almost universally adopted in business, and the fact that the same X (for eXtensible) is now in XHTML emphasizes the unstoppable movement toward the separation of presentation and content.
The rest of this chapter is dedicated to the latest, completely reformulated, totally-modern, and altogether more flexible version of HTML. Ladies and gentlemen, please welcome … XHTML! XHTML Markup RulesCorrectly written XHTML markup gives you the best chance that your pages will display correctly in a broad variety of devices for years to come. The clean, easy-to-write, and flexible nature of XHTML produces code that loads fast, is easy to understand when editing, and prepares your content for use in a variety of applications. You can easily determine if your site complies with Web standardsif your markup is well-formed and valid XHTML, and your style sheet is valid CSS, then it will comply. (Whether it's well designed or not is a rather more subjective matter, but we will consider that as we go along.) Well formed means that the XHTML is structured correctly, according to the markup rules described in this chapter. Valid means the markup contains only XHTML, with no meaningless tags, tags that are not closed properly, or deprecated (phased out, but still operational) HTML tags. You can check to see if your page meets these criteria by uploading the page onto a server and then going to http://validator.w3.org and entering the page's URL. Press Submit, and in a few seconds you are presented with either a detailed list of the page's errors or the very satisfying "This Page Is Valid XHTML!" message (Figure 1.2). CSS can be validated in the same way at http://jigsaw.w3.org/css-validator. Figure 1.2. If your site complies with Web standards, you'll get the ever-gratifying This Page Is Valid XHTML message from the W3C validator.Here's the complete (and mercifully, short) list of the coding requirements for XHTML compliance:
Entities not only help avoid parsing errors like the one just mentioned, but they also enable certain symbols to be displayed at all, such as © for the copyright symbol (©). Every symbolic entity begins with an ampersand (&) and ends with a semicolon (;). Because of this, you probably aren't surprised to find out that XHTML regards ampersands in your code as the start of entities, and so you must also encode ampersands as entities when you want them to appear in your content; the ampersand entity is &.
A good rule of thumb is that if a character you want to use is not printed on the keys of your keyboard (such as é, ®, ©, or £), you need to use an entity in your markup. There are some 50,000 entities total, which encompass the character sets of most of the world's major languages, but you can find a shorter list of the commonly used entities at the Web Design Group site (www.htmlhelp.com/reference/html40/entities). And those are the rules of XHTML markup; they are relatively simple, but you must follow them exactly if you want you pages to validate (and you do). Understanding MarkupHere is a sample unstyled but valid XHTML page that illustrates the rules of XHTML (Figure 1.3): Figure 1.3. This unstyled but valid XHTML isn't visually interesting, but it is definitely usable.
The page isn't pretty, but it is certainly usable. And, this page's markup is lean and simple. There is no presentational code. and this XHTML passes muster with the WC3 HTML validator. In Chapter 3, I'll begin teaching you how to turn this unstyled markup into a more attractive-looking page using CSS. Now let's get into more detail on the XHTML rules by taking a look at the markup that created the page shown in Figure 1.3 line by line. LINES 1 - 2
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Here the DOCTYPE is set to XHTML 1.0 Strict. In this case, you're indicating that code will be interpreted as pure, non-backward-compatible XHTML. I focus on the strict DOCTYPEs throughout this book, which means I do not use any deprecated HTML. If you need to support deprecated HTML tags such as frames, you need a different DOCTYPE (see "XHTML Markup Rules," #3 earlier in this chapter). LINE 3<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> Next is the opening html tag, which did not have attributes in the past. Now it has a URL that points to the namespace (the collection of XML declarations and attributes) of this document. As mentioned earlier in "XHTML Markup Rules", the DOCTYPE and namespace declarations ensure that the browser understands what flavor of (X)HTML you are using, so it interprets your code as you intended. LINE 4<head> This tag opens the document head. The head of your document, which is sandwiched between the head and /head tags), contains information that, with the exception of the title, is not displayed to the viewer. Besides the essential head tags I list next (Lines 5 9), optionally there can be others: meta tags can contain all kinds of information (page descriptions, keywords, author names, etc.) used by search engines and other indexing software that might visit you site.
There can also be style tags that contain JavaScript and CSS that relate to, and can only be used by, the page they are on. LINE 5<title>A Sample XHTML Document</title> Technically, you don't have to use a title tag for your page to validate, but if you don't add it, the validator will encourage you to add it, and after you read the "About Title Tags" sidebar, you always will. LINES 6 - 7<meta http-equiv="Content-type" content="text/html; charset=iso-8859-1" /> <meta http-equiv="Content-Language" content="en-us" /> These two required meta head tags provide information that helps the browser and server properly manage and display the page. XHTML insists that you provide character encoding information, which ensures that the browser is displaying the pages with an appropriate character set. Here, in the first meta tag, 8859-1 is the code for Latin-1, the alphabet and associated symbols used in writing English and some other languages (see "XHTML Markup Rules," #3 earlier in this chapter). Note that as nonenclosing tags, they are both closed with the space-slash-angle bracket construction. Language information is also required. In the second meta tag, I state that the language is U.S. English; a language type such as Chinese causes the browser to display text from right to left. LINES 8 - 9<link href="demo_styles.css" rel="stylesheet" type="text/css" /> </head> The link tag links the XHTML markup to a CSS style sheet, which is a separate file located using the HRef. (I show you how to create a linked CSS style sheet later in this book so in this case the browser does not find the file and simply ignores this line.) The link tag isn't required, but linking is how you relate a style sheet to your markup, and by adding the same style sheet link to each page of your site, you can enable the pages to all share the same set of styles. You can also use the @import tag to link to a style sheet, and I'll show you both of these linking methods and when you might use one or the other or both, later in the book. Make sure you close the document head using the /head tag. LINE 10<body> Start the document body. The body contains the content that displays on your page. LINE 11<!--header--> This is a comment. It is not displayed; it is just here to make the code more understandable. Note that in XHTML you can only use two dashes, instead of the unlimited number allowed by HTML, at the start and end of each comment. LINE 12<div id="logo"> <img src="logo_area.jpg" width="150" height="80" alt="Stylin logo" />
Divs divide the page into rectangular, box-like areas. These areas are invisible unless you turn their borders on or color their backgrounds. This div tag has an id attribute with the value of "logo"; you can use this ID name to target CSS styles at this div to set its position, size, background color, and much more; furthermore, the div allows you to position all the content within it as a group and target styles at each of the tags it contains. The logo image tag (img) is a nonenclosing element and is therefore closed with a slash before the closing brace. Note the alt tag, which displays if the graphic doesn't load or is spoken by a screen reader. You must use alt tags on every image, even if the value is "" (that is, two quotes with nothing, not even a space, in between). Only do this if the image serves no informational purpose. You can leave the alt tags blank on everything, but such tags will be flagged by an XHTML validator. Also, this isn't very user friendly and does not aid accessibility. Note that all attribute values (such as the 150 and 80 in this example) must now be in quotes. Yes, really.
LINES 13 - 15<h3>a New Riders book by Charles Wyke-Smith</h3> </div> <!--end header--> A size 3 text heading is a block-level element and therefore it occurs on a new line, or more precisely, under the previous element. No br / tags are required. </div> Remember to close the header division using the /div tag and make a comment that the header ends here. LINES 16 - 20<!--main content--> <div class="contentarea"> <h1>Moving to XHTML</h1> <p>Creating XHTML compliant pages simply requires following a few simple rules. These rules may seem counter-intuitive or just a lot of extra work at first, but the benefits are significant and actually make coding sites much easier. Also, XHTML code can be easily validated online, so you can be sure your code is correctly written.</p> <p>Here are the key requirements for successful validation of your XHTML code.</p> Now, the content area starts with a div, which is a block-level element. The main header is size 1 text. Next, are two paragraphs. Paragraph tags, like all enclosing tags, must be closed with a backslash tag; in this case, /p. Note that paragraphs are block-level elements and have a default amount of space around them, top and bottom. LINES 21 - 31<ol> <li>Declare a DOCTYPE.</li> <li>Declare an XML namespace.</li> <li>Declare your content type.</li> <li>Close every tag, enclosing or non-enclosing.</li> <li>All tags must be nested correctly.</li> <li>Inline tags can't contain block-level tags.</li> <li>Write tags in lowercase.</li> <li>Attributes must have values and must be quoted.</li> <li>Use encoded equivalents for left brace and ampersand.</li> </ol> This is an ordered list; each list item has a number by default. (Unordered lists (ul) have bullets by default rather than numbers). LINE 32<a href="more.htm">more about these requirements</a> This is a hyperlink to a page named more.htm in the same folder as the current page. LINES 33 - 34</div> <!--end main content--> This closes the content area div. The comment is, of course, optional. LINES 35 - 37<!navigation--> <div id="navigation"> <p>Here are some useful links from the web site of the <acronym title="World Wide Web Consortium">WC3</acronym> (World Wide Web Consortium), the guiding body of the web's development.</p> It's good practice to style acronyms in a way that differentiates them from the text around them. Internet Explorer does not provide any default styling for acronyms; Safari will put them in italics (such as in Figure 1.3). If you add a title tag to an acronym, a tool tip containing the text from the title attribute pops up when a user mouses over it. It's also good practice to indicate the tool-tip's availability by underlining the acronym with a dotted line; this is achieved by styling the acronym element with a dotted border-bottom. Don't make the underline solid, which by convention would indicate the text is a link. These same markup techniques can also be applied to the abbr (abbreviation) tag. LINES 38 - 45<ul> <li><a href="http://validator.w3.org">WC3's XHTML validator</a></li> <li><a href="http://jigsaw.w3.org/css-validator/">WC3's CSS validator</a></li> <li><a href="http://www.w3.org/MarkUp/">XHTML Resources</a></li> <li><a href="http://www.w3.org/Style/CSS/">CSS Resources</a></li> </ul> </div> <!--end navigation--> This navigation aid is constructed as a list in which each list item is a link. All of this is inside a div block with an ID that enables you to reference it accurately from the style sheet. Note that there is no line break (which, for you purists, is purely presentational markup) at the end of each link; none is needed. By default, links appear in a row because they are inline elements, but here, because they are contained within list items, which are block-level elements, they display stacked. LINES 46 - 50<!--footer--> <div id="homepagefooter"> <p>© 2004 Charlie Wyke-Smith and New Riders.</p> </div> <!--end footer--> The last element of the page is a div that contains the footer text inside a paragraph tag. LINES 51 - 53</body> </html> <!--end of sample doc--> Now you just close out the body and the page, and you're done. Any questions? No? Good! Moving right along . . . Document Hierarchy: Meet the XHTML FamilyOK, the document hierarchy is one more important concept you need to understand before you can get to CSS. The document hierarchy is like a family tree or an organizational chart based on the nesting of a page's XHTML tags. A good way to learn to understand this concept is to take a snip of the body section of the markup we just discussed and strip out the content so that you can better see the organization of the tags. Here's the stripped-down header <body> <!--header - this is just a comment, not code--> <div id="logo"> <img /> <h3> </h3> </div> <!---end header - remaining tags removed here for clarity--> </body> Now you can clearly see the relationships of the tags; for example, in the markup, you can see that the body tag contains (or nests) all the other tags. You can also see that the div tag (with the ID of "logo") contains two tags; an image tag and head 3 tag. Figure 1.4 shows another way to represent this structurewith a hierarchy diagram. Figure 1.4. You can clearly see the hierarchical structure in this diagram.When examining this hierarchical view, we can say that the both the img tag and the H3 tag are the children of the div tag, because it is the containing element of both. In turn, the div tag is the parent tag of both of them, and the img tag and the H3 tag are siblings of one another because they both have the same parent tag. Finally, the body tag is an ancestor tag of the img and h3 tags, because they are indirectly descended from it. In the same way, the img and h3 tags (and the div, for that matter) are descendants of the body tag. To quote Sly Stone: "It's a family affair . . . " In CSS, you write a kind of shorthand based on these relationships; for example
div#logo img {some CSS styling in here}
Such a CCS rule only targets img tags inside of (descended from) the div with the ID of "logo" (the # is the CSS symbol for an ID). This rule means "any image that is descended from the div with an ID of "logo"; other img tags in the page are unaffected by this rule because they aren't contained within the "logo" div. In this way, you can add a border around just this image or set its margin to move it away from surrounding elements. We will get into learning to write CSS rules like this in great detail in the next chapter, but the important concept to understand is that every element within the body of your document is a descendant of the body tag, and, depending on its location in the markup, the element could be an ancestor, a parent, a child, or a sibling to other tags in the document hierarchy. By creating rules that use (and often combine) references to IDs, classes, and the hierarchy structure, you have means by which you can accurately dictate which CSS rules affect which XHTML elements, and this is exactly what you will learn to do next. |