1.1 The Web's Fall from Grace
Back in
the dimly remembered, early years of the Web (1990-1993), HTML was a
fairly lean language. It was composed almost entirely of structural
elements that were useful for describing things like paragraphs,
hyperlinks, lists, and headings. It had nothing even remotely
approaching tables, frames, or the complex markup we assume is a
necessary part of creating web pages. The general idea was that HTML
would be a structural markup language, used to describe the various
parts of a document. Very little was said about how those parts
should be displayed. The language wasn't concerned
with appearance. It was just a clean little markup scheme.
Then came Mosaic.
Suddenly, the power of the World Wide Web was obvious to almost
anyone who spent more than 10 minutes playing with it. Jumping from
one document to another was no harder than pointing the mouse cursor
at a specially colored bit of text, or even an image, and clicking
the mouse button. Even better, text and images could be displayed
together, and all you needed to create a page was a plain-text
editor. It was free, it was open, and it was cool.
Web sites began to spring up everywhere. There were personal
journals, university sites, corporate sites, and more. As the number
of sites increased, so did the demand for new HTML elements that
would each perform a specific function. Authors started demanding
that they be able to make text boldfaced, or italicized.
At the time, HTML wasn't equipped to handle those
sorts of desires. You could declare a bit of text to be emphasized,
but that wasn't necessarily the same as being
italicized—it could be boldfaced instead, or even normal text
with a different color, depending on the user's
browser and her preferences. There was nothing to ensure that what
the author created was what the reader would see.
As a result of these pressures, markup elements like
<B> and <I> started
to creep into the language. Suddenly, a structural language started
to become presentational.
1.1.1 What a Mess
Years later, we have inherited the problems of this haphazard
process.
Large
parts of HTML 3.2 and HTML 4.0, for example, were devoted to
presentational considerations. The ability to color and size text
through the font element, to apply background
colors and images to documents and tables, to use
table elements (such as
cellspacing), and to make text blink on and off
are all the legacy of the original cries for "more
control!"
For an example of the mess in action, take a quick glance at almost
any corporate web site's markup. The sheer amount of
markup in comparison to actual useful information is astonishing.
Even worse, for most sites, the markup is almost entirely made up of
tables and font elements, none of which conveys
any real semantic meaning to what's being presented.
From a structural standpoint, these pages are little better than
random strings of letters.
For example, let's assume that for page titles, an
author is using font elements instead of heading
elements like h1:
<font size="+3" face="Helvetica" color="red">Page Title</font>
Structurally speaking, the font tag has no
meaning. This makes the document far less useful. What good is a
font tag to a speech-synthesis browser, for
example? If an author uses heading elements instead of
font elements, though, the speaking browser can
use a certain speaking style to read the text. With the
font tag, the browser has no way to know that the
text is any different from other text.
Why do authors run roughshod over
structure and meaning this way? Because they want readers to see the
page as they designed it. To use structural HTML markup is to give up
a lot of control over a page's appearance, and it
certainly doesn't allow for the kind of densely
packed page designs that have become so popular over the years. But
consider the following problems with such a roughshod approach:
Unstructured pages make content indexing inordinately difficult. A
truly powerful search engine would allow users to search only page
titles, or only section headings within pages, or only paragraph
text, or perhaps only those paragraphs that are marked as being
important. In order to accomplish such a feat, however, the page
contents must be contained within some sort of structural
markup—exactly the sort of markup most pages lack. Google, for
example, does pay attention to markup structure when indexing pages,
so a structural page will increase your Google rank. Lack of structure reduces accessibility. Imagine that you are blind
and rely on a speech-synthesis browser to search the Web. Which would
you prefer: a structured page that lets your browser read only
section headings so that you can choose which section
you'd like to hear more about; or a page that is so
lacking in structure that your browser is forced to read the entire
thing with no indication of what's a heading,
what's a paragraph, and what's
important? Let's return to Google—the search
engine is in effect the world's most active blind
user, with millions of friends who accept its every suggestion about
where to surf and shop. Advanced page presentation is possible only with some sort of
document structure. Imagine a page in which only the section headings
are shown, with an arrow next to each. The user can decide which
section heading applies to him and click on it, thus revealing the
text of that section. Structured markup is easier to maintain. How many times have you
spent several minutes hunting through someone else's
HTML (or even your own) in search of the one little error
that's messing up your page in one browser or
another? How much time have you spent writing nested tables and
font elements, just to get a sidebar with white
hyperlinks in it? How many linebreak elements have you inserted
trying to get exactly the right separation between a title and the
following text? By using structural markup, you can clean up your
code and make it easier to find what you're looking
for.
Granted, a fully structured document is a little plain. Due to that
one single fact, a hundred arguments in favor of structural markup
won't sway a marketing department from using the
type of HTML that was so prevalent at the end of the 20th century,
and which persists even today. What we need is a way to combine
structural markup with attractive page presentation.
|