Приглашаем посетить
Чулков (chulkov.lit-info.ru)

Providing Hints for Search Engines

Previous Page
Table of Contents
Next Page

Providing Hints for Search Engines

Fact: There is absolutely nothing you can do to guarantee that your site will appear in the top 10 search results for a particular word or phrase in any major search engine (short of buying ad space from the search site, that is). After all, if there were, why couldn't everyone else who wants to be number 1 on the list do it, too? What you can do is avoid being last on the list and give yourself as good a chance as anyone else of being first.

Each search engine uses a different method for determining which pages are likely to be most relevant and should therefore be sorted to the top of a search result list. You don't need to get too hung up about the differences, though, because they all use some combination of the same basic criteria. The following list includes almost everything any search engine considers when trying to evaluate which pages best match one or more keywords. The first three of these criteria are used by almost every major search engine, and most of them also use at least one or two of the other criteria.

  • Do the keywords appear in the <title> tag of the page?

  • Do the keywords appear in the first few lines of the page?

  • How many times do the keywords appear in the entire page?

  • Do the keywords appear in a <meta /> tag in the page?

  • Do the keywords appear in the names of image files and alt text for images in the page?

  • How many other pages in my web site link to the page?

  • How many other pages in other web sites link to the page? How many other pages link to those pages?

  • How many times have people chosen this page from a previous search list result?

  • Is the page rated highly in a human-generated directory?

By the Way

Yahoo! and Ask Jeeves are unique among search engines in that real people analyze and categorize web sites that are added to its directory.


Clearly, the most important thing you can do to improve your position is to consider what word combinations your intended audience is most likely to enter. I'd recommend that you not concern yourself with common single-word searches; the lists they generate are usually so long that trying to make it to the top is like playing the lottery. Focus instead on uncommon words and two- or three-word combinations that are most likely to indicate relevance to your topic. Make sure that those terms and phrases occur several times on your page, and be certain to put the most important ones in the <title> tag and the first heading or introductory paragraph.

By the Way

Some over-eager web page authors put dozens or even hundreds of repetitions of the same word on their pages, sometimes in small print or a hard-to-see color, just to get the search engines to sort that page to the top of the list whenever someone searches for that word. This practice is called search engine spamming.

Don't be tempted to try this sort of thingall the major search engines are on to this practice, and immediately delete any page from their database that sets off a "spam detector" by repeating the same word or group of words in a suspicious pattern. It's still fine (and quite beneficial) to have several occurrences of important search words on a page. Make sure, however, that you use the words in normal sentences or phrases, and the spam police will leave you alone.


Of all the search engine evaluation criteria just listed, the use of <meta /> tags is probably the most poorly understood. Some people rave about <meta /> tags as if using them could instantly move you to the top of every search list. Other people dismiss <meta /> tags as ineffective and useless. Neither of these extremes is true.

A <meta /> tag is a general-purpose tag you can put in the <head> portion of any document to specify some information about the page that doesn't belong in the <body> text. Most major search engines look at <meta /> tags to provide them with a short description of your page and some keywords to identify what your page is about. For example, your automatic cockroach flattener order form might include the following two tags:

<meta name="description"
content="Order form for the SuperSquish cockroach flattener." />
<meta name="keywords"
content="cockroach,roaches,kill,squish,supersquish" />

Watch Out!

Always place <meta /> tags after the <head>, <title>, and </title> tags but before the closing </head> tag.

According to XHTML standards, <title> must be the very first tag in the <head> section of every document.


The first tag in this example ensures that the search engine has an accurate description of the page to present on its search results list. The second <meta /> tag slightly increases your page's ranking on the list whenever any of your specified keywords are included in a search query.

You should always include <meta /> tags with name="description" and name="keywords" attributes in any page that you want to be indexed by a search engine. Doing so may not have a dramatic effect on your position in search lists, and not all search engines look for <meta /> tags, but it can only help.

Did you Know?

The previous cockroach example aside, search engine experts suggest that the ideal length of a page description in a <meta /> tag is in the 100- to 200-character range. For keywords, the recommended length is in the 200- to 400-character range. Experts also suggest not wasting spaces in between keywords, which is evident in the cockroach example. And finally, don't go crazy repeating the same keywords in multiple phrases in the keywordssome search engines will penalize you for attempting to overdo it.


Did you Know?

In the unlikely event that you don't want a page to be included in search engine databases at all, you can put the following <meta /> tag in the <head> portion of that page:

<meta name="robots" content="noindex" />

This causes some search robots to ignore the page. For more robust protection from prying robot eyes, ask the person who manages your web server to include your page address in the server's robots.txt file. (She will know what that means and how to do it.) All major search spiders will then be sure to ignore your pages. This might apply to internal company pages that you'd rather not be readily available via public searches.


To give you a concrete example of how to improve search engine results, consider the page listed in Listing 23.1 and shown in Figure 23.1. This page should be fairly easy to find because it deals with a specific topic and includes several occurrences of some uncommon technical terms for which people interested in this subject would be likely to search. However, there are several things you could do to improve the chances of this page appearing high on a search engine results list.

Listing 23.1. A Page That Will Present Some Problems During an Internet Site Search
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
    <title>Fractal Central</title>
  </head>

  <body style="background-image:url(fractalback.jpg); color:#003399">
    <div style="text-align:center">
      <img src="fractalaccent.gif" alt="" />
    </div>
    <div style="width:133px; float:left; padding:6px; text-align:center;
    border-width:4px; border-style:ridge">
      Discover the latest software, books and more at our online store.<br />
      <a href="orderform.html"><img src="orderform.gif" alt="Order Form"
      style="border-style:none" /></a>
    </div>
    <div style="float:left; padding:6px">
      <h2>A Comprehensive Guide to the<br />
      Art and Science of Chaos and Complexity</h2>
      <p>What's that? You say you're hearing about "fractals" and "chaos" all
      over the place, but still aren't too sure what they are? How about a
      quick summary of some key concepts:</p>
      <ol>
        <li><p>Even the simplest systems become deeply complex and richly
        beautiful when a process is "iterated" over and over, using the
        results of each step as the starting point of the next. This is how
        Nature creates a magnificently detailed 300-foot redwood tree from a
        seed the size of your fingernail.</p></li>
        <li><p>Most "iterated systems" are easily simulated on computers,
        but only a few are predictable and controllable. Why? Because a tiny
        influence, like a "butterfly flapping its wings," can be strangely
        amplified to have major consequences such as completely changing
        tomorrow's weather in a distant part of the world.</p></li>
        <li><p>Fractals can be magnified forever without loss of detail, so
        mathematics that relies on straight lines is useless with them.
        However, they give us a new concept called "fractal dimension" which
        can measure the texture and complexity of anything from coastlines to
        storm clouds.</p></li>
        <li><p>While fractals win prizes at graphics shows, their chaotic
        patterns pop up in every branch of science. Physicists find beautiful
        artwork coming out of their plotters. "Strange attractors" with
        fractal turbulence appear in celestial mechanics. Biologists diagnose
        "dynamical diseases" when fractal rhythms fall out of sync. Even pure
        mathematicians go on tour with dazzling videos of their
        research.</p></li>
      </ol>
      <p>Think all these folks may be on to something?</p>
    </div>
    <div style="text-align:center">
      <a href="http://netletter.com/nonsense/"><img src="findout.gif"
      alt="Find Out More" style="border-style:none" /></a>
    </div>
  </body>
</html>

Figure 23.1. The first part of the page shown in Listing 23.1, as it appears in a web browser.

Providing Hints for Search Engines


The contents of the page in Listing 23.2 and Figure 23.2 look to a human being almost the same as those of the page in Listing 23.1 and Figure 23.1. To search robots and search engines, however, these two pages appear quite different. The following list summarizes the changes and explains why I made each modification:

  • I added some important search terms to the <title> tag and the first heading on the page. The original page didn't even include the word fractal in either of these two key positions.

  • I added <meta /> tags to assist search engines with a description and keywords.

  • I added a very descriptive alt attribute to the first <img /> tag. Not all search engines read and index alt text, but some do.

  • I took out the quotation marks around technical terms (such as "fractal" and "iterated") because some search engines consider "fractal" to be a different word than fractal. I replaced the quotation marks with the character entity &quot;, which search robots simply disregard. This is also a good idea because XHTML urges web developers to use the &quot; entity instead of quotation marks anyway.

  • I added the keyword fractal twice to the text in the order-form box.

Figure 23.2. The first part of the page in Listing 23.2, as it appears in a web browser.

Providing Hints for Search Engines


It is impossible to quantify how much more frequently people searching for information on fractals and chaos were able to find the page shown in Listing 23.2 versus the page shown in Listing 23.1, but it's a sure bet that none of the changes could do anything but improve the page's visibility to search engines. As is often the case, the improvements made for the benefit of the search spiders probably made the page's subject easier for humans to recognize and understand as well. This makes optimizing a page for search engines a win-win effort!

Listing 23.2. An Improvement on the Page in Listing 23.1
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
    <title>Fractal Central: A Guide to Fractals, Chaos, and Complexity</title>
    <meta name="description" content="A comprehensive guide to fractal
    geometry, chaos science and complexity theory." />
    <meta name="keywords"  content="fractal, fractals, chaos science, chaos
    theory, fractal geometry, complexity, complexity theory" />
  </head>

  <body style="background-image:url(fractalback.jpg); color:#003399">
    <div style="text-align:center">
      <img src="fractalaccent.gif" alt="Fractal Central: A Guide to Fractals,
      Chaos, and Complexity" />
    </div>
    <div style="width:133px; float:left; padding:6px; text-align:center;
    border-width:4px; border-style:ridge">
      Discover the latest fractal software, books and more at the
      <span style="font-weight:bold">Fractal Central</span> online store.<br />
      <a href="orderform.html"><img src="orderform.gif" alt="Order Form"
      style="border-style:none" /></a>
    </div>
    <div style="float:left; padding:6px">
      <h2>A Comprehensive Guide to Fractal Geometry,<br />
      Chaos Science, and Complexity Theory</h2>
      <p>What's that? You say you're hearing about &quot;fractals&quot; and
      &quot;chaos&quot; all over the place, but still aren't too sure what
      they are? How about a quick summary of some key concepts:</p>
      <ol>
        <li><p>Even the simplest systems become deeply complex and richly
        beautiful when a process is &quot;iterated&quot; over and over, using
        the results of each step as the starting point of the next. This is
        how Nature creates a magnificently detailed 300-foot redwood tree from
        a seed the size of your fingernail.</p></li>
        <li><p>Most &quot;iterated systems&quot; are easily simulated on
        computers, but only a few are predictable and controllable. Why?
        Because a tiny influence, like a &quot;butterfly flapping its
        wings,&quot; can be strangely amplified to have major consequences
        such as completely changing tomorrow's weather in a distant part of
        the world.</p></li>
        <li><p>Fractals can be magnified forever without loss of detail, so
        mathematics that relies on straight lines is useless with them.
        However, they give us a new concept called &quot;fractal
        dimension&quot; which can measure the texture and complexity of
        anything from coastlines to storm clouds.</p></li>
        <li><p>While fractals win prizes at graphics shows, their chaotic
        patterns pop up in every branch of science. Physicists find beautiful
        artwork coming out of their plotters. &quot;Strange attractors&quot;
        with fractal turbulence appear in celestial mechanics. Biologists
        diagnose &quot;dynamical diseases&quot; when fractal rhythms fall out
        of sync. Even pure mathematicians go on tour with dazzling videos of
        their research.</p></li>
      </ol>
      <p>Think all these folks may be on to something?</p>
    </div>
    <div style="text-align:center">
      <a href="http://netletter.com/nonsense/"><img src="findout.gif"
      alt="Find Out More" style="border-style:none" /></a>
    </div>
  </body>
</html>

By the Way

If you've read any popular computer magazines in the past few years, you've probably found claims that XML, the "HTML of the future," will make it much easier to find what you're looking for on the Internet. You might be wondering how to get the web pages you create hooked up to this magical new searching miracle.

The good news is that XML will indeed eventually make online searching easier and more efficient. The bad news is that neither XML nor its offspring, XHTML, can make it any easier for people to find your pages today or even in the very near future. The technologies that will rely on XML to improve web searching are still under development, and may not reach the mainstream for years to come. Having said that, using XHTML and CSS in lieu of old-style HTML is a step in the right direction toward a more organized, and therefore more searchable, Web.

While you're waiting for The Next Big Thing to hit, you might want to keep an eye on future developments in web searching by stopping by the Search Engine Watch site at http://searchenginewatch.com/ every month or so.



Previous Page
Table of Contents
Next Page