The Quest for Hits : The Ultimate Guide to Getting Listed with the Search Engines

Glossary of Search Engine & Directory Terms

Did you know? You can click on the "Quest" graphic at any time to return to the main page
   
404 Error
The error code that is returned when you attempt to access a page that no longer exists. It is generally a good idea to report pages with 404 errors to the search engines so that they stay up to date. If you find that a page that ranks ahead of yours is no longer available, always resubmit it to the search engine in question and it will soon disappear, letting your own page jump up.

Algorithm
The closely guarded "recipe" that each search engine uses to decide how to rank the results of a given search. Each search engine has a unique algorithm; the purpose of the algorithm is to try to second-guess what people are really looking for when they specify search terms. The closer your site fits the algorithm, the better it will place during searches. Most of the effort expended in positioning sites for search engines is concerned with trying to reverse-engineer (work out) the algorithms of those search engines.

Alliance
A relationship between two search engines, or between a search engine and another site, that increases the traffic to the search engine. For instance, Yahoo! has a deal with AltaVista whereby it sends "overflow" searches (i.e. ones that do not return any useful results) to AltaVista's much larger database. Both AltaVista and HotBot send traffic to LookSmart. Changing alliances can have a huge effect on the size (and hence the importance) of different search engines.

Base URL
The root or main URL of your site; your home or welcome page; the page where most of your visitors will start their journey. Some search engines (and most directories) will only allow you to submit one URL from any given site; make sure you submit the base URL.

Boolean expression
Logical modifiers that can be used to change the results of a search. The exact modifiers vary by search engine. Some examples are "AND" and "OR" (for instance a search for email AND address would return all the sites that contain both words. A search for email OR address would return all the sites that contain at least one of the words) Sadly, not enough people know how to use these expressions; the results returned by search engines usually improve if Boolean expressions are used.

Category
The subject area in a directory that your site appears in. Yahoo!, for instance, has tens of thousands of categories, from the top category (such as "Computers & Internet") to very specific subcategories deep within the site. The choice of the right category is a very important part of the process of listing your site; choose an irrelevant category or very high-level category and your site may never be listed, or never found. On the other hand, if you choose a category that is very deeply buried many layers into the site, none of your potential visitors may bother to dig down that far.

Crawling
The process of indexing a site using a spider or robot. Usually, the spider records a subset of all the information on the page, according to the algorithm it was programmed to follow. Some spiders will crawl all the links off a given page or site; other spiders only index the exact URL they are given.

Directory
A site that gathers information about hundreds (or thousands, or indeed hundreds of thousands) of other sites, broken down by category. Directories usually use the information you specify for the description of your site, rather than taking it from your site itself.

Modifier
Special characters and conventions you can use to influence the results of a search, in a similar way to Boolean expressions. For instance, a "+" in front of a search term often means "This term must appear on the page" and a "-" means "This term must not appear on the page". Other modifiers include "link:" [show me all the links to this URL], "word1 word2" [show me all the pages containing this exact expression] etc.

Netscape effect
This is what I call the substantial boost search engines receive from being listed on Netscape's default search page. Although this traffic is decreasing in importance, Netscape still has a major effect on the popularity of the different search engines -- and it charges them accordingly inflated prices for the privilege of an enhanced listing.

Rank
The position of your site in the list of results returned for a given set of search terms. For instance, a site might rank 15th for the search terms "free email" but 2nd for the terms "free email address". The your site's rank will change constantly as new sites are registered with search engines and other sites shut down; retrospective spamming can have a huge (negative) effect on your site's rank. In general, if your site does not appear within the first 15-30 results for given search terms, you are unlikely to get many visitors based on those search terms.

Relevancy
A measure of how closely your site matches the search terms, calculated according to the algorithm employed by that particular search engine. Relevancy is usually expressed as a percentage. Some search engines are not very discriminating; HotBot, for instance, often returns dozens or hundreds of pages with a relevancy of 99% or better for a given set of search terms.

Retrospective spamming
This is what I call accidental spamming that occurs when a search engine changes the rules under your feet and filters the sites in its database according to the new rules. For instance, a search engine might suddenly introduce a rule that it will only list up to 40 pages per site. You currently have 41 pages listed with that search engine -- you are suddenly labelled a spammer and your site is reduced in rank or deleted completely. There is very little you can do to guard against retrospective spamming, since by definition it comes as a surprise. However, you can at least try to restrict the number of pages you submit so that only the most relevant pages are listed; don't make use of any practices that could easily be seen as spamming, and of course keep a regular eye on your ranking in case you suddenly vanish from the search engine!

Robot
A robot or 'bot is a small software program that crawls the Net collecting information about sites. See spider.

Robot exclusion file
A special text file, usually named "robots.txt", that controls how robots will treat your site. You can specify certain files or directories that you do not want robots to index, or you can even block specific robots from indexing certain areas of your site. This is very useful for robots such as the Excite Spider that crawl ALL the pages of your site; without any kind of blocking information, they may end up indexing temporary pages, stats pages and all sorts of other cyber-driftwood that has accumulated around your site.

Search engine
A database of sites built up by crawling the Net collecting information, either at the behest of site owners or through an automatic independent process.

Search terms
The word or phrase you enter into the search box on a search engine or directory. The more specific the search terms, the more relevant the results returned from the search. For example, if you are looking for the best cookie recipe you've ever tasted, a search for recipe is likely to be completely useless. chocolate cookie recipe is a step in the right direction. delicious chocolate cookies might be a good choice too, although you're likely to get results for stores selling cookies as well. Your best bet would be recipe for delicious chocolate cookies. You need to imagine the search terms that potential visitors are likely to use to find your site, then make sure your site ranks well for those terms on the search engines.

Spamming
The practice of cheating the search engines by making use of their algorithm to make your site more appear more relevant than it really is. For instance, if it is clear that a search engine's algorithm ranks pages in which the search terms appear several times over those where they only appear once, it is tempting to stuff your page full of these terms (often in a small font or in a colour that is hard to see against a coloured background). Spamming is a really bad idea; every search engine has its own definition of spam; step outside the bounds they have defined and your site will be reduced in rank or deleted from the search engine altogether. In the worst case scenario, it will be blacklisted. People usually spam either because they have followed dated (or just plain wrong!) information from a dubious promotion-related site, or because of ignorance. I reckon that only a minority of people set out with the aim of deliberately misleading the search engines. It's a moot point as the results will be the same. So always read the instructions before submitting your site, err on the side of caution and don't even think about cheating. Even then, you could occasionally run into trouble (see retrospective spamming).

Spider
Another word for robot.

Spidering
Another word for crawling.

Stop terms
Search terms so common that they have no effect, or even a negative effect, on your site's rank. For instance, AltaVista considers "web" to be a stop term, and will ignore the word web completely in most searches (except when combined with other expressions using a Boolean modifier). If your page is filled with stop terms, it is very unlikely to rank well.

VirtualPromote Forums
The best place -- after this site, of course -- to get up to date information about the search engines. So bookmark this page, then head over there to learn more...

Yahoo!
The #1 site on the Internet for driving traffic to your own site. A good listing here is a key part of any promotion campaign. Yahoo! is aware of their elite position, and has been consistently moving the goalposts in favour of higher quality sites. Still, if you have followed all the advice in the course so far, you're likely to sail through their listing process.

This site is part of PR2 : Free Website Promotion Course