404 Error
The error code that is returned when you attempt to
access a page that no longer exists. It is generally a
good idea to report pages with 404 errors to the search engines so that they stay up
to date. If you find that a page that ranks
ahead of yours is no longer available, always resubmit it
to the search engine in question
and it will soon disappear, letting your own page jump
up.Algorithm
The closely guarded "recipe" that each search engine uses to decide how to
rank the results of a given search.
Each search engine has a unique
algorithm; the purpose of the algorithm is to try to
second-guess what people are really looking for when they
specify search terms. The closer
your site fits the algorithm, the better it will place
during searches. Most of the effort expended in
positioning sites for search engines
is concerned with trying to reverse-engineer (work out)
the algorithms of those search
engines.
Alliance
A relationship between two search
engines, or between a search
engine and another site, that increases the traffic
to the search engine. For
instance, Yahoo! has a deal with AltaVista whereby it
sends "overflow" searches (i.e. ones that do
not return any useful results) to AltaVista's much larger
database. Both AltaVista and HotBot send traffic to
LookSmart. Changing alliances can have a huge effect on
the size (and hence the importance) of different search engines.
Base URL
The root or main URL of your site; your home or welcome
page; the page where most of your visitors will start
their journey. Some search engines
(and most directories) will only allow you to submit one
URL from any given site; make sure you submit the base
URL.
Boolean expression
Logical modifiers that can be
used to change the results of a search. The exact modifiers vary by search engine. Some examples are
"AND" and "OR" (for instance a search
for email AND address would return all the sites that
contain both words. A search for email OR address would
return all the sites that contain at least one of the
words) Sadly, not enough people know how to use these
expressions; the results returned by search
engines usually improve if Boolean expressions are
used.
Category
The subject area in a directory
that your site appears in. Yahoo!, for instance, has tens
of thousands of categories, from the top category (such
as "Computers & Internet") to very specific
subcategories deep within the site. The choice of the
right category is a very important part of the process of
listing your site; choose an irrelevant category or very
high-level category and your site may never be listed, or
never found. On the other hand, if you choose a category
that is very deeply buried many layers into the site,
none of your potential visitors may bother to dig down
that far.
Crawling
The process of indexing a site using a spider
or robot. Usually, the spider records a subset of all the
information on the page, according to the algorithm it was programmed to
follow. Some spiders will crawl all
the links off a given page or site; other spiders only index the exact URL they
are given.
Directory
A site that gathers information about hundreds (or
thousands, or indeed hundreds of thousands) of other
sites, broken down by category.
Directories usually use the information you specify for
the description of your site, rather than taking it from
your site itself.
Modifier
Special characters and conventions you can use to
influence the results of a search, in a similar way to Boolean expressions. For instance, a
"+" in front of a search
term often means "This term must appear on the
page" and a "-" means "This term must
not appear on the page". Other modifiers include
"link:" [show me all the links to this URL],
"word1 word2" [show me all the pages containing
this exact expression] etc.
Netscape
effect
This is what I call the substantial boost search engines receive from being
listed on Netscape's default search page. Although this
traffic is decreasing in importance, Netscape still has a
major effect on the popularity of the different search engines -- and it charges
them accordingly inflated prices for the privilege of an
enhanced listing.
Rank
The position of your site in the list of results returned
for a given set of search terms.
For instance, a site might rank 15th for the search terms "free email"
but 2nd for the terms "free email address". The
your site's rank will change constantly as new sites are
registered with search engines
and other sites shut down; retrospective spamming
can have a huge (negative) effect on your site's rank. In
general, if your site does not appear within the first
15-30 results for given search terms,
you are unlikely to get many visitors based on those search terms.
Relevancy
A measure of how closely your site matches the search terms, calculated according
to the algorithm employed by
that particular search engine.
Relevancy is usually expressed as a percentage. Some search engines are not very
discriminating; HotBot, for instance, often returns
dozens or hundreds of pages with a relevancy of 99% or
better for a given set of search
terms.
Retrospective
spamming
This is what I call accidental spamming
that occurs when a search engine
changes the rules under your feet and filters the sites
in its database according to the new rules. For instance,
a search engine might suddenly
introduce a rule that it will only list up to 40 pages
per site. You currently have 41 pages listed with that search engine -- you are suddenly
labelled a spammer and your site is reduced in rank or deleted completely. There is
very little you can do to guard against retrospective
spamming, since by definition it comes as a surprise.
However, you can at least try to restrict the number of
pages you submit so that only the most relevant pages are
listed; don't make use of any practices that could easily
be seen as spamming, and of course
keep a regular eye on your ranking in
case you suddenly vanish from the search
engine!
Robot
A robot or 'bot is a small software program that crawls the Net collecting
information about sites. See spider.
Robot
exclusion file
A special text file, usually named
"robots.txt", that controls how robots will treat your site. You can
specify certain files or directories that you do not want
robots to index, or you can even
block specific robots from indexing
certain areas of your site. This is very useful for robots such as the Excite Spider that crawl ALL the pages of your site;
without any kind of blocking information, they may end up
indexing temporary pages, stats pages and all sorts of
other cyber-driftwood that has accumulated around your
site.
Search
engine
A database of sites built up by crawling the Net collecting
information, either at the behest of site owners or
through an automatic independent process.
Search
terms
The word or phrase you enter into the search box on a search engine or directory. The more specific the
search terms, the more relevant the results returned from
the search. For example, if you are looking for the best
cookie recipe you've ever tasted, a search for recipe
is likely to be completely useless. chocolate
cookie recipe is a step in the right
direction. delicious chocolate cookies
might be a good choice too, although you're likely to get
results for stores selling cookies as well. Your best bet
would be recipe for delicious chocolate
cookies. You need to imagine the search
terms that potential visitors are likely to use to find
your site, then make sure your site ranks
well for those terms on the search
engines.
Spamming
The practice of cheating the search
engines by making use of their algorithm to make your
site more appear more relevant than it really is. For
instance, if it is clear that a search
engine's algorithm ranks pages in
which the search terms appear
several times over those where they only appear once, it
is tempting to stuff your page full of these terms (often
in a small font or in a colour that is hard to see
against a coloured background). Spamming is a really bad
idea; every search engine has
its own definition of spam; step outside the bounds they
have defined and your site will be reduced in rank or deleted from the search engine altogether. In the
worst case scenario, it will be blacklisted. People
usually spam either because they have followed dated (or
just plain wrong!) information from a dubious
promotion-related site, or because of ignorance. I reckon
that only a minority of people set out with the aim of
deliberately misleading the search
engines. It's a moot point as the results will be the
same. So always read the instructions before submitting
your site, err on the side of caution and don't even
think about cheating. Even then, you could occasionally
run into trouble (see retrospective
spamming).
Spider
Another word for robot.
Spidering
Another word for crawling.
Stop
terms
Search terms so common that they
have no effect, or even a negative effect, on your site's
rank. For instance, AltaVista
considers "web" to be a stop term, and will
ignore the word web completely in most searches (except
when combined with other expressions using a Boolean modifier). If your page is
filled with stop terms, it is very unlikely to rank well.
VirtualPromote Forums
The best place -- after this site, of course -- to get up
to date information about the search
engines. So bookmark this page, then head
over there to learn more...
Yahoo!
The #1 site on the Internet for driving traffic to your
own site. A good listing here is a key part of any
promotion campaign. Yahoo! is aware of their elite
position, and has been consistently moving the goalposts
in favour of higher quality sites. Still, if you have
followed all the advice in the course so far, you're
likely to sail through their listing process.
|