|
What is a search engine and how does it work?
On the Internet, a search engine has three parts:
A spider (also called a "crawler" or a "bot") which travels
to every page or representative page on every searchable web
site, reads it, then using hypertext links on those pages,
travels throughout the other pages linked by
that web site.

A catalog or Index which is created by programs compiling
the pages read
from those web sites, and...
A program which receives your search request, compares it to
the entries in the index, and returns the results to you. An
alternative to using a search engine is to explore a
structured directory of topics. Yahoo, which also lets you
use its search engine, is the most widely-used directory on
the Web. A number of Web portal sites offer both the search
engine and directory approaches to finding information Not
all search engines are created equal, but all of them have a
few basic components that are essential to their use. Some
components are more visible than others to the average user,
but all of them must be working in tandem to create a high
performance search tool. The three basic actions that have
to be performed for a search engine to be useful are: Gather
information, analyze information, and display information.
The only major difference between major search engines is
how these tasks are performed and how often they are
performed. Gathering information Spiders are the programs
that search engines use to collect information about web
sites on the Internet. These programs traverse the world
wide web gathering the content of web sites and store that
information for later processing.
There are two basic ways that spiders can find your web
site. You can tell the search engine about your web site, or
let it find your site on its own. Typically search engines
will have a place on their web site which allows you to
suggest a site to them. After a site has been suggested, the
search engines spider will visit that web site to collect
information about it. Spiders also follow the links on each
web site to find linked sites to visit. This is how a spider
will find your site by itself. The more web sites that link
to your site, the more likely a spider will find your site
without you telling it your sites URL.
Usually search engine spiders will revisit your site when
you submit your URL again. When the spider finds a link to
your site, or after a specified amount of time has passed
since its last visit. Depending on the number of web sites
that the spider needs to visit and the resources that the
spider has at its disposal, it can take days or months for a
spider to visit or revisit your web site.
Displaying information
Search engines take a search request from a user and display
a list of web pages that relate to that topic. These
returned sites give clues to the algorithm used to analyze
the web pages in the search engines index. When a search
engine displays the file size of the web page or a
percentage next to the web site, it can be used to help
figure out how to optimize your web pages better for that
search engine. Some search engines return results in the
order of relevance, others mix up the results to make sure
the web sites returned are from different sites. No matter
how a search engine displays the information requested by a
user, this result is typically the first impression of your
web site. It is important to follow any guidelines that
search engines give and do research on how each search
engine analyzes web pages so that you not only get a good
ranking for your search, but the description of your site is
accurate as well.
|