|
See Your Site With the Eyes of a Spider
Making efforts to optimize a site is great but what counts
is how search engines see your efforts. While even the most
careful optimization does not guarantee tops position in
search results, if your site does not follow basic SEO
truths, then it is more than certain that this site will not
score well with search engines. One way to check in advance
how your SEO efforts are seen by search engines is to use a
search engine simulator.
Spiders Explained

Basically all search engine spiders function on the same
principle – they crawl the Web and index pages, which are
stored in a database and later use various algorithms to
determine page ranking, relevancy, etc of the collected
pages. While the algorithms of calculating ranking and
relevancy widely differ among search engines, the way they
index sites is more or less uniform and it is very important
that you know what spiders are interested in and what they
neglect.
Search engine spiders are robots and they do not read your
pages the way a human does. Instead, they tend to see only
particular stuff and are blind for many extras (Flash,
JavaScript) that are intended for humans. Since spiders
determine if humans will find your site, it is worth to
consider what spiders like and what don't.
Flash, JavaScript, Image Text or Frames?!
Flash, JavaScript and image text are NOT visible to search
engines. Frames are a real disaster in terms of SEO ranking.
All of them might be great in terms of design and usability
but for search engines they are absolutely wrong. An
incredible mistake one can make is to have a Flash intro
page (frames or no frames, this will hardly make the
situation worse) with the keywords buried in the animation.
Check with the Search Engine Spider Simulator tool a page
with Flash and images (and preferably no text or inbound or
outbound hyperlinks) and you will see that to search engines
this page appears almost blank.
Running your site through this simulator will show you more
than the fact that Flash and JavaScript are not SEO
favorites. In a way, spiders are like text browsers and they
don't see anything that is not a piece of text. So having an
image with text in it means nothing to a spider and it will
ignore it. A workaround (recommended as a SEO best practice)
is to include meaningful description of the image in the ALT
attribute of the <IMG> tag but be careful not to use too
many keywords in it because you risk penalties for keyword
stuffing. ALT attribute is especially essential, when you
use links rather than text for links. You can use ALT text
for describing what a Flash movie is about but again, be
careful not to trespass the line between optimization and
over-optimization.
Are Your Hyperlinks Spiderable?
The search engine spider simulator can be of great help when
trying to figure out if the hyperlinks lead to the right
place. For instance, link exchange websites often put fake
links to your site with _javascript (using mouse over events
and stuff to make the link look genuine) but actually this
is not a link that search engines will see and follow. Since
the spider simulator would not display such links, you'll
know that something with the link is wrong.
It is highly recommended to use the <noscript> tag, as
opposed to _javascript based menus. The reason is that _javascript
based menus are not spiderable and all the links in them
will be ignored as page text. The solution to this problem
is to put all menu item links in the <noscript> tag. The <noscript>
tag can hold a lot but please avoid using it for link
stuffing or any other kind of SEO manipulation.
If you happen to have tons of hyperlinks on your pages
(although it is highly recommended to have less than 100
hyperlinks on a page), then you might have hard times
checking if they are OK. For instance, if you have pages
that display “403 Forbidden”, “404 Page Not Found” or
similar errors that prevent the spider from accessing the
page, then it is certain that this page will not be indexed.
It is necessary to mention that a spider simulator does not
deal with 403 and 404 errors because it is checking where
links lead to not if the target of the link is in place, so
you need to use other tools for checking if the targets of
hyperlinks are the intended ones.
Looking for Your Keywords
While there are specific tools, like the Keyword Playground
or the Website Keyword Suggestions, which deal with keywords
in more detail, search engine spider simulators also help to
see with the eyes of a spider where keywords are located
among the text of the page. Why is this important? Because
keywords in the first paragraphs of a page weigh more than
keywords in the middle or at the end. And if keywords
visually appear to us to be on the top, this may not be the
way spiders see them. Consider a standard Web page with
tables. In this case chronologically the code that describes
the page layout (like navigation links or separate cells
with text that are the same sitewise) might come first and
what is worse, can be so long that the actual page-specific
content will be screens away from the top of the page. When
we look at the page in a browser, to us everything is fine –
the page-specific content is on top but since in the HTML
code this is just the opposite, the page will not be noticed
as keyword-rich.
Are Dynamic Pages Too Dynamic to be Seen At All
Dynamic pages (especially ones with question marks in the
URL) are also an extra that spiders do not love, although
many search engines do index dynamic pages as well. Running
the spider simulator will give you an idea how well your
dynamic pages are accepted by search engines. Useful
suggestions how to deal with search engines and dynamic URLs
can be found in the Dynamic URLs vs. Static URLs article.
Meta Keywords and Meta Description
Meta keywords and meta description, as the name implies, are
to be found in the <META> tag of a HTML page. Once meta
keywords and meta descriptions were the single most
important criterion for determining relevance of a page but
now search engines employ alternative mechanisms for
determining relevancy, so you can safely skip listing
keywords and description in Meta tags (unless you want to
add there instructions for the spider what to index and what
not but apart from that meta tags are not very useful
anymore).
|