<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:blogChannel="http://backend.userland.com/blogChannelModule" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:pingback="http://madskills.com/public/xml/rss/module/pingback/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
  <channel>
    <title>jane and robot</title>
    <description>search friendly design patterns for web development</description>
    <link>http://janeandrobot.com/</link>
    <docs>http://www.rssboard.org/rss-specification</docs>
    <generator>BlogEngine.Net Syndication Generator 1.0.0.0 (http://dotnetblogengine.net/)</generator>
    <language>en-US</language>
    <blogChannel:blogRoll>http://janeandrobot.com/opml.axd</blogChannel:blogRoll>
    <blogChannel:blink>http://www.janeandrobot.com/syndication.axd</blogChannel:blink>
    <dc:creator>My name</dc:creator>
    <dc:title>jane and robot</dc:title>
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/janeandrobot" type="application/rss+xml" /><item>
      <title>URL Referrer Tracking</title>
      <description>&lt;p&gt;There may be instance when you want to track the source of a request, and a common way of doing so is by using tracking parameters in URLs. Unfortunately, implementing referrer tracking in this way can result in significant issues with search engines. In particular, it can cause duplicate content issues (since the search engine bot finds multiple valid URLs that point to the same page) and ranking issues (since all the links to the page aren't to the same URL). &lt;/p&gt;  &lt;p&gt;Let's say that Jane and Robot uploaded two different online training seminars to YouTube as part of a viral marketing effort to drive more traffic to our site. To gauge our return on investment from each of these seminars, we've added a tracking parameter to the link within each YouTube description that a customer can click on to learn more, here are the two URLS: http://janeandrobot.com/?from=promo-seminar-1 and http://janeandrobot.com/?from=promo-seminar-2. Each would bring the customer to our home page (the same page served by http://janeandrobot.com) and we would track the conversions based on the from parameter in the URL. &lt;/p&gt;  &lt;p&gt;While this solution may seem to work well initially, it can result in low quality tracking data and impact our search acquisition. Here's a summary of the major problems: &lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;     &lt;p&gt;&lt;strong&gt;Duplicate content&lt;/strong&gt; - search engines sometimes have difficulty determining if two URLs contain the exact same page (see &lt;a href="http://janeandrobot.com/post/canonical-url-canonicalization-domain.aspx"&gt;canonicalization&lt;/a&gt; for more information). In this case, we're creating this problem because we've created multiple URLs for the same page. Search engines are likely to find all three URLs for the home page and store/ rank them as separate content within their index. This could cause the search engine robots to crawl the page three times instead of just once (which may not be a big deal if we are only tracking two promotions, but could become a big problem if we used similar tracking parameters for many other campaigns and URLs). Not only are the robots using more bandwidth than is necessary, but since they don't crawl a site infinitely, they could spend all the allotted time crawling duplicate pages and never get to some of the good unique pages on the site. &lt;/p&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;p&gt;&lt;strong&gt;Ranking&lt;/strong&gt; - search engines use the number of quality links pointing to a URL as a major signal in determining the authority and usefulness of that content. Because we now have three different URLs pointing to the same page, people have three choices when linking to it. The result is a lower rank for all of the variations of the URL. Search engines generally filter out duplicates, so for instance, if the original (canonical) home page has 100 incoming links and each URL with a tracking parameter has 25 links, then search engines might filter out the two URLs with fewer links and show only the canonical URL, ranking it at position eight for a particular query based on those 100 incoming links. If all incoming links were to the same URL, then search engines would count 150 links to the home page and might rank it at position three for that same query.&amp;#160; &lt;br /&gt;        &lt;br /&gt;Another danger is that if one of the YouTube promo videos becomes exceptionally popular, its promo URL might gain more links than the original home page URL. Using this same example, if one of the promo URLs gained 200 links, search engines might choose to display it in the search results over the original home page. This could cause a confusing experience for potential customers who are looking for your home page (http://janeandrobot.com/?from=promo-seminar-1 doesn't look like a home page and searchers might be less likely to click on it, thinking it's not the page they're looking for). It's also not ideal from a branding perspective. &lt;/p&gt;   &lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Reporting quality&lt;/strong&gt; - as social networking sites become more popular, we become more of a sharing culture online. Many people use bookmarks, and online bookmarking sites such as Delicious, email, and other sharing sites such as Facebook, Twitter, and FriendFeed to save and share URLs. They'll click on on a URL, and if they like it, copy and paste it from the browser's address bar. If the link they're saving/sharing happens to be one of our promotional links, then they have preserved this link for all time, and everyone who clicks through the link will look identical to someone coming through the promo. This skews the reporting numbers of who went to the site after viewing the video -- which was why we set up the tracking parameters in the first place! &lt;/li&gt; &lt;/ol&gt;  &lt;h2&gt;Implementation Options&lt;/h2&gt;  &lt;p&gt;Unfortunately there is no perfect solution for this scenario, and what works best for you depends on your infrastructure and situation. Here we've listed several common solutions that you can choose from to improve your own implementation. We generally recommend the first solution (Redirects), but there are pros and cons to each option that you should review carefully before making your decision. &lt;/p&gt;  &lt;h3&gt;Redirects (and Cookies) &lt;/h3&gt;  &lt;p&gt;The first option strives to solve the problem by trapping all of the promotional requests, recording the tracking information, then removing the tracking parameter from the URL. This can be time consuming to implement, but it is the best all-round scenario to address the three major issues listed above. &lt;/p&gt;  &lt;p&gt;If you wanted to get fancy, and track a user's entire session based on your referral parameter, then you can use this method as well and simply set a cookie on the client machine at the same time you trap the request. This is recommended to understand the value of traffic from different sources. In either case, here are the steps you'll need to undertake: &lt;/p&gt;  &lt;p&gt;&lt;strong&gt;1. Trap the incoming request&lt;/strong&gt; - find where you web site application's logic processes the HTTP request for your page. Trap each request at that point and check if it has a tracking parameter. If it does, record this in your internal referral tracking system. You can record this either in your server logs, or in a custom referral tracking database you maintain on your own. &lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;If you also would like to track the entire user's session, then you should also use this opportunity to set a cookie on the client. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;&lt;strong&gt;2. Implement the redirect&lt;/strong&gt; - next step is to implement a 301 redirect from the current URL to the same page without the tracking parameter (or the canonical URL). Don't for get to use the cache-control attribute in the HTTP header to ensure that all the requests come to your server and don't get handled automatically in some network-based cache. Here's what a sample redirect header might look like: &lt;/p&gt;  &lt;div class="csharpcode"&gt;   &lt;pre class="alt"&gt;301 Moved Permanently&lt;/pre&gt;

  &lt;pre class="alt"&gt;Cache-Control: max-age=0&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Note that ASP.Net and IIS both use 302 redirects by default, so you many need to manually create the 301 response code.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;The way this works is that when a search engine encounters a promotional URL (http://janeandrobot.com/?from=promo-seminar-1) it issues an HTTP GET request to the URL. The HTTP response tells the search engine that this page has been permanently moved (301 Redirect) and provides the new address (the same as the old address but without the tracking parameter). The search engine then discards the first URL (with the tracking code) and only stores the second URL (without the tracking code). And everything is right in the world. &lt;/p&gt;

&lt;p&gt;This implementation is one of the best options, but it does have some limitations: &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;One downside of this method is that it requires you to manage your own referral tracking system. Because it traps the referral parameters and removes them from the URL before the page actually loads, 3rd party referral tracking applications like Google Analytics, Omniture, WebTrends or Microsoft adCenter Analytics will not be able to track these referrals. &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Canonical ULR &amp;lt;LInk /&amp;gt; Tag&lt;/h3&gt;

&lt;p&gt;Possibly the simplest option to solve this issue is to take advantage of a new standard recently adopted by Google, Yahoo and Microsoft Live Search. Their solution to this problem is to use a new attribute of the &amp;lt;link /&amp;gt; tag to explicitly tell them what the canonical URL for the page is. Assuming the &amp;lt;link /&amp;gt; tag has been created correctly, the search engines will treat this like the a 301 redirect to the canonical URL.&lt;/p&gt;

&lt;p&gt;Here's an example of using this tag:&lt;/p&gt;

&lt;div class="csharpcode"&gt;
  &lt;pre class="alt"&gt;&amp;lt;html&amp;gt;&lt;/pre&gt;

  &lt;pre class="alt"&gt;   &amp;lt;head&amp;gt;&lt;/pre&gt;

  &lt;pre class="alt"&gt;      &amp;lt;link rel=&amp;quot;canonical&amp;quot; href=&amp;quot;http://janeandrobot.com&amp;quot; /&amp;gt;&lt;/pre&gt;

  &lt;pre class="alt"&gt;   &amp;lt;/head&amp;gt;&lt;/pre&gt;

  &lt;pre class="alt"&gt;&amp;lt;/html&amp;gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here's a few notes about implementing this tag:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Search engines view this as a hint, not a command. Implementing this tag isn't a guarantee, although Google said they will try their best to make it work. The reason they can't give any guarantees is because they may detect that you are implementing it incorrectly, or it is being used for some type of spammy scenario.&lt;/p&gt;
  &lt;/li&gt;

  &lt;li&gt;
    &lt;p&gt;Relative or absolute URL are supported within the &lt;strong&gt;href &lt;/strong&gt;attribute. However, I recommend that you use absolute URLs whenever possible. This helps the search engines further normalize the URLs because they see what protocol (http or https) you use, and whether or not you are prefixing your domain with &amp;quot;www.&amp;quot;.&lt;/p&gt;
  &lt;/li&gt;

  &lt;li&gt;
    &lt;p&gt;Sub-domains are supported, separate domains are not. With this tag you can specify a separate a different sub-domain, for example within this URL (http://janeandrobot.com?from=promo-seminar-2) you could specify this canonical URL (http://videos.janeandrobot.com). However, the &amp;lt;link /&amp;gt; tag would not be valid if you specific a completely different domain like this http://janeandrobot-videos.com. &lt;/p&gt;
  &lt;/li&gt;

  &lt;li&gt;
    &lt;p&gt;Common Pitfalls... You'll want to ensure that you don't do anything silly like (i) create an infinite loop with two canonical tags pointing to each other (ii) have the canonical tag point to a page that returns a 404 status code. You should also make sure that your canonical URL is generally a short and simple URL.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While this implementation seems a little too good to be true, there are a few potential downsides. The first is that if you implement it incorrectly, the search engines will simply ignore it, and that could be complicated to debug. The other issue is that it fixes issues #1 (duplicate content) and #2 (ranking) but does nothing to fix the 3rd issue of reporting. Still, given all of that I would likely implement this option first and do the others when I had some spare dev cycles.&lt;/p&gt;

&lt;h3&gt;URL Fragment&lt;/h3&gt;

&lt;p&gt;A simple and elegant option is to simply place the tracking parameter behind a hash mark in the URL, creating a URL fragment. Traditionally, these are used to denote links within a page, and are ignored completely by search engines. In fact, they simply truncate the URL fragment from the URL. &lt;/p&gt;

&lt;p&gt;Old URL &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;http://janeandrobot.com/?from=promo-seminar-1 &lt;/li&gt;

  &lt;li&gt;http://janeandrobot.com/?from=promo-seminar-2 &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;New URL with URL Fragment &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;http://janeandrobot.com/&lt;font color="#ff0000"&gt;#&lt;/font&gt;from=promo-seminar-1 &lt;/li&gt;

  &lt;li&gt;http://janeandrobot.com/&lt;font color="#ff0000"&gt;#&lt;/font&gt;from=promo-seminar-1 &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By default Google Analytics will ignore the fragment as well, however there is a simple work around that was provided to us by &lt;a href="http://www.kaushik.net/avinash/"&gt;Avinash Kaushik&lt;/a&gt;, Google's web metrics evangelist. Using the following JavaScript: &lt;/p&gt;

&lt;div class="csharpcode"&gt;
  &lt;pre class="alt"&gt;var pageTracker = _gat._getTracker(&amp;quot;UA-12345-1&amp;quot;); &lt;/pre&gt;

  &lt;pre class="alt"&gt;&amp;#160; &lt;/pre&gt;

  &lt;pre class="alt"&gt;// Solution for domain level only &lt;/pre&gt;

  &lt;pre class="alt"&gt;pageTracker._trackPageview(document.location.pathname + &amp;quot;/&amp;quot; + document.location.hash); &lt;/pre&gt;

  &lt;pre class="alt"&gt;&amp;#160; &lt;/pre&gt;

  &lt;pre class="alt"&gt;// If you have a path included in the URL as well&amp;#160; &lt;/pre&gt;

  &lt;pre class="alt"&gt;pageTracker._trackPageview(document.location.pathname + document.location.search + &lt;/pre&gt;

  &lt;pre class="alt"&gt;                           &amp;quot;/&amp;quot; + document.location.hash); &lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You can create a few additional variations of this if you also have additional queries in the URL you would like to track. Check with your web analytics provider to find out if you need to customize your implementation to account for using URL fragments for tracking. &lt;/p&gt;

&lt;p&gt;Does this sound too simple and easy to be true? There are a couple downsides to this approach: &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This option fixes issues 1 (duplicate content) &amp;amp; 2 (ranking) listed above, but it will not address the 3rd issue of reporting. You could still encounter some reporting issues using this method if people are bookmarking or emailing around the URL. &lt;/li&gt;

  &lt;li&gt;Typically you'll have to write some custom code to parse the URL fragment. Since it's a non-standard implementation, standard methods may not support this. &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Robots Exclusion Protocol&lt;/h3&gt;

&lt;p&gt;Another relatively simple solution is to use robots.txt to ensure that search engines are not indexing URLs that contain tracking parameters. This method enables you to ensure that the original (canonical) version of the URL is always the one indexed and avoids the duplicate content issues involving indexing and bandwidth. &lt;/p&gt;

&lt;p&gt;Assuming that all of our tracking parameters will follow a similar pattern to this: &lt;/p&gt;

&lt;p&gt;http://janeandrobot.com/?from=&amp;lt;PromoID&amp;gt; &lt;/p&gt;

&lt;p&gt;we can easily create a pattern that will match for this. Below is a robots.txt file that implements the pattern: &lt;/p&gt;

&lt;div class="csharpcode"&gt;
  &lt;pre class="alt"&gt;# Sample Robots.txt file, single query parameter&lt;/pre&gt;

  &lt;pre class="alt"&gt;User-agent: *&lt;/pre&gt;

  &lt;pre class="alt"&gt;Disallow: /?from=&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The first line means that this rule should apply to all search engines (or robots crawling your site), and the second line tells them that they can't index any URLs that start with 'janeandrobot.com/?from=' and some type of promotional code of any length. See complete information on using the &lt;a href="http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx"&gt;Robots Exclusion Protocol&lt;/a&gt;. Use this pattern if you will have multiple query parameters: &lt;/p&gt;

&lt;div class="csharpcode"&gt;
  &lt;pre class="alt"&gt;# Sample Robots.txt file, multiple query parameters&lt;/pre&gt;

  &lt;pre class="alt"&gt;User-agent: *&lt;/pre&gt;

  &lt;pre class="alt"&gt;Disallow: /*from=&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Once you've implemented the pattern appropriate for your site, you can easily check to see if it is working correctly by using the &lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35237"&gt;Google Webmaster Tools robots.txt analysis tool&lt;/a&gt;. It enables you to test specific URLs against a test robots.txt file. Note that although this tool tests GoogleBot specifically, all the major search engines &lt;a href="http://searchengineland.com/yahoo-google-microsoft-clarify-robotstxt-support-14125.php"&gt;support the same pattern matching rules&lt;/a&gt;. In &lt;a href="https://www.google.com/webmasters/tools"&gt;Google Webmaster Tools&lt;/a&gt;: &lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Add the site, then click &lt;strong&gt;Tools&lt;/strong&gt; &amp;gt; &lt;strong&gt;Analyze robots.txt&lt;/strong&gt;. (Unlike most features in Google Webmaster Tools, you don't need to verify ownership of the site to use the robots.txt analysis tool). The tool displays the current robots.txt file. &lt;/li&gt;

  &lt;li&gt;Modify this file with the Disallow line for the tracking parameter. (If the site doesn't yet have a robots.txt file, you'll need to copy in both the User-agent and Disallow lines.) &lt;/li&gt;

  &lt;li&gt;In the Test URLs box, add a couple of the URLs you want to block. Also add a few URLs you &lt;strong&gt;do&lt;/strong&gt; want indexed (such as the original version of the URL that you're adding tracking parameters to). &lt;/li&gt;

  &lt;li&gt;Click &lt;strong&gt;Check&lt;/strong&gt;. The tool displays how Googlebot would interpret the robots.txt file and if each URL you are testing would be blocked or allowed. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;At this point you may be thinking, wow, I can do all this and &lt;em&gt;not &lt;/em&gt;have to write any new code? Unfortunately, there are even more downsides to this approach than the others: &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This option will fix issue 1 (duplicate content), but not issues 2 (ranking) and 3 (reporting). This can be a good interim solution while you're implementing the more complete redirects solution, but it often isn't useful enough on its own. &lt;/li&gt;

  &lt;li&gt;Likely this will take a little bit of extra testing to ensure you get the patterns correct in your robots.txt file and don't inadvertently block content you want indexed. &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Yahoo Site Explorer&lt;/h3&gt;

&lt;p&gt;Yahoo provides an online tool designed to solve this scenario. However, the solution only helps with Yahoo search traffic. To use the Yahoo fix, simply go to &lt;a href="http://siteexplorer.search.yahoo.com"&gt;http://siteexplorer.search.yahoo.com&lt;/a&gt; and create an account for your web site in the Yahoo Site Explorer tool. Once you've verified ownership of your web site, you can use their &lt;a href="http://www.ysearchblog.com/archives/000479.html"&gt;Dynamic URL Rewriting&lt;/a&gt; tool to indicate which parameters in your URLs Yahoo should ignore. &lt;/p&gt;

&lt;p&gt;&lt;a href="http://janeandrobot.com/image.axd?picture=WindowsLiveWriter/URLReferrerTracking_E5DF/image_2.png"&gt;&lt;img style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" border="0" alt="image" src="http://janeandrobot.com/image.axd?picture=WindowsLiveWriter/URLReferrerTracking_E5DF/image_thumb.png" width="660" height="90" /&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Simply specify the name of the parameter you use for referral tracking (in our example it is 'from'), and set the action 'Remove from URLs'. Yahoo will then remove that parameter from all of your URLs while processing them and give you a handy little report about how many URLs where impacted. &lt;/p&gt;

&lt;p&gt;Again, this is another solution that seems too easy to be true, but again, there are some significant limitations with this approach: &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;At the end of the day this is still a Yahoo-only solution. With approximately 20% market share, it is likely this will not meet all of your needs. However, if you do get some percentage of your traffic from Yahoo, there is no harm in doing this in the short term while you implement another method in the longer term. &lt;/li&gt;

  &lt;li&gt;The other problem with this solution is that it doesn't solve issue #3 (reporting), so you are still susceptible to reporting errors due to folks bookmarking and emailing your URLs with tracking codes. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Common Pitfalls &lt;/h2&gt;

&lt;h3&gt;Cloaking &amp;amp; Conditional Redirects&lt;/h3&gt;

&lt;p&gt;Some web sites and SEO consultants attempt to solve this by a technique called cloaking or conditional redirects. Essentially what these methods do is check if the HTTP GET request is coming from a search engine and then show them something different than normal users see. This something different could be a simple 301 redirect back to the page without the tracking parameter similar to our first solution above. The difference is that our solution implemented this redirect for all requesters, and cloaking/ conditional redirects implement it only for search engines. &lt;/p&gt;

&lt;p&gt;The big problem with this implementation method is that cloaking and conditional redirects are explicitly prohibited in the webmaster guidelines for &lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66355"&gt;Google&lt;/a&gt;, &lt;a href="http://help.yahoo.com/l/us/yahoo/search/basics/basics-18.html"&gt;Yahoo&lt;/a&gt; and &lt;a href="http://help.live.com/help.aspx?mkt=en-us&amp;amp;project=wl_webmasters"&gt;Live Search&lt;/a&gt;.&amp;#160; If you use this method, you risk your pages being penalized or banned by the search engines. The primary reason they prohibit this behaviors is because they want to know exactly what content they are presenting searchers using their service. When a web site shows something different to a search engine robot than to a general user, a search engine can never be sure what the user will see when they go to the web site. So, even if you're thinking of implementing cloaking for what seems to be a &lt;a href="http://www.ninebyblue.com/blog/whats-really-black-hat-anyway/"&gt;valid, and not deceptive, reason&lt;/a&gt;, it's still a technique search engines strongly discourage. &lt;/p&gt;

&lt;p&gt;This leads to the second major problem with this implementation method - it adds significant complication and can be difficult to monitor whether or not it's working - e.g. you have to test it pretending to be each of the 3 search engines robots. When things go wrong, it is likely that you're not going to see it right away, and by the time you do, your search engine traffic may already be impacted. Check out this example when Nike ran into an &lt;a href="http://www.vabeachkevin.com/nikecom-pay-attention-googlebot-cloaking-broken/"&gt;issue with cloaking&lt;/a&gt;. &lt;/p&gt;

&lt;h3&gt;Crazy Tracking Codes &lt;/h3&gt;

&lt;p&gt;Many studies on the web that show &lt;a href="http://www.marketingsherpa.com/article.php?ident=30181"&gt;customers prefer short, understandable URLs&lt;/a&gt; over long complicated ones, and are more likely to click on them in the search results. In addition, users prefer descriptive keywords in URLs. Therefore, it might be worth your time to spend a few extra minutes thinking about the tracking codes you use to see if you can make them friendlier. &lt;/p&gt;

&lt;p&gt;Good examples &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;?from=promo &lt;/li&gt;

  &lt;li&gt;?from=developer-video &lt;/li&gt;

  &lt;li&gt;?partner=a768sdf129 &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bad examples &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;?i=A768SDF129,re23ADFA,style-23423,date-2008-02-01&amp;amp;page=2 &lt;/li&gt;

  &lt;li&gt;?IAmSpyingOnYou=a768sdf129&amp;amp;YouAreASucker=re23adfd &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Testing Your Implementation&lt;/h2&gt;

&lt;p&gt;So you've implemented your new favorite method, it compiles on your dev box, and now it's time to roll it into production, right? Maybe not! The initial goal of referrer URL-based tracking was to understand where your traffic was coming from so you can use that information to optimize your business. To ensure the data your collecting is actually useful, we highly recommend that you do some testing to ensure that all the common scenarios are working the way you expect, and you know where the holes are in your measurement capabilities. As with all metrics on the web, there will be holes in your data so you need to know what they are and account for them. &lt;/p&gt;

&lt;p&gt;The first step in testing the implementation is to try it with a test parameter, walking the full scenario through start to finish. &lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Create several phoney promotional links that reflect the actual types of links you expect. This could be on your home page, product pages or with many additional query parameters that you might encounter. &lt;/li&gt;

  &lt;li&gt;Place these fake promotional links in a location that won't confuse your customers but are likely to get indexed by search engines. Using a social networking site or a blog might serve this well. &lt;/li&gt;

  &lt;li&gt;Click through those links as a customer and verify that you get to the correct page with a good user experience. Be sure to take these into account as well: 
    &lt;ul&gt;
      &lt;li&gt;&lt;em&gt;Redirects operating properly (if you're using them)&lt;/em&gt; - use the &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/3829"&gt;Live HTTP Headers&lt;/a&gt; tool in FireFox to ensure the application is providing the correct headers (301 redirect and caching). &lt;/li&gt;

      &lt;li&gt;&lt;em&gt;Major browsers all work&lt;/em&gt;- if you're using cookies, you should test all the major browsers to ensure that they support cookies and that your scenario works the way you might expect. Don't forget to try common mobile browsers if your customers access your site this way. &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;

  &lt;li&gt;Check out the search engine experience to ensure that you're not running into the duplicate content or ranking issues. 
    &lt;ul&gt;
      &lt;li&gt;&lt;em&gt;Major Engines submit URL&lt;/em&gt; - if you place the test URLs in the right social network or place on your blog, they should get indexed within a week or so. If they don't you can also try the &amp;quot;submit a URL&amp;quot; from &lt;a href="http://www.google.com/addurl/"&gt;Google&lt;/a&gt;, &lt;a href="http://siteexplorer.search.yahoo.com/submit"&gt;Yahoo&lt;/a&gt; and &lt;a href="http://search.msn.com.sg/docs/submit.aspx"&gt;Microsoft&lt;/a&gt;, though they are not guaranteed to work. Essentially you want to make sure the search engines have had the opportunity to see these URLs. &lt;/li&gt;

      &lt;li&gt;&lt;em&gt;Use 'site:' command to ensure tracking URLs are not indexed&lt;/em&gt; - here's an example query in &lt;a href="http://janeandrobot.com/admin/Pages/site:janeandrobot.com%20inurl:from"&gt;Google&lt;/a&gt;, &lt;a href="http://siteexplorer.search.yahoo.com/search?p=http%3A%2F%2Fjaneandrobot.com&amp;amp;fr=sfp"&gt;Yahoo&lt;/a&gt;, and &lt;a href="http://search.live.com/results.aspx?q=site%3Ajaneandrobot.com&amp;amp;first=1&amp;amp;FORM=PERE"&gt;Microsoft&lt;/a&gt; showing that our Jane and Robot example promotional URLs are not indexed. &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;

  &lt;li&gt;Take a look at your metrics and ensure the numbers you're recording correlate to the testing you are doing. Some additional things to consider: 
    &lt;ul&gt;
      &lt;li&gt;&lt;u&gt;Internal referrals &lt;/u&gt;- you might also want to add some logic to your application to filter out (or exclude) all referrals from the development team and your own employees. This is often done by checking requests against a list of known employee or company IP addresses and scrubbing those from your tracking data. &lt;/li&gt;

      &lt;li&gt;&lt;u&gt;Caching Issues &lt;/u&gt;- you might also want to try out several scenarios with multiple subsequent requests. You'll want to ensure that every request is going to your server and not getting cached somewhere along the way. &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;Related Resources&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Related Internet Standards: 
    &lt;ul&gt;
      &lt;li&gt;&lt;a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9"&gt;W3C Standard For Cache-Control Header&lt;/a&gt; &lt;/li&gt;

      &lt;li&gt;&lt;a href="http://www.apps.ietf.org/rfc/rfc2396.html"&gt;URI Specification&lt;/a&gt; (how URLs work) &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;

  &lt;li&gt;Tools Used in Article: 
    &lt;ul&gt;
      &lt;li&gt;&lt;a href="http://google.com/webmasters/tools"&gt;Google Webmaster Tools&lt;/a&gt; (Robots.txt Tester) &lt;/li&gt;

      &lt;li&gt;&lt;a href="http://siteexplorer.search.yahoo.com/"&gt;Yahoo Site Explorer&lt;/a&gt; (Dynamic URL Rewriting) &lt;/li&gt;

      &lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/3829"&gt;Live HTTP Headers&lt;/a&gt; (View HTTP Headers) &lt;/li&gt;

      &lt;li&gt;Suggest URL Tool - &lt;a href="http://www.google.com/addurl/"&gt;Google&lt;/a&gt;, &lt;a href="http://siteexplorer.search.yahoo.com/submit"&gt;Yahoo&lt;/a&gt;, &lt;a href="http://search.msn.com.sg/docs/submit.aspx"&gt;Microsoft&lt;/a&gt; &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;

  &lt;li&gt;Canonical Link Tag Standard&lt;/li&gt;

  &lt;ul&gt;
    &lt;li&gt;&lt;a href="http://searchengineland.com/canonical-tag-16537"&gt;Search Engine Land Article&lt;/a&gt; (Best Practices)&lt;/li&gt;

    &lt;li&gt;&lt;a href="http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html"&gt;Google Announcement Blog Post&lt;/a&gt;&lt;/li&gt;

    &lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;amp;answer=139394"&gt;Google Help Documentation&lt;/a&gt;&lt;/li&gt;

    &lt;li&gt;&lt;a href="http://ysearchblog.com/2009/02/12/fighting-duplication-adding-more-arrows-to-your-quiver/"&gt;Yahoo Announcement Blog Post&lt;/a&gt;&lt;/li&gt;

    &lt;li&gt;&lt;a href="http://blogs.msdn.com/webmaster/archive/2009/02/12/partnering-to-help-solve-duplicate-content-issues.aspx"&gt;Live Search Announcement Blog Post&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;

  &lt;li&gt;Related articles 
    &lt;ul&gt;
      &lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66359"&gt;Duplicate Content - Google Technical Support&lt;/a&gt; &lt;/li&gt;

      &lt;li&gt;&lt;a href="http://blogs.omniture.com/2008/10/01/campaign-tracking-inside-omniture-sitecatalyst/"&gt;URL Tracking in Omniture's SiteCatalyst&lt;/a&gt; &lt;/li&gt;

      &lt;li&gt;&lt;a href="http://www.google.com/support/googleanalytics/bin/answer.py?answer=55515"&gt;Goal Tracking in Google Analytics&lt;/a&gt; &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;

  &lt;li&gt;&lt;a href="http://www.kaushik.net/avinash"&gt;Occam&amp;#8217;s Razor by Avinash Kaushik&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/janeandrobot/~4/ohl_CJWpbow" height="1" width="1"/&gt;</description>
      <link>http://feedproxy.google.com/~r/janeandrobot/~3/ohl_CJWpbow/post.aspx</link>
      <author>Nathan Buggia</author>
      <comments>http://janeandrobot.com/post/URL-Referrer-Tracking.aspx#comment</comments>
      <guid isPermaLink="false">http://janeandrobot.com/post.aspx?id=cda667bc-9934-4975-a893-c002181d0582</guid>
      <pubDate>Sun, 30 Nov 2008 13:21:00 -0700</pubDate>
      <category>Design Patterns</category>
      <dc:publisher>Nathan Buggia</dc:publisher>
      <pingback:server>http://janeandrobot.com/pingback.axd</pingback:server>
      <pingback:target>http://janeandrobot.com/post.aspx?id=cda667bc-9934-4975-a893-c002181d0582</pingback:target>
      <slash:comments>40</slash:comments>
      <trackback:ping>http://janeandrobot.com/trackback.axd?id=cda667bc-9934-4975-a893-c002181d0582</trackback:ping>
      <wfw:comment>http://janeandrobot.com/post/URL-Referrer-Tracking.aspx#comment</wfw:comment>
      <wfw:commentRss>http://janeandrobot.com/syndication.axd?post=cda667bc-9934-4975-a893-c002181d0582</wfw:commentRss>
    <feedburner:origLink>http://janeandrobot.com/post.aspx?id=cda667bc-9934-4975-a893-c002181d0582</feedburner:origLink></item>
    <item>
      <title>Managing Robot's Access To Your Website</title>
      <description>&lt;p&gt;
Controlling what content is blocked from being found in search engines is crucial for many websites. Fortunately, the major search engines and other well-behaved robots observe the &lt;a href="http://www.robotstxt.org/" target="_blank"&gt;Robots Exclusion Protocol&lt;/a&gt; (REP), which has evolved organically since the early 1990&amp;#39;s to provide a set of controls over what parts of a web site search engines robots can crawl and index. 
&lt;/p&gt;
&lt;p&gt;
Article Sections: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href="#Capabilities_of_the_REP"&gt;Capabilities of REP&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="#Deciding_what_should_be_Public_vs._Private"&gt;Deciding What Should be Public vs. Private&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="#Implementing_the_REP"&gt;Implementing the REP&lt;/a&gt; 
	&lt;ul style="margin-bottom: 0pt"&gt;
		&lt;li&gt;&lt;a href="#Site_Level_Implementation_(Robots.txt)"&gt;Site Level&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="#Page_Level_Implementation_(META_Tags)"&gt;Page Level (Meta Tags)&lt;/a&gt;&lt;br /&gt;
		&lt;/li&gt;
		&lt;li&gt;&lt;a href="#HTTP_Header_Implementation_(X-ROBOTS-Tag)"&gt;Page Level (HTTP Header)&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="#Content_Level_Implementation"&gt;Content Level&lt;/a&gt;&lt;br /&gt;
		&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;a href="#Common_implementation_mistakes"&gt;Common Mistakes&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="#Testing_your_implementation_"&gt;Testing Your Implementation&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="#removal"&gt;Removing Content From Search Engine Indices&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="#Additional_Resources:_"&gt;Additional Resources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a name="Capabilities_of_the_REP" title="Capabilities_of_the_REP"&gt;&lt;/a&gt;Capabilities of the REP&lt;/h2&gt;
&lt;p&gt;
The Robots Exclusion Protocol provides controls that can be applied at the site level (robots.txt), at the page level (META tag, or X-Robots-Tag), or at the HTML element level to control both the crawl of your site and the way it&amp;#39;s listed in the search engine results pages (SERPs). Below is a table listing the common scenarios, directives, and which search engines support them. 
&lt;/p&gt;
&lt;table border="1" cellspacing="0" cellpadding="2"&gt;
	&lt;tbody&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;strong&gt;Robots.txt&lt;/strong&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;strong&gt;META/ X-Robots-Tag&lt;/strong&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;strong&gt;Other&lt;/strong&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;strong&gt;Supported By&lt;/strong&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Allow access to your content&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;Allow&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;FOLLOW&lt;br /&gt;
			INDEX&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Disallow access to your content&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;Disallow &lt;br /&gt;
			&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOINDEX &lt;br /&gt;
			NOFOLLOW&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;amp;answer=35303"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Disallow access to index images on the page&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOIMAGEINDEX&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;amp;answer=79892"&gt;Google&lt;/a&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Disallow the display of a cached version of your content in the SERP&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOARCHIVE&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35306="&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://help.yahoo.com/l/us/yahoo/search/deletion/basics-10.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Disallow the creation of a description for this content in the SERP&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOSNIPPET&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35304"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://www.ysearchblog.com/archives/000587.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Disallow the translation of your content into other languages&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOTRANSLATE&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/help/faq_translation.html#donttrans"&gt;Google&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Do not follow or give weight to links within this content&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOFOLLOW &lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;a href attribute:&lt;br /&gt;
			rel=NOFOLLOW&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=96569"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://www.ysearchblog.com/archives/000069.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/livesearch/archive/2005/01/18/nofollow_tags.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Do not use the &lt;a href="http://www.dmoz.org/" target="_blank"&gt;Open Directory Project&lt;/a&gt; (ODP) to create descriptions for your content in the SERP&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;NOODP&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35264"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://help.yahoo.com/l/us/yahoo/search/indexing/indexing-11.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Do not use the Yahoo Directory to create descriptions for your content in the SERP&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;span style="font-size: 10pt; font-family: 'Tahoma','sans-serif'"&gt;NOYDIR&lt;/span&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Yahoo&lt;/a&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Do not index this specific element within an HTML page&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;class=robots-nocontent&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.ysearchblog.com/archives/000444.html"&gt;Yahoo&lt;/a&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Stop indexing this content after a specific date&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;UNAVAILABLE_AFTER&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html"&gt;Google&lt;/a&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Specify a sitemap file or a sitemap index file&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;Sitemap&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;amp;answer=64748"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://www.ysearchblog.com/archives/000437.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/livesearch/archive/2007/04/11/discovering-sitemaps.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Specify how frequently a crawler may access your website&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;Crawl-Delay &lt;br /&gt;
			&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://google.com/webmaster"&gt;Google WMT&lt;/a&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/webmaster/archive/2008/04/18/ramping-up-msnbot.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Authenticate the identity of the crawler&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;Reverse DNS Lookup&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://www.ysearchblog.com/archives/000460.html"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx"&gt;Microsoft&lt;/a&gt; &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td valign="top"&gt;Request removal of your content from the engine&amp;#39;s index&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://google.com/webmaster"&gt;Google WMT&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://siteexplorer.search.yahoo.com"&gt;Yahoo SE&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://webmaster.live.com/"&gt;Microsoft WMT&lt;/a&gt;&lt;/td&gt;
			&lt;td valign="top" style="text-align: center"&gt;&lt;a href="http://googlewebmastercentral.blogspot.com/2007/04/requesting-removal-of-content-from-our.html"&gt;Google&lt;/a&gt; &lt;br /&gt;
			&lt;a href="http://help.yahoo.com/l/us/yahoo/search/siteexplorer/delete/"&gt;Yahoo&lt;/a&gt; &lt;br /&gt;
			Microsoft &lt;/td&gt;
		&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;&lt;a name="Deciding_what_should_be_Public_vs._Private" title="Deciding_what_should_be_Public_vs._Private"&gt;&lt;/a&gt;Deciding What Should be Public vs. Private&lt;/h2&gt;
&lt;p&gt;
One of the first steps in managing the robots is knowing what type of content should be public vs. private. Start with the assumption that by default, everything is public, then explicitly identify the items that are private. 
&lt;/p&gt;
&lt;p&gt;
If you want search engines to access all the content on your site, you don&amp;#39;t need a robots.txt file at all. When a search engine tries to access the robots.txt file on your site and the server can&amp;#39;t return one (ideally by returning a 404 HTTP status code), the search engine treats this the same as a robots.txt file that allows access to everything. 
&lt;/p&gt;
&lt;p&gt;
Every website and every business has a different set of needs, so there&amp;#39;s no blanket rule for what to make private, but some common elements may apply. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Private data&lt;/strong&gt; - You may have content on your site that you don&amp;#39;t want to be searchable in search engines. For instance, you may have private user information (such as addresses) that you don&amp;#39;t want surfaced. For this type of content, you may want to use a more secure approach that keeps all visitors from the pages (such as password protection). However, some types of content are fine for visitor access, but not search engine access. For instance, you may run a discussion forum that is open for public viewing, but you may not want individual posts to appear in search results for forum member names. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;&lt;a name="noncontent" title="noncontent"&gt;&lt;/a&gt;Non-content content&lt;/strong&gt; - Some content, like &lt;a href="http://janeandrobot.com/post/Effectively-Using-Images.aspx#noncontent"&gt;images used for navigation&lt;/a&gt;, provides little value to searchers. It&amp;#39;s not harmful to include these items in search engine indices, but since search engines allocate limited bandwidth to crawl each site and limited space to store content from each site, it may make sense to block these items to help direct the bots to the content on your site that you do want indexed. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Printer-friendly pages&lt;/strong&gt; - if you have specific pages (URLs) that are formatted for printing you may want to block them out to avoid duplicate content issues. The drawback to allowing the printer-friendly page to be indexed is that it could potentially be listed in the search results instead of the default version of the page, which wouldn&amp;#39;t provide an ideal user experience for a visitor coming to the site through search. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Affiliate links and advertising&lt;/strong&gt; - If you include advertising on your site, you can keep search engine robots from following the links by redirecting them to a blocked page, then on to the destination page. (There are other methods for implementing advertising-based links as well.) 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Landing pages&lt;/strong&gt; - Your site may include multiple variations of entry pages used for advertising purposes. For instance, you may run AdWords campaigns that link to a particular version of a page based on the ad, or you may print different URLs for different print ad campaigns (either for tracking purposes or to provide a custom experience related to the ad). Since these pages are meant to be an extension of the ad, and are generally near duplicates of the default version of the page, you may want to block these landing pages from being indexed. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Experimental pages&lt;/strong&gt; - As you try new ideas on your site (for instance, using A/B testing), you likely want to block all but the original page from being indexed during the experiment. 
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a name="Implementing_the_REP" title="Implementing_the_REP"&gt;&lt;/a&gt;Implementing the REP&lt;/h2&gt;
&lt;p&gt;
REP is flexible and can be implemented a number of ways. This flexibility lets you easily specify some policies for your entire site (or subdomain) and then enhance them more granularly at the page or link level as needed.&amp;nbsp; 
&lt;/p&gt;
&lt;h3&gt;&lt;a name="Site_Level_Implementation_(Robots.txt)" title="Site_Level_Implementation_(Robots.txt)"&gt;&lt;/a&gt;Site Level Implementation (Robots.txt)&lt;/h3&gt;
&lt;p&gt;
Site wide directives are stored in a robots.txt file, which must&amp;nbsp;be located in the root directory of each domain or sub-domain (e.g. &lt;a href="http://janeandrobot.com/robots.txt"&gt;http://janeandrobot.com/robots.txt&lt;/a&gt;.) Note that robots.txt files only apply to the hostname where they are placed, and do not apply to subdomains. So a robots.txt file located on &lt;a href="http://microsoft.com/robots.txt"&gt;http://microsoft.com/robots.txt&lt;/a&gt; will not apply to the MSDN subdomain&amp;nbsp;&lt;a href="http://msdn.microsoft.com/"&gt;http://msdn.microsoft.com&lt;/a&gt;.&amp;nbsp;However, the robots.txt file does apply to all subfolders and&amp;nbsp;pages within the specified hostname. 
&lt;/p&gt;
&lt;p&gt;
A robots.txt file is a UTF-8 encoded file that contains entries that consist of a user-agent line (that tells the search engine robot if the entry is directed at it) and one or more directives that specify content that the search engine robot is blocked from crawling or indexing. A simple robots.txt file is shown below. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /private
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;code&gt;user-agent:&lt;/code&gt; - Specifies which robots the entry applies to. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Set this to &lt;code&gt;*&lt;/code&gt; to specify that this entry applies to all search engine robots. &lt;/li&gt;
	&lt;li&gt;Set this to a specific robot name to provide instructions for just that robot. You can find a complete list of robot names at &lt;a href="http://www.robotstxt.org"&gt;robotstxt.org&lt;/a&gt;. &lt;/li&gt;
	&lt;li&gt;If you direct an entry at a particular robot, then it obeys that entry &lt;em&gt;instead&lt;/em&gt; of any entries defined for &lt;code&gt;user-agent: * &lt;/code&gt;(rather than in addition to those entries).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
The major search engines have multiple robots that crawl the web for different types of content (such as images or mobile). They generally begin all robots with the same name so that if you block the major robot, all robots for that search engine are blocked as well. However, if you want to block only the more specific robot, you can block it directly and still allow web crawl access. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364"&gt;Google&lt;/a&gt;&lt;/strong&gt; - The primary search engine robot is Googlebot.&amp;nbsp; &lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html"&gt;Yahoo!&lt;/a&gt;&lt;/strong&gt; - The primary search engine robot is Slurp. &lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx"&gt;Live Search&lt;/a&gt;&lt;/strong&gt; - The primary search engine robots is MSNbot.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
&lt;code&gt;Disallow: &lt;/code&gt;- Specifies what content is blocked 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Must begin with a slash (&lt;code&gt;/&lt;/code&gt;). &lt;/li&gt;
	&lt;li&gt;Blocks access to any URLs that begin with the characters after the &lt;code&gt;/&lt;/code&gt;. For instance, &lt;code&gt;Disallow: /images&lt;/code&gt; blocks access to &lt;code&gt;/images/&lt;/code&gt;, &lt;code&gt;/images/image1.jpg&lt;/code&gt;, and &lt;code&gt;/images10&lt;/code&gt;. &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
You can specify other rules for search engine robots in addition to the standard instructions that block access to content as noted in &lt;a href="#other"&gt;other robot instructions&lt;/a&gt;. 
&lt;/p&gt;
&lt;p&gt;
Some things to note about robots.txt implementation: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The major search engines support pattern matching using the asterisk character (*) for wildcard match and the dollar sign ($) for end of sequence matching as described below in &lt;a href="#patterns"&gt;using pattern matching&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;The robots.txt file is case sensitive, so &lt;code&gt;Disallow: /images &lt;/code&gt;would block &lt;code&gt;http://www.example.com/images&lt;/code&gt; but not &lt;code&gt;http://www.example.com/Images&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;If conflicts exist in the file, the robot obeys the longest (and therefore generally more specific) line.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Basic Samples&lt;/h4&gt;
&lt;p&gt;
&lt;em&gt;Block all robots&lt;/em&gt; - Useful when your site is in pre-launch development and isn&amp;#39;t ready for search traffic. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# This keeps out all well-behaved robots.
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# Disallow: * is not valid.
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;em&gt;Keep out all bots by default&lt;/em&gt; - Blocks all pages except those specified. Not recommended as is difficult to maintain and diagnose. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Stay out unless otherwise stated
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /Public/
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /articles/
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /images/
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;em&gt;Block specific content&lt;/em&gt; - The most common usage of robots.txt. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Block access to the images folder
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /images/
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;a name="allow" title="allow"&gt;&lt;/a&gt;&lt;em&gt;Allow specific content&lt;/em&gt; - Block a folder, but allow access to selected pages in that folder. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Block everything in the images folder
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# Except allow images/image1.jpg
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /images/
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /images/image1.jpg
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;a href="http://janeandrobot.com/admin/Pages/patterns"&gt;&lt;/a&gt;&lt;em&gt;Allow specific robot&lt;/em&gt; - Block a class of robots (for instance, Googlebot), but allow a specific bot in that class (for instance, Googlebot-Mobile). 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Block Googlebot access
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# Allow Googlebot-Mobile access
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: Googlebot
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: Googlebot-Mobile
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /
&lt;/pre&gt;
&lt;/div&gt;
&lt;h4&gt;Pattern Matching Examples&lt;/h4&gt;
&lt;p&gt;
The major engines support two types of pattern matching. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&amp;nbsp;&lt;strong&gt;*&lt;/strong&gt; matches any sequence of characters&lt;/li&gt;
	&lt;li&gt;&amp;nbsp;&lt;strong&gt;$&lt;/strong&gt; matches the end of&amp;nbsp; URL. &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
&lt;em&gt;Block access to URLs that contain a set of characters&lt;/em&gt; - Use the asterisk (*) to specify a wildcard. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Block access to all URLs that include an ampersand
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /*&amp;amp;
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
This directive would block search engines from crawling &lt;code&gt;http://www.example.com/page1.asp?id=5&amp;amp;sessionid=xyz&lt;/code&gt;. 
&lt;/p&gt;
&lt;p&gt;
&lt;em&gt;Block access to URLs that end with a set of characters&lt;/em&gt; - Use the dollar sign ($) to specify end of line. 
&lt;/p&gt;
&lt;div class="answer_heading"&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Block access to all URLs that end in .cgi
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /*.cgi$
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
This directive would block search engines from crawling &lt;code&gt;http://www.example.com/script1.cgi&lt;/code&gt; but &lt;em&gt;not&lt;/em&gt; from crawling &lt;code&gt;http://www.example.com/script1.cgi?value=1&lt;/code&gt;. 
&lt;/p&gt;
&lt;p&gt;
&lt;em&gt;Selectively allow access to a URL that matches a blocked pattern&lt;/em&gt; - Use the &lt;code&gt;Allow&lt;/code&gt; directive in conjunction with pattern matching for more complex implementations. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Block access to URLs that contain ?
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# Allow access to URLs that end in ?
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /*?
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /*?$
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
That directive blocks all URLs that contain &lt;code&gt;?&lt;/code&gt; except those that end in &lt;code&gt;?&lt;/code&gt;. In this example, the default version of the page will be indexable: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;http://www.example.com/productlisting.aspx?&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
Variations of the page will be blocked: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;http://www.example.com/productlisting.aspx?nav=price&lt;/code&gt;&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;http://www.example.com/productlisting.aspx?sort=alpha&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;a name="other" title="other"&gt;&lt;/a&gt;Other robot instructions&lt;/h4&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;span class="style2"&gt;Specify a Sitemap or Sitemap index file&lt;/span&gt; - If you&amp;#39;d like to provide search engines with a comprehensive list of your best URLs, you can provide one or more &lt;a href="http://sitemaps.org" target="_blank"&gt;Sitemap&lt;/a&gt; autodiscovery directives. Note, user-agent does not apply to this directive so you cannot use this to specify a Sitemap to some but not all search engines. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# Please take my sitemap and index everything!
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Sitemap: http://janeandrobot.com/sitemap.axd
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;em&gt;Reduce the crawling load&lt;/em&gt; - This only works with Microsoft and Yahoo. For Google you&amp;#39;ll need to specify a slower crawling speed through their &lt;a href="http://google.com/webmaster" target="_blank"&gt;Webmaster Tools&lt;/a&gt;. Be careful when implementing this because if you slow down the crawl too much, robots won&amp;#39;t be able to get to all of your site and you may lose pages from the index. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# MSNBot, please wait 5 seconds in between visits
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: msnbot
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Crawl-delay: 5
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# Yahoo&amp;#39;s Slurp, please wait 12 seconds in between visits
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: slurp
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Crawl-delay: 12
&lt;/pre&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a name="Page_Level_Implementation_(META_Tags)" title="Page_Level_Implementation_(META_Tags)"&gt;&lt;/a&gt;Page Level Implementation (META Tags)&lt;/h3&gt;
&lt;p&gt;
The REP page-level directives allow you to refine the site wide policies on a page-by-page basis 
&lt;/p&gt;
&lt;p&gt;
&lt;em&gt;Placing a meta tag on the page&lt;/em&gt; - Place the meta tag in the head tag. Each directive should be comma delimited inside the tag. E.g. &amp;lt;meta name=&amp;quot;ROBOTS&amp;quot; content=&amp;quot;Directive1, Directive 2&amp;gt;. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;html&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;head&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;title&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;Your title here&lt;span class="kwrd"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="html"&gt;title&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="html"&gt;head&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;body&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;Your page here&lt;span class="kwrd"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="html"&gt;body&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="html"&gt;html&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;em&gt;Targeting a specific search engine&lt;/em&gt; - Within the meta tag you can specify which search engine you would like to target, or you can target them all. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Applies to All Robots --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- ONLY GoogleBot --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;Googlebot&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- ONLY Slurp (Yahoo) --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;Slurp&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- ONLY MSNBot (Microsoft) --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;MSNBot&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;em&gt;Control how your listings&lt;/em&gt; - there are a set of options you can use to determine how your site will show up on the SERP. You can exert some control over how the description is created, and remove the &amp;quot;Cached page&amp;quot; link. 
&lt;/p&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=example-SERP.gif" alt="Example search engine results page (SERP)" width="542" height="83" /&gt;&amp;nbsp;&lt;!-- code formatted by http://manoli.net/csharpformat/ --&gt; 
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not show a description for this page --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOSNIPPET&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not use http://dmoz.org to create a description --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOODP&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not present a cached version of the document in a search result --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOARCHIVE&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;em&gt;Using other directives&lt;/em&gt; - Other meta robots directives are shown below. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not trust links on this page, could be user generated content (UCG) --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOFOLLOW&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not index this page --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not index any images on this page (will still index the if they are linked&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;     elsewhere) Better to use Robots.txt if you really want them safe.&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;     This is a Google Only tag. --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;GOOGLEBOT&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOIMAGEINDEX&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- Do not translate this page into other languages--&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;NOTRANSLATE&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;nbsp;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="rem"&gt;&amp;lt;!-- NOT RECOMMENDED, there really isn&amp;#39;t much point in using these --&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;FOLLOW&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;meta&lt;/span&gt; &lt;span class="attr"&gt;name&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;ROBOTS&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;content&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;UNAVAILABLE_AFTER&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a name="HTTP_Header_Implementation_(X-ROBOTS-Tag)" title="HTTP_Header_Implementation_(X-ROBOTS-Tag)"&gt;&lt;/a&gt;HTTP Header Implementation (X-ROBOTS-Tag)&lt;/h3&gt;
&lt;p&gt;
Allows developers to specify page-level REP directives for non text/html content types like PDF, DOC, PPT, or dynamically generated images. 
&lt;/p&gt;
&lt;p&gt;
&lt;em&gt;Using the X-Robots-Tag&lt;/em&gt; - to use the X-Robots-Tag, simply add it to your header as shown below. To specify multiple directives you can either comma delimit them, or add them as separate header items. 
&lt;/p&gt;
&lt;!-- code formatted by http://manoli.net/csharpformat/ --&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
HTTP/1.x 200 OK
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Cache-Control: private
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Content-Length: 2199552
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Content-Type: application/octet-stream
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Server: Microsoft-IIS/7.0
&lt;/pre&gt;
&lt;pre class="alt"&gt;
content-disposition: inline; filename=01 - The truth about SEO.ppt
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&lt;strong&gt;X-Robots-Tag: noindex, nosnippet&lt;/strong&gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
X-Powered-By: ASP.NET
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Date: Sun, 01 Jun 2008 19:25:47 GMT
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
The X-Robots-Tag directive supports most of the same directives as the meta tag. The only limitation with this method over the meta tag implementation is that there is no way to target a specific robot - though that probably isn&amp;#39;t a big deal for most use cases. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;span style="font-family: Courier New"&gt;X-Robots-Tag: noindex&lt;/span&gt;&lt;/li&gt;
	&lt;li&gt;&lt;span style="font-family: Courier New"&gt;X-Robots-Tag: nosnippet&lt;/span&gt;&lt;/li&gt;
	&lt;li&gt;&lt;span style="font-family: Courier New"&gt;X-Robots-Tag: notranslate&lt;/span&gt;&lt;/li&gt;
	&lt;li&gt;&lt;span style="font-family: Courier New"&gt;X-Robots-Tag: noarchive&lt;/span&gt;&lt;/li&gt;
	&lt;li&gt;&lt;span style="font-family: Courier New"&gt;X-Robots-Tag: unavailable_after: 7 Jul 2007 16:30:00 GMT&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;a name="Content_Level_Implementation" title="Content_Level_Implementation"&gt;&lt;/a&gt;Content Level Implementation&lt;/h3&gt;
&lt;p&gt;
You can further refine your site level and page level directives within several content tags. 
&lt;/p&gt;
&lt;p&gt;
Each anchor tag (link) can be modified to tell search engines that you do not trust where this URL is pointing to. This is typically used for links within user generated content (UCG) like wikis, blog comments, reviews and other community sites. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
&amp;lt;a href=&amp;quot;#&amp;quot; rel=&amp;quot;NOFOLLOW&amp;quot;&amp;gt;My Hyperlink&amp;lt;/a&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;
Also, in Yahoo Search you can specify which &amp;lt;div&amp;gt; elements on a page you would not like indexed using the &lt;code&gt;class=robots-nocontent&lt;/code&gt; attribute. However, we don&amp;#39;t highly recommend using this tag because it is not supported in any other engine, making it not super-useful. 
&lt;/p&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
&amp;lt;div class=&amp;quot;robots-nocontent&amp;quot;&amp;gt;
&lt;/pre&gt;
&lt;pre class="alt"&gt;
No content for you! (or at least Yahoo!)
&lt;/pre&gt;
&lt;pre class="alt"&gt;
&amp;lt;/div&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;h2&gt;&lt;a name="Common_implementation_mistakes" title="Common_implementation_mistakes"&gt;&lt;/a&gt;Common Mistakes&lt;/h2&gt;
&lt;p&gt;
While implementing the REP is generally straight-forward, there are a few common mistakes. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;em&gt;GoogleBot follows the most specific directive, ignoring all others&lt;/em&gt;. In the robots.txt file, if you specify a section for all user-agents (&lt;code&gt;user-agent: *&lt;/code&gt;) and also declare a section for Googlebot (&lt;code&gt;user-agent: Googlebot&lt;/code&gt;), Google will disregard all sections in the robots.txt file except the Googlebot section. This could potentially leave you exposing much more content to Google that you might have thought. 
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="csharpcode"&gt;
&lt;pre class="alt"&gt;
# This keeps out all well-behaved robots			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
					&amp;nbsp;			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: *			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Disallow: /			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
							&amp;nbsp;			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# This looks like it is giving Google access to only this directory, but since it is a			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# GoogleBot specific section, Google will disregard the previous section			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
# and access the whole site.				
&lt;/pre&gt;
&lt;pre class="alt"&gt;
							&amp;nbsp;			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
User-agent: Googlebot			
&lt;/pre&gt;
&lt;pre class="alt"&gt;
Allow: /Content_For_Google/			
&lt;/pre&gt;
&lt;/div&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;em&gt;NOFOLLOW will most likely not prevent indexing&lt;/em&gt; - if you use &lt;code&gt;NOFOLLOW&lt;/code&gt; at either the page or the link level, it is still possible for the links from the page to be indexed because the search engine may have found a reference to them from another source. Another note, using &lt;code&gt;rel=&amp;quot;NOFOLLOW&amp;quot;&lt;/code&gt; within your anchor text is still perceived as a recommendation by the search engines, not a command. 
	&lt;/p&gt;
	&lt;p&gt;
	To ensure that content is not indexed, either use the &lt;code&gt;Disallow&lt;/code&gt; directive at the site level, or use &lt;code&gt;NOINDEX&lt;/code&gt; at the page level. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;em&gt;Directives that are not recommended&lt;/em&gt; - the directives in the REP are all about exceptions, by default the robots assume they can crawl your whole site. Therefore, you do not need to explicitly use the &lt;code&gt;FOLLOW&lt;/code&gt; and &lt;code&gt;INDEX&lt;/code&gt; directives as they will not be taken into account by the search engines. It sounds silly but I&amp;#39;ve seen a few sites that have implemented these on every page and every link. 
	&lt;/p&gt;
	&lt;p&gt;
	Another directive that is not recommended is the &lt;code&gt;NOCACHE&lt;/code&gt; directive. This was created by Microsoft, and is synonymous with &lt;code&gt;NOARCHIVE&lt;/code&gt;. While they will most likely always continue to support the directive, it is better to use &lt;code&gt;NOARCHIVE&lt;/code&gt; so it will work on all the search engines. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;em&gt;Be cognizant of case&lt;/em&gt; - when referencing files and URLs in the Robots.txt file, use a defensive approace to URL case, as the major engines do not handle it the same way. (e.g. /Files does not always equal /files). 
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a name="Testing_your_implementation_" title="Testing_your_implementation_"&gt;&lt;/a&gt;Testing Your Implementation &lt;/h2&gt;
&lt;p&gt;
As you&amp;#39;re implementing your REP design, you should test it both before you deploy it and after. The easiest way to test this is to use the robots validator in either Google or Microsoft&amp;#39;s Webmaster Tools. These tools are generally good enough test beds for most folks, however advanced developers (or paranoid ones with critical business requirements) will want to definitively know what the robots are doing, not simply rely on what the robots say they are doing. These folks will want to look at their tools as well look at their server logs. 
&lt;/p&gt;
&lt;p&gt;
In addition to using validation tools, reporting tools from the search engines on what they couldn&amp;#39;t acces, and looking at logs data to see what the search engine robots are crawling, you should check the search engine results to see if any pages you are intending to block are being indexed. If they are, use the methods described in this section to ensure you are blocking them correctly and &lt;a href="#removal"&gt;use the search engine tools to request that the pages be removed&lt;/a&gt;. 
&lt;/p&gt;
&lt;p&gt;
&lt;a name="partial" title="partial"&gt;&lt;/a&gt;&lt;em&gt;When Blocked Content Appears to be Indexed&lt;/em&gt; - If search engines are blocked from crawling pages, they may still index the URL if the robot finds a link to that URL on a page that isn&amp;#39;t blocked. The listing may display the URL only, such as shown below. 
&lt;/p&gt;
&lt;p&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=urlonly.gif" alt="Google partially indexed results" width="364" height="282" /&gt; 
&lt;/p&gt;
&lt;p&gt;
Or, it may include a title and in some instances, a description. This makes it appear as though the search engine robot is disregarding the directive that blocks access to the page, but the search engine is in fact obeying the directive not to crawl the page and is using anchor text from the link to that page and descriptive details from either the page that contains the link or a source such as the &lt;a href="http://www.dmoz.org"&gt;Open Directory Project&lt;/a&gt;. 
&lt;/p&gt;
&lt;p&gt;
For more details, see: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35667"&gt;Google: partially indexed page&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-01.html"&gt;Yahoo!: thin documents&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;a name="The_Easy_Way_" title="The_Easy_Way_"&gt;&lt;/a&gt;The Easy Way&lt;/h3&gt;
&lt;p&gt;
&lt;em&gt;Search Engine Tools For Validation&lt;/em&gt; - Both Google and Microsoft provide some tools as part of their Webmaster Centers to help you verify if you&amp;#39;ve configured your REP the way you expect. Let&amp;#39;s start with Google&amp;#39;s tools: 
&lt;/p&gt;
&lt;p&gt;
The first thing you should check are the list of URLs that Google has seen from your website and not indexed due to the REP. Note you can also download the list and filter, sort, and have-your-way-with-it in Excel. 
&lt;/p&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=webmaster-robotstxt-blocked.gif" alt="Google Webmaster Tools: Blocked URLs" width="450" height="266" /&gt;&amp;nbsp; 
&lt;p&gt;
The next step is to use their interactive robots.txt tool to analyze your rules and test specific URLs for blockage. When you pull up the tool they already should have it pre-populated with the robots.txt file they have on file from the last time they crawled. You can input a list of URLs you&amp;#39;d like to check below, select the user-agent you&amp;#39;d like to check against and the tool will tell you if they are blocked or not. You can also use the tool to test changes to your robots.txt file to see how Google would interpret things. 
&lt;/p&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=Google-Analyze-RobotsTXT.jpg" alt="Google Webmaster Tools: robots.txt analysis" width="450" height="378" /&gt;&amp;nbsp; 
&lt;p&gt;
Microsoft has a similar tool in their &lt;a href="http://webmaster.live.com/"&gt;Webmaster Center&lt;/a&gt; that will validate a robots.txt file against the standard that MSNBot supports. To use the tool, simply log in copy &amp;amp; paste your robots.txt file into the top field and select &lt;strong&gt;Validate&lt;/strong&gt;. A list of all detectable issues are displayed in the bottom box. 
&lt;/p&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=Microsoft-RobotsTXT-Validat.jpg" alt="Microsoft Live Search Webmaster Tools: robots.txt validator" width="450" height="285" /&gt;&amp;nbsp; 
&lt;h3&gt;&lt;a name="The_Hard_Way_(More_Accurate)" title="The_Hard_Way_(More_Accurate)"&gt;&lt;/a&gt;The Hard Way&lt;/h3&gt;
&lt;p&gt;
&lt;em&gt;More Accurate Views of Robot Access Through Your Logs&lt;/em&gt; - If you have a specific business need to ensure that the robots are following your rules, (or you&amp;#39;re just paranoid) then you should not simply rely on the tools they provide to test compliance. You&amp;#39;re going to need to go straight to the horse&amp;#39;s mouth and analyze your web server logs to see exactly what they are doing. There is no one easy tool for doing this, you&amp;#39;ll likely have to use an existing tool like one of these (&lt;a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07"&gt;Microsoft HTTP Log Parser&lt;/a&gt;) or write your own. It isn&amp;#39;t difficult, it will simply take some time to implement. A useful reference for this is a list of all the robot &lt;a href="http://www.robotstxt.org/db.html"&gt;user agents&lt;/a&gt;, and more complete list of bots from &lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364"&gt;Google&lt;/a&gt;, and &lt;a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx"&gt;Microsoft&lt;/a&gt;. 
&lt;/p&gt;
&lt;p&gt;
&lt;a name="verify" title="verify"&gt;&lt;/a&gt;&lt;em&gt;Verifying Robot Identity&lt;/em&gt; - Another thing you&amp;#39;ll likely want to consider in this endeavor is to validate that the robots are who they actually say they are. Google, Yahoo and Microsoft all support &lt;a href="http://en.wikipedia.org/wiki/Reverse_DNS_lookup"&gt;Reverse DNS authentication&lt;/a&gt; of their robots. The process is pretty simple and described here by &lt;a href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html"&gt;Google&lt;/a&gt;, &lt;a href="http://www.ysearchblog.com/archives/000460.html"&gt;Yahoo &lt;/a&gt;and &lt;a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx"&gt;Microsoft&lt;/a&gt;, essentially you simply find out what range their robot&amp;#39;s DNS is hosted in, and use that in your tool. This way, if the address changes (which it will), you don&amp;#39;t need to update your code. 
&lt;/p&gt;
&lt;p&gt;
Should you find any issues, where one of the robots are not minding the REP, or are misbehaving in some other way, you can always communicate directly with each engine through one of their forums: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href="http://groups.google.com/group/Google_Webmaster_Help-Indexing/topics"&gt;Google Crawling, Indexing and Ranking Forum&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://help.yahoo.com/l/us/yahoo/search/search_support.html"&gt;Yahoo Crawler Feedback Form&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://forums.microsoft.com/webmaster/ShowForum.aspx?ForumID=1984&amp;amp;SiteID=79"&gt;Microsoft Crawler Error and Feedback Forum&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a name="removal" title="removal"&gt;&lt;/a&gt;Removing Content From Search Engine Indices&lt;/h2&gt;
&lt;p&gt;
If you find that you haven&amp;#39;t implemented the techniques described here correctly and private content from your site is indexed, each of the major search engines has methods available for requesting that it be removed. For more information, see: 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href="http://googlewebmastercentral.blogspot.com/2007/04/requesting-removal-of-content-from-our.html"&gt;Google: Requesting removal of content from our index&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://help.yahoo.com/l/us/yahoo/search/siteexplorer/delete/"&gt;Yahoo!: Deleting URLs&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://support.live.com/eform.aspx?productKey=wlsearch&amp;amp;page=wlsupport_home_options_form_byemail&amp;amp;ct=eformts"&gt;Live Search: Requesting content removal&lt;/a&gt;&lt;br /&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a name="Additional_Resources:_" title="Additional_Resources:_"&gt;&lt;/a&gt;Additional Resources: &lt;/h2&gt;
&lt;ul&gt;
	&lt;li&gt;Google 
	&lt;ul&gt;
		&lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40362"&gt;How to create a robots.txt file&lt;/a&gt; &lt;/li&gt;
		&lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364"&gt;Descriptions of each user-agent that Google uses&lt;/a&gt; &lt;/li&gt;
		&lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40367"&gt;How to use pattern matching&lt;/a&gt; &lt;/li&gt;
		&lt;li&gt;&lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40368"&gt;How often we recrawl your robots.txt file&lt;/a&gt; &lt;/li&gt;
		&lt;li&gt;&lt;a href="http://googlewebmastercentral.blogspot.com/2006/08/all-about-googlebot.html"&gt;All about Googlebot&lt;/a&gt; &lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Yahoo! 
	&lt;ul&gt;
		&lt;li&gt;&lt;a href="http://www.ysearchblog.com/archives/000372.html"&gt;Wild card support&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="http://www.ysearchblog.com/archives/000508.html"&gt;X-Robots tag directive support&lt;/a&gt; &lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Microsoft Live Search 
	&lt;ul&gt;
		&lt;li&gt;&lt;a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx"&gt;Search robots in disguise&lt;/a&gt;&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Other resources 
	&lt;ul&gt;
		&lt;li&gt;&lt;a href="http://searchengineland.com/070305-204850.php"&gt;Search Engine Land: Meta Robots Tag 101&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="http://searchengineland.com/080603-121100.php"&gt;Search Engine Land: Yahoo!, Microsoft, Google Clarify Robots.txt Support&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="http://searchengineland.com/070417-213813.php"&gt;Search Engine Land: URL Removal Options&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="http://www.robotstxt.org/"&gt;robotstxt.org&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Robots.txt"&gt;Wikipedia: Robots Exclusion Standard&lt;/a&gt;&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/janeandrobot/~4/isCIvwhaZbU" height="1" width="1"/&gt;</description>
      <link>http://feedproxy.google.com/~r/janeandrobot/~3/isCIvwhaZbU/post.aspx</link>
      <author>Vanessa Fox</author>
      <comments>http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx#comment</comments>
      <guid isPermaLink="false">http://janeandrobot.com/post.aspx?id=f9624323-567c-4e1c-b57f-e8d5a0db093b</guid>
      <pubDate>Wed, 04 Jun 2008 12:18:00 -0700</pubDate>
      <category>Reference</category>
      <dc:publisher>Vanessa Fox</dc:publisher>
      <pingback:server>http://janeandrobot.com/pingback.axd</pingback:server>
      <pingback:target>http://janeandrobot.com/post.aspx?id=f9624323-567c-4e1c-b57f-e8d5a0db093b</pingback:target>
      <slash:comments>64</slash:comments>
      <trackback:ping>http://janeandrobot.com/trackback.axd?id=f9624323-567c-4e1c-b57f-e8d5a0db093b</trackback:ping>
      <wfw:comment>http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx#comment</wfw:comment>
      <wfw:commentRss>http://janeandrobot.com/syndication.axd?post=f9624323-567c-4e1c-b57f-e8d5a0db093b</wfw:commentRss>
    <feedburner:origLink>http://janeandrobot.com/post.aspx?id=f9624323-567c-4e1c-b57f-e8d5a0db093b</feedburner:origLink></item>
    <item>
      <title>Effectively Using Images</title>
      <description>&lt;p&gt;
A picture is worth a thousand words. Unfortunately, when it comes to major search engines (which are still primarily text-based), a picture is worth a lot of blank space. Does this mean you shouldn&amp;#39;t use images on your site if you want to rank in search? Not at all. Just keep some simple things in mind when adding those images to your pages. As a bonus, these tips help not only with seach engine robots, but with Jane as well! You want your site to be accessible in screen readers, to those who have images turned off in their browsers, and to those who have slow connections or are on mobile browsers and may have trouble loading images. 
&lt;/p&gt;
&lt;p&gt;
By providing search engine robots with textual information about the images on your site, your site can benefit not only from better placement in web search results, but in image search results also. Image Seach can provide substantial search traffic, so don&amp;#39;t overlook this as an acquisition channel. 
&lt;/p&gt;
&lt;p&gt;
Below are recommendations for using images effectively for both Jane and search engine robots.&amp;nbsp; 
&lt;/p&gt;
&lt;h2&gt;Don&amp;#39;t put text in images&lt;/h2&gt;
&lt;p&gt;
Put text in straight HTML whenever possible. Sometimes web designers like to put text in images because they can use a wider variety of fonts and can manipulate the design more freely. Much of this styling can be done with CSS and in cases where it can&amp;#39;t, the extra design a graphical version of the text provides may not really add visitor value. In fact, it may detract from usability because it may be difficult to read. It also may hurt viral efforts since it can&amp;#39;t be copied and pasted. If I want to send an email to all of my friends suggesting we all go to a hot new restaurant, I may want to copy and paste a few menu items from the restaurant&amp;#39;s web site to send to them. If the menu is in an image, I can&amp;#39;t do that. 
&lt;/p&gt;
&lt;h2&gt;Use the ALT attribute&lt;/h2&gt;
&lt;p&gt;
The most well-known method for making images accessible is effective use the ALT attribute in the IMG element. And yet it&amp;#39;s very common to find empty ALT tags all over the web. 
&lt;/p&gt;
&lt;!-- code formatted by http://manoli.net/csharpformat/ --&gt;
&lt;pre class="csharpcode"&gt;
&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;img&lt;/span&gt; &lt;span class="attr"&gt;src&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;/images/lavender-plant.jpg&amp;quot;&lt;/span&gt; &lt;span class="attr"&gt;alt&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;Picture of a lavender plant&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;
&lt;/pre&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Make the text in the ALT tag descriptive.&lt;/strong&gt; It should describe the image concisely. Think of someone browsing your site with a screen reader. How will they want the image presented? 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Don&amp;#39;t stuff the ALT tag with keywords.&lt;/strong&gt; A long ALT attribute, full of keywords your want to rank for looks spammy to both your visitors and to search engines and may make both devalue your content. How can you tell if your ALT text is spammy or simply descriptive? It&amp;#39;s a judgment call, but if you can&amp;#39;t tell, get some objective opinions. ALT=&amp;quot;buy cheap viagra now cheap viagra online get viagra here&amp;quot; is probably going to be pretty obviously spammy to anyone you ask. 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Make the ALT text relevant to the image.&lt;/strong&gt; Use the ALT text to describe the image, not as a place to add descriptive text abou the page that isn&amp;#39;t directly relevant to the image. For instance, if the image is of a car, your ALT text should be something like &amp;quot;blue mini cooper&amp;quot; not &amp;quot;cheapcars.com has cheap cars available in every make and model including mini coopers, volkswagons, and Ferraris like Magnum PI used to drive&amp;quot;. 
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
What about the TITLE attribute? It likely &lt;a href="http://googlewebmastercentral.blogspot.com/2007/12/using-alt-attributes-smartly.html"&gt;doesn&amp;#39;t provide direct search engine value&lt;/a&gt;, although it may be useful for your visitors. 
&lt;/p&gt;
&lt;h2&gt;Make image filenames descriptive&lt;/h2&gt;
&lt;p&gt;
If possible, describe the image in name of the image file. For instance,&lt;strong&gt; lavender-plant.jpg&lt;/strong&gt; is better than &lt;strong&gt;image123.jpg&lt;/strong&gt;. If you are importing a lot of images, for instance, for a product database, it may be problematic to manually name each file. In this case, find programmatic ways to rename the images using text from how the images are tagged or categorized. If your filename includes multiple words, use hyphens to separate them (search engines tend to see a hyphen as a separator and an underscore as a joiner (so lavendar_plant would be seen as one word and lavender-plant would be seen as two). 
&lt;/p&gt;
&lt;h2&gt;Use image captions&lt;/h2&gt;
&lt;p&gt;
Provide a caption below or above the image that describes what it&amp;#39;s about and gives context for how it relates to the rest of the page. 
&lt;/p&gt;
&lt;h2&gt;Provide textual clues around the image&lt;/h2&gt;
&lt;p&gt;
Try to include text around the image that relates to what the image is about. Text on the page helps seach engines know what the page itself is about, which helps the page rank for relevant queries, but text near images can help those images rank in image search results as well. 
&lt;/p&gt;
&lt;h2&gt;Be cautious about using images for navigational links&lt;/h2&gt;
&lt;p&gt;
If you use images in menus and other navigation, make sure that you use ALT text that replicates how the image represents that menu option. But also test the implementation by turning off images in your browser and making sure the links still work. Some implementations incorrectly require images to be enabled, causing search engine robots to be unable to follow those links. 
&lt;/p&gt;
&lt;p&gt;
Another potential usability issue with images and navigation is that if you use a textual link combined with a background image, the text may disappear if the image doesn&amp;#39;t load. (This issue can happen with this type of design in places other than menus, but that scenario is where it can be commonly seen.) 
&lt;/p&gt;
&lt;table border="0" style="margin: 0pt auto"&gt;
	&lt;tbody&gt;
		&lt;tr&gt;
			&lt;td&gt;
			&lt;div align="center"&gt;
			&lt;strong&gt;Navigational Link With Images Enabled&lt;/strong&gt; 
			&lt;/div&gt;
			&lt;/td&gt;
			&lt;td&gt;&lt;strong&gt;Navigational Link with Images Disabled&lt;/strong&gt;&lt;br /&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;img style="margin-bottom: 0px" src="http://janeandrobot.com/image.axd?picture=image-example-with-background.png" alt="" width="221" height="40" /&gt;&lt;/td&gt;
			&lt;td&gt;&amp;nbsp;&lt;img style="margin-bottom: 0px" src="http://janeandrobot.com/image.axd?picture=image-example-without-background.png" alt="" width="242" height="43" /&gt;&lt;/td&gt;
		&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;Be cautious about using images for headings and logos&lt;/h2&gt;
&lt;p&gt;
Many web sites use an image for the header of the page or for the company logo. This implementation works well, but be sure that you replicate the company name, heading text, or other words from that image in the ALT text. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;
	If you have an image as the page&amp;#39;s H1 tag, keep in mind that the H1 is one of the most important clues for a search engine to determine what the page is about, so consider using text instead of an image or at least using descriptive ALT text. In the example below, the code is using CSS to display an image of the company logo as the H1 tag. A better implementation would be to display the image in the header of the page, and use the H1 tag to provide visitors and search engines description information about the page. 
	&lt;/p&gt;
	&lt;/li&gt;&lt;!-- code formatted by http://manoli.net/csharpformat/ --&gt;
	&lt;pre class="csharpcode"&gt;
								&lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;h1&lt;/span&gt; c&lt;span class="attr"&gt;lass&lt;/span&gt;&lt;span class="kwrd"&gt;=&amp;quot;home-logo&amp;quot;&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;Company Name&lt;span class="kwrd"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="html"&gt;h1&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;																				
	&lt;/pre&gt;
	&lt;p&gt;
	The CSS for this implementation positions the text at -999em. This is not recommended both because it means that when a visitor loads the page with images turned off, the text can&amp;#39;t be seen (and so the heading space is simply blank) and because search engines may find the practice deceptive (the text is hidden). 
	&lt;/p&gt;
	&lt;pre&gt;
								.home-logo {	background:transparent url(/images/logo1.gif)	no-repeat scroll center top;	height:63px;	margin-top:35px;	text-indent:-99999em;	}		
	&lt;/pre&gt;
	&lt;li&gt;
	&lt;p&gt;
	If your header includes an image of your company logo, avoid commonly used ALT text such as &amp;quot;home&amp;quot; or &amp;quot;logo&amp;quot;. Instead, succinctly describe your company or home page (using either the company name or a brief description of the site). (Also, avoid naming your company logo something like &lt;strong&gt;logo.jpg&lt;/strong&gt;.) 
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	If your site includes a header that consists entirely of a large image, test the layout of the page with images turned off. In some cases, the result can be a large area of white space that pushes all content below the fold. In the example below, all company information and details about the site are lost without the header image.
	&lt;/p&gt;
	&lt;div style="text-align: center"&gt;
	&lt;img src="http://janeandrobot.com/image.axd?picture=header-images-off_small.gif" alt="" width="356" height="143" /&gt; 
	&lt;/div&gt;
	&lt;p&gt;
	&amp;nbsp;
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a name="noncontent" title="noncontent"&gt;&lt;/a&gt;Block Non-Content Images&lt;/h2&gt;
&lt;p&gt;
If you use a lot of non-content images (for instance, arrows, bullets, and boxes), you likely don&amp;#39;t want those indexed. Since search engine robots spend limited time crawling each site, it may make sense to block them from crawling these types of images so they can spend all the available resources on the pages and images you do want indexed. As a bonus, if you want to provide an image search on your site (for instance, using the &lt;a href="http://nathanbuggia.com/post/Custom-Site-Search-Engine-Using-the-Live-Search-API.aspx"&gt;Live Search API&lt;/a&gt;), if only content images are indexed, then the image results will be more useful for your visitors. 
&lt;/p&gt;
&lt;p&gt;
A good way to block non-content images is to place them in a separate folder from your content images and then block that folder using robots.txt. For instance, if you place these images in a folder called &lt;strong&gt;no_index_images&lt;/strong&gt;, your robots.txt file would contain: 
&lt;/p&gt;
&lt;pre&gt;
User-agent: *
&lt;/pre&gt;
&lt;pre&gt;
Disallow: /no_index_images/
&lt;/pre&gt;
&lt;h2&gt;Opt-in to Google&amp;#39;s Image Labeler&lt;/h2&gt;
&lt;p&gt;
In Google&amp;#39;s Webmaster Tools, you can &lt;a href="http://www.google.com/support/webmasters/bin/answer.py?answer=48367"&gt;opt your images&lt;/a&gt; into their &lt;a href="http://images.google.com/imagelabeler/"&gt;Image Labeler&lt;/a&gt;. This enables others on the web to tag your images, which in turn provides Google with additional details about what the images are about and can help them rank for a wider variety of queries in Google Image Search. 
&lt;/p&gt;
&lt;h2&gt;Images can be search engine and user friendly&lt;/h2&gt;
&lt;p&gt;
With a little planning and good structure, you can effectively use images on your site in ways that benefit both Jane and robots. And by optimizing images in the ways described in this article, you may also be able to tap into an additional acquisition channel - image search.&amp;nbsp; 
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/janeandrobot/~4/DiV2trXU1r0" height="1" width="1"/&gt;</description>
      <link>http://feedproxy.google.com/~r/janeandrobot/~3/DiV2trXU1r0/post.aspx</link>
      <author>Vanessa Fox</author>
      <comments>http://janeandrobot.com/post/Effectively-Using-Images.aspx#comment</comments>
      <guid isPermaLink="false">http://janeandrobot.com/post.aspx?id=1dc89436-6604-4f3f-aaaa-a1aa7ba094c1</guid>
      <pubDate>Wed, 14 May 2008 07:30:00 -0700</pubDate>
      <category>Design Patterns</category>
      <dc:publisher>Vanessa Fox</dc:publisher>
      <pingback:server>http://janeandrobot.com/pingback.axd</pingback:server>
      <pingback:target>http://janeandrobot.com/post.aspx?id=1dc89436-6604-4f3f-aaaa-a1aa7ba094c1</pingback:target>
      <slash:comments>28</slash:comments>
      <trackback:ping>http://janeandrobot.com/trackback.axd?id=1dc89436-6604-4f3f-aaaa-a1aa7ba094c1</trackback:ping>
      <wfw:comment>http://janeandrobot.com/post/Effectively-Using-Images.aspx#comment</wfw:comment>
      <wfw:commentRss>http://janeandrobot.com/syndication.axd?post=1dc89436-6604-4f3f-aaaa-a1aa7ba094c1</wfw:commentRss>
    <feedburner:origLink>http://janeandrobot.com/post.aspx?id=1dc89436-6604-4f3f-aaaa-a1aa7ba094c1</feedburner:origLink></item>
    <item>
      <title>Domain Canonicalization</title>
      <description>&lt;p&gt;
Pop quiz: what&amp;#39;s the difference between the following 
URLs:
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;http://website.com&lt;/li&gt;
	&lt;li&gt;http://www.website.com&lt;/li&gt;
	&lt;li&gt;http://website.com/default.php&lt;/li&gt;
	&lt;li&gt;http://www.website.com/default.php&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
Give up? If you&amp;#39;re a user, then chances you expect all of 
those URLs will lead you to the same page. Robots, 
however, are not as good at determining if pages are the 
same, so they often store each separately. A big part of how search engines rank pages is based on how many external links those pages have. If other sites on the web link to the different versions of your home page, then search engines may calculate the value of each URL separately, based on the number of links to each version. This can effectively diminish the potential rank 
your page would have if it were found (and linked to) by only one URL.
&lt;/p&gt;
&lt;p&gt;
The practice of consolidating all versions of a page under one URL is 
referred to as &amp;quot;canonicalization&amp;quot; (because you collapse all versions under the &amp;quot;canonical&amp;quot; or true version). The four examples listed 
above are the most common, but there are 
potentially many, many URLs that lead you to the 
same page. By adhering to several best practices, you should 
be able to address 90% of common site-wide&amp;nbsp; canonicalization issues 
on your site and consequently increase how your site ranks.
&lt;/p&gt;
&lt;h3&gt;Recommendation&lt;/h3&gt;
&lt;p&gt;
The solution is to be 
explicit about the canonical form of your URLs. Following are four best practices to achieve this, with 
specific code and configuration examples.
&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Select WWW or Non-WWW, then redirect the other 
	option to your preferred version.&lt;br /&gt;
	&lt;/strong&gt;The hard part is choosing if you want your site to 
	be &lt;em&gt;&amp;quot;www.website.com&amp;quot;&lt;/em&gt; or simply &lt;em&gt;&amp;quot;website.com&amp;quot;&lt;/em&gt;. There is no right answer for every 
	company so you&amp;#39;ll have to figure this out on your own
	&lt;em&gt;(but, removing the &amp;quot;www.&amp;quot; saves your customers 4 
	keystrokes, which really add up on a mobile device, and 
	it makes your brand the first thing your customers see).&lt;/em&gt;
	&lt;/p&gt;
	&lt;p&gt;
	Once you&amp;#39;ve selected, you then need to find a way to 
	trap all requests to your application, check which form 
	is being used, and if it is not the correct form, initiate a 301 Redirect to the correct form. 
	For example, if the user types in &lt;em&gt;wikipedia.org&lt;/em&gt;, 
	they will automatically get redirected to &lt;em&gt;
	www.wikipedia.org&lt;/em&gt;.
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Remove the default filename from the end of your 
	URLs.&lt;/strong&gt;&lt;br /&gt;
	All web servers allow you to select one or more default 
	filenames to serve when the browser requests a 
	directory. For example, this website is run on IIS, so 
	when the user requests &lt;em&gt;&amp;quot;http://janeandrobot.com&amp;quot;&lt;/em&gt; 
	we really serve &lt;em&gt;
	&amp;quot;http://janeandrobot.com/default.aspx&amp;quot;. &lt;/em&gt;
	&lt;/p&gt;
	&lt;p&gt;
	In the same code you use to enforce www vs. non-www, you 
	should also check and see if the default filename is at 
	the end of the URL and then trim it off. So, &lt;em&gt;
	&amp;quot;http://janeandrobot.com/default.aspx&amp;quot;&lt;/em&gt; would be 
	converted to &lt;em&gt;&amp;quot;http://janeandrobot.com&amp;quot;&lt;/em&gt;.
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Link internally to the 
	canonical form of your URL&lt;/strong&gt;. &lt;br /&gt;
	Make sure 
	you always link to the proper canonical form of your 
	URLs from within your site. This practice helps encourage external sites to link to the site using the correct version as well (since those linking to you often cut and paste from your pages or RSS feed.) Note 
	there is a degree of diminishing returns here, so you don&amp;#39;t 
	need to spend the whole weekend hunting down every last 
	URL. Just make sure to review your site&amp;#39;s primary 
	navigation, top landing pages and blog.
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	&lt;strong&gt;Use Google Webmaster Tools to tell Google the 
	correct form.&lt;/strong&gt;&lt;br /&gt;
	Implementing these best practices on your site are ideal, since they address the problem for 
	all search engines and give your customers a 
	consistent, properly branded navigation experience. But what can you do if you reviewed 
	steps 1-3 and found that it would take six months to 
	implement on your production site? There is 
	something that you can do today: using
	&lt;a href="http://google.com/webmaster"&gt;Google&amp;#39;s Webmaster 
	Tools&lt;/a&gt;, you can navigate to the &amp;quot;Tools&amp;quot; 
	section and select &amp;quot;Set preferred domain.&amp;quot; Here you can specify if you&amp;#39;d like Google to&amp;nbsp; use &lt;em&gt;
	&amp;quot;www.website.com&amp;quot;&lt;/em&gt; or &lt;em&gt;&amp;quot;website.com&amp;quot;&lt;/em&gt; in 
	their index and search results, as well as consolidate links to both versions. Note that while this 
	will provide you short-term benefit from Google, it does 
	not help you in Yahoo! or Live Search.
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Checking Your Website&lt;/h3&gt;
&lt;p&gt;
To check your website to see if you&amp;#39;re handling domain 
canonicalization correctly, you can use the
&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/3829"&gt;
Live HTTP Headers&lt;/a&gt; add-on for Firefox.&amp;nbsp;
&lt;/p&gt;
&lt;div style="text-align: center"&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=live-http-headers-screensho.jpg" alt="" /&gt;
&lt;/div&gt;
Open the Live HTTP Headers tool, then 
try all the variations of the URL at several different 
levels to ensure they all redirect back to the appropriate 
canonical form. As you&amp;#39;re checking each variation, look at the HTTP headers using the Firefox 
plug-in to ensure they are all 301 redirects (and not, for instance, 302 
redirects). 
&lt;p&gt;
Here&amp;#39;s an example test case:
&lt;/p&gt;
&lt;table border="1" cellspacing="0" cellpadding="2" style="width: 100%"&gt;
	&lt;tbody&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;strong&gt;Canonical URL Form&lt;/strong&gt;&lt;/td&gt;
			&lt;td&gt;&lt;strong&gt;Test Case&lt;/strong&gt;&lt;/td&gt;
			&lt;td align="right"&gt;&lt;strong&gt;Test Result&lt;/strong&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://janeandrobot.com"&gt;
			http://janeandrobot.com&lt;/a&gt;&lt;/td&gt;
			&lt;td&gt;janeandrobot.com&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;janeandrobot.com/default.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;www.janeandrobot.com&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;www.janeandrobot.com/default.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://janeandrobot.com/about.aspx"&gt;
			http://janeandrobot.com/about.aspx&lt;/a&gt;&lt;/td&gt;
			&lt;td&gt;janeandrobot.com/about.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;www.janeandrobot.com/about.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://janeandrobot.com/folder"&gt;
			http://janeandrobot.com/folder&lt;/a&gt;&lt;/td&gt;
			&lt;td&gt;janeandrobot.com/folder&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;janeandrobot.com/folder/default.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;www.janeandrobot.com/folder&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;www.janeandrobot.com/folder/default.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;
			&lt;a href="http://janeandrobot.com/folder/test.aspx"&gt;
			http://janeandrobot.com/folder/test.aspx&lt;/a&gt;&lt;/td&gt;
			&lt;td&gt;janeandrobot.com/folder/test.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&amp;nbsp;&lt;/td&gt;
			&lt;td&gt;www.janeandrobot.com/folder/test.aspx&lt;/td&gt;
			&lt;td align="right"&gt;Success&lt;/td&gt;
		&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;Examples&lt;/h3&gt;
&lt;p class="style4"&gt;
Canonicalization issues are very common and being an 
Microsoft employee, I don&amp;#39;t have to go far to find an 
example. Check out the website for Microsoft&amp;#39;s annual
&lt;a href="http://visitmix.com"&gt;Mix conference&lt;/a&gt; for web 
developers.&amp;nbsp;
&lt;/p&gt;
&lt;div style="text-align: center"&gt;
&lt;img src="http://janeandrobot.com/image.axd?picture=mix08-screen-shot.jpg" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;
I was able to generate the table below by plugging the 
common URL variations into Yahoo&amp;#39;s Site Explorer to find a list of links to each variation.&amp;nbsp;
&lt;/p&gt;
&lt;table border="1" cellspacing="0" cellpadding="2" style="width: 100%"&gt;
	&lt;tbody&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;strong&gt;URL Variation&lt;/strong&gt;&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;&lt;strong&gt;Number of Links from 
			within website&lt;/strong&gt;&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;&lt;strong&gt;Number of Links from 
			outside websites&lt;/strong&gt;&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://visitmix.com"&gt;
			http://visitmix.com&lt;/a&gt;&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;17,663&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;59,498 &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://www.visitmix.com"&gt;
			http://www.visitmix.com&lt;/a&gt;&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;9,074&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;22,179 &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://visitmix.com/default.aspx"&gt;
			http://visitmix.com/default.aspx&lt;/a&gt; &lt;/td&gt;
			&lt;td class="style3" align="right"&gt;0&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;22&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;&lt;a href="http://www.visitmix.com/default.aspx"&gt;
			http://www.visitmix.com/default.aspx&lt;/a&gt; &lt;/td&gt;
			&lt;td class="style3" align="right"&gt;0&lt;/td&gt;
			&lt;td class="style3" align="right"&gt;12&lt;/td&gt;
		&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;
&lt;br /&gt;
Looking through these numbers yields some interesting insights:
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;
	Not doing &lt;em&gt;&amp;quot;www&amp;quot;&lt;/em&gt; vs &lt;em&gt;&amp;quot;non-www&amp;quot;&lt;/em&gt; is 
	definitely hurting their ranking - you can tell because 
	they have a similar number of inlinks for each version. 
	Ranking is done on a logarithmic scale, so every 
	additional link is more valuable than the one before. If they redirected all versions to one canonical form, search engines would see their home page has having 81,711 external links, would would be a substantial boost.
	&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;
	They are not good about using the same version of the 
	URL within their site. If you&amp;#39;re not cognizant of this 
	on your site, others won&amp;#39;t be either. It looks like they use &lt;em&gt;
	visitmix.com&lt;/em&gt; about 75% of the time internally, and
	&lt;em&gt;www.visitmix.com&lt;/em&gt; the other 25%.
	&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Additional Resources&lt;/h3&gt;
&lt;ul&gt;
	&lt;li&gt;
	&lt;a href="http://www.mattcutts.com/blog/seo-advice-url-canonicalization/"&gt;Matt Cutt&amp;#39;s Article on Canonicalization&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/"&gt;Additional Canonicalization Scenarios from Ian Ring&lt;/a&gt; - a few other great scnearios like capitalization and other default values. &lt;/li&gt;
	&lt;li&gt;&lt;a href="http://siteexplorer.search.yahoo.com/"&gt;Yahoo Site Explorer&lt;/a&gt; - see how many inlinks you have 
	for each URL variation&lt;/li&gt;
	&lt;li&gt;
	&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/3829"&gt;Live HTTP Headers&lt;/a&gt; - check your redirects to make 
	sure you&amp;#39;re implementing 301 redirects, not 302s.&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/janeandrobot/~4/JG5pdaD7v5w" height="1" width="1"/&gt;</description>
      <link>http://feedproxy.google.com/~r/janeandrobot/~3/JG5pdaD7v5w/post.aspx</link>
      <author>Nathan Buggia</author>
      <comments>http://janeandrobot.com/post/canonical-url-canonicalization-domain.aspx#comment</comments>
      <guid isPermaLink="false">http://janeandrobot.com/post.aspx?id=bc82aab4-a7c9-410c-9849-b1a445048f0a</guid>
      <pubDate>Fri, 02 May 2008 11:52:00 -0700</pubDate>
      <category>Design Patterns</category>
      <dc:publisher>Nathan Buggia</dc:publisher>
      <pingback:server>http://janeandrobot.com/pingback.axd</pingback:server>
      <pingback:target>http://janeandrobot.com/post.aspx?id=bc82aab4-a7c9-410c-9849-b1a445048f0a</pingback:target>
      <slash:comments>23</slash:comments>
      <trackback:ping>http://janeandrobot.com/trackback.axd?id=bc82aab4-a7c9-410c-9849-b1a445048f0a</trackback:ping>
      <wfw:comment>http://janeandrobot.com/post/canonical-url-canonicalization-domain.aspx#comment</wfw:comment>
      <wfw:commentRss>http://janeandrobot.com/syndication.axd?post=bc82aab4-a7c9-410c-9849-b1a445048f0a</wfw:commentRss>
    <feedburner:origLink>http://janeandrobot.com/post.aspx?id=bc82aab4-a7c9-410c-9849-b1a445048f0a</feedburner:origLink></item>
    <item>
      <title>Search-Friendly Design Patterns For Web Developers</title>
      <description>&lt;p&gt;
Search engine optimization (SEO) isn&amp;#39;t a marketing gimmick, spammy scheme, or destroyer of web usability. It&amp;#39;s a fundamental building block of effective infrastructure design that ensures web applications can thrive in an online environment in which potential customers turn to search first. It also can dramatically improve site usability and visitor happiness. 
&lt;/p&gt;
&lt;p&gt;
Watch this space for design patterns, code snippets, case studies, and implementation techniques for elegant code that both ensures potential customers can find your site online as well as provides them a compelling experience once they arrive. 
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://janeandrobot.com/about.aspx"&gt;Learn more about us and our philosophy&lt;/a&gt;. 
&lt;/p&gt;
&lt;p&gt;
View our Web 2.0 Expo slides and resources. 
&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href="http://janeandrobot.com/admin/Pages/web20presentations.html"&gt;Presentation slides&lt;/a&gt;&amp;nbsp;(AJAX Version via SlideShare) 
	&lt;ul&gt;
		&lt;li&gt;&lt;a rel="enclosure" href="http://janeandrobot.com/file.axd?file=01+-+The+truth+about+SEO.ppt"&gt;01 - The truth about SEO.ppt (2.10 mb)&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a rel="enclosure" href="http://janeandrobot.com/file.axd?file=02+-+How+search+engines+work.ppt"&gt;02 - How search engines work.ppt (1.70 mb)&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a rel="enclosure" href="http://janeandrobot.com/file.axd?file=03+-+Building+pages.ppt"&gt;03 - Building pages.ppt (1.58 mb)&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a rel="enclosure" href="http://janeandrobot.com/file.axd?file=04+-+Architecting+navigation.ppt"&gt;04 - Architecting navigation.ppt (2.35 mb)&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a rel="enclosure" href="http://janeandrobot.com/file.axd?file=05+-+Marketing+and+Development.ppt"&gt;05 - Marketing and Development.ppt (1.16 mb)&lt;/a&gt;&lt;/li&gt;
		&lt;li&gt;&lt;a rel="enclosure" href="http://janeandrobot.com/file.axd?file=06+-+Diagnosing+issues.ppt"&gt;06 - Diagnosing issues.ppt (2.25 mb)&lt;/a&gt; &lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://janeandrobot.com/admin/Pages/seo-developer-accessibility-checklist.html"&gt;Search accessibility checklist&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://janeandrobot.com/admin/Pages/seo-developer-discoverability-checklist.html"&gt;Search discoverability checklist&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://janeandrobot.com/admin/Pages/seo-developer-conversion-checklist.html"&gt;Search conversion checklist&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://janeandrobot.com/admin/Pages/seo-developer-resources.html"&gt;SEO/development resources&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="http://janeandrobot.com/admin/Pages/seo-developer-tools.html"&gt;SEO/development tools&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
&lt;a id="ctl00_hlnkFeed2" href="http://feeds.feedburner.com/janeandrobot"&gt;&lt;span class="nav_tab"&gt;&lt;img id="ctl00_i2" style="border-width: 0px" src="/pics/rssButton.gif" alt="feed" width="12" height="12" /&gt;&lt;/span&gt;&lt;/a&gt;&lt;a href="http://feeds.feedburner.com/janeandrobot"&gt;Sign up for our feed&lt;/a&gt;. 
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/janeandrobot/~4/8rZuXnJtLWI" height="1" width="1"/&gt;</description>
      <link>http://feedproxy.google.com/~r/janeandrobot/~3/8rZuXnJtLWI/post.aspx</link>
      <author>Vanessa Fox</author>
      <comments>http://janeandrobot.com/post/Search-Friendly-Design-Patterns-For-Web-Developers.aspx#comment</comments>
      <guid isPermaLink="false">http://janeandrobot.com/post.aspx?id=aa80d7d5-b819-4b9b-90af-3a06ea9c211e</guid>
      <pubDate>Thu, 24 Apr 2008 03:17:00 -0700</pubDate>
      <category>Editorial</category>
      <dc:publisher>Vanessa Fox</dc:publisher>
      <pingback:server>http://janeandrobot.com/pingback.axd</pingback:server>
      <pingback:target>http://janeandrobot.com/post.aspx?id=aa80d7d5-b819-4b9b-90af-3a06ea9c211e</pingback:target>
      <slash:comments>110</slash:comments>
      <trackback:ping>http://janeandrobot.com/trackback.axd?id=aa80d7d5-b819-4b9b-90af-3a06ea9c211e</trackback:ping>
      <wfw:comment>http://janeandrobot.com/post/Search-Friendly-Design-Patterns-For-Web-Developers.aspx#comment</wfw:comment>
      <wfw:commentRss>http://janeandrobot.com/syndication.axd?post=aa80d7d5-b819-4b9b-90af-3a06ea9c211e</wfw:commentRss>
    <feedburner:origLink>http://janeandrobot.com/post.aspx?id=aa80d7d5-b819-4b9b-90af-3a06ea9c211e</feedburner:origLink></item>
  </channel>
</rss>
