<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0"><channel><title>Webmaster Blog</title><link>http://www.bing.com/community/blogs/webmaster/default.aspx</link><description>Official blog of the Live Search Webmaster Center Team.</description><dc:language /><generator>CommunityServer 2008.5 SP2 (Build: 40407.4157)</generator><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/msdn/webmaster" type="application/rss+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><item><title>Robots speaking many languages</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/AkRT1rl6ZpU/robots-speaking-many-languages.aspx</link><pubDate>Thu, 05 Nov 2009 23:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9560071</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>2</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9560071</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/11/05/robots-speaking-many-languages.aspx#comments</comments><description>&lt;p&gt;We've already covered &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/21/prevent-a-bot-from-getting-lost-in-space-sem-101.aspx"&gt;in past blog articles&lt;/a&gt; some of the basics about how webmasters can use a file called robots.txt to control how search engine crawlers (aka bots) crawl their websites. But there is so much more to talk about with bots. So let's take a bit of a deeper dive into the subject.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Topic 1: Using the proper text file encoding&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The robots.txt file is used by webmasters to either specifically define which files and directories that compliant search engine bots may or may not crawl. Robots.txt files are basically &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Text_file"&gt;text files&lt;/a&gt;. However, even something as seemingly straightforward as a text file is not as simple as it might seem. Which type of file encoding scheme is used to save the file makes a big difference. For example, when you use the quintessential text file editor, the Notepad utility in Windows, you can save your text files in your choice of the following encoding types:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ANSI (aka &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Windows-1252"&gt;Windows-1252&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a target="_blank" href="http://en.wikipedia.org/wiki/Unicode"&gt;Unicode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Unicode &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Big_Endian"&gt;big endian&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a target="_blank" href="http://en.wikipedia.org/wiki/Utf-8"&gt;UTF-8&lt;/a&gt; (usually defined as "Unicode Transformation Format," but I've also seen alternatives starting with either "Universal " or even "UCS," which itself stands for "&lt;a target="_blank" href="http://en.wikipedia.org/wiki/Universal_Character_Set"&gt;universal character set&lt;/a&gt;")&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you choose to save your robots.txt file as either Unicode or Unicode big endian, the resulting file will not be compatible with most search engine bots. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;Robots.txt file requirements&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;To ensure that the search engine bots can read the directives for blocking or allowing content access in robots.txt file (not just with Bing, but all of them), save the file using one of the following compatible encoding formats:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a target="_blank" href="http://en.wikipedia.org/wiki/ASCII"&gt;American Standard Code for Information Interchange (ASCII)&lt;/a&gt; (a 7-bit, 128 character set) &lt;/li&gt;
&lt;li&gt;&lt;a target="_blank" href="http://en.wikipedia.org/wiki/ISO/IEC_8859-1"&gt;ISO-8859-1&lt;/a&gt; (an 8-bit, 256 character set backward compatible with US ASCII)&lt;/li&gt;
&lt;li&gt;UTF-8 (a variable-length character encoding version of Unicode that is backwards compatible with US ASCII)&lt;/li&gt;
&lt;li&gt;Windows-1252 (aka ANSI, as used in Microsoft Windows, it is an 8-bit, 256 character set backward compatible with US ASCII)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Sticking with one of these compatible encoding formats will ensure that the bots you wish to control can read, and thus act upon, your robots.txt file. For more information, check out this &lt;a target="_blank" href="http://www.microsoft.com/typography/unicode/cs.htm"&gt;article covering the history of character sets&lt;/a&gt; from the Microsoft Typography team.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Topic 2: Writing non-ASCII alphabetic characters in robots.txt&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The limited number of compatible file encoding formats for robots.txt exposes a potential problem for some users.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a target="_blank" href="http://www.ietf.org/"&gt;Internet Engineering Task Force (IETF)&lt;/a&gt; proclaims that &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Uniform_Resource_Identifier"&gt;Uniform Resource Identifiers (URIs)&lt;/a&gt;, comprising both &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Uniform_Resource_Locator"&gt;Uniform Resource Locators (URLs)&lt;/a&gt; and &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Uniform_Resource_Name"&gt;Uniform Resource Names (URNs)&lt;/a&gt;, must be written using the US-ASCII character set. However, ASCII's 128 characters only covers the English alphabet, numbers, and punctuation marks. However, some of the alphabetic characters from other Latin-based languages, such as &amp;ntilde; in Spanish and &amp;ccedil; in French, are left out of ASCII. More significantly, most characters in non-Latin-based alphabets, such as pi (&amp;pi;) in Greek, ya (я) in Cyrillic, and entire alphabets from many other world languages, can't be accurately written in the limited, English-oriented ASCII.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This limitation with regard to robots.txt can come into play for webmasters when bots visit web servers using languages whose characters fall outside of the ASCII character set. If a robots.txt file is present on that server and it includes directives to block bot access to files and directories whose names include non-ASCII characters, the bot may not interpret the directive as the webmaster intended.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Percent encoding to the rescue&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;There is a way to make sure the bots can properly read the file and directory path names, regardless of whether it adheres to ASCII standards. When writing directives that include characters unavailable in ASCII, you can "escape" (aka &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Percent-

encoding"&gt;percent-encode&lt;/a&gt;) them, which enables the bot to read them.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Percent-encoded characters, discussed in the IETF's &lt;a target="_blank" href="http://tools.ietf.org/html/rfc3986"&gt;RFC 3986&lt;/a&gt;, are used as character substitutes. A percent-encoded character is a sequence of one or more three-character codes (aka &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Octet_%28computing%29"&gt;octets&lt;/a&gt;), starting with the "%" sign and followed by two &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Hexadecimal"&gt;hexadecimal&lt;/a&gt; numbers. Percent encoding converts the character's hexadecimal UTF-8 value into a sequence of one or more ASCII-based octets that a URI-compliant bot can read.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;To demonstrate what percent-encoded text looks like, type &lt;span style="font-family: Courier New;"&gt;www.%62%69%6e%67.com&lt;/span&gt; in your browser's address bar. It will be automatically decoded into www.bing.com. The octet codes %62, %69, %6e, and %67 are decoded by the browser into letters b, i, n, and g, respectively. Note though, that that the recommended use for percent encoding is really for those non-ASCII characters in a URL path to minimize the potential for decoding translation errors.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Real world example&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Let's look at a real-world example. Suppose you were the webmaster for a website that contained the URL http://www.domain.com/папка/ (the folder name in the sample URL is written in Cyrillic and literally means "folder"). To block a bot from accessing that folder on your website using percent encoding in your robots.txt file, you would need to write the directive as follows:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;Disallow: /%D0%BF%D0%B0%D0%BF%D0%BA%D0%B0/&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;If instead you simply wrote&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;Disallow: /папка/&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;the bot may not be able to read the directive and thus fail to perform as desired.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Performing percent encoding&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;So how do you translate your non-ASCII characters into escape-encoded octets? Well, it's a bit of a chore, frankly. If you search for them, there are a few websites and/or tools that offer to perform percent encoding for you, but rather than endorse a site I know nothing about, I'll instead tell you how to manually calculate the conversion. If you want to use an automated tool, go for it. But knowing how the process works will allow you to verify that a tool encoded your characters correctly.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;Warning!&lt;/i&gt;&lt;/b&gt; I'm going to get pretty tech geeky here. If working with hexadecimal and &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Binary_numeral_system"&gt;binary&lt;/a&gt; numbers is not your thing, I apologize up front!&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;OK, thus warned, let's get to it. You first need to know the &lt;a target="_blank" href="http://en.wikipedia.org/wiki/List_of_Unicode_characters"&gt;UTF-8 hexadecimal value&lt;/a&gt; for each character you want to encode. They are usually presented as U+&lt;i&gt;HHHH&lt;/i&gt;. The four "H" hex digits are what you need. &lt;/p&gt;
&lt;p&gt;As defined in &lt;a target="_blank" href="http://www.ietf.org/rfc/rfc3987.txt"&gt;IETF RFC 3987&lt;/a&gt;, the escape-encoded characters can be between one and four octets in length. The first octet of the sequence defines how many octets you need to represent the specific UTF-8 character. The higher the hex number, the more octets you need to express it. Remember these rules:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Characters with hex values between 0000 and 007F only require only one octet. The &lt;a target="_blank" href="http://en.wikipedia.org/wiki/High-

order_bit"&gt;high-order&lt;/a&gt; (left most) bit of the binary octet will always be 0 and the remaining seven bits are used to define the character.&lt;/li&gt;
&lt;li&gt;Characters with hex values between 0080 and 07FF require two octets. The right most octet (last of the sequence) will always have the first two highest order bits set to 10. The remaining six bit positions of that octet are the first six &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Least_significant_bit"&gt;low-order bits&lt;/a&gt; of the hex number's converted binary value (I set the Calculator utility in Windows to &lt;b&gt;Scientific &lt;/b&gt;view to do that conversion). The next octet (the first in the sequence, positioned to the left of the last octet) always starts with the first three highest order bits set to 110 (the number of leading 1 bits indicates the number of octets needed to represent the character - in this case, two). The remaining higher bits of the binary-converted hex number will fill in the last five lower order bit positions (add one or more 0 at the high end if there aren't enough remaining bits to complete the 8-bit octet). &lt;/li&gt;
&lt;li&gt;Characters with hex values between 0800 and FFFF require three octets. Use the same right-to-left octet encoding process as the two-octet character, but start the first (highest) octet with 1110.&lt;/li&gt;
&lt;li&gt;Characters with hex values higher than FFFF require four octets. Use the same right-to-left octet encoding process as the two-octet character, but start the first (highest) octet with 11110.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Below is a table to help illustrate these concepts. The letter &lt;i&gt;n&lt;/i&gt; in the table represents the open bit positions in each octet for encoding the character's binary number.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;table cellpadding="0" cellspacing="0" border="1"&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td width="156" valign="top"&gt;
&lt;p align="center"&gt;&lt;b&gt;Hexadecimal value&lt;/b&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td width="306" valign="top"&gt;
&lt;p align="center"&gt;&lt;b&gt;Octet sequence (in binary)&lt;/b&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td width="156" valign="top"&gt;
&lt;p&gt;0000 0000-0000 007F&lt;/p&gt;
&lt;/td&gt;
&lt;td width="306" valign="top"&gt;
&lt;p&gt;0&lt;i&gt;nnnnnnn&lt;/i&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td width="156" valign="top"&gt;
&lt;p&gt;0000 0080-0000 07FF&lt;/p&gt;
&lt;/td&gt;
&lt;td width="306" valign="top"&gt;
&lt;p&gt;110&lt;i&gt;nnnnn&lt;/i&gt; 10&lt;i&gt;nnnnnn&lt;/i&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td width="156" valign="top"&gt;
&lt;p&gt;0000 0800-0000 FFFF&lt;/p&gt;
&lt;/td&gt;
&lt;td width="306" valign="top"&gt;
&lt;p&gt;1110&lt;i&gt;nnnn&lt;/i&gt; 10&lt;i&gt;nnnnnn&lt;/i&gt; 10&lt;i&gt;nnnnnn&lt;/i&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td width="156" valign="top"&gt;
&lt;p&gt;0001 0000-0010 FFFF&lt;/p&gt;
&lt;/td&gt;
&lt;td width="306" valign="top"&gt;
&lt;p&gt;11110&lt;i&gt;nnn&lt;/i&gt; 10&lt;i&gt;nnnnnn&lt;/i&gt; 10&lt;i&gt;nnnnnn&lt;/i&gt; 10&lt;i&gt;nnnnnn&lt;/i&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Let's demo this using the first letter of the Cyrillic example given above, п. To manually percent encode this UTF-8 character, do the following:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Look up the &lt;a target="_blank" href="http://en.wikipedia.org/wiki/List_of_Unicode_characters"&gt;character's hex value&lt;/a&gt;. The hex value for the lower case version of this character is 043F.&lt;/li&gt;
&lt;li&gt;Use the table above to determine the number of octets needed. 043F requires two.&lt;/li&gt;
&lt;li&gt;Convert the hex value to binary. Windows Calculator converted it to 10000111111.&lt;/li&gt;
&lt;li&gt;Build the lowest order octet based on the rules stated earlier. We get 10111111.&lt;/li&gt;
&lt;li&gt;Build the next, higher order octet. We get 11010000.&lt;/li&gt;
&lt;li&gt;This results in a binary octet sequence of 11010000 10111111.&lt;/li&gt;
&lt;li&gt;Reconvert each octet in the sequence into hex. We get a converted sequence of D0 BF.&lt;/li&gt;
&lt;li&gt;Write each octet with a preceding percent symbol (and no spaces in-between, please!) to finish the encoding: %D0%BF&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;You can confirm your percent encoded path works as expected by typing it into your browser as part of a URL. If it resolves correctly, you're golden.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;There always more to talk about with robots (and so many other webmaster-related topics). If you have any questions, comments, or suggestions, feel free to post them in our &lt;a target="_blank" href="http://www.bing.com/community/forums/12256.aspx"&gt;SEM forum&lt;/a&gt;. Until next time...&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9560071" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/AkRT1rl6ZpU" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Crawling/default.aspx">Crawling</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/11/05/robots-speaking-many-languages.aspx</feedburner:origLink></item><item><title>MSNBot 1.1 is retired</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/FGZHfB6cpnU/msnbot-1-1-is-retired.aspx</link><pubDate>Wed, 04 Nov 2009 22:50:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9559902</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>11</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9559902</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/11/04/msnbot-1-1-is-retired.aspx#comments</comments><description>&lt;p&gt;The Bing team has been talking about its new crawler (aka bot), MSNBot 2.0b, &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/17/new-bot-work-continues-at-bing.aspx/"&gt;in this blog&lt;/a&gt; for quite some time now. We have made numerous improvements in its performance, addressed some webmaster concerns, and published detailed information on &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/21/prevent-a-bot-from-getting-lost-in-space-sem-101.aspx"&gt;how to control the bot with a robots.txt file&lt;/a&gt;. Today we are announcing that the new bot is fully operational. This development will enable Bing to do a better job at gathering the information we need from the myriad of websites we index worldwide.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;As MSNBot 2.0b enters full-scale production, the time has come to retire our previous generation bot, MSNBot 1.1. By the end of the first week in November, you will no longer see the following user agent in your referrer logs:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;msnbot/1.1 (+http://search.msn.com/msnbot.htm)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The only Bing user agent you will see in your logs from this point forward will be this, our new bot:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;msnbot/2.0b (+http://search.msn.com/msnbot.htm)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This event is a major milestone for the Bing engineering team, and we look forward to the positive developments that this bot will bring to our search engine index and thus to Bing customers. We want to specifically thank all those webmasters who provided us with valuable feedback as we ramped up the production of the new bot. Your assistance and cooperation was essential to making this milestone happen.&lt;/p&gt;
&lt;p&gt;Stay tuned for more information about this and other developments from the Bing engineering team. We'll have a lot more to talk about in the coming weeks and months.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you have any questions, comments, or suggestions, feel free to post them in our &lt;a target="_blank" href="http://www.bing.com/community/forums/12252.aspx"&gt;Crawling/Indexing Discussion forum&lt;/a&gt;. Be back at you soon...&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9559902" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/FGZHfB6cpnU" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Crawling/default.aspx">Crawling</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Announcement/default.aspx">Announcement</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/11/04/msnbot-1-1-is-retired.aspx</feedburner:origLink></item><item><title>Fixing 404 File Not Found frustrations (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/t1-BDVaCzfc/fixing-404-file-not-found-frustrations-sem-101.aspx</link><pubDate>Wed, 04 Nov 2009 20:43:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9559875</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>6</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9559875</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/11/04/fixing-404-file-not-found-frustrations-sem-101.aspx#comments</comments><description>&lt;p&gt;You've seen it. So have I. Nearly every person who has actively browsed the Web for more than 15 minutes has seen it. I'm talking about the dreaded 404 File Not Found error. When it occurs, users simply abandon their search on that site and go elsewhere. That's a potential lost sale, subscription, or download opportunity (aka conversion) for the affected site! It has been estimated that up to 10% of traffic to large websites on the Web is looking for pages that don't exist, so this is a big problem.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Back in the early days of the Web, when you entered an incorrect URL, you got a nearly blank, stark white screen containing nothing but the simple words, "404 File not found." Yeah, thanks. That's really helpful. But back in the day, when people actually wrote webpages in Notepad, that behavior was de rigueur. That was the way things were. Back then. But that was then. This is now. We can do better.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Unusable URLs&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;There are a great many ways for URLs to be rendered unusable. They can be mistyped or misspelled by the user, the page can be moved, renamed, or deleted by the webmaster, and the URL can be incorrectly written by external webmasters who create the outbound link from their sites to your, which is the most frustrating situation for the webmaster of the intended destination.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;So when a potential customer of yours, interested in learning more about what you do or what it is your site has to offer, uses an erroneous URL for a page within your website today, what do they get? Do they get the Web's equivalent of the blue screen of death, a useless page that stops them dead in their tracks and forces them to move on to a competitor's site? Or do they get a page with helpful guidance that keeps them in your site, offering to assist them with finding the information they seek?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;As webmaster, you control how you rename and remove pages from your site (see information on using redirects to point to moved and renamed pages in &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/26/site-architecture-and-seo-file-page-issues-sem-101.aspx"&gt;an earlier SEM 101 article&lt;/a&gt;). But you can't control the content of your inbound links, nor can you nudge that user who can't spell or remember your page-naming scheme (yet another good reminder to make the file names of your pages logical and easy to remember). So instead of cursing the darkness of silly users over whom you have no control, instead light a metaphorical candle by creating a &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Custom_error_page"&gt;custom 404 error page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Custom 404 error messages&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Creating a custom 404 error page is not that hard to do, but so many webmasters overlook doing this small but crucial step. Your custom 404 error page will appear in place of a generic 404 error message when the URL to your site is broken. By anticipating what users will likely want to know when they come to your site, you can proactively give them enough information in the custom error message to keep them in your site, and then provide them with easy means to find the information they want.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;OK, so this is a great idea, but how do you do it? Well, I've got you covered there. What you do depends on which web server platform you are using to host your site. Let's take a look at what you need to do to establish this safety net for your users. After all, adding this one little feature to your site might make the difference between a bounce and a conversion!&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Dynamic or static?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Before we discuss anything else, you first must decide how you want to approach this situation. You can create a single, static custom 404 error page for your entire site. That will be pretty simple to implement and is likely a good fit for smaller sites, but larger sites might feel constrained by a single, static page. There are technologies available that allow you to create dynamic error pages based on script, and that might suit a larger site's needs better, but the caveat exists that if the host scripting engine suffers a failure, then you'll have no custom 404 message at all. Given that this is not a developer blog, I'll just point all webmasters interested in creating dynamic 404 error pages out to Bing searches for information on both &lt;a target="_blank" href="http://www.bing.com/search?q=dynamic+custom+404+page+IIS&amp;amp;form=QBRE&amp;amp;qs=n"&gt;Internet Information Services (IIS)&lt;/a&gt; and &lt;a target="_blank" href="http://www.bing.com/search?q=dynamic+custom+404+page+Apache+&amp;amp;go=&amp;amp;form=QBRE&amp;amp;qs=n"&gt;Apache&lt;/a&gt; platforms.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you're interested in learning more about basic, static custom 404 pages, follow on. After all, this is SEM 101, right?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Create the custom page content&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;You need to decide what information and features you want to include in your custom 404 page. I suggest the following:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Use the page template for your website to maintain consistency with your site's look and feel on this new page.&lt;/li&gt;
&lt;li&gt;Include your site's navigation scheme in this page. If you have created an HTML sitemap page or a dedicated site search box, include access to those features as well.&lt;/li&gt;
&lt;li&gt;In the page's text, first acknowledge that the URL for the page the user was intending to see does not exist. Then offer a quick description of your site's subject-matter theme and list the products/services/opportunities it offers. Follow that with a suggestion that the reader use the on-page site navigation (menus, sitemap, search box, etc.) to look for the information they are interested in.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Don't add huge, hard-to-read fonts, auto-play music, videos, or animations, or include anything else that may be off-putting to the first-time reader. Leave off the advertisements as well - this will distract the user from the critical mission at hand - to get the user back onto a real page within your site. Remember, the whole point of a custom 404 error page is to prevent a bounce (aka a single page visit session in which the visitor abandons the site without visiting any other pages). Keep the page clean and easy to read.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Save this custom page file to the root directory of your site (for this discussion, I'll call the new file 404.htm). Now let's cover how to employ it on the various web server platforms.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Apache&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Apache users can configure a special text file found in the root directory of their site to implement a custom 404 error page. The file, named .htaccess (the dot precedes the file name and contains no typical file name extension), can be edited in Notepad to include the following line (using our sample error page file):&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;ErrorDocument 404 /404.htm&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Of course, you must name and store the custom 404 error page file identified in the location as specified or even it will return a 404 File Not Found error! Use Bing search results to find more information on &lt;a target="_blank" href="http://www.bing.com/search?q=custom+404+page+Apache&amp;amp;go=&amp;amp;form=QBLH&amp;amp;qs=n"&gt;what to do in Apache for custom 404 messages&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;IIS &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you are using IIS, the implementation of a custom 404 page is simple. Here's how:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;First, open IIS and select the website you want to customize.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;In IIS versions 5 and 6:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li value="2"&gt;Right-click the website and select &lt;b&gt;Properties&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;Click the &lt;b&gt;Custom Errors&lt;/b&gt; tab.&lt;/li&gt;
&lt;li&gt;Double-click the listing for status code &lt;b&gt;404&lt;/b&gt; to edit that setting.&lt;/li&gt;
&lt;li&gt;In the &lt;b&gt;Edit Custom Error Properties&lt;/b&gt; dialog box, set the &lt;b&gt;Message type&lt;/b&gt; drop down list to &lt;b&gt;URL&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;b&gt;URL&lt;/b&gt; text box, per our example earlier, type &lt;span style="font-family: Courier New;"&gt;/404.htm&lt;/span&gt;.&lt;/li&gt;
&lt;li&gt;Click &lt;b&gt;OK&lt;/b&gt; to save your work.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;In IIS 7 and higher:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li value="2"&gt;In the right pane, double-click &lt;b&gt;Error Pages&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;Double-click the listing for status code &lt;b&gt;404&lt;/b&gt; to edit that setting.&lt;/li&gt;
&lt;li&gt;In the &lt;b&gt;Response Action&lt;/b&gt; group, select the option &lt;b&gt;Insert content few static file into the error response&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;Click &lt;b&gt;Set&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;b&gt;Root directory path&lt;/b&gt; text box, either type the physical path of the root directory (up to the portion of the path that branches into a specific locale directory) of the custom error page file or click &lt;b&gt;Browse&lt;/b&gt; and navigate to the error file root directory and then click &lt;b&gt;OK&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;b&gt;Relative file path&lt;/b&gt; text box, type the relative path of the localized error file, which if our example earlier was for US English only, could be written as &lt;span style="font-family: Courier New;"&gt;\EN-US\404.htm&lt;/span&gt;.&lt;/li&gt;
&lt;li&gt;If you are implementing localized versions of a custom 404 error page, ensure the&lt;b&gt; Try to return the error file in the client language&lt;/b&gt; check box is selected, and then click &lt;b&gt;OK&lt;/b&gt; twice to save your work.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;IIS 7.0 and higher users can alternatively edit their web.config file to include a snippet of code to accomplish the same task. Using the sample file 404.htm referenced earlier, here is an example code snippet:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;&lt;br /&gt;&amp;lt;configuration&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;lt;system.webServer&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;httpErrors errorMode="DetailedLocalOnly"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;remove statusCode="404" /&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;error statusCode="404" path="404.htm" responseMode="File" /&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;/httpErrors&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;lt;/system.webServer&amp;gt;&lt;br /&gt;&amp;lt;/configuration&amp;gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Check out Bing search results for more information on &lt;a target="_blank" href="http://www.bing.com/search?q=custom+404+page+IIS&amp;amp;go=&amp;amp;form=QBRE&amp;amp;qs=n"&gt;creating custom 404 error pages in IIS&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Bing Web Page Error Toolkit&lt;/b&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Aside from creating your own home-grown, dynamic custom 404 page solution for IIS, there is another option for IIS users that kicks it up a notch. The &lt;a target="_blank" href="http://www.bing.com/developers/"&gt;Bing Developer Center&lt;/a&gt; team has augmented the IIS experience in creating even more useful customized 404 error pages. Straight from the &lt;a target="_blank" href="http://www.bing.com/community/blogs/developer/"&gt;Bing Developer Center blog&lt;/a&gt;, check out the post from this past summer, &lt;a target="_blank" href="http://www.bing.com/community/blogs/developer/archive/2009/06/23/customize-your-404-error-pages-with-the-bing-api-web-page-error-toolkit.aspx"&gt;Customize your 404 error pages with the Bing API Web Page Error Toolkit&lt;/a&gt;. The post reveals that the toolkit, built on the &lt;a target="_blank" href="http://www.bing.com/developers/s/API%20Basics.pdf"&gt;Bing API&lt;/a&gt;, replaces the default IIS 404 error page with a dynamically created Bing search page containing customized search results derived from keywords extracted from either the source &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Uniform_Resource_Identifier"&gt;Uniform Resource Identifier (URI)&lt;/a&gt; or the &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Http_request#Request_message"&gt;HTTP request&lt;/a&gt;. This creates a search list of relevant, alternative pages on your site, helping the user more easily and quickly find the information they originally wanted without abandoning their visit to your domain.&lt;/p&gt;
&lt;p&gt;The &lt;a target="_blank" href="http://www.microsoft.com/downloads/details.aspx?displaylang=en&amp;amp;FamilyID=deca3e03-4f93-46d1-affc-493c0e02eb63"&gt;Bing Web Page Error Toolkit&lt;/a&gt; is available as a free download. If you're using IIS, give it a try. Your users will show their gratitude with more conversions and fewer bounces! Note that you will need &lt;a target="_blank" href="http://www.microsoft.com/visualstudio/en-us/default.mspx"&gt;Microsoft Visual Studio&lt;/a&gt; to use this tool.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Test, test, test&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Once you have implemented your custom 404 page, test it out. In a browser, type your domain name followed by a short, random string of characters you know does not match any existing file or directory name. If everything was implemented correctly, you should get the custom 404 page you created in response to that bad URL. From that point forward, your potential customers will also see that page, and if the new error page's content is well done, they will more likely stay on your site. And that's the goal after all, right?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you have any questions, comments, or suggestions, feel free to post them in our &lt;a target="_blank" href="http://www.bing.com/community/forums/12256.aspx"&gt;SEM forum&lt;/a&gt;. Later...&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9559875" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/t1-BDVaCzfc" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/11/04/fixing-404-file-not-found-frustrations-sem-101.aspx</feedburner:origLink></item><item><title>Translator widget: Delivering your site to the world</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/ZBHzE0YRub0/translator-widget-delivering-your-site-to-the-world.aspx</link><pubDate>Wed, 14 Oct 2009 17:04:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9556677</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>18</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9556677</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/10/14/translator-widget-delivering-your-site-to-the-world.aspx#comments</comments><description>&lt;p&gt;About &lt;a href="http://blogs.msdn.com/translation/archive/2009/03/18/announcing-the-microsoft-translator-web-page-widget.aspx"&gt;eight months back&lt;/a&gt;, the Microsoft Research Translator team delivered an entirely unique way of delivering your website's pages to visitors who speak a different language with no development effort on your part. Unlike any other translation widget/gadget available at that time, the Translator widget was unique in that it kept your audience on your site, rather than redirecting them to a proxy translation service. Since then, thousands of sites have adopted the Translator widget and have been able to attract a much broader audience from around the world.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Powered by the same machine translation technology that is used by &lt;a href="http://blogs.msdn.com/translation/archive/2009/06/10/microsoft-translator-instant-answers-now-on-bing.aspx"&gt;Bing&lt;/a&gt;, &lt;a href="http://www.ieaddons.com/en/details/translation/Bing_Translator/"&gt;Internet Explorer&lt;/a&gt;, and &lt;a href="http://blogs.technet.com/office_global_experience/archive/tags/Translate/default.aspx"&gt;Office&lt;/a&gt;, the Translator widget provides a free option to deliver a "gisting" experience to a non-native audience. While machine translation cannot replace a professional or human localization, it aims to provide a rough understanding (the gist) of the content on the page to those that cannot read the original language. The translation engine is worked on continuously to deliver better quality and more languages. You can learn more about the pioneering work being done by our researchers in this space over at the &lt;a href="http://research.microsoft.com/en-us/projects/mt/"&gt;Translator group's site&lt;/a&gt; at Microsoft Research. With the widget, given the on-demand nature of the translations, there is no load on your site and the freshest translations are delivered to the visitor.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Adopting the widget is as simple as copying and pasting a small snippet of JavaScript code into your site. You can customize and generate a snippet for your site at the &lt;a href="http://go.microsoft.com/?linkid=9656123"&gt;widget adoption portal&lt;/a&gt;. Once this code snippet is pasted into an appropriate area of your page, the Translator widget appears on your site to your users in the language their browser is set to. This localization of the widget user interface ensures that your site's audience always sees "Translate this page" in their language and thereby are able to kick-off the translation (as shown in the images below). The translator team is also planning to add an "automatic" translation functionality, where you can set the widget to auto-translate the page into the visitor's browser language upon arrival.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/8512.Translator1.png" border="0" /&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/0677.Translator2a.png" border="0" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Once translation has been kicked off, the page is translated using "progressive rendering" - a technique that ensures that the visitor can immediately get the benefit of translation without waiting for the whole page to be translated. As they navigate from page to page on your site, the pages get automatically translated, resulting in a seamless experience for your visitors. A progress bar and several other controls are displayed as well, to the visitor, floating at the top of the screen. Upon translation, hovering over the translated sentences displays tool tips that show the original source sentence, as shown in the image below. This can be useful in situations where the visitor has some familiarity with the source language.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/2313.Translator3.png" border="0" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Another interesting feature of the widget is the ability to share a link to the translated page. A visitor to a site who has translated the page to a particular language can share a link to the translated version of the page. When the recipient clicks on the link, they are taken to the page and translation to their language is kicked off automatically. For example, this page (&lt;a href="http://viks.org/2009/06/11/instant-translations-in-bing/"&gt;http://viks.org/2009/06/11/instant-translations-in-bing/&lt;/a&gt;) can be auto-translated to Spanish by appending the code &lt;span style="font-family: Courier New;"&gt;#mstto=es&lt;/span&gt; to the end of the URL (&lt;a href="http://viks.org/2009/06/11/instant-translations-in-bing/#mstto=es"&gt;http://viks.org/2009/06/11/instant-translations-in-bing/#mstto=es&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;So, what are you waiting for? Go &lt;a href="http://go.microsoft.com/?linkid=9656123"&gt;get the widget&lt;/a&gt; and start making the Web more "worldly"!&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Stay tuned to the Webmaster Center blog and the &lt;a href="http://blogs.msdn.com/translation"&gt;Translator team's blog&lt;/a&gt; for more information on additions to the widget functionality. You can also &lt;a href="http://go.microsoft.com/?linkid=9656020"&gt;participate in the Translator user community&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;-- Vikram Dendi, Senior Product Manager, Microsoft Research&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9556677" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/ZBHzE0YRub0" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Announcement/default.aspx">Announcement</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/10/14/translator-widget-delivering-your-site-to-the-world.aspx</feedburner:origLink></item><item><title>Webmaster Center blog Q&amp;A </title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/8JwsDztZJLM/webmaster-center-blog-q-amp-a.aspx</link><pubDate>Fri, 09 Oct 2009 20:33:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9555881</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>20</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9555881</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/10/09/webmaster-center-blog-q-amp-a.aspx#comments</comments><description>&lt;p&gt;We've been really busy here at the Bing Webmaster Center blog team, pumping out new content on a regular basis to create a &lt;a target="_blank" href="http://www.bing.com/community/search/livesearch.aspx?domain=www.bing.com%2fcommunity&amp;amp;q=%22SEM+101%22"&gt;nice library of content on issues that matter to webmasters and online publishers&lt;/a&gt;. I thought I'd take a moment to catch my breath, pause on creating a new thematic article (or yet another multi-part series!) for SEM 101, and address some commonly asked questions in the blog comments.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Why wasn't my question in the blog comments answered?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A:&lt;/strong&gt; Well, the Webmaster Center blog is really not the right place for such back and forth exchanges. In fact, at the end of each blog post, there is a reminder to post your questions and comments in the &lt;a target="_blank" href="http://www.bing.com/community/forums/12256.aspx"&gt;SEM forum&lt;/a&gt;. The collection of &lt;a target="_blank" href="http://www.bing.com/community/forums/default.aspx?GroupID=11"&gt;Webmaster Center forums&lt;/a&gt; are specifically designed and staffed to address your questions, so if you have a question for the Bing Webmaster Center team, please post it in the forums where you will get a reply. Now if you have a question for other webmasters, you can certainly post that to the blog comments, but even then, you may get better results from posting it in the forums.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Why did my blog comment disappear?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; As we truly value all the input we receive from the webmaster community, we are reluctant to delete any comments, but on occasion we have to. If a blog post comment is simply blank, includes profanity or obviously objectionable, business-inappropriate content, or is merely an off-topic advertisement for an external website (the basic definition of web spam), we delete those comments. In cases where the same comment is repeated multiple times in the same post by the same sender, we delete the redundancies but leave the original.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;We love to get your feedback, and whether you like or hate our content, have a suggestion for clarifying a point or care to elaborate your own, related story, we're really happy when you contribute to the community. Please continue to do so! But if your comment was deleted, there was a compelling reason for doing so. On rare occasions, we get someone who decides to post the same web spam comment across dozens of our posts simultaneously, sometimes even spanning beyond Webmaster and going into other Bing community blogs, such as Maps, Developer, Travel, or the main Search blog. I've seen a couple of instances where someone comment-bombed our blogs in a huge, redundant web spam blast. Those comments are all quickly deleted and those spammers are banned from posting again to the blogs. Seriously, who wants to read that junk?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: I added in my site's URL in my blog comment. That's good link building for my site (it's coming from an authoritative site, after all), right?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; Well, the basic intent of the idea is good. You do want to get as many high-quality, authoritative, inbound links as you can. That is one of the keys to improving your page rank of your site. But in this case, as so many blog commenters do this on a regular basis, the links entered in Bing blog comments are automatically created with the &lt;span style="font-family: Courier New;"&gt;rel="nofollow"&lt;/span&gt; attribute included in the anchor tag. This means that when search engines hit the blog page, the link using that attribute will not earn any inbound link credit for the referenced page. So sorry, folks, this one won't count. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Make no mistake, earning high-quality inbound links is hard work. You need to get webmasters from authoritative sites to link to you (this is why link exchanges don't help build page rank value). You usually do that by providing high quality content on your site that those webmasters value. But simply adding a URL to a blog comment is far from hard. And many websites make it a policy to add the &lt;span style="font-family: Courier New;"&gt;rel="nofollow"&lt;/span&gt; attribute to all visitor-generated content links because that content can be so hard to police. Who wants to allow a visitor to link out to web spam or malware? And who has time to police the quality of every user-generated link?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;You had a good idea with good intentions. But in this case, it's a waste of time to include your site's URL if your goal is to get an authoritative inbound link.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Why isn't my site indexed yet?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; This is exactly the sort of question you should post in the Webmaster Center's &lt;a target="_blank" href="http://www.bing.com/community/forums/12252.aspx"&gt;Crawling/Indexing Discussion forum&lt;/a&gt;. The number of variables here that can affect the answer specific to your question is enormous, including:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the quality of your site's content&lt;/li&gt;
&lt;li&gt;the quantity and authoritative quality of your site's inbound links&lt;/li&gt;
&lt;li&gt;the ability of the search engine bot to discover and crawl your site's content&lt;/li&gt;
&lt;li&gt;the validity of the HTML code used&lt;/li&gt;
&lt;li&gt;the age of your site&lt;/li&gt;
&lt;li&gt;the freshness of the site's content over time&lt;/li&gt;
&lt;li&gt;whether or not malware was detected on your site&lt;/li&gt;
&lt;li&gt;whether or not the content is judged to be web spam or duplicate content copied from other sites&lt;/li&gt;
&lt;li&gt;whether your site violates the &lt;a target="_blank" href="http://help.live.com/help.aspx?project=wl_webmasters&amp;amp;querytype=keyword&amp;amp;query=senilediug&amp;amp;mkt=en-US"&gt;Bing search guidelines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;and so much more&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you post this question in the Crawling/Indexing Discussion forum, our staff can look up your website's index information and help determine what can be done to improve your situation. Take advantage of their expertise and resources for this and other similar questions!&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Do the search engine optimization (SEO)&amp;nbsp;recommendations you give for Bing affect my SEO performance with other search engines?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A:&lt;/strong&gt; Yes. They improve it! And of course, if you are actively performing legitimate, white-hat SEO activities for other search engines, it'll also help you with Bing as well. The basic takeaway here is that SEO is still SEO, and Bing doesn't change that. If you perform solid, reputable SEO on your website, which entails a good deal of hard work, creating unique and valuable content, earning authoritative inbound links, and the like (see our &lt;a target="_blank" href="http://www.bing.com/community/search/livesearch.aspx?domain=www.bing.com%2fcommunity&amp;amp;q=%22SEM+101%22"&gt;library of SEM 101 content&lt;/a&gt; in this blog for details), you'll see benefits in all top-tier search engines. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;But remember: SEO efforts ultimately only optimize the rank position that your site's design, linking, and content deserves. It removes the technical obstacles that can impede it from getting the best rank it should. However, it won't get you anything more than that. The most important part of SEO is doing the hard work of building value necessary to make your site stand out from the competing crowd of other websites for searchers. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;One other thing to remember: there is a long tail in search. After the few obvious keywords used in a particular field, there are many, many more keywords used to a lesser degree that still drive a lot of traffic to various niches of that field. Instead of always trying to be number 1 in a highly competitive field for the obvious keywords and faltering, consider doing the work of finding a less competitive keyword niche in that same field and then do the hard work necessary to earn solid ranking there.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;For more information on SEO and Bing, see our recent blog post, &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/03/search-engine-optimization-for-bing.aspx"&gt;Search Engine Optimization for Bing&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: How do I submit a Sitemap to Bing?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; There are a couple of ways to do this. If you have registered your site with &lt;a target="_blank" href="http://www.bing.com/webmaster"&gt;Bing Webmaster Center tools&lt;/a&gt;, log into the site, select the site to use from the &lt;b&gt;Site List&lt;/b&gt; page (webmasters can register multiple sites for one account), and then click the &lt;b&gt;Sitemaps&lt;/b&gt; tab. From there, you can perform a direct Sitemap submission by typing the web address of your Sitemap file (such as &lt;span style="font-family: Courier New;"&gt;www.example.com/sitemap.xml&lt;/span&gt; -- be sure to omit the "HTTP://" protocol designation as it's not needed here). If you have not yet registered your site with Webmaster Center (why not?) and you just want to submit your Sitemap file through your web browser using our Sitemap ping service, use the following URL:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New;"&gt;http://www.bing.com/webmaster/ping.aspx?sitemap=&lt;i&gt;add your Sitemap web address here&lt;/i&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Again, leave off the "HTTP://" as it isn't needed. For more information on using Sitemaps with Bing, see our Webmaster Center blog article, &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/15/uncovering-web-based-treasure-with-sitemaps-sem-101.aspx"&gt;Uncovering web-based treasure with Sitemaps (SEM 101)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Does Bing support sitemap index files?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; Definitely. Bing supports Sitemaps with up to 50,000 entries, be they site URLs or, in the case of Sitemap index files, references to child Sitemap files. With a Sitemap index file containing 50,000 references to child Sitemaps, each of which containing 50,000 site URLs, your Sitemap strategy can reference up to 2.5 billion URLs. Let us know if you need more. :-)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;For more information on Sitemap index file support, see the Webmaster Center blog post, &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/12/bing-enhances-support-for-large-sitemaps.aspx"&gt;Bing enhances support for large Sitemaps&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;&lt;br /&gt;&lt;b&gt;Q: I can't find you anymore. Where did your blog recently move to? How do I get to it now?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; The blog didn't actually move (well, not since June, when Bing was introduced). We're still at &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/"&gt;http://www.bing.com/community/blogs/webmaster/&lt;/a&gt;. But the &lt;b&gt;extras&lt;/b&gt; menu in the Bing user interface was recently removed and all references to &lt;a target="_blank" href="http://www.bing.com/webmaster"&gt;Webmaster Center&lt;/a&gt;, the &lt;a target="_blank" href="http://www.bing.com/community/"&gt;Bing Community&lt;/a&gt;, and the &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/"&gt;Bing Webmaster Center blogs&lt;/a&gt; and &lt;a target="_blank" href="http://www.bing.com/community/forums/default.aspx?GroupID=11"&gt;forums&lt;/a&gt;, were migrated under the &lt;b&gt;&lt;a target="_blank" href="http://www.bing.com/explore"&gt;More&lt;/a&gt;&lt;/b&gt; link, found on both the &lt;a target="_blank" href="http://www.bing.com/"&gt;Bing home page&lt;/a&gt; and the top left menu on other Bing pages. Some folks might have missed that. Please be sure to save the Webmaster Center blog to your browser favorites or, better yet, &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/rss.aspx"&gt;subscribe to our blog's RSS feed&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Why are your blog columns so long?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; No reason.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Q: Thank you!&lt;sub&gt;&lt;/sub&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;A:&lt;/b&gt; You're welcome! We get this comment most of all, and I wanted to make sure we acknowledged our appreciation for your kind words and support. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;As always, if you have any questions, comments, or suggestions, feel free to post them in our &lt;a target="_blank" href="http://www.bing.com/community/forums/12256.aspx"&gt;SEM forum&lt;/a&gt; (or any of the &lt;a target="_blank" href="http://www.bing.com/community/forums/default.aspx?GroupID=11"&gt;other Webmaster Center forums&lt;/a&gt; as appropriate). Until next time...&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9555881" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/8JwsDztZJLM" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/sitemaps/default.aspx">sitemaps</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Crawling/default.aspx">Crawling</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/10/09/webmaster-center-blog-q-amp-a.aspx</feedburner:origLink></item><item><title>The merciless malignancy of malware Part 4 (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/Fiu07OKuV9s/the-merciless-malignancy-of-malware-part-4-sem-101.aspx</link><pubDate>Thu, 01 Oct 2009 17:14:53 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9554029</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>5</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9554029</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/10/01/the-merciless-malignancy-of-malware-part-4-sem-101.aspx#comments</comments><description>&lt;p&gt;OK, so I totally geeked out with my recommendations on how to better secure your webmaster computing environment. As a result, I had too much material for one post and thus had to split it up into two pieces. Let’s wrap up this long series of posts on malware by finishing up with the last of the security recommendations.&lt;/p&gt;  &lt;p&gt;In &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/11/the-merciless-malignancy-of-malware-part-1-sem-101.aspx" target="_blank"&gt;Part 1&lt;/a&gt; of this series on &lt;a href="http://en.wikipedia.org/wiki/Malware" target="_blank"&gt;malware&lt;/a&gt;, we discussed how to detect a malware infection on your website using tools like &lt;a href="http://www.bing.com/webmaster/" target="_blank"&gt;Bing’s Webmaster Center&lt;/a&gt;. The &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/18/the-merciless-malignancy-of-malware-part-2-sem-101.aspx" target="_blank"&gt;Part 2&lt;/a&gt; post covered the resources and strategies for identifying the types and locations of malware code that typically affect websites with advice on how to remove it. The &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/24/the-merciless-malignancy-of-malware-part-3-sem-101.aspx" target="_blank"&gt;Part 3&lt;/a&gt; post began the run-down through 10 recommendations (well, the first 5, anyway!) on how to better secure your workstation and web server computers to prevent the malware from coming back. Today’s post, Part 4, finishes the list, and then includes information on what steps you can take to get that pesky malware warning message removed from your recently cleaned site in the Bing index.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Recommendations continued&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Getting rid of malware is only part of the battle. Hardening your security practices to keep it away is just as important. Let’s continue the list of recommended security strategies started in the &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/24/the-merciless-malignancy-of-malware-part-3-sem-101.aspx" target="_blank"&gt;previous post&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;6. Run Microsoft Update&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;I am presuming with this recommendation that you are running a modern Microsoft Windows operating system. Regularly run &lt;a href="http://windowsupdate.microsoft.com/" target="_blank"&gt;Microsoft Update&lt;/a&gt; on every Windows-based computer you use to touch your website. When you do so, I recommend that you click &lt;b&gt;Custom&lt;/b&gt; to see the total list of available updates for your computer rather than seeing only the &lt;b&gt;High Priority&lt;/b&gt; updates. Always keep current with the latest High Priority updates and strongly consider applying others updates as well.&lt;/p&gt;  &lt;p&gt;Note that the second Tuesday of every month is commonly referred to as “Patch Tuesday” for Microsoft Update, and time should be set aside on those dates to make sure all Windows-based systems in your web server infrastructure get the necessary security updates. Occasionally Microsoft, when necessary, also provides high-priority security updates ahead of this schedule, so it pays to stay on top of these releases as they occur. Signing up to receive &lt;a href="http://technet.microsoft.com/en-us/security/dd252948.aspx" target="_blank"&gt;Microsoft Technical Security Notifications&lt;/a&gt; can help!&lt;/p&gt;  &lt;p&gt;&lt;b&gt;7. Update non-Microsoft applications, too&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Applications that touch the Internet are at least as vulnerable to security holes as are web browsers and operating systems. Some major software manufacturers are beginning to build into their applications an online update system analogous to Microsoft Update. But not all have this feature yet, and not all that do perform the update automatically. It’s a really good idea to scan for and plug the often nasty security holes in the applications on your workstation through a software updating tool. I like the &lt;a href="http://secunia.com/vulnerability_scanning/" target="_blank"&gt;Secunia Software Inspector tool&lt;/a&gt; (check the licensing requirements for commercial use, but it’s free for many users), but there are many other choices out there. Be sure that the web applications you use are checked in that process. The bottom line is you need to regularly check for and install any software updates on all of the computers associated with your website.&lt;/p&gt;  &lt;p&gt;Keep in mind that software manufacturers regularly release updates for their products when they discover faulty features and security holes. The hacker community makes a point of studying those patches to learn what exploits the updates fix. If you don’t stay current with software updates, your computer may become vulnerable to reverse-engineered exploits.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;8. Improve your wireless security&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Many computers these days, especially laptops, are connected to the Internet only by wireless connections. If you work in a big organization with a security-conscious IT shop, you’re probably fine (while you’re at work, anyway). But many small shops and even more home users install their new Wi-Fi routers using default settings across the board. Hackers have developed such efficient wireless security cracking tools over the past decade that paranoia is no longer considered irrational or delusional behavior among IT security folks. (But if tin foil hats come out, all bets are off.)&lt;/p&gt;  &lt;p&gt;There are several things you can do to improve the security of your wireless network router. Dig up the user’s manual for that old router and learn how to do all of the following:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Update the device firmware. Go to the router manufacturer’s website and browse to the Downloads page for your model (typically within the site’s Support section) to see if you have the latest firmware release. If not, download it and install it. You may get new functional features and/or have known security holes resolved. Either way, look into it. The router’s manufacturer put it out there for a reason! &lt;/li&gt;    &lt;li&gt;Change the administrator password from the manufacturer’s default (using the tips in &lt;a href="http://www.microsoft.com/protect/fraud/passwords/create.aspx" target="_blank"&gt;Create strong passwords&lt;/a&gt;). Hackers typically know the default administrator password for various routers. Leaving yours with the default is honestly no better than disabling the admin password altogether. &lt;/li&gt;    &lt;li&gt;Change the network’s &lt;a href="http://en.wikipedia.org/wiki/SSID" target="_blank"&gt;Service Set Identifier (SSID)&lt;/a&gt; friendly name from its default to a name of your own choosing. Then once done, then disable the SSID broadcast so that the wireless network is hidden. It’s harder to crack a wireless network if you don’t see it, especially when you don’t know its name! &lt;/li&gt;    &lt;li&gt;Enable &lt;a href="http://en.wikipedia.org/wiki/MAC_filtering" target="_blank"&gt;media access control (MAC) address filtering&lt;/a&gt; so that only computers and devices whose MAC addresses you specify can access the network. All others are denied access. &lt;/li&gt;    &lt;li&gt;Exclusively use &lt;a href="http://en.wikipedia.org/wiki/WPA2#WPA2" target="_blank"&gt;Wi-Fi Protected Access version 2 (WPA2)&lt;/a&gt; security with &lt;a href="http://en.wikipedia.org/wiki/Advanced_Encryption_Standard" target="_blank"&gt;Advanced Encryption Standard (AES) encryption&lt;/a&gt; for the most secure connections. Forget relying upon Wired Equivalent Privacy (WEP) or WPA using Temporal Key Integrity Protocol (TKIP) encryption for security. Modern wireless security cracking tools can break these encryption schemes in minutes, even with the longest keys. &lt;/li&gt;    &lt;li&gt;Enable advanced routing features such as SPI (as discussed in tip #3 in the &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/24/the-merciless-malignancy-of-malware-part-3-sem-101.aspx" target="_blank"&gt;previous post&lt;/a&gt;). If your wireless router doesn’t support SPI, it’s probably using old technology and it may be time to shop for a new, more secure Wi-Fi router. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Note that none of these changes by themselves will sufficiently upgrade your wireless security, but the aggregate value of implementing them all will make your wireless network much more difficult to crack. And unless you are dealing with extremely determined hackers with an abundance of both technical resources and time to focus on cracking your specific, secured network, they will almost always move on to another of the ubiquitous, softer targets in the wifisphere.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;9. Protect your website’s configuration files&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Ensure that the sensitive configuration files of your web server and your web applications aren't accessible to unauthorized, external users. Place them in directories that are not served to the public and then disable directory browsing on your web server. Refer to your web server documentation for specific instructions on how to do this. I also recommend researching additional methods of securing your web server, such as &lt;a href="http://www.bing.com/search?q=IIS+security&amp;amp;go=&amp;amp;form=QBRE&amp;amp;qs=n" target="_blank"&gt;IIS&lt;/a&gt; or &lt;a href="http://www.bing.com/search?q=apache+security&amp;amp;form=QBLH&amp;amp;qs=AS&amp;amp;pq=Apache+sec" target="_blank"&gt;Apache&lt;/a&gt;, from attack.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;10. Perform data validation on user input&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;If your website accepts user input, ensure it is validated before processing or displaying it back to the user. For example, if you have a login form that accepts user names and passwords that are checked against a database, ensure that the input is scrubbed of any unexpected or invalid characters that might offer malicious manipulation of the database. Also, if user input is accepted and displayed (such as on forums), ensure users aren't able to modify the source code of the webpage, such as adding script for running &amp;lt;iframe&amp;gt; HTML code. &lt;/p&gt;  &lt;p&gt;Also be sure that input from backend systems is validated. This protects the users of your website, even if attackers &amp;quot;only&amp;quot; managed to break into a backend system, like your database. For more information on similar, related website attacks, look into the topic of &lt;a href="http://en.wikipedia.org/wiki/Cross-site_scripting" target="_blank"&gt;cross-site scripting (XSS)&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Bonus tip: Backup your clean web content&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Once you’ve ensured your site’s content and source code is clean, &lt;a href="http://en.wikipedia.org/wiki/Backup target=" _blank??="_blank??"&gt;back it up&lt;/a&gt;! &lt;a href="http://en.wikipedia.org/wiki/Disaster_recovery" target="_blank"&gt;Disaster recovery&lt;/a&gt; is not just about fires, floods, and earthquakes. A sudden, major malware infection ranks right up there in terms of potential business outages, so protect your work, your site, your business, and your customers who depend on you with proper, functional backups of clean code.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Even more information on securing servers&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;To be ultra secure, you might simply consider flattening and rebuilding the server from scratch. But don’t simply rebuild it to the way it was – remember, it was hacked in that state! Put in place all of the hardening steps mentioned earlier, as well as triple-checking all of your permissions settings, before putting the server back in service online. For more information on dealing with hacked servers, check out &lt;a href="http://www.antiphishing.org/reports/APWG_WTD_HackedWebsite.pdf" target="_blank"&gt;What to Do If Your Website Has Been Hacked by Phishers&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Request removal of the Bing malware warning&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Once you’ve resolved your malware infection, closed the security vulnerabilities that allowed your computer to be successfully attacked, and uploaded your cleaned-up source code to your web server, you’ve got one more job to do. It’s time to request that Bing re-evaluate your website for malware. Here’s how:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;Open the &lt;a href="https://support.discoverbing.com/eform.aspx?productKey=bingcontentremoval&amp;amp;ct=eformts" target="_blank"&gt;Bing support form&lt;/a&gt;. &lt;/li&gt;    &lt;li&gt;In the resulting &lt;b&gt;Windows Bing Support&lt;/b&gt; web form, type your full name and email addresses in the text boxes provided. &lt;/li&gt;    &lt;li&gt;In the &lt;b&gt;Service: Bing&lt;/b&gt; drop-down list, select &lt;b&gt;My Site has a malware warning&lt;/b&gt;. &lt;/li&gt;    &lt;li&gt;In the new drop-down list that appears below, select the option that best matches your specific situation (in this case, that’ll be &lt;b&gt;The malware has been removed&lt;/b&gt;. &lt;/li&gt;    &lt;li&gt;Complete the remainder of the form, adding as much detail as possible in the comments text box to help the support team resolve your request. Once completed, type the characters shown in the security image, and then click &lt;b&gt;Submit&lt;/b&gt;. &lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;By following this procedure, Bing will rescan your website to check that the malware has been removed. If confirmed, your content can then be reincluded in normal search results. Once done, keep monitoring your site’s malware status in the Crawl Issues tool of Bing’s Webmaster Center, just to be sure you stay on top of any new issues.&lt;/p&gt;  &lt;p&gt;If you have any questions or comments about malware, please feel free to post them in our &lt;a href="http://www.bing.com/community/forums/12248.aspx" target="_blank"&gt;General Questions forum&lt;/a&gt;. For regular SEM and SEO questions and suggestions, please go to our &lt;a href="http://www.bing.com/community/forums/12256.aspx" target="_blank"&gt;SEM forum&lt;/a&gt;. See you again soon…&lt;/p&gt;  &lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9554029" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/Fiu07OKuV9s" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Security/default.aspx">Security</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/10/01/the-merciless-malignancy-of-malware-part-4-sem-101.aspx</feedburner:origLink></item><item><title>The merciless malignancy of malware Part 3 (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/IAisJyCz9Ak/the-merciless-malignancy-of-malware-part-3-sem-101.aspx</link><pubDate>Thu, 24 Sep 2009 22:18:10 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9553308</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>3</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9553308</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/09/24/the-merciless-malignancy-of-malware-part-3-sem-101.aspx#comments</comments><description>&lt;p&gt;We’re going to diverge a bit from our regularly scheduled programming. Normally this column discusses search engine optimization (SEO) and related elements of search engine marketing (SEM), but we’re knee deep into our multi-part series on &lt;a href="http://en.wikipedia.org/wiki/Malware" target="_blank"&gt;malware&lt;/a&gt; and we’re going to begin the wrap-up with a talk about improving computer security. However, I geeked out a bit here, and the column went a bit long (yeah, even longer than usual!), so I decided to break this last section up into two pieces. Who wants to read a white paper as a blog post? I mean, besides me? :-)&lt;/p&gt;  &lt;p&gt;While beefing up your computer security practices won’t necessarily have a direct affect your site’s SEO performance, consider the repercussions of not doing so. Presenting a malware-infected website to your customers is a great way to ruin the integrity and conversion potential of your online business. Top tier search engines like Bing will either block a malware-infected page from showing up in its search engine results pages (SERPs) or will redirect the affected page’s link to a malware warning message. Bing presents the following warning message when searchers click its SERP link for a malware-infected page:&lt;/p&gt;  &lt;p&gt;&lt;img border="0" src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/4666.Bing_5F00_Malware_5F00_warning.GIF" /&gt;&lt;/p&gt;  &lt;p&gt;Since the vast majority of searchers will never opt to click through to override a malware warning from a SERP, assuming the link to the affected page is even shown in the first place, failure to quickly address detected malware infections is a great way to kill off pretty much all of your search referral traffic. And those customers who navigate directly to your site will not likely come back once they’ve determined your site was the source of their newly acquired malware infection.&lt;/p&gt;  &lt;p&gt;In &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/11/the-merciless-malignancy-of-malware-part-1-sem-101.aspx" target="_blank"&gt;Part 1&lt;/a&gt; of this series on malware, we discussed how to detect a malware infection on your website using tools like &lt;a href="http://www.bing.com/webmaster/" target="_blank"&gt;Bing’s Webmaster Center&lt;/a&gt;. The &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/18/the-merciless-malignancy-of-malware-part-2-sem-101.aspx" target="_blank"&gt;Part 2&lt;/a&gt; post was a long discussion on the resources and strategies for identifying the types and locations of malware code that typically affect websites, and included high-level information on removing it from your site. Today’s post, Part 3, and the next one, Part 4, present altogether 10 solid recommendations on how to better secure your workstation and web server computers so that the infections don’t come back. After all, what good is it to invest time in shooing away a kitchen full of house flies when you haven’t bothered to close the screen door?&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Recommended security strategies&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Once malware is removed, steps need to be taken to secure your website to prevent malware from reappearing on your website in the future. Securing all of the computers involved with creating, managing, and serving your website are the keys to success. If you were infected with malware, that means your computer infrastructure has one or more security vulnerabilities that need to be addressed. The following preventive measures are key tasks that either you or your hosting provider (likely a combination of both) need to take.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;1. Install and use an antivirus tool&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;If you have not done so yet, install and run a fully capable &lt;a href="http://www.microsoft.com/windows/antivirus-partners/windows-7.aspx" target="_blank"&gt;antivirus software tool&lt;/a&gt; on the computer workstation you use to develop and upload your website content. If your web server is not otherwise protected, also install an appropriate antivirus solution on it as well. A high-quality antivirus product will support scanning embedded scripts and other locally saved webpage controls used in your website’s source code for any known malware, so don’t skimp on quality and features here.&lt;/p&gt;  &lt;p&gt;Once you have an antivirus solution installed, be sure to regularly update both the tool’s program code and its malware signature files used for detection. Most modern antivirus tools have update features built-in, but make sure the update feature is working as expected before setting it and forgetting it. If you need some convincing as to why keeping your antivirus solution updated is important, I can only refer you to the &lt;a href="http://www.microsoft.com/sir" target="_blank"&gt;Microsoft Security Intelligence Report&lt;/a&gt; (to which Bing is a key contributor). And lastly, remember to use your antivirus tool! You need to regularly scan your Internet-connected computers for malware to ensure they remain clean. &lt;/p&gt;  &lt;p&gt;Microsoft offers a free, web-based, anti-malware scanner called &lt;a href="http://safety.live.com/" target="_blank"&gt;Windows Live OneCare safety scanner&lt;/a&gt;. It works on computers running Windows XP, Windows Vista, and Windows 7. It checks for and removes viruses, spyware, and other likely unwanted software, as well as detects vulnerabilities in your Internet connection. Heck, it can even be used to clean up your hard drive and tune up your computer’s performance!&lt;/p&gt;  &lt;p&gt;Microsoft has also just released its &lt;a href="http://www.microsoft.com/security_essentials/" target="_blank"&gt;Microsoft Security Essentials&lt;/a&gt; program, a new, no-cost, anti-malware solution that runs in the background of your computer and protects it in real-time against viruses, spyware, and other malicious software. Check it out.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;2. Install and use an anti-spyware tool&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;If your antivirus solution doesn’t specifically include it (and many do these days), you should also install a &lt;a href="http://en.wikipedia.org/wiki/Anti-spyware#Remedies_and_prevention" target="_blank"&gt;good anti-spyware scanning and protection tool&lt;/a&gt; on your workstation (since you likely don’t surf the Web directly from your web server, this protection is likely not needed there). As with the antivirus tool, keep this tool updated and use it regularly to scan your computer for problems. The last thing you want to do is introduce malware into your web server environment from a compromised workstation!&lt;/p&gt;  &lt;p&gt;Microsoft also offers a free antispyware tool called &lt;a href="http://www.microsoft.com/windows/products/winfamily/defender/" target="_blank"&gt;Windows Defender&lt;/a&gt;. It actively protects your computer in real-time against pop-ups, performance problems, and security threats by detecting and removing spyware and other unwanted software.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;3. Use a firewall&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;At a minimum, you should use a software firewall utility to protect your workstation and server from external hackers. A software firewall blocks unauthorized and inappropriate network traffic to your computer. Hackers employ these techniques to take control of, and thus install malware on, your system. &lt;a href="http://www.bing.com/search?q=software+firewall&amp;amp;form=QBLH&amp;amp;qs=AS&amp;amp;pq=software+firew" target="_blank"&gt;Many software firewall options exist&lt;/a&gt;, both for Windows users and users of other platforms. On your server, use the firewall to block all inbound traffic except for normal web server requests traffic and a secure access method for your webmaster site uploads from predefined computers.&lt;/p&gt;  &lt;p&gt;To improve security further, consider installing a separate hardware firewall device between your computers and the Internet that offers, at a minimum, &lt;a href="http://en.wikipedia.org/wiki/Stateful_firewall" target="_blank"&gt;stateful packet inspection (SPI)&lt;/a&gt;. Firewall devices use SPI to track the state of the network connections passing through them. Rogue or malformed TCP/IP network packets, sometimes implemented by hackers to get through weaker firewall solutions, are rejected by SPI-enabled firewalls. &lt;a href="http://en.wikipedia.org/wiki/Application_layer_firewall" target="_blank"&gt;Application-level filter&lt;/a&gt; firewalls are better yet, as they work at the application layer of the network protocol stack, where they can more safely examine which network protocol is used on which port and determine whether its use is appropriate.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;4. Use a secure protocol to access your web server&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Standard FTP protocol doesn’t encrypt the data as it’s transmitted, so if your computer or its network has been compromised by hacker using network sniffer technologies, your web server’s logon credentials are at risk of being stolen. As alluded to in the section on firewall, using &lt;a href="http://en.wikipedia.org/wiki/SSH_File_Transfer_Protocol" target="_blank"&gt;Secure FTP or Secure Shell (SSH)&lt;/a&gt; eliminates this potential vulnerability. Make sure you do this end-to-end, from the site developer to the webmaster and from the webmaster to the server.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;5. Change and strengthen your passwords&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Your computer security is usually only as good as the freshness and strength of the passwords you use to access your computer. If your passwords haven’t been changed since the days &lt;a href="http://en.wikipedia.org/wiki/%27N_Sync" target="_blank"&gt;'N Sync&lt;/a&gt; was still hot, it’s time to say &amp;quot;Bye Bye Bye&amp;quot; to that. You need to implement a regimen of regularly changing your passwords. And when you do, please make them harder to guess than “password” or something else hyper-obvious. Check out the article, &lt;a href="http://www.microsoft.com/protect/fraud/passwords/create.aspx" target="_blank"&gt;Create strong passwords&lt;/a&gt;, for helpful tips on doing this.&lt;/p&gt;  &lt;p&gt;Yeah, you don’t need to tell me that this is inconvenient. But if you choose to skip doing this, while you might be happier temporarily, hackers will be thrilled. Static, simple passwords are easy to crack, and once hackers figure out your logon credentials, they can do anything they want to your site, including locking you out! Imagine having a hacked site and you can’t even log in to fix the problem!&lt;/p&gt;  &lt;p&gt;&lt;b&gt;More recommendations to come&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;We’ll continue with another five recommendations for securing your webmaster computing environment in our next post. If you have any questions or comments about malware, please feel free to post them in our &lt;a href="http://www.bing.com/community/forums/12248.aspx" target="_blank"&gt;General Questions forum&lt;/a&gt;. For regular SEM and SEO questions and suggestions, please go to our &lt;a href="http://www.bing.com/community/forums/12256.aspx" target="_blank"&gt;SEM forum&lt;/a&gt;. I’ll be back…&lt;/p&gt;  &lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9553308" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/IAisJyCz9Ak" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Security/default.aspx">Security</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/09/24/the-merciless-malignancy-of-malware-part-3-sem-101.aspx</feedburner:origLink></item><item><title>The merciless malignancy of malware Part 2 (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/1bkX-C0SDQk/the-merciless-malignancy-of-malware-part-2-sem-101.aspx</link><pubDate>Fri, 18 Sep 2009 22:34:04 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9552126</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>7</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9552126</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/09/18/the-merciless-malignancy-of-malware-part-2-sem-101.aspx#comments</comments><description>&lt;p&gt;Malware infections are no laughing matter. When they afflict your website, they can infect your customers, who won’t appreciate your sharing, intentional or not (and I’m guessing it’s not)! And if Bing discovers malware on your site, your listing in the Bing search engine results pages (SERPs) will either be completely omitted or the link to your site will be disabled, so when the searcher clicks on it, only a malware warning appears. All told, this is bad news for conversions, don’t you think?&lt;/p&gt;  &lt;p&gt;This article is Part 2 of a three-part series on malware. &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/09/11/the-merciless-malignancy-of-malware-part-1-sem-101.aspx" target="_blank"&gt;Part 1&lt;/a&gt; covered how to detect the presence of malware on your site by using the &lt;a href="http://www.bing.com/webmaster/" target="_blank"&gt;Bing Webmaster Center tools&lt;/a&gt; to get access to the information the bot sees when it crawls your site’s pages and the external links they contain. In this post, we’ll cover the available resources and strategies to do a malware clean-up job. It’s usually a big job, so let’s get right to it.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Cleaning up the mess&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Bing’s detection of malware on your site usually indicates that your site was hacked. Comprehensive information on how to clean up each specific malware infection could fill an entire book (and this post is quite long as is). Instead of deep dives into specifics, let’s talk about strategies and resources for combating this problem.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Sources of malware code&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;There are three primary ways your website might be serving malware:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;b&gt;External source.&lt;/b&gt; If hackers exploit an existing security vulnerability to gain access to your source code, they often edit your HTML or script files to make calls to externally based, malicious content on servers they control. Worse yet, hackers don’t even need access to your web server or source files to inflict this attack. If you include externally based content on your pages, and if the hackers can successfully attack the source at its external site, your pages then will unintentionally serve their malware to your customers. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Local source.&lt;/b&gt; Sometimes hackers, once they’ve gained full access to vulnerable web servers, put malware code directly in your webpage files and/or place malicious content in the directory structure of your website. Your page’s HTML source code may still appear to be clean, but in this case, the poisoned images, documents, or other binary files they call locally can be the source of the malware attack. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Man-in-the-middle attack.&lt;/b&gt; Although this is a less common form of attack due to its technical sophistication, hackers can, when server and network security is severely compromised, inject malware into your webpage content over the network as it travels from your web server to the end user. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Your webpage will be considered malicious if you serve malware from any source, be it from an external server, directly from your web server, or by man-in-the-middle attacks. A user browsing to your webpage from the Bing SERPs will not be able to distinguish your clean content from the malicious content inserted there by hackers. It’s all presented as content in your webpages, so you are ultimately responsible for protecting your customers.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Attack indicators&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;The malicious code changes that hackers will likely employ come in one or more of these five forms. If any of these elements in your code appear to be suspicious, unexpectedly modified, or unfamiliar to you as webmaster, investigate them further.&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;b&gt;Script code&lt;/b&gt;. Webmaster should check for new &lt;a href="http://en.wikipedia.org/wiki/Code_injection" target="_blank"&gt;JavaScript script code inserted&lt;/a&gt; into their pages. This code will be contained within &amp;lt;script&amp;gt; tags. It is common for inserted, malicious script to be written in an encrypted hash function, accompanied by the decryption key hash value, which allows the script to execute but prevents the webmaster from being able to read and interpret the function of the code. This is known as &lt;a href="http://en.wikipedia.org/wiki/Obfuscated_code#Obfuscation_in_malicious_software" target="_blank"&gt;obfuscated JavaScript code&lt;/a&gt;. Such code will look like many wrapped lines of continuous, random alphanumeric characters within a &amp;lt;script&amp;gt; tag. If your pages do not normally use encrypted scripts, this form of script code will definitely stand out as different. This malicious code usually runs when the page is loaded, and it typically appends an exploit or a poisoned, hidden control to the page as it is loaded. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;&amp;lt;iframe&amp;gt; code.&lt;/b&gt; An &amp;lt;iframe&amp;gt; is simply an HTML tag that enables an unrelated HTML document to be loaded within another HTML document. The &amp;lt;iframe&amp;gt; tag enables hackers to inject poisoned HTML and script code into another webmaster’s webpage. Injected script code often employs &amp;lt;iframe&amp;gt; tags as the means of creating hidden windows that enable malware exploits to execute without the user’s knowledge. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Page redirect.&lt;/b&gt; If a hacker get access to edit your home page, they can add code that will automatically and immediately redirect a web browser to another web page (usually to one similarly named and identical looking one on an external server, but possibly to one created on your server) that runs malware as the page loads. This can be done by means of &amp;lt;meta&amp;gt; refresh, JavaScript, or even 301/302 redirects. Unless you find the code for the redirect when you examine your content, you typically won’t see any malware on your page because it’s not executed there. Be sure to also visually inspect your web server configuration for unauthorized redirects. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Externally sourced content.&lt;/b&gt; Note that while the use of small, externally-based controls, like hit counters, can be legitimately secure when you first install them, if the webmaster of that control’s host server is not security conscious, those once-benign controls can themselves become malicious vectors later on. Also, small advertising hosts can outsource their contracts to other advertising hosts, who might sub-contract that work out again several times down the line, all done in order to sell more advertising. But the farther you get away from the original trusted external host, the more vulnerable your link becomes to that original, external ad host. Only use external (third party) content from highly trusted sources whose security practices are widely known to be good. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Obfuscation efforts.&lt;/b&gt; Attackers often try to hide their exploitation work from quick inspections by using external, referring domain names that are spelled very similarly to known, trusted entities of the Web. Check the spelling of domain names in all external resources to be sure the URLs were not changed to addresses that are similarly named but not the actual, intended target. This includes external references to advertisers, hit counters and other such controls, external images, analytics trackers, and the like. Also, look for URLs that substitute IP addresses for domain names, another common method of obfuscation. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;&lt;b&gt;What can you do?&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;You gotta look at the code. If someone cracked your web server’s security and modified your source code, you need to find what’s changed as the first step in identifying and cleaning up the malware. You can do this by visually inspect the HTML and script code on your pages for unauthorized changes.&lt;/p&gt;  &lt;p&gt;When you examine your source code, carefully inspect your code on your web server. Look for newly added scripts in your HTML pages that execute when the page loads, especially obfuscated script. Consider any references to third-party domains in your source code as a potential source of malware. Suspects should include any inserted external code that runs on your site when the page is loaded, including hit counters, images, media content, and other externally sourced controls. External scripts should never be implicitly trusted without a careful consideration of that host’s security practices, as this is a major security vulnerability.&lt;/p&gt;  &lt;p&gt;As much as possible, remove unnecessary, externally sourced content to reduce your exposure to exploits beyond your control. Only embed content from trusted third parties into your webpages. If you discover some code that was added to or modified on your page without authorization or realize a once-trusted external page element now appears to be malicious, simply remove that portion of the code from your file to clean it up.&lt;/p&gt;  &lt;p&gt;Malware might also have been embedded in your existing images, document files, animations and media content, or other binary files that are presented on your pages. All of these should be scanned again with an antivirus tool for malware.&lt;/p&gt;  &lt;p&gt;If you are using a version control system for maintaining your site’s source code, you can easily redeploy the last known good version before the infection occurred. Just be sure that the versioned source code from your workstation is not the source of the malware.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Diagnostic tools to use&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;To help in your source code examination, use these tools for additional insight on cleaning up a malware mess:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;b&gt;Run an antivirus utility on your source code.&lt;/b&gt; Install a fully capable &lt;a href="http://www.bing.com/search?q=antivirus+software&amp;amp;form=QBLH&amp;amp;filt=all&amp;amp;qs=AS&amp;amp;pq=antivirus" target="_blank"&gt;antivirus software tool&lt;/a&gt; (and regularly update it to be sure its program code and malware signatures are current) to run a thorough scan of the folders containing your website’s source code. It may be able to detect some forms of malware if they are locally installed on your web server or your webpage files were modified with unauthorized, malicious scripts. Also run a thorough antivirus scan of your personal workstation (the one you use to edit the your site’s source code and connect to the web server for uploads). You may unknowingly infect an otherwise clean web server with a compromised workstation infected with malware. And if you get a key logger infection on your workstation, the hacker controlling that malware might steal your web server’s FTP logon credentials, providing them with full access to attack your site with malicious content. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Run Fiddler HTTP proxy on your website.&lt;/b&gt; The &lt;a href="http://www.fiddler2.com/" target="_blank"&gt;Fiddler web debugging proxy tool&lt;/a&gt; is a no-cost, web debugging proxy tool used to see what HTTP calls are being made when your page is loaded. By examining the multi-threaded, network traffic generated by your webpages, you can see if your pages are making unexpected calls to unknown resources, and if so, identify where they are going. Watch the &lt;a href="http://www.fiddler2.com/Fiddler/help/" target="_blank"&gt;Fiddler video tutorials and reads its documentation&lt;/a&gt; to learn how this valuable tool is used and how it works. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;&lt;b&gt;Checking for man-in-the-middle attacks &lt;/b&gt;&lt;/p&gt;  &lt;p&gt;You might also inspect your source code as received by your browser using the browser’s &lt;b&gt;View Source&lt;/b&gt; command to check for “man-in-the-middle” attacks. In that case, a direct inspection of the original webpage source code files on your web server would likely reveal no malware infection. However, by revealing and examining the source code for the infected webpages from your web browser and comparing the results to the original, clean file from the web server, you might find the malicious changes. If so, inform your web-hosting provider that they might be the victims of a &amp;quot;man-in-the-middle&amp;quot; attack. If your provider takes no action as a result, consider moving your website to a more trusted provider. Luckily, as this is a much more sophisticated attack, it is less common than overt modification of the code on your webpages.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;&lt;i&gt;Warning!&lt;/i&gt;&lt;/b&gt; Make sure both your browser and your operating system are running the latest security updates, along with running up-to-date antivirus, anti-spyware, and software firewall products, to minimize the vulnerabilities to your computer when loading pages likely to be infected with malware.&lt;/p&gt;  &lt;p&gt;Also, most web browsers allow you to configure specific security settings for individual sites. Add your infected site to the list. (If your browser doesn’t allow you to specify security settings for individual sites, you can temporarily implement these settings for all sites during your testing, but you may want to revert those changes later to restore full functionality.) You’ll want to disable JavaScripts for your tests. If you’re using Internet Explorer, you’ll also want to disable ActiveX controls. These changes will protect your computer from the infection methods used by malware.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Verifying your fixes&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Once you have cleaned up the problem, you should verify your work to be sure the revised code is clean. &lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Visually inspect the page on the web server to be sure the edits are in place. &lt;/li&gt;    &lt;li&gt;Visually inspect the changed page and its source code in your web browser. &lt;/li&gt;    &lt;li&gt;Use Fiddler to ensure that the malware’s unexpected external network calls have been eliminated. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;&lt;b&gt;A stumper&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Sometimes you’ll scan your site’s code and find no clear source of malware, yet malware is clearly affecting the users of your website. If this is the case, look at portions of your code where you take user input without input validation, write cookies to the user’s computer, or other such personalized activity beyond simply displaying information to a generic user. Your site may be the victim of &lt;a href="http://en.wikipedia.org/wiki/Cross-site_scripting" target="_blank"&gt;cross-site scripting (XSS)&lt;/a&gt;. Resolving this specific issue is beyond the scope of this article, but it is very commonly used by hackers for exploiting computer security vulnerabilities, and you should learn how to protect your site against such attacks.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Additional information resources&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Microsoft offers a number of useful, anti-malware resources to help you understand what you are up against and what you need to do. Check these out for starters:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Visit the &lt;a href="http://www.microsoft.com/security/portal/" target="_blank"&gt;Microsoft Malware Protection Center Threat Research and Response portal&lt;/a&gt; for information on malware, Microsoft security products, useful guidance and advice, and more. &lt;/li&gt;    &lt;li&gt;Visit the Microsoft TechNet &lt;a href="http://technet.microsoft.com/en-us/security/" target="_blank"&gt;Security TechCenter&lt;/a&gt; to access their vast library of security resources, including the articles:       &lt;ul&gt;       &lt;li&gt;&lt;a href="http://technet.microsoft.com/en-us/library/cc498723.aspx" target="_blank"&gt;Security and Updates&lt;/a&gt; &lt;/li&gt;        &lt;li&gt;&lt;a href="http://technet.microsoft.com/en-us/library/cc700813.aspx" target="_blank"&gt;Help: I Got Hacked. Now What Do I Do?&lt;/a&gt; &lt;/li&gt;        &lt;li&gt;&lt;a href="http://technet.microsoft.com/en-us/library/cc512653.aspx" target="_blank"&gt;Virus Management: Overview of the Malicious Software Removal Tool&lt;/a&gt; &lt;/li&gt;     &lt;/ul&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;The topic of malware clean up is admittedly not really an introductory level subject, despite this being the SEM 101 column. But the negative implications of detected malware infections on a website are huge. Referrals from Bing will likely dry up after the bots detect malware because of end user protection mechanisms employed on the SERPs to prevent searchers from clicking an infected page. And on top of that, the few customers who choose to circumvent those protections on the SERP or who choose to browse directly to an infected site may possibly suffer the frustrating consequences of a malware infection. Either way, the folks whom you are trying to convert, either with a purchase, a subscription, or a download, will be forced to deal with the unpleasant mess left by the malware picked up from your site. They won’t remain your customers for long. And that’s why this topic needs to be addressed in SEM 101, even though it’s not really a 101-level topic.&lt;/p&gt;  &lt;p&gt;If you have any questions or comments about malware, please feel free to post them in our &lt;a href="http://www.bing.com/community/forums/12248.aspx" target="_blank"&gt;General Questions forum&lt;/a&gt;. For regular SEM and SEO questions and suggestions, please go to our &lt;a href="http://www.bing.com/community/forums/12256.aspx" target="_blank"&gt;SEM forum&lt;/a&gt;. Next up: how to better secure your computers against hacker attacks. Until then...&lt;/p&gt;  &lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9552126" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/1bkX-C0SDQk" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Security/default.aspx">Security</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/09/18/the-merciless-malignancy-of-malware-part-2-sem-101.aspx</feedburner:origLink></item><item><title>Temporary glitch for adding sites to Webmaster tools</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/MM5Wj-E4AXI/tools-outage.aspx</link><pubDate>Thu, 17 Sep 2009 19:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9551906</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>16</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9551906</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/09/17/tools-outage.aspx#comments</comments><description>&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-family: Calibri; font-size: small;"&gt;As some of you may have experienced, our &amp;ldquo;Add a site&amp;rdquo; page has been unavailable in some locales for the past several days. This is not a site-wide or even a global issue.&amp;nbsp; Unfortunately, this happened due to an update that uncovered a bug in the original code, which caused us to disable the page in one of our data centers. The good news is that we are testing a fix now and will be releasing it to production just as soon as we are sure it will not cause further complications. I will update this post&amp;nbsp;when I receive word that the fix is in production. &lt;/span&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-family: Calibri; font-size: small;"&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-family: Calibri; font-size: small;"&gt;On behalf of all of us here in the Bing Webmaster Center team, we would like to extend our apologies and thank you for your continued patience.&lt;/span&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-family: Calibri; font-size: small;"&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-family: Calibri; font-size: small;"&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;em&gt;&lt;span style="font-family: Calibri; font-size: small;"&gt;--Brett Yount, Bing Webmaster Center &lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9551906" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/MM5Wj-E4AXI" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Announcement/default.aspx">Announcement</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Tools+outage/default.aspx">Tools outage</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/09/17/tools-outage.aspx</feedburner:origLink></item><item><title>The merciless malignancy of malware Part 1 (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/zHeZrpzHrHI/the-merciless-malignancy-of-malware-part-1-sem-101.aspx</link><pubDate>Fri, 11 Sep 2009 19:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9550111</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>15</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9550111</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/09/11/the-merciless-malignancy-of-malware-part-1-sem-101.aspx#comments</comments><description>&lt;p&gt;The Web is an incredible place, filled with amazing media, fascinating content, and wonderful social opportunities, and there&amp;rsquo;s more of each than anyone can possibly ever consume. But unfortunately, it&amp;rsquo;s not a benign place. There are more than a few malefactors out there who actively seek to take over your computer for a variety of nefarious purposes. These purposes usually include turning your computer into a:&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Member of their computer zombie army, available on command (and to the highest bidder) to execute massive &lt;a target="_blank" href="http://en.wikipedia.org/wiki/DDOS#Distributed_attack"&gt;distributed denial-of-service (DDOS) attacks&lt;/a&gt; on other web-based computers&lt;/li&gt; &lt;li&gt;Recorder of keystrokes so they can steal passwords to users&amp;rsquo; online financial accounts, along with their cash and other, personal data of value to identity thieves&lt;/li&gt; &lt;li&gt;Secret, hidden repository for their stolen and hacked software and pornographic content&lt;/li&gt; &lt;li&gt;Vector for spreading their malicious software (aka &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Malware"&gt;malware&lt;/a&gt;) to other computers&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;The people who do this today are usually not the one-off, &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Script_kiddies"&gt;script kiddies&lt;/a&gt; of yore. These miscreants are now often very sophisticated computer software engineers who work for organized criminal groups. And make no mistake: the motive is now profit-based, not simple mischief. These hackers attempt to do all this and more by infecting your computer with a wide variety of malware.&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;Malware is the name for software created specifically to stealthily install, take control, and perform harmful actions on a computer without the computer owner&amp;rsquo;s knowledge or permission. Programs such as viruses, worms, Trojan horses, root kits, key loggers, malicious scripts, drive-by downloads, and corrupted program controls are today typically Internet-borne threats, much of it coming from otherwise innocent websites whose content is often secretly hacked. &lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;Many tech savvy users know how to basically protect their computers from these denizens of the dark, but not everyone does. That lapse in universal security consciousness has to include, sad to say, some webmasters and web server hosts. When Bing crawls the Web to gather new and revised content to index, it invariably comes across malware-infected sites. While a few appear to be clear attempts to lure in unsuspecting users like a &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Venus_flytrap"&gt;Venus Flytrap&lt;/a&gt; waiting for its next insect meal, a large number of sites appear to be infected from external sources (aka hackers), and the webmasters of these affected sites are almost guaranteed to be innocent victims of sabotage.&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;This is Part 1 of a three-part series on malware and what webmasters need to know. We&amp;rsquo;ll cover malware detection (how to tell if your site is infected), strategies and resources for cleaning up (what to do about it), and how to secure computers against the security vulnerabilities that allowed the malware to be injected there (how to stop it from coming back). We&amp;rsquo;ll also cover what to do once malware is cleaned up so that the Bing index lists your site as being clean again. Let&amp;rsquo;s get to it!&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;&lt;b&gt;Detection&lt;/b&gt;&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;So how do you know if your site has unwittingly become a malware vector? It&amp;rsquo;s not always obvious for webmasters to tell. You can wait for victimized users to send you reports (often in the form of furiously rude complaints!), but by then who knows how many of your site&amp;rsquo;s visitors have been infected (and how many of them will come back once they determine where the infection came from)? &lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;The search engine crawlers (aka bots) have seen it all. They see the attempted effort to inject malware in drive-by attacks as they crawl the Web. While the bots themselves don&amp;rsquo;t get infected, they do note the source of the infection attempt in their database.&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;Wouldn&amp;rsquo;t you like to peer into that database to see if the bot found malware on your site? Well, I&amp;rsquo;ve got good news for you. You can! &lt;a target="_blank" href="http://www.bing.com/webmaster"&gt;Bing&amp;rsquo;s Webmaster Center tools&lt;/a&gt; offer a peek at what the bot found when crawling your webpages. And unlike the webmaster tools from other search engines, Bing Webmaster Center will show you if we detected malware when we crawled your pages. To get this invaluable insider&amp;rsquo;s view of your site, you&amp;rsquo;ll need to first have an account with Webmaster Center. If you don&amp;rsquo;t yet have an account, follow the instructions at &lt;a target="_blank" href="http://help.live.com/Help.aspx?market=en-US&amp;amp;project=WL_Webmasters&amp;amp;querytype=topic&amp;amp;query=WL_WEBMASTERS_CONC_VerifyYourSite.htm"&gt;Authenticate your website&lt;/a&gt; to set up your account and register your site(s). Note that you&amp;rsquo;ll need access to either the root directory of your website or to the source code to your site&amp;rsquo;s default page for deploying a customized authentication code that proves you are the owner of the site. This data about your website is business confidential, after all!&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;Once your site is registered and can be authenticated, log in to &lt;a target="_blank" href="http://www.bing.com/webmaster"&gt;the tools&lt;/a&gt;, click the registered site you want to investigate from the &lt;b&gt;Site List&lt;/b&gt; page, and then click the &lt;b&gt;Crawl Issues&lt;/b&gt; tool tab. In the &lt;b&gt;Select Issue Type&lt;/b&gt; drop down list, select &lt;b&gt;Malware Infected&lt;/b&gt;. If any infected pages were detected by &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/17/new-bot-work-continues-at-bing.aspx"&gt;MSNBot&lt;/a&gt;, we&amp;rsquo;ll identify those pages for you by file name. Note that getting no explicit results in the &lt;b&gt;Malware Infected&lt;/b&gt; list is not necessarily the equivalent of a clean bill of health for your entire website. That merely means we didn&amp;rsquo;t detect malware on the pages that are in the index. To see how many of your site&amp;rsquo;s pages are in the Bing index, click on the &lt;b&gt;Summary&lt;/b&gt; tool tab, and then look at the &lt;b&gt;Indexed pages&lt;/b&gt; field. If not every page in your site is indexed, you might remain reasonably suspicious, even with no detected malware. But if any malware was detected, consider this to be a giant red flag hoisted up high. In this case, every page on your site needs to be examined closely, especially those not indexed. A detected malware infection means your site has likely been hacked, and if your site&amp;rsquo;s security was compromised once, every page should be suspected as dirty until individually verified by you as clean. &lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;  You should also click on the &lt;b&gt;Outbound Links&lt;/b&gt; tool tab and select the &lt;b&gt;Show only outbound links to malware&lt;/b&gt; check box to see if you&amp;rsquo;re linking to any indexed, malware-infected pages on other sites. If so, you can protect your site&amp;rsquo;s customers by removing the link to the infected page. It&amp;rsquo;s also good form to inform your fellow webmaster of what you&amp;rsquo;ve detected on their site so they can fix the problem and you can restore the link (wouldn&amp;rsquo;t you want to know if another webmaster found something wrong with your site?).&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;&lt;b&gt;Implications of a positive result&lt;/b&gt;&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;OK, so unlucky you &amp;ndash; your site has one or more pages that were detected as infected with malware. What does this mean? Do you really need to fix it? Well, let&amp;rsquo;s address these questions by describing what Bing does with malware-infected sites. &lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;Through the use of its &lt;a target="_blank" href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/17/bing-keeps-the-web-safe-with-malware-filter.aspx"&gt;malware filter&lt;/a&gt; and the &lt;a target="_blank" href="http://www.bing.com/community/blogs/search/archive/2008/12/02/battling-the-plague-of-the-web.aspx"&gt;drive-by download detection&lt;/a&gt; features, Bing helps protect its users against a variety&lt;span&gt;&amp;nbsp; &lt;/span&gt;of malware infections whenever possible. These protections either &lt;span style="color: #333333;"&gt;identify and remove malware sites from our search engine results pages (SERPs) &lt;/span&gt;or block access to infected URLs. If your malware-infected page does show up in the Bing SERP, the blue link to your page will be disabled. When a user clicks on the disabled link, instead of going to your page, they will see a malware warning box pop up to the right of the SERP listing. The pop up warning box looks like the following example: &lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/4666.Bing_5F00_Malware_5F00_warning.GIF" border="0" /&gt;&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;A recent study at Microsoft revealed that 98% of searchers who get a malware notification will heed the warning and opt to not click the &lt;b&gt;visit the website&lt;/b&gt; link in the warning message. That means that if your site is flagged by Bing as malware-infected, your search engine referral traffic will drop off the charts! As such, it is in your best interest as webmaster to rectify the malware issue so that you can get your search engine referral business back in gear!&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;In the next article of this series on malware, we&amp;rsquo;ll dive into strategies and identify resources for cleaning up a malware mess. If you have any questions or comments about malware, please feel free to post them in our &lt;a href="http://www.bing.com/community/forums/12248.aspx" target="_blank"&gt;General Questions forum&lt;/a&gt;. For regular SEM and SEO questions and suggestions, please go to our &lt;a href="http://www.bing.com/community/forums/12256.aspx" target="_blank"&gt;SEM forum&lt;/a&gt;. Until next time&amp;hellip;&lt;/p&gt; &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt; &lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9550111" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/zHeZrpzHrHI" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Security/default.aspx">Security</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/09/11/the-merciless-malignancy-of-malware-part-1-sem-101.aspx</feedburner:origLink></item><item><title>Search Engine Optimization for Bing</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/ZlZy_5gm5iM/search-engine-optimization-for-bing.aspx</link><pubDate>Thu, 03 Sep 2009 22:20:21 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9548667</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>114</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9548667</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/09/03/search-engine-optimization-for-bing.aspx#comments</comments><description>&lt;p&gt;When I attended both SMX Advanced in Seattle back in June and SES San Jose just a couple of weeks ago, I heard a lot of questions from webmasters about Bing, especially pertaining to search engine optimization issues. Typically these included:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;I want to do SEO for Bing—where should I start? &lt;/li&gt;    &lt;li&gt;How is Bing different in terms of SEO? &lt;/li&gt;    &lt;li&gt;What do webmasters need to know and do? &lt;/li&gt;    &lt;li&gt;Are there any insider tips for successful ranking? &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;I’ll tackle these questions by providing some useful, baseline information and include pointers to more detailed, pertinent docs.&lt;/p&gt;  &lt;p&gt;As you know, Bing is an evolution in the search engine space. With its innovative, new user interface (UI) design bringing new depth and opportunities for searchers, they can now quickly find the information they seek when they search the Internet. New UI features, such as Quick Tabs, Related Searches, and Document Preview (to name just a few), surface more information and present more opportunities to discover what searchers want to know so they can make more informed decisions more quickly. As a result, we describe Bing as a decision engine. (For more information on the new UI features in Bing, see the Bing Webmaster Center blog post, &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/11/bing-white-paper-for-webmasters-amp-publishers-released.aspx"&gt;Bing white paper for webmasters &amp;amp; publishers released&lt;/a&gt;.)&lt;/p&gt;  &lt;p&gt;Under the covers of the new UI, we do a lot of engineering work on a very large scale. For example, we crawl a variety of content types found on the Web, index that content, apply appropriate algorithms, and finally send relevant content to user queries in our search engine results pages (SERPs).&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Bing’s SEO principles&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;SEO is fundamentally about creating websites that are good for people. The most basic advice we can give for achieving optimum rank for your site in Bing is to do the following:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Develop &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/05/27/are-you-content-with-your-content-sem-101.aspx"&gt;great, original content&lt;/a&gt; (including &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/05/20/put-your-keywords-where-the-emphasis-is-sem-101.aspx"&gt;well-implemented&lt;/a&gt; &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/05/15/the-key-to-picking-the-right-keywords-sem-101.aspx"&gt;keywords&lt;/a&gt;) directed toward &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/05/12/who-s-looking-for-you-sem-101.aspx"&gt;your intended audience&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;Use &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/10/architecting-content-for-seo-sem-101.aspx"&gt;well-architected&lt;/a&gt; &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/18/head-s-up-on-lt-head-gt-tag-optimization-sem-101.aspx"&gt;code&lt;/a&gt; in &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/26/site-architecture-and-seo-file-page-issues-sem-101.aspx"&gt;your webpages&lt;/a&gt; (including &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/23/images-and-flash-and-script-oh-my-sem-101.aspx"&gt;images&lt;/a&gt; and &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/15/uncovering-web-based-treasure-with-sitemaps-sem-101.aspx"&gt;Sitemaps&lt;/a&gt;) so that users’ web browsers and search engine crawlers can read the content &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/21/prevent-a-bot-from-getting-lost-in-space-sem-101.aspx"&gt;you want indexed&lt;/a&gt;) &lt;/li&gt;    &lt;li&gt;Earn several, high-quality, &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/16/links-the-good-the-bad-and-the-ugly-part-1-sem-101.aspx"&gt;authoritative&lt;/a&gt; &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/19/links-the-good-the-bad-and-the-ugly-part-2-sem-101.aspx"&gt;inbound links&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;As you can see by the links, much of this material has already been discussed in-depth in the Webmaster Center team blog in our ongoing column, &lt;a href="http://www.bing.com/community/search/livesearch.aspx?domain=www.bing.com%2fcommunity&amp;amp;q=%22SEM+101%22"&gt;search engine marketing (SEM) 101&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;The type of SEO work and tasks webmasters need to perform to be successful in Bing hasn’t changed—all of the legitimate, time-tested, SEO skills and knowledge that webmasters have invested in previously apply fully today with Bing. Moreover, investments in solid, reputable SEO work made for Bing will bring similar improvements in your website’s page rank in other search engines as well.&lt;/p&gt;  &lt;p&gt;Ultimately, SEO is still SEO. Bing doesn’t change that. Bing’s new user interface design simply adds new opportunities to searchers to find what the information they want more quickly and easily, and that benefits webmasters who have taken the time to work on the quality of their content, website architecture, and have done the hard work of earning several high-quality inbound links.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Key content and tools for performing SEO with Bing&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;To keep up with the latest and greatest information coming from the Bing Webmaster Center team, we recommend that you follow and review the following content:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Review the Bing official &lt;a href="http://help.live.com/Help.aspx?market=en-US&amp;amp;project=WL_Webmasters&amp;amp;querytype=topic&amp;amp;query=WL_WEBMASTERS_REF_GuidelinesforSuccessfulIndexing.htm"&gt;guidelines for successful indexing&lt;/a&gt; document for various recommendations on technical and content issues as well as known problems that can affect your site’s rank &lt;/li&gt;    &lt;li&gt;Visit the &lt;a href="http://www.bing.com/community/blogs/webmaster/"&gt;Webmaster Center blog&lt;/a&gt; to keep up with the latest information from the team (you can even subscribe to &lt;a href="http://www.bing.com/community/blogs/webmaster/rss.aspx"&gt;our blog’s RSS feed&lt;/a&gt; to automate this process) &lt;/li&gt;    &lt;li&gt;&lt;a href="http://help.live.com/Help.aspx?market=en-US&amp;amp;project=WL_Webmasters&amp;amp;querytype=topic&amp;amp;query=WL_WEBMASTERS_CONC_VerifyYourSite.htm"&gt;Register all of your websites&lt;/a&gt; with Bing &lt;a href="http://www.bing.com/webmaster"&gt;Webmaster Center tools&lt;/a&gt;, where you can &lt;a href="http://help.live.com/Help.aspx?market=en-US&amp;amp;project=WL_Webmasters&amp;amp;querytype=topic&amp;amp;query=WL_WEBMASTERS_CONC_WebmasterTools.htm"&gt;use our tools&lt;/a&gt; to see all sorts of data to your website pertinent to webmasters &lt;/li&gt;    &lt;li&gt;Participate in our &lt;a href="http://www.bing.com/community/forums/default.aspx?GroupID=11"&gt;Webmaster Center user forums&lt;/a&gt; to ask questions and provide us with feedback &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;We look forward to working with you as partners in helping our mutual customers find the information they seek on the Internet.&lt;/p&gt;  &lt;p&gt;&lt;i&gt;-- Rajesh Srivastava, Principal Group Program Manager, Bing&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9548667" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/ZlZy_5gm5iM" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/09/03/search-engine-optimization-for-bing.aspx</feedburner:origLink></item><item><title>How Microsoft handles bots clicking on ads</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/snVqQfMdf8s/how-microsoft-handles-bots-clicking-on-ads.aspx</link><pubDate>Tue, 25 Aug 2009 16:51:26 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9546496</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>21</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9546496</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/08/25/how-microsoft-handles-bots-clicking-on-ads.aspx#comments</comments><description>&lt;p&gt;There’s been some recent discussion in the SEO blogosphere asserting that Bing clicks its own &lt;a href="https://adcenter.microsoft.com/"&gt;adCenter&lt;/a&gt; ads. This has created some misunderstanding. Let’s take a moment to clarify what is actually happening, and what this really means for webmasters and advertisers.&lt;/p&gt;  &lt;p&gt;The Bing team is aware of an issue shared by all search engines: paid advertising links on sites are, on occasion, crawled and indexed by search engines. Standard practice in the search industry is to scan web pages for the purpose of indexing and understanding the site’s content, and to determine which ads match best the destination site. &lt;b&gt;Microsoft adCenter does not charge an advertiser for clicks generated by any known search engine bots, including our own.&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;AdCenter uses a variety of techniques to remove bots, including the Interactive Advertising Bureau’s (IAB) Spiders and Robots protocol.&amp;#160; The IAB provides a list of known bots, and Microsoft bots are a part of that list. As a result, any activity generated by bots will not skew AdCenter data because it will be categorized as low quality in AdCenter Reports. You can view the Standard Quality and Low Quality data by accessing the AdCenter Reports tab.&lt;/p&gt;  &lt;p&gt;In June, 2009, Microsoft received Click Quality Accreditation from the IAB, which holds the industry’s highest standards in click measurement. The IAB and independent third-part auditors verified that adCenter meets their requirements for Click Quality Accreditation, which includes not billing for our search bot’s ad clicks. For more information, visit the &lt;a href="http://community.microsoftadvertising.com/blogs/advertiser/archive/2009/06/29/adcenter-and-atlas-media-console-receive-click-measurement-accreditation-from-the-media-rating-council.aspx"&gt;adCenter Blog&lt;/a&gt;, or the &lt;a href="http://www.iab.net/about_the_iab/recent_press_releases/press_release_archive/press_release/pr-051209"&gt;IAB site&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;This issue exists for all search engines, and we all follow the practice of not charging for bot-driven ad clicks. We maintain the integrity of our engine and our advertiser’s experience as a very high priority, and welcome your feedback. Please visit our Webmaster &lt;a href="http://www.bing.com/community/forums/12252.aspx"&gt;Crawling/Indexing Discussion forum&lt;/a&gt; to leave your comments and questions.&lt;/p&gt;  &lt;p&gt;&lt;i&gt;-- Rajesh Srivastava, Principal Group Program Manager, Bing &lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9546496" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/snVqQfMdf8s" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Crawling/default.aspx">Crawling</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Announcement/default.aspx">Announcement</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/08/25/how-microsoft-handles-bots-clicking-on-ads.aspx</feedburner:origLink></item><item><title>Prevent a bot from getting “lost in space” (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/L03RdjlIaLY/prevent-a-bot-from-getting-lost-in-space-sem-101.aspx</link><pubDate>Fri, 21 Aug 2009 00:32:25 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9545942</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>20</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9545942</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/08/21/prevent-a-bot-from-getting-lost-in-space-sem-101.aspx#comments</comments><description>&lt;p&gt;We recently published a non-SEM 101 blog post on &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/10/crawl-delay-and-the-bing-crawler-msnbot.aspx"&gt;controlling the crawl rate of MSNBot&lt;/a&gt;, the Bing web crawler (aka robot, or simply just bot). That got me thinking about robots. Naturally, that led to &lt;a href="http://www.imdb.com/media/rm3078854656/ch0011083"&gt;The Robot&lt;/a&gt; on &lt;a href="http://en.wikipedia.org/wiki/Lost_in_Space"&gt;Lost in Space&lt;/a&gt;. &lt;a href="http://en.wikipedia.org/wiki/File:BillMumy1.jpg"&gt;Will Robinson&lt;/a&gt;, the show’s precocious youngster who was a whiz at 1960s-style, clunky electronics (even though the show was supposedly set in 1997!), was best friends with The Robot. They looked out for each other and helped each other in times of need.&lt;/p&gt;  &lt;p&gt;In a way, search engine bots and webmasters have a similar relationship. They need one another. Webmasters need search engine bots to crawl the pages of their sites so that they can be added to their indexes. Bots need webmasters to provide them with compelling content, well-formed code, and authoritative backlinks to serve to their search customers in search engine results pages (SERPs). This form of a “mutualistic, symbiotic” relationship is beneficial to both parties. Webmasters benefit from getting high quality content into the search engine index. Search engines benefit by being able to provide searchers with useful, relevant results to their queries, no matter how arcane.&lt;/p&gt;  &lt;p&gt;The Robot character was actually quite powerful in its analytical capabilities, and would alert the Robinson family when danger lurked (although I never understood why it did not sense the cloak-and-dagger danger in stowaway Dr. Smith). But most important was young Will Robinson’s ability to communicate with The Robot. He could direct The Robot’s powerful intelligence toward things that needed its attention (such as your average, run-of-the-mill, hostile space aliens, who coincidently always looked like expressionless, rubber masked-humans!). The Robot’s assistance helped Will survive his rough and tumble alien environment.&lt;/p&gt;  &lt;p&gt;As a webmaster, you, too, can communicate with the robot (the search engine variety). You can block it from crawling specified directories and files, override the generic crawl block for a subset of those files, block specified pages from being indexed, block the following of links on a page, and much more. Let’s take a look at how you can converse with your friendly search engine robot and help it navigate your website with an eye toward directing its behavior for crawling and indexing your content. This effort might help your site better survive in the rough and tumble environment of the Web.&lt;/p&gt;  &lt;p&gt;While Will Robinson conversed with The Robot in spoken English, you’ll need to communicate with the search engine bots using the &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Robots Exclusion Protocol (REP)&lt;/a&gt;. You can do this in two ways (depending on what you want to do): within the HTML code of each page or in a separate file named robots.txt (which you save to the root directory of your website). While not a perfect analogy, consider the scope of the message to the robot for defining how you tell it what to do. For site- or directory-wide directives, you’ll typically use the robots.txt file. Page- and even link-specific directives are more often handled in a page’s HTML code.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;The robots.txt file&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;The robots.txt file is commonly used to block bots (identified as user-agents within the context of the file) from accessing directories and files that contain data the webmaster doesn’t want added to the search index, such as scripts, databases, and other information that is not intended for public consumption or has no value for searchers. For example, a basic robots.txt file might include directives such as the following:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;User-agent: *      &lt;br /&gt;Disallow: /private/&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;The above sample robots.txt file code applies to all user-agents (bots to you and me) and blocks bot access to all files in the directory named /private on the web server. &lt;/p&gt;  &lt;p&gt;But you can do more than that with the robots.txt file. What if you actually had a ton of existing files in /private but actually wanted some of them made available for the crawler to see? Instead of re-architecting your site to move certain content to a new directory (and potentially breaking internal links along the way), use the Allow directive. Allow is a non-standard REP directive, but it’s supported by Bing and other major search engines. Note that to be compatible with the largest number of search engines, you should list all Allow directives before the generic Disallow directives for the same directory. Such a pair of directives might look like this:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;Allow: /private/public.doc      &lt;br /&gt;Disallow: /private/&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Note&lt;/b&gt; If there is some logical confusion and both Allow and Disallow directives apply to a URL, the Allow directive takes precedent.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Wildcards&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;The use of wildcards is supported in robots.txt. The “*” character can be used to represent characters appended to the ends of URLs, such as session IDs and extraneous parameters. Examples of each would look like this:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;Disallow: */tags/      &lt;br /&gt;Disallow: *private.aspx       &lt;br /&gt;Disallow: /*?sessionid&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;The 1&lt;sup&gt;st&lt;/sup&gt; line in the above example blocks bot access to any URL that contains a directory named “tags,” such as “/best_Sellers/tags/computer/”, “/newYearSpecial/tags/gift/shoes/”, and “/archive/2008/sales/tags/knife/spoon/”. The 2&lt;sup&gt;nd&lt;/sup&gt; line above blocks all URLs that end with the string “private.aspx”, regardless of the directory name (note that the preceding forward slash is redundant and thus not included). The last line above blocks access to any URL with “?sessionid” anywhere in their URL string, such as “/cart.aspx?sessionid=342bca31?”. &lt;/p&gt;  &lt;p&gt;&lt;b&gt;Notes&lt;/b&gt;&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;The last directive in the sample is not intended to block file and directory names that use that string, only URL parameters, so we added the “?” to the sample string to ensure that work as expected. However, if the parameter “sessionid” will not always be the first in a URL, you can change the string to “*?*sessionid” so you’re sure you block the URLs you intend. If you only want to block parameter names and not values, use the string “*?*sessionid=”. If you delete the “?” from the example string, this directive will block URL containing file and directory names that match the string. As you can see, this can be tricky, but also quite powerful. &lt;/li&gt;    &lt;li&gt;A trailing “*” is always redundant since that replicates the existing behavior for MSNBot. Disallowing “/private*” is the same as disallowing “/private”, so don’t bother adding wildcard directives for those cases. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;You can use the “$”wildcard character to filter by file extension. &lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;Disallow: /*.docx$&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Note&lt;/b&gt; The directive above will disallow any URL containing the file name extension string “.docx” from being crawled, such as a URL containing the sample string, “/sample/hello.docx”. In comparison, the directive Disallow: /*.docx blocks more URLs, as it applies to more than just file name extension strings.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Sitemaps&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;If you have created an XML-based Sitemap file for your site (as discussed in the recent blog post &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/15/uncovering-web-based-treasure-with-sitemaps-sem-101.aspx"&gt;Uncovering web-based treasure with Sitemaps&lt;/a&gt;), you can add a reference to the location of your Sitemap file at the end of your robots.txt file. The syntax for a Sitemap reference is as follows:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;Sitemap: http://www.your-url.com/sitemap.xml&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Other issues&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;You can also add a crawl-delay directive to your robots.txt file to change the default pace at which Bing crawls your site. I’ll do my part in conserving electrons and avoiding redundancy by instead referring you to that post: &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/10/crawl-delay-and-the-bing-crawler-msnbot.aspx"&gt;Crawl delay and the Bing crawler, MSNBot&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Whatever you choose to do with robots.txt, don’t play games with constant changes to the file. &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2008/09/13/is-your-robots-txt-file-on-the-clock.aspx"&gt;Note this story from our blog&lt;/a&gt; about a webmaster who was having problems with getting a site properly crawled, and we discovered they were swapping out differently configured robots.txt files in an automated fashion in a misguided effort to control crawling. That was not a helpful strategy!&lt;/p&gt;  &lt;p&gt;&lt;b&gt;File format&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;The robots.txt file must be saved in a standard text file format, such as ASCII or UTF-8, so it can be read by the bots. One easy way to verify that the proper file format is used is to edit the file in Microsoft Notepad. Save the file using the Notepad default file format type, &lt;b&gt;Text Documents (*.txt)&lt;/b&gt; with &lt;b&gt;ANSI&lt;/b&gt; encoding.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Validation&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Once your robots.txt file is built, I suggest that you validate it before you consider it done. If you are a member of &lt;a href="http://www.bing.com/webmaster/"&gt;Bing Webmaster Center&lt;/a&gt;, log in to get access to our &lt;a href="http://www.bing.com/webmaster/RobotReportPage.aspx"&gt;online Robot.txt validation tool&lt;/a&gt;. Otherwise, there are &lt;a href="http://www.bing.com/search?q=validate+robots.txt&amp;amp;go=&amp;amp;form=QBLH&amp;amp;qs=n"&gt;a number of other online robots.txt validators available&lt;/a&gt; for you to use. We all want to avoid having The Robot say, “Warning! Warning! That does not compute!”&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Maintenance&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Once you’ve got a good robots.txt file built and validated, don’t just set it and forget it. Periodically audit the settings in the file, especially after you’ve gone through a site redesign. You need to be sure your blocking directives are still valid so that nothing is unintentionally blocking the bot from valid content or leaving sensitive material exposed.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;HTML code&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;You can also put REP directives directly in your HTML code. Recall a few weeks back we discussed the way to &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/18/head-s-up-on-lt-head-gt-tag-optimization-sem-101.aspx"&gt;optimize the &amp;lt;head&amp;gt; tag&lt;/a&gt; and got into how to use &amp;lt;meta&amp;gt; tags? Well, that is where most of these REP directives go as well. Let’s take a look at how these work. Here’s a sample &amp;lt;meta&amp;gt; tag that addresses bots:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;&amp;lt;meta name=&amp;quot;robots&amp;quot; content=&amp;quot;noindex, nofollow&amp;quot;&amp;gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;There are more options than the two listed. REP values for the content attribute are designed to be aggregated to combine functionality so that it performs just as you want.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Meta tags&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;The name=”robots” attribute of the &amp;lt;meta&amp;gt; tag is read by the bot when it accesses an HTML page. The directives listed in the content attribute tell it what (or more specifically, what not) to do. Let’s define the function of each of the content attribute values.    &lt;br /&gt;    &lt;br /&gt;    &lt;table border="1" cellspacing="0" cellpadding="0"&gt;&lt;tbody&gt;       &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;&lt;b&gt;Value&lt;/b&gt;&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;&lt;b&gt;Function&lt;/b&gt;&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;        &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;noindex&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;Prevents the bot from indexing the contents of the page, but links on the page can be followed. This is useful when the page’s content is not intended for searchers to see.&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;        &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;nofollow&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;Prevents the bot from following the links on the page, but the page can be indexed. This is useful if the links created on a page are not in control of the webmaster, such as with a blog or user forum. Links to spam or malware are not what careful webmasters want to serve to their customers!&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;        &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;nosnippet&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;Instructs the bot to not display a snippet for that page in the SERPs. Snippets are the text description of the page shown between the page title and the blue link to the site.&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;        &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;noarchive&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;Instructs the bot to not display a cache link for that page in the SERP.&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;        &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;nocache&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;Same as noarchive.&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;        &lt;tr&gt;         &lt;td valign="top" width="80"&gt;           &lt;p&gt;noodp&lt;/p&gt;         &lt;/td&gt;          &lt;td valign="top" width="657"&gt;           &lt;p&gt;Instructs the bot to not use a title and snippet from the Open Directory Project (ODP) for that page in the SERP.&lt;/p&gt;         &lt;/td&gt;       &lt;/tr&gt;     &lt;/tbody&gt;&lt;/table&gt; &lt;/p&gt;  &lt;p&gt;Note that the HTML-based, bot-blocking directives within a page will override Allow directives found applying to an HTML file in robots.txt.&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Links&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;What if you want almost all of the links on a page followed, but for some reason, you want to block the bot from following one or a few? Well, there is a solution for that as well: the rel=”nofollow” attribute. To see it in action, look at the following sample anchor tag:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;&amp;lt;a rel=&amp;quot;nofollow&amp;quot; href=&amp;quot;http://www.untrustedsite.com/forum/stuff.aspx?var=1”&amp;gt;Read these forum comments&amp;lt;/a&amp;gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Caveats&lt;/b&gt;&lt;/p&gt;  &lt;p&gt;Note that with the rel=”nofollow” attribute, a REP-compliant bot will not follow that specific link on that page. However, if any other page on the site (or a link on an external site) refers to the blocked page without any REP directives blocking it, the page may still be crawled and could make it into the index. This caveat goes for all REP link blocking directives that are not consistently applied to a specified page. And in the case of external sites linking to the page on your site that is supposed to be blocked, which local webmasters cannot control, these pages may still be crawled and indexed. In this case, the &amp;lt;meta&amp;gt; tag’s noindex solution for that page is the best option.&lt;/p&gt;  &lt;p&gt;I discuss this attribute in some detail at the end of a previous blog article. I’ll again conserve electrons by referring you to &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/07/01/making-links-work-for-you-sem-101.aspx"&gt;Making links work for you&lt;/a&gt;. Robots appreciate that kind of thing.&lt;/p&gt;  &lt;p&gt;For more information on REP, see our past big blog article on the subject, &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx"&gt;Robots Exclusion Protocol: joining together to provide better documentation&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Judicious use of REP directives will help shape your website’s presence on the SERPs. It’ll help direct the search engine bots to content you want them to see and block them from content you do not want indexed, be it because the content is not useful to searchers (such as shopping cart or logon pages) or because it contains potentially business confidential data (such as references to internal IT infrastructure in scripts). Your efforts to direct the traffic here will help prevent bots from getting “lost in space.” It’s only too bad that the search engine bots don’t wave their virtual arms and trumpet “Danger, Will Robinson!” whenever they can access content not intended for the index! But that’s your call to make, not theirs.&lt;/p&gt;  &lt;p&gt;If you have any questions, comments, or suggestions, feel free to post them in either our &lt;a href="http://www.bing.com/community/forums/12256.aspx"&gt;SEM forum&lt;/a&gt; or our &lt;a href="http://www.bing.com/community/forums/12252.aspx"&gt;Crawling/Indexing Discussion forum&lt;/a&gt;. Later…&lt;/p&gt;  &lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9545942" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/L03RdjlIaLY" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/Crawling/default.aspx">Crawling</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/08/21/prevent-a-bot-from-getting-lost-in-space-sem-101.aspx</feedburner:origLink></item><item><title>Getting the IIS SEO Toolkit up and running</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/pFEP6_PgU3A/setting-up-iis-7-before-installing-iis-seo-toolkit.aspx</link><pubDate>Mon, 17 Aug 2009 21:26:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9545340</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>36</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9545340</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/08/17/setting-up-iis-7-before-installing-iis-seo-toolkit.aspx#comments</comments><description>&lt;p&gt;We recently published a popular blog post called &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/03/get-detailed-site-analysis-to-solve-problems-sem-101.aspx"&gt;Get detailed site analysis to solve problems&lt;/a&gt; that highlights the function and capabilities of the new, beta search engine optimization (SEO) Toolkit from the Microsoft Internet Information Server (IIS) team. The tool works as an extension to the latest version of IIS, version 7.&lt;/p&gt;
&lt;p&gt;We&amp;rsquo;ve received some webmaster feedback with specific questions on how to set up IIS 7 before installing the IIS SEO Toolkit. So to help simplify and clarify this task for webmasters (as clearly folks are very much interested in using the tool!), we developed this quick &amp;ldquo;How-to&amp;rdquo; post with detailed instructions for the set up of IIS 7. We hope this helps!&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Minimum installation requirements&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The IIS SEO Toolkit will run on any version of Microsoft Windows capable of running IIS 7. This includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Windows Vista Service Pack (SP)1 and higher (only available in the Vista versions Home Premium, Business, and Ultimate) &lt;/li&gt;
&lt;li&gt;Windows Server 2008 &lt;/li&gt;
&lt;li&gt;Windows 7 &lt;/li&gt;
&lt;li&gt;Windows Server 2008 R2&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The IIS SEO Toolkit, due to its reliance on IIS 7, is not supported on Windows XP, Windows Server 2003, or on any alternative operating system.&lt;/p&gt;
&lt;p&gt;Note: You must be a member of either the Administrator or Web Server Administrator groups to perform this set up procedure.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Enable IIS 7&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Before you can install the IIS SEO Toolkit, you first need to enable IIS 7 on your computer. In both Windows Vista and Windows 7, this will install the web server software but will not, by default, open port 80 for inbound network traffic. To set up IIS 7, follow these instructions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;1. Click &lt;b&gt;Start&lt;/b&gt;. &lt;/p&gt;
&lt;p&gt;2. Click &lt;b&gt;Control Panel&lt;/b&gt;. &lt;/p&gt;
&lt;p&gt;3. Click &lt;b&gt;Programs&lt;/b&gt;. &lt;/p&gt;
&lt;p&gt;4. Select &lt;b&gt;Turn Windows features on or off&lt;/b&gt; (as shown in the figure below). If you receive a Windows Security warning, click &lt;b&gt;Allow&lt;/b&gt;. &lt;/p&gt;
&lt;p&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/1803.1_2D00_ControlPanel.gif" border="0" /&gt; &lt;/p&gt;
&lt;p&gt;5. In the resulting &lt;b&gt;Windows Features&lt;/b&gt; dialog box, select the &lt;b&gt;Internet Information Services&lt;/b&gt; check box (as shown in the figure below).&lt;/p&gt;
&lt;p&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/0654.2_2D00_IIScheckbox.gif" border="0" /&gt; &lt;/p&gt;
&lt;p&gt;6. Click &lt;b&gt;+&lt;/b&gt; to expand the view to see the following nodes: &lt;b&gt;World Wide Web Services&lt;/b&gt;, and then &lt;b&gt;Application Development Features&lt;/b&gt;. Select the &lt;b&gt;.NET Extensibility&lt;/b&gt; check box (as shown in the figure below). &lt;/p&gt;
&lt;p&gt;&lt;img src="http://www.bing.com/community/cfs-file.ashx/__key/CommunityServer.Components.UserFiles/00.00.19.19.87.Attached+Files/0257.3_2D00_NETextensibilityCheckbox.gif" border="0" /&gt; &lt;/p&gt;
&lt;p&gt;7. Click &lt;b&gt;OK&lt;/b&gt;. The activation of the newly installed components may take a few minutes to complete. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For more information in installing IIS 7, see the article &lt;a href="http://technet.microsoft.com/en-us/library/cc732624(WS.10).aspx"&gt;Installing IIS 7&lt;/a&gt; on Microsoft TechNet.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Install the IIS SEO Toolkit&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Once IIS 7 is enabled, follow these instructions to install the IIS SEO Toolkit extension.&lt;/p&gt;
&lt;p&gt;You&amp;rsquo;ll need to download the version of the toolkit corresponding to the type of Windows you are running (either 32-bit or 64-bit). If you are not sure what type you have installed, you can check by doing the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Click &lt;b&gt;Start&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;Right-click &lt;b&gt;Computer&lt;/b&gt;, and then click &lt;b&gt;Properties&lt;/b&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;b&gt;System&lt;/b&gt; group, &lt;b&gt;System Type&lt;/b&gt; will indicate whether you are running a 32-bit or a 64-bit version of Windows.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;To install the IIS SEO Toolkit, click the correct link below for your installed version of Windows to download the program installation file. You can choose to run the downloaded program directly or save it locally and then start it. After starting the installation of the IIS SEO Toolkit, following the instructions as prompted.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://go.microsoft.com/?linkid=9668966"&gt;IIS SEO Toolkit beta for 32-bit Windows&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="http://go.microsoft.com/?linkid=9668967"&gt;IIS SEO Toolkit beta for 64-bit Windows&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;For an introduction to and information on the features of the IIS SEO Toolkit, see the Webmaster Center blog post called &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/08/03/get-detailed-site-analysis-to-solve-problems-sem-101.aspx"&gt;Get detailed site analysis to solve problems&lt;/a&gt;. Thanks for taking the IIS SEO Toolkit for a test spin! We think you&amp;rsquo;ll be impressed!&lt;/p&gt;
&lt;p&gt;&lt;i&gt;--Alessandro Catorcini, Lead Program Manager, Bing API &amp;amp; Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9545340" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/pFEP6_PgU3A" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/08/17/setting-up-iis-7-before-installing-iis-seo-toolkit.aspx</feedburner:origLink></item><item><title>Uncovering web-based treasure with Sitemaps (SEM 101)</title><link>http://feedproxy.google.com/~r/msdn/webmaster/~3/a3SIfC_7Oqw/uncovering-web-based-treasure-with-sitemaps-sem-101.aspx</link><pubDate>Fri, 14 Aug 2009 23:09:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9544984</guid><dc:creator>Webmaster Center team</dc:creator><slash:comments>33</slash:comments><wfw:commentRss>http://www.bing.com/community/blogs/webmaster/rsscomments.aspx?PostID=9544984</wfw:commentRss><comments>http://www.bing.com/community/blogs/webmaster/archive/2009/08/15/uncovering-web-based-treasure-with-sitemaps-sem-101.aspx#comments</comments><description>&lt;p&gt;Have you ever noticed how pirate treasure maps are like Sitemaps? While your website may not contain a treasure of gold and silver (unless it&amp;rsquo;s a metals commodities trading site!), if you have good content, that is certainly treasure to someone who is looking for it. Unfortunately, it&amp;rsquo;s buried on your website and no one knows what&amp;rsquo;s there except you! But since you want to share your site&amp;rsquo;s treasure with others, you need to let them know what you have buried and where to find it. You can wait for search engine crawlers (aka bots) and random traffic to come by to browse, but that will take time and even then, they might not discover everything that you have to offer. Instead, you can help the search bots to dig up your treasured content with a Sitemap.&lt;/p&gt;
&lt;p&gt;Now I should pause for a moment to mention that you shouldn&amp;rsquo;t confuse sitemaps with Sitemaps. You&amp;rsquo;ve got that, right? Well, just in case that&amp;rsquo;s as clear as mud, keep this in mind: when referring to sitemap files in text (such as this!), use the lower case word &amp;ldquo;sitemap&amp;rdquo; to mean HTML-based files intended for users to browse. They typically contain a list of all the pages on your site. &lt;/p&gt;
&lt;p&gt;On the other hand, use the capitalized word &amp;ldquo;Sitemap&amp;rdquo; to mean XML-based files designed for use by search engine bots to collect data from webmasters identifying the most important pages and directories within their sites for crawling and indexing. Both types of sitemap files can (and probably should) use all lower case letters in their file name (such as sitemap.xml and sitemap.htm), but capitalize the references to the XML-based one in text to help readers distinguish which type of sitemap you are discussing. This article, coming from the perspective of a search engine, is focusing on Sitemaps, not sitemaps. You&amp;rsquo;re with me now, right? :-)&lt;/p&gt;
&lt;p&gt;A good Sitemap will tell search engine bots about the content stored on a site. That helps the content be seen by the bot and, with any luck (assuming the content is well formed and has value), get into the index. Users who are on a content treasure quest will query search engines with keywords to locate the content they are seeking. If the search engine indexed the content found by the bot, which can be more likely when a good Sitemap is present, then that site&amp;rsquo;s content has a better chance for appearing in the search engine results pages (SERP). After all, you can&amp;rsquo;t get onto the SERP if your pages aren&amp;rsquo;t indexed! &lt;/p&gt;
&lt;p&gt;&lt;b&gt;Structure&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;A Sitemap file, saved to the root directory of your site, contains references to specific URL locations for pages (or to other Sitemap files on very large sites), often describing the last modified date, the typical change frequency for a page, and the priority the specified page has compared to the other content on your site.&lt;/p&gt;
&lt;p&gt;A brief example of the contents of a Sitemap file looks like this:&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: Courier New; font-size: small;"&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt; &lt;br /&gt;&amp;lt;urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;url&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;loc&amp;gt;http://www.mysite.com/default.htm&amp;lt;/loc&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;lastmod&amp;gt;2009-03-01&amp;lt;/lastmod&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;changefreq&amp;gt;monthly&amp;lt;/changefreq&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;priority&amp;gt;0.8&amp;lt;/priority&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/url&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;url&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;loc&amp;gt;http://www.mysite.com/contacts.htm&amp;lt;/loc&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;changefreq&amp;gt;yearly&amp;lt;/changefreq&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;priority&amp;gt;0.4&amp;lt;/priority&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/url&amp;gt; &lt;br /&gt;&amp;lt;/urlset&amp;gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The &amp;lt;urlset&amp;gt; tag is standard and points to the current protocol to reference. The &amp;lt;url&amp;gt; and &amp;lt;loc&amp;gt; tags are the minimum required data needed for each page entry. The other tags, &amp;lt;lastmod&amp;gt;, &amp;lt;changefreq&amp;gt;, and &amp;lt;priority&amp;gt;, are optional, additional data. To see the data entry formatting and attributes used for these optional tags, sail on over to &lt;a href="http://sitemaps.org/protocol.php"&gt;Sitemaps XML format&lt;/a&gt; for reference information. Note that not every page on your site need be listed in the Sitemap&amp;mdash;only the ones containing valuable content for the user.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;File formats&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Bing supports Sitemap files submitted as XML and gzip files, but not as HTM or HTML files (those would be sitemaps as opposed to Sitemaps, right? Besides, the XML content of a well-formed Sitemap file wouldn&amp;rsquo;t render correctly in browsers as an HTM file, anyway). If you&amp;rsquo;ve created a browsable HTML-based sitemap for your end users, they will thank you for the effort, but you can&amp;rsquo;t recycle it as a Sitemap. You&amp;rsquo;ll still need to create a separate, XML-based Sitemap file using the tag structure as noted above for submission to Bing.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Size matters not so much anymore&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;A typical treasure map only has one X to mark the spot. However, your Sitemap can list multiple locations identifying the treasures on your site. It used to be considered common wisdom in the search engine optimization (SEO) community that there can be too much of a good thing. It used to be accepted that from a search engine perspective, the most effective size for a Sitemap is approximately 150 or fewer URLs. Anything more bountiful and the crawler may not take it all in. Well not so fast, matey!&lt;/p&gt;
&lt;p&gt;Per the post &lt;a href="http://www.bing.com/community/blogs/webmaster/archive/2009/06/12/bing-enhances-support-for-large-sitemaps.aspx"&gt;Bing enhances support for large Sitemaps&lt;/a&gt; made in this blog just a few weeks ago, Bing now supports Sitemap files that contain up to 50,000 references (to either URLs or links to other, child Sitemap files). This development is a boon for webmasters of very large sites. They can now create multiple child Sitemap files, each dedicated to mapping specific areas of their content organization, and store those child Sitemap files in the base directories of those content areas. Then they can link to the child Sitemaps via their primary (aka index) Sitemap file (the one stored in the root directory of a site). One index Sitemap linking to 50,000 child Sitemaps, each of those referencing up to 50,000 URLs, means they can reference up to 2.5 billion URLs through the Sitemap technology, and the Bing crawler, MSNBot, will read it all. Now that&amp;rsquo;s a lot of treasure!&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Sitemap submission&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;There are multiple ways for webmasters to submit their Sitemaps to Bing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Ping service.&lt;/b&gt; Using your browser&amp;rsquo;s address bar, you can directly submit your Sitemap to Bing. Type &lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Courier New;"&gt;http://www.bing.com/webmaster/ping.aspx?sitemap=&lt;i&gt;www.YourURL.com/sitemap.xml&lt;/i&gt;&lt;/span&gt; &lt;br /&gt;&lt;br /&gt;substituting the full URL to your Sitemap file in place of the YourURL.com example. &lt;/li&gt;
&lt;li&gt;&lt;b&gt;Webmaster Center tools.&lt;/b&gt; You can sign in to Bing&amp;rsquo;s &lt;a href="http://www.bing.com/webmaster"&gt;Webmaster Center tools&lt;/a&gt; and use the &lt;b&gt;Sitemaps&lt;/b&gt; tool (if you are not already registered to use these free tools, this is a good reason to sign up and see all the other tools available to help you analyze and optimize your site). Simply copy the URL of your Sitemap into the &lt;b&gt;Direct sitemap submission&lt;/b&gt; text box, and then click &lt;b&gt;Submit&lt;/b&gt;. &lt;/li&gt;
&lt;li&gt;&lt;b&gt;Robot.txt file reference.&lt;/b&gt; If you are using a &lt;a href="http://www.robotstxt.org/"&gt;robots.txt&lt;/a&gt; file to instruct search engine bots which files and directories not to crawl and thus block from adding to their indexes, you can add a line to that file, most typically done at the end, that reads as follows: &lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Courier New;"&gt;Sitemap: &lt;i&gt;http://www.YourURL.com/sitemap.xml&lt;/i&gt;&lt;/span&gt; &lt;br /&gt;&lt;br /&gt;substituting the full URL to your Sitemap file in place of the YourURL.com example. &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Validation&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Before submitting your Sitemap to Bing, we recommend that you run the XML code you&amp;rsquo;ve written through a Sitemap validation tool. After all, what good is a treasure map if it has errors in it? Do a &lt;a href="http://search.live.com/results.aspx?q=sitemap+validate&amp;amp;mkt=en-us"&gt;search for your Sitemap validator of choice&lt;/a&gt; and follow the instructions on the page. If errors are found, correct them before you submit the Sitemap file to Bing.&lt;/p&gt;
&lt;p&gt;Once you submit your Sitemap to Bing, we will read its contents, which will help us with uncovering more of the content treasures buried on your site and evaluating it as potential new additions to our index. And with more of your site&amp;rsquo;s content in the index, instead of users sailing on past your site to other ports of call in their quest for content treasure, they may stop at yours and exclaim, &amp;ldquo;Shiver me timbers, matey, look at what we have here!&amp;rdquo;&lt;/p&gt;
&lt;p&gt;If you have any questions, comments, or suggestions, feel free to post them in our &lt;a href="http://www.bing.com/community/forums/12256.aspx"&gt;SEM forum&lt;/a&gt;. Until next time&amp;hellip;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;-- Rick DeJarnette, Bing Webmaster Center&lt;/i&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.bing.com/community/aggbug.aspx?PostID=9544984" width="1" height="1"&gt;&lt;img src="http://feeds.feedburner.com/~r/msdn/webmaster/~4/a3SIfC_7Oqw" height="1" width="1"/&gt;</description><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO/default.aspx">SEO</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEO+101/default.aspx">SEO 101</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/sitemaps/default.aspx">sitemaps</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM/default.aspx">SEM</category><category domain="http://www.bing.com/community/blogs/webmaster/archive/tags/SEM+101/default.aspx">SEM 101</category><feedburner:origLink>http://www.bing.com/community/blogs/webmaster/archive/2009/08/15/uncovering-web-based-treasure-with-sitemaps-sem-101.aspx</feedburner:origLink></item></channel></rss>
