<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>things of sorts</title>
	
	<link>http://ekstreme.com/thingsofsorts</link>
	<description>SEO, PHP, HTML, AJAX, JS, and life</description>
	<lastBuildDate>Thu, 29 Dec 2011 14:41:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/thingsofsorts" /><feedburner:info uri="thingsofsorts" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:browserFriendly></feedburner:browserFriendly><item>
		<title>I’m Joining Google</title>
		<link>http://ekstreme.com/thingsofsorts/ekstremecom/joining-google</link>
		<comments>http://ekstreme.com/thingsofsorts/ekstremecom/joining-google#comments</comments>
		<pubDate>Sun, 09 Jan 2011 18:00:04 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[eKstreme.com]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/?p=290</guid>
		<description><![CDATA[Tomorrow morning I will start working at Google&#8217;s London office as a Webmaster Trends Analyst. SEOs will immediately know what that role is as I will be John Mueller&#8217;s team mate. To those that don&#8217;t know, it&#8217;s the team within Google that interacts with website owners and manages Webmaster Central among other things. What does [...]]]></description>
			<content:encoded><![CDATA[<p>Tomorrow morning I will start working at Google&#8217;s London office as a Webmaster Trends Analyst. SEOs will immediately know what that role is as I will be John Mueller&#8217;s team mate. To those that don&#8217;t know, it&#8217;s the team within Google that interacts with website owners and manages <a href="http://www.google.com/webmasters/">Webmaster Central</a> among other things.</p>
<p>What does this mean for my websites? A few things:</p>
<ul>
<li>I will not be building any new tools. All my tools that already exist here and elsewhere are going to remain as they are and will not be updated.</li>
<li><strong>Very important corollary</strong>: The tools here on eKstreme.com and elsewhere, especially the SEO tools, were built before I joined Google and thus are not in any way official Google tools and must not to be taken as endorsed by Google. So please no one try this one. OK?</li>
<li>I&#8217;ve been running some SEO experiments here on eKstreme.com and elsewhere. Although I&#8217;ve stopped them, their effects might still be hiding in a cache somewhere or whatnot. If you see anything fishy/funny it is NOT an official Google recommendation of how to do things. Heck, as an SEO I&#8217;ve pushed the boundaries a bit and so if you try anything you think I&#8217;ve tried or trying, it&#8217;s at your own risk.</li>
<li>My <a href="http://www.ocwsearch.com/">OpenCourseWare search engine (OCW Search)</a> is now officially part of the <a href="http://www.ocwconsortium.org/">OpenCourseWare Consortium</a>. As I will not be able to work on it from now on, I&#8217;ve decided to donate it to the Consortium instead of shutting it down. Its dream lives on and it is now in much more capable hands than mine, hands that even have time to work on it!</li>
</ul>
<p>Some Yes questions: So will I start blogging more? Hopefully. Will I be at conferences? Yes, though I don&#8217;t know when. I&#8217;ll post here and hopefully get to meet more of the people I&#8217;ve been talking to over the years.</p>
<p>Some No questions: Will I tell you SEO secrets if you ask me? Of course not! Can you hire me? No! I&#8217;m done with the freelance work. Can we exchange links? Nope.</p>
<p>A little disclaimer: This website is still my own and anything written on it is my own personal view/opinion and not my employer. This is not an official Google blog.</p>
<p>Finally, you should <a href="http://www.twitter.com/pierrefar">follow me on twitter</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/ekstremecom/joining-google/feed</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Awesome new free SEO tool: Blekko</title>
		<link>http://ekstreme.com/thingsofsorts/seosem/awesome-new-free-seo-tool-blekko</link>
		<comments>http://ekstreme.com/thingsofsorts/seosem/awesome-new-free-seo-tool-blekko#comments</comments>
		<pubDate>Mon, 01 Nov 2010 09:18:39 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[SEO/SEM]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/?p=286</guid>
		<description><![CDATA[As of today, 1 November 2010, a new search engine is open to the public: blekko. Regardless of what its ambitions (to be better than Google and topple them), it has a very useful treasure for internet marketers: a thorough SEo analysis of any URL or domain they have indexed. Before I go through some [...]]]></description>
			<content:encoded><![CDATA[<p>As of today, 1 November 2010, a new search engine is open to the public: <a href="http://blekko.com/">blekko</a>. Regardless of what its ambitions (to be better than Google and topple them), it has a very useful treasure for internet marketers: a thorough SEo analysis of any URL or domain they have indexed.</p>
<p>Before I go through some of the data they share, note that they have a <a href="http://blekko.com/toolbar">toolbar</a> that allows you to &quot;view SEO data in real time&quot;. And because they allow you to mark pages as spam to kill them completely from your search results, the toolbar has a button to do that too. Interesting way to crowdsource spam control.</p>
<p>Back to SEO data: To see it, you simply search for the domain name you&#8217;re interested in followed by a /domain. For example, for ekstreme.com: <a href="http://blekko.com/ws/ekstreme.com+/domain">ekstreme.com /domain</a>. That first page is a gold mine in its own right, as it tells you how many inbound links from how many domains. And for each domain it lists, you can dig in a bit deeper to understand that domain. This is actually the first in many tabs shown. Clicking through them you&#8217;ll see they display a few graphs (sadly, the ever useless pie charts), give crawl stats data, and duplicate content.</p>
<p>The crawl stats data hints at how fresh/stale their data is, and from testing a few domains I own and some of my SEO clients&#8217; domains, it&#8217;s clear the freshness is a big issue: it&#8217;s very hit and miss, and for newer and/or smaller sites, they are lagging significantly. For example, when analyzing <a href="http://www.ocwsearch.com/">OCW Search</a>, the numbers indicate they have crawled 8 pages (there are more <img src='http://ekstreme.com/thingsofsorts/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  ) and the average page length is 0kb, which is clearly wrong. They also tell you the reverse IP address and for OCW Search it&#8217;s Amazon, which is the previous host that I moved away from a couple of months ago. So take this data with a grain of salt!</p>
<p>The last report I want to highlight is the comparison. By default when you click on the compare tab, you will be comparing the www domain with the non-www domain. Again, for ekstreme.com: <a href="http://blekko.com/ws/ekstreme.com+/compare">ekstreme.com /compare</a>. This tells you who is linking to each domain which is a good rough estimate of the problem size if you are dealing with canonical URL issues. BUT, the real kicker here is that you can compare different domains, for example <a href="http://blekko.com/ws/ekstreme.com+/compare?c=www.ocwsearch.com">ekstreme.com and www.ocwsearch.com</a> (screenshot below). This is a gold mine for market analysis.</p>
<div><img src="/images/blekko.jpg" alt="blekko search domain comparison" width="471" height="477" /></div>
<p>So all in all an excellent free tool, but one that suffers from stale data &#8211; the problem for all SEO tools. My recommendation is like any other SEO tool: use it while understanding the data&#8217;s limitations.</p>
<p>Finally, I&#8217;d like to note that this is a genius marketing strategy on blekko&#8217;s part: getting SEOs to talk about them and use them is a great way to get early traction. Us search geeks are the very very leading edge of early adoptors, so well done on spotting this opportunity.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/seosem/awesome-new-free-seo-tool-blekko/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Remember me?</title>
		<link>http://ekstreme.com/thingsofsorts/ekstremecom/remember-me</link>
		<comments>http://ekstreme.com/thingsofsorts/ekstremecom/remember-me#comments</comments>
		<pubDate>Wed, 22 Sep 2010 15:50:44 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[eKstreme.com]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/?p=282</guid>
		<description><![CDATA[Hello? tap, tap Is this thing on? The last blog post was from way back in January. Things has moved on since then but I just haven&#8217;t had time to blog about them. So here goes: I&#8217;m now a freelance web developer and SEO. Want to hire me? Look at my funnily-named SEO company website. [...]]]></description>
			<content:encoded><![CDATA[<p>Hello? <em>tap, tap</em> Is this thing on?</p>
<p>The last blog post was from way back in January. Things has moved on since then but I just haven&#8217;t had time to blog about them. So here goes:</p>
<ul>
<li>I&#8217;m now a freelance web developer and SEO. Want to hire me? Look at my funnily-named <a href="http://www.alphaveloptica.com/">SEO company website</a>.</li>
<li>I built, launched, and got sucked into the resulting whirlwind of OCW Search, a <a href="http://www.ocwsearch.com/">free online courses search engine</a>. OCW Search helps you find free downloadable courses from universities like MIT, Notre Dame, The Open University in the UK, and many more. Yes, I&#8217;m an SEO and I operate a search engine now. Let me tell you it&#8217;s an awesome experience.</li>
<li>I&#8217;m expanding OCW Search into a full-featured <a href="http://www.wisdombee.com/">online education startup</a>. Sign up to the mailing list to get access before everyone else <img src='http://ekstreme.com/thingsofsorts/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  You know you want to!</li>
<li>In June 2010, I gave a talk at MongoUK about the <a href="http://skillsmatter.com/podcast/cloud-grid/mongodb-full-text-search-with-sphinx">technology behind OCW Search</a>.</li>
</ul>
<p>And if you&#8217;re reading this, you&#8217;ll notice that eKstreme.com now looks different. Well I thought that the previous design was getting old and I was moving the site to a new server (and a new PHP framework I built), and thought this is a good chance to give it a fresh look.</p>
<p>Speaking of which, I retired a LOT of the tools and code previously available on eKstreme.com. Why? I just don&#8217;t have time to provide support and to keep updating them as APIs change and bugs are identified. I have a different focus now and it&#8217;s not fair for users or me to keep semi-functional code released without support.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/ekstremecom/remember-me/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Speaking at SES London 2010</title>
		<link>http://ekstreme.com/thingsofsorts/seosem/speaking-at-ses-london-2010</link>
		<comments>http://ekstreme.com/thingsofsorts/seosem/speaking-at-ses-london-2010#comments</comments>
		<pubDate>Sun, 24 Jan 2010 14:15:17 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[SEO/SEM]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/?p=252</guid>
		<description><![CDATA[I&#8217;m very happy to be confirm this: I will be speaking at the Automating Twitter session at Search Engine Strategies 2010 in London on 18 February 2010. My talk will be about analytics for social media marketing. Whenever you launch a marketing campaign, you need to measure it in detail to understand its performance. I [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m very happy to be confirm this: I will be speaking at the <a href="http://www.searchenginestrategies.com/london/agenda-day3.php#automating-twitter">Automating Twitter</a> session at Search Engine Strategies 2010 in London on 18 February 2010.</p>
<p style="float:right; width=280px; padding:0px;"><a href="http://www.searchenginestrategies.com/london/agenda-day3.php#automating-twitter"><img src="http://www.searchenginestrategies.com/_imgs/ses10_logo.gif" alt="SES 2010 Logo" width="260" height="90" /></a></p>
<p>My talk will be about analytics for social media marketing. Whenever you launch a marketing campaign, you need to measure it in detail to understand its performance. I will talk about some of the important actionable metrics you need to track, with code examples of how to track them. Other key topics covered are filtering, automation, and reporting, all of which feed into experimentation to find the most effective marketing messages.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/seosem/speaking-at-ses-london-2010/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Alerts Now Spell Checks the Queries</title>
		<link>http://ekstreme.com/thingsofsorts/fun-web/google-alerts-now-spell-checks-the-queries</link>
		<comments>http://ekstreme.com/thingsofsorts/fun-web/google-alerts-now-spell-checks-the-queries#comments</comments>
		<pubDate>Wed, 12 Nov 2008 00:14:41 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[Fun Web]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/fun-web/google-alerts-now-spell-checks-the-queries</guid>
		<description><![CDATA[Lately I&#8217;ve been noticing a lot of weird hits coming in via my Google Alerts emails. I&#8217;ve dug into it and I think I&#8217;ve figured out what&#8217;s going on: Google Alerts is spell checking the queries and matching the queries as it would do in a search. This in addition to matching the Alert queries [...]]]></description>
			<content:encoded><![CDATA[<p>Lately I&#8217;ve been noticing a lot of weird hits coming in via my Google Alerts emails. I&#8217;ve dug into it and I think I&#8217;ve figured out what&#8217;s going on: Google Alerts is spell checking the queries and matching the queries as it would do in a search. This in addition to matching the Alert queries exactly as previously. This new behavior kicked in about a week or 10 days ago.</p>
<p>For example: I keep an alert for [blogsci] because I have a website at blogsci.com. Up till recently, I used to get alerts only when the word &quot;blogsci&quot; was matched in a page. Now, I&#8217;m getting Alerts for pages that do not ever mention the word &quot;blogsci&quot; but the spell checked &quot;blog sci&quot;. So I get matches for &quot;&#8230;blog: sci-fi&#8230;&quot;. See what happened there?</p>
<p>Another example: I run a website with a domain name of XY.com where X is a word and Y is another word. My Alert is set to match it exactly as [XY]. This was going well until recently when I started getting alerts that match [X Y].</p>
<p>Another example: I have an alert for [cli.gs], my latest web app. I get a lot of spurious alerts for this because it matches [cli gs] which is a very popular combination apparently.</p>
<p>Anyone else seeing this weirdness? Any other interpretations? Thoughts in the comments please!</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/fun-web/google-alerts-now-spell-checks-the-queries/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hey YouTube: UK = GB, and both are English</title>
		<link>http://ekstreme.com/thingsofsorts/fun-web/hey-youtube-uk-gb-and-both-are-english</link>
		<comments>http://ekstreme.com/thingsofsorts/fun-web/hey-youtube-uk-gb-and-both-are-english#comments</comments>
		<pubDate>Tue, 28 Oct 2008 20:44:40 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[Fun Web]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/fun-web/hey-youtube-uk-gb-and-both-are-english</guid>
		<description><![CDATA[Sometimes I see help messages that just leave me speechless. This message from YouTube about my automatically-set language preferences goes above and beyond anything I&#8217;ve seen in a long time because it has two big &#34;WTF moments&#34;: The problems? The red circles: The suggestion that English (UK) is different from English (GB). Psst. They&#8217;re the [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes I see help messages that just leave me speechless. This message from YouTube about my automatically-set language preferences goes above and beyond anything I&#8217;ve seen in a long time because it has two big &quot;WTF moments&quot;:</p>
<p>The problems?</p>
<ul>
<li>The red circles: The suggestion that English (UK) is different from English (GB). Psst. They&#8217;re the same thing. It&#8217;s an <a href="http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Exceptional_reservations">exceptional reservation</a> in the ISO standard.</li>
<li>The black circle: The whole message is apparently not in English because the link at the bottom right corner gives me the option to view it in English. When I click it, I get the same message, but instead of suggesting English (UK), it suggests just plain old English. And oh, it gives me the option to change my language to the real English of English (US).</li>
</ul>
<div style="float:left;"><script type="text/javascript">submit_url = "http://ekstreme.com/thingsofsorts/fun-web/hey-youtube-uk-gb-and-both-are-english";</script><script type="text/javascript" src="http://sphinn.com/evb/button.php"></script></div>
<p>Hey, I have news for you YouTube: English, English (UK), English (GB) and English (US) are all freakin&#8217; English.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/fun-web/hey-youtube-uk-gb-and-both-are-english/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Yahoo! Search Doing a SERPs Usability Survey</title>
		<link>http://ekstreme.com/thingsofsorts/fun-web/yahoo-search-doing-a-serps-usability-survey</link>
		<comments>http://ekstreme.com/thingsofsorts/fun-web/yahoo-search-doing-a-serps-usability-survey#comments</comments>
		<pubDate>Thu, 23 Oct 2008 09:12:20 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[Fun Web]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/fun-web/yahoo-search-doing-a-serps-usability-survey</guid>
		<description><![CDATA[I was just searching with Yahoo! and I saw a survey request from &#34;Yahoo! Surveys&#34;. It was a big purple box to the immediate right of the results list, and it was anchored to the bottom of the screen (so even if I scrolled down, it went down too). I clicked on it before I [...]]]></description>
			<content:encoded><![CDATA[<p style="float:right;"><script type="text/javascript">submit_url = "http://ekstreme.com/thingsofsorts/fun-web/yahoo-search-doing-a-serps-usability-survey";</script><br />
<script type="text/javascript" src="http://sphinn.com/evb/button.php"></script></p>
<p>I was just searching with Yahoo! and I saw a survey request from &quot;Yahoo! Surveys&quot;. It was a big purple box to the immediate right of the results list, and it was anchored to the bottom of the screen (so even if I scrolled down, it went down too). I clicked on it before I realized I should have taken a screenshot, but I did take a screenshot of the single question in the survey. The question opens in a new window:</p>
<p><a href="http://i422.photobucket.com/albums/pp307/pierrefar/yahoo-survey.png"><img src="http://i422.photobucket.com/albums/pp307/pierrefar/yahoo-survey-2.png" border="0" alt="Photobucket"></a></p>
<p>Click for full size, and no, I&#8217;m not going to tell you what my answer was :p</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/fun-web/yahoo-search-doing-a-serps-usability-survey/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The SEOmoz Linkscape Ghost</title>
		<link>http://ekstreme.com/thingsofsorts/fun-web/the-seomoz-linkscape-ghost</link>
		<comments>http://ekstreme.com/thingsofsorts/fun-web/the-seomoz-linkscape-ghost#comments</comments>
		<pubDate>Wed, 08 Oct 2008 01:00:36 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[Fun Web]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/fun-web/the-seomoz-linkscape-ghost</guid>
		<description><![CDATA[If you&#8217;re part of the SEO industry, unless you&#8217;ve been livining under a rock for the past couple of days, you will know that SEOmoz launched a new tool called Linkscape, to much fanfare. First things first, congrats and kudos are due to the SEOmoz team for building such a complex beast. It&#8217;s not easy [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;re part of the SEO industry, unless you&#8217;ve been livining under a rock for the past couple of days, you will know that SEOmoz launched a new tool called Linkscape, to <a href="http://sphinn.com/story/77000">much fanfare</a>. First things first, congrats and kudos are due to the SEOmoz team for building such a complex beast. It&#8217;s not easy at the very least on the technical level.</p>
<p style="float:left;"><script type="text/javascript">submit_url = "http://ekstreme.com/thingsofsorts/fun-web/the-seomoz-linkscape-ghost";</script><br />
<script type="text/javascript" src="http://sphinn.com/evb/button.php"></script></p>
<p>But there is a problem: SEOmoz has not disclosed the user agent (UA) of its crawler. Here I will talk about why this is a bad thing, and also take a stab and go out on a limb and say: there is no SEOmoz crawler, at least not in the traditional sense. For the latter, I will offer a viable technical alternative, which may or not be correct, but the fact the alternative exists gives a sensible explanation as to why SEOmoz is not offering a straight answer to the UA question.</p>
<h2>Why Disclosing the UA is Essential</h2>
<p>Let&#8217;s not mince words: we as an SEO community like a little mud fight once in a while. We debate and discuss and yes fight. But one thing we all know how to recognize is malicious activity and differentiate it from aggressive activity.</p>
<p>Example: a bot scraping our content for an MFA site is a tolerated nusance. We take steps to negate the effects of scrapers but at the end of the day we don&#8217;t fight them hard. On the other hand, a bot probing for security holes is treated like a witch in 1209AD.</p>
<p>Which is why the Linkscape&#8217;s lack of disclosure hurts: We as a community work hard at identifiying bots. SEOmoz is supposed to be a good citizen of the SEO world, and yet the lack of transparency goes against the spirit and the image of SEOmoz. On the one hand we have a company with a strong community doing good deeds (SEO trademark fight anyone?) and yet it behaves in a way we expect out of the shady side of the net we deal with every day.</p>
<p>Not just that: the data collected from us, about us, will be used against us. It&#8217;s called competitive intelligence.</p>
<p>And not just that: SEOmoz is using the data to make money. The free version is pathetic and the Pro version needs a monthly subscription.</p>
<p>To me, this kind of behavior (stealth, harmful, and to make money) puts Linkscape squarely in the naughty corner. I certainly didn&#8217;t expect this out of SEOmoz. Tough luck Rand and co: you have a great brand and I for one expect better!</p>
<p>But I won&#8217;t ask for a UA because I think there isn&#8217;t one.</p>
<h2>How To Build Linkscape</h2>
<p>It&#8217;s actually quite easy on a conceptual level. However, just like cooking, having a recipe doesn&#8217;t make you a great chef &#8211; there are lots of details that SEOmoz must have tackled successfully to build Linkscape. I am not trying to belittle their achievment, and all I can show you is one recipe. This recipe is completely my guess and could very well be wrong. I have not talked to anyone at SEOmoz.</p>
<p>So come on Pierre, what is it? The answer is the <a href="http://developer.yahoo.com/">Yahoo! Search API</a>. It&#8217;s an API giving programmers complete access over the Yahoo! index without crawling to a single page. For example, the following URL:</p>
<div class="code">http://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&#038;query=site%3Aseomoz.org%2F&#038;results=2</div>
<p>fetches the first two hits from a Yahoo! [site:seomoz.org]. Interestingly, it tells you where the cache URLs are, and they reside on Yahoo! servers (unsurprisingly). So you fetch the cache from Yahoo!, do the analysis, save what you care about (links, titles, etc), and you&#8217;re done.</p>
<p>You&#8217;ll need to kick start this somehow with a seed set of sites. DMOZ and Wikipedia are usually good sources that are freely available. Wikipedia can even be downloaded so no one needs to know. Yahoo!&#8217;s very own Delicious, Digg, reddit, etc are also good starting points because they tell you what&#8217;s hot right now. The seed is basically a huge set of URLs from which you extract the domain names and do [site:domain] queries. Lather, rinse, repeat.</p>
<p>Notice that you won&#8217;t need to crawl a single page yourself. You let Yahoo! do the work for you. Neat, no?</p>
<h2>So What Should SEOmoz Disclose?</h2>
<p>Above I said two potentially conflicting things: SEOmoz should disclose the Linkscape user agent and then went on to show that it doesn&#8217;t need to have a user agent. So what exactly am I asking from SEOmoz?</p>
<p>Easy: complete disclosure. If SEOmoz is using a traditional crawler, we must have its UA and the IP addresses. It&#8217;s only a matter of time for us to find them. If not, SEOmoz needs to explain clearly why not.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/fun-web/the-seomoz-linkscape-ghost/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Announcing Cligs: Short URLs with Analytics and SEO Friendliness</title>
		<link>http://ekstreme.com/thingsofsorts/fun-web/announcing-cligs-short-urls-with-analytics-and-seo-friendliness</link>
		<comments>http://ekstreme.com/thingsofsorts/fun-web/announcing-cligs-short-urls-with-analytics-and-seo-friendliness#comments</comments>
		<pubDate>Mon, 15 Sep 2008 22:17:29 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[Fun Web]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/fun-web/announcing-cligs-short-urls-with-analytics-and-seo-friendliness</guid>
		<description><![CDATA[That&#8217;s right folks, the short URL market is broken and I&#8217;m fixing it. The new service is called Cligs (like Clicks but with a G). It&#8217;s a short URL service on steroids. The key feature is that it tracks the clicks of the short URLs. What kind of analytics do you get? At launch right [...]]]></description>
			<content:encoded><![CDATA[<p>That&#8217;s right folks, the short URL market is broken and I&#8217;m fixing it. The new service is called <a href="http://cli.gs/">Cligs</a> (like Clicks but with a G). It&#8217;s a short URL service on steroids. The key feature is that it tracks the clicks of the short URLs.</p>
<p>What kind of analytics do you get? At launch right now:</p>
<ul>
<li>Cligs gives you tons of <strong>traffic data and analytics</strong> about the traffic your short URLs get. This includes:
<ul>
<li>Number of hits</li>
<li>Referral stats</li>
<li>Mentions on twitter, blogs, and the web</li>
<li>Mentions of the destination URL on twitter, blogs, the web, and delicious</li>
</ul>
<p> And lots more! And if you want a more data, <a href="http://blog.cli.gs/contact">just let me know</a>!</li>
<li><strong>Cligs forwards with a 301 Permanent Redirect</strong> so your destination URL gets <strong>full SEO benefits</strong> of the link. If you are an affiliate marketer, this means you can hide your backlinks, get traffic, get statistics, and get the SEO benefits.</li>
</li>
<li>With Cligs, you can create an <strong>unlimited number of short URLs</strong> for the same destination URL. This is great because you can promote the same destination at different sites like twitter or facebook by using different cligs and watch how each source sends you traffic.</li>
</ul>
<p>That&#8217;s just the start. There are a ton of new features that are going to be added in the coming few days and weeks, including some SEO-useful analytics.</p>
<p>And, of course, there is a bookmarklet:</p>
<p><a href="javascript:(function(){ window.open('http://cli.gs/cligs/new?url='+encodeURIComponent(location.href)); })();">Shorten Link @ Cli.gs</a></p>
<p>So what are you waiting for? Stop using plain-vanilla short URL services and start using <a href="http://cli.gs/">Cligs</a>.</p>
<p>Comments and feedback <a href="http://blog.cli.gs/contact">most welcome</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/fun-web/announcing-cligs-short-urls-with-analytics-and-seo-friendliness/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Stealth Crawler from Yahoo!</title>
		<link>http://ekstreme.com/thingsofsorts/web-programming/new-stealth-crawler-from-yahoo</link>
		<comments>http://ekstreme.com/thingsofsorts/web-programming/new-stealth-crawler-from-yahoo#comments</comments>
		<pubDate>Fri, 12 Sep 2008 13:00:10 +0000</pubDate>
		<dc:creator>Pierre</dc:creator>
				<category><![CDATA[Web programming]]></category>

		<guid isPermaLink="false">http://ekstreme.com/thingsofsorts/web-programming/new-stealth-crawler-from-yahoo</guid>
		<description><![CDATA[For the past few months, I&#8217;ve been tracking a crawler from Yahoo! that does not identify itself on my science blog. The bot&#8217;s details are: Requested page: /science/converting-blood-groups At: 06 May 2008 10:21:05 AM GMT Routed to: /index.php Referred from: http://blogsci.com/science/converting-blood-groups Remote: crawl1.image.srch.kr1.yahoo.com (203.212.174.181) Request: HTTP/1.1 GET Accepting: HTTP: */* Charset: Enconding: Languages: UA: Cookies: [...]]]></description>
			<content:encoded><![CDATA[<p>For the past few months, I&#8217;ve been tracking a crawler from Yahoo! that does not identify itself on my <a href="http://blogsci.com/">science blog</a>. The bot&#8217;s details are:</p>
<div class="code">Requested page: /science/converting-blood-groups
<ul>
<li>At: 06 May 2008 10:21:05 AM GMT</li>
<li>Routed to: /index.php</li>
<li>Referred from: <a href="http://blogsci.com/science/converting-blood-groups">http://blogsci.com/science/converting-blood-groups</a></li>
<li>Remote: crawl1.image.srch.kr1.yahoo.com (203.212.174.181)</li>
<li>Request: HTTP/1.1 GET</li>
<li>Accepting:
<ul>
<li>HTTP: */*</li>
<li>Charset: </li>
<li>Enconding: </li>
<li>Languages: </li>
</ul>
</li>
<li>UA: </li>
<li>Cookies: </li>
</ul>
</div>
<p>Notice a few interesting details: No user-agent string, the fact it provides an HTTP_REFERER header that&#8217;s the same page being requested, it comes from *.yahoo.com not the usual yahoo.net for Slurp, and the fact it says &quot;image&quot; and &quot;srch&quot; in the host.</p>
<p>The tracking is very low-level, a few hits a day with lots of one-hit-a-day visits.</p>
<p>What&#8217;s really interesting is how laser-targeted it is: it&#8217;s only requested the same two pages many times since May. The pages are the specific blog post linked to above plus the archive page that contains that post, so it&#8217;s likely something about that post that&#8217;s of interest to the bot. And yes, the post contains an image, and the image is the only one in the main content of the archive.</p>
<p>I&#8217;ll dig deeper when I have a chance. Please let me know in the comments below if you&#8217;re seeing something similar.</p>
]]></content:encoded>
			<wfw:commentRss>http://ekstreme.com/thingsofsorts/web-programming/new-stealth-crawler-from-yahoo/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
