From Marketing in the Age of Google: Excerpts From Chapter 5 (How Search Engines Work) and Chapter 7 (Working With Developers)
The major search engines that account for most market share today are the automated type: Google, Yahoo!, and Microsoft Bing (although Yahoo! and Microsoft have reached an agreement in which Yahoo! will replace its search engine with Bing, although this hasn’t happened as of late 20093). All three have similar overall infrastructure as follows:
- Web crawlers (also known as ”spiders” or ”robots”) that crawl the Web. These crawlers follow links to discover the pages on the Web.
- Extraction processes that gather information from those pages (such as textual content, metadata, and links).
- Index storage that stores the content from Web pages. Content is generally stored using word-based keys, similar to the index in a book. When you look up a word in the index of a book, you learn the page number that word is on. Similarly, with a search engine index, the search engine can look up a word that someone is searching for and find out all the Web pages associated with that word.
- Results scoring that determines what pages are the most relevant for each search. When someone does a search (called a”query”) and the search engine checks the index for all the Web pages associated with that search, the search engine needs a way to rank those Web pages into an order that is useful for the searcher. Search engines use a number of factors in scoring and these factors are adjusted all of the time based on new algorithms, tests, and other criteria. Search engines keep the details of these scoring factors secret. Once the search engine compiles and ranks the pages that are relevant for the query, it displays it in a list called ”organic results.” The ranking process happens at the time of the query.
Working With Developers
A site’s technical infrastructure is a vital part of search acquisition. If the search engine bots can’t crawl and extract content from a site’s pages, that site has little chance of ranking well in search engines for relevant queries. In addition, changing a site’s content management system or server setup, merging sites after an acquisition, building micro sites and other common activities can greatly impact search acquisition.
Web developers should understand the principles of searchability and build them into the infrastructure and should have a set of best practices for modifying that infrastructure and troubleshooting problems. This chapter provides an overview of the technical issues involved with search acquisition. You can use this chapter either to give directly to your Web developers, or if you’re interested in the technical details, it can help you better understand the technical issues so you can have productive conversations with the development team about how best to build search best practices into the development process.
In order for search best practices to be successfully implemented in Web development, the development team needs executive support. This means the developers need to be given the training to expand their core set into understanding how to make Web infrastructure searchable, they need time to test searchability in addition to functionality, and they need to be provided context about why they are being asked to make changes. Web developers know how to make sites functional and might be suspicious if you ask them to make changes which don’t seem to improve functionality. But if you explain that, whereas both 301 and 302 redirects transfer the visitor from one page to another, only a 301 tells search engines to index the new page instead of the old one, you’ll get the developers on your side much more quickly.
- Diagnostic tools and technical resources (for Flash, JavaScript, AJAX, IIS, asp.net, Apache, and more)
- Technical checklists
- Managing Search Engine Access Via the Robots Exclusion Protocol
- Effectively Using Images
- Domain Canonicalization
- How to implement AJAX in a Google-friendly Way
- Pros and Cons of URL Canonicalization Options
- What Site Owners, Web Developers, and SEOs Should Know About the Yahoo/Microsoft Deal
- Google, Flash, and JavaScript
- Making Geotargeted Content Findable For the Right Searchers


