<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xml:base="http://hecker.org">
  <title type="text">Frank Hecker - Site Information</title>
  <subtitle type="text">Information about Frank Hecker's web site</subtitle>
  
  <link rel="alternate" type="text/html" hreflang="en" href="http://hecker.org/site/" />
  <id>tag:hecker.org,2004:/site</id>
  <generator uri="http://www.blosxom.com/" version="2.0">Blosxom</generator>
  <rights>Copyright 2004-2006 Frank Hecker, http://www.hecker.org/</rights>
  
  
  <updated>2005-09-22T06:25:00Z</updated>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/hecker-site" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="hecker-site" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
    <id>tag:hecker.org,2005:/site/feedback-welcome</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/feedback-welcome" />

    <title type="text">Feedback is now welcome</title>
    <published>2005-09-22T06:25:00Z</published>
    <updated>2005-09-22T06:25:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>After much struggle I've finally managed to get my blog to support
comments and TrackBacks. (This is what I get for using <a href="http://www.blosxom.com/">"roll your
own" blogging software</a>). I'll blog some more
later about how I did this, for any Blosxom users who happen to be
interested; in the meantime please report any problems to me, either
as comments on this post (if you're able to) or via email.</p>

<p>UPDATE: I now have a <a href="http://hecker.org/blosxom/feedback">blog post</a> describing the
new Blosxom plugin I wrote to support comments and TrackBacks.</p>
</div>
    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2005:/site/todo</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/todo" />

    <title type="text">To-do list</title>
    <published>2005-01-09T14:10:00Z</published>
    <updated>2005-01-09T14:10:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>As many people can attest, sometimes you spend more time (and have
more fun) tinkering with the underpinnings of a web site instead of
actually writing new content to be posted on it. In that spirit, here
is my current list of things I'm planning to add to or change about my
web site and blog.</p>

<p>(I've left old items in place but simply marked them as deleted.)</p>

<p>The following are "must-haves":</p>

<ul>
<li><del>blog overlays old site</del> (working)
<ul>
<li><del>URI rewriting for cononical URIs</del>
(<a href="http://hecker.org/site/uri-rewriting">done</a>)</li>
<li>don't special case POST for now (but may need to later?)</li>
</ul></li>
<li><del>extensionless URIs</del> (working)
<ul>
<li><del>plugin for Blosxom</del> (<a href="http://hecker.org/blosxom/extensionless">done</a>)</li>
<li><del>load as 20extensionless? (after content negotiation, before
everything else)</del> (done, loaded as first extension)</li>
<li><del>fix breadcrumbs, check for issues wtth other plugins
(seemore, etc.)</del> (no seemore patch needed, need to post
patch to breadcrumbs)</li>
<li><del>MultiViews for Apache</del> (decided not to do this, but
instead special-case <code>.html</code> in <a href="http://hecker.org/site/uri-rewriting">URI rewriting</a>,
will use <code>.var</code> files if I ever need more than that)</li>
<li><del>check for cruft (e.g., <code>*.bak</code>)</del> (done, site can now
be regenerated from scratch and be cleaned up with <code>make
remote-clean</code>)</li>
</ul></li>
<li><del>XHTML version</del> (postponed)
<ul>
<li><del>return <code>.xhtml</code> vs <code>.html</code> based on <code>Accept</code> header?</del>
(decided against this, as it would require special-casing Internet
Explorer)</li>
</ul></li>
<li>basic feed (working)
<ul>
<li><del>vanilla RSS 0.91</del> (done)</li>
<li><del>validate feed (feedvalidator.org)</del> (done)</li>
<li><del>try to "stupefy" excerpts to remove fancy punctuation? strip
HTML?</del> (left as is for now)</li>
<li>fix problem with example HTML tags messing up in NewsFire</li>
</ul></li>
<li>site and category intros
<ul>
<li><del>use readme plugin</del> (done)</li>
<li>add readme content to feed as description?</li>
</ul></li>
<li>HTML 4.01 Strict
<ul>
<li><del>create HTML 4.01 Strict flavour files</del> (done)</li>
<li><del>patch Markdown to generate HTML-compliant empty tags</del>
(<a href="http://hecker.org/blosxom/markdown-empty-element-suffix-patch">done</a>)</li>
<li><del>put validation check in footer</del> (done)</li>
<li>validate all visible pages</li>
<li>fix old pages</li>
</ul></li>
<li>CSS
<ul>
<li>simplify CSS more if possible</li>
<li>change old pages to new design (in progress)</li>
<li><del>add print media style</del> (done)</li>
<li>eliminate common.css? (may keep this for Netscape Navigator 4.x)</li>
<li><del>determine proper invocation</del> (done, use <code>@import</code> to
hide from Netscape Navigator 4.x)</li>
</ul></li>
<li>accessibility
<ul>
<li>do Bobby, etc., checks</li>
<li><del>get tips from diveintomark</del> (done)</li>
<li>accessibility statement (have draft)</li>
</ul></li>
<li>typography/markup
<ul>
<li>tweak SmartyPants as necessary (not needed?)</li>
<li>add needed entities to template files (including RSS?), readme files</li>
</ul></li>
<li><del>dating posts</del> (working)
<ul>
<li><del>put date stamp in entry file</del> (done)</li>
<li><del>compare entries_cache_meta plugin to meta plugin plus
something else</del> (done, using entries_cache_meta)</li>
<li><del>handle dates for statically-generated pages</del> (have
initial hack)</li>
</ul></li>
<li><del>decide on archive</del> (done, using the archives plugin)</li>
<li>nail down site hierarchy (see below)</li>
<li>other
<ul>
<li><del>look at bettertitles?</del> (done, using bettertitles)</li>
<li><del>change Blosxom file extension from .txt to something else (e.g.,
.bls)</del> (leaving as-is for now)</li>
<li><del>finalize plugin list and establish an order for them to run</del>
(done, using <code>SEQUENCE</code> file to generate prefixes on the fly when
copying over new or modified plugins)</li>
<li><del>fix readme invocation (add P tag to readme.html?)</del>
(no, this causes problems in practice, use one-paragraph readme
blurbs)</li>
</ul></li>
</ul>

<p>The following are desirable but not necessarily "must-haves":</p>

<ul>
<li>search
<ul>
<li>using Google SiteSearch for now</li>
</ul></li>
<li>comment/writebacks
<ul>
<li>keep up with new wb versions</li>
<li>research URL rewriting for POST</li>
</ul></li>
<li>other feed types
<ul>
<li>RSS 1.0 and 2.0 (adds dates)</li>
<li><del>Atom</del> (done, using atomfeed plugin)</li>
</ul></li>
<li><del>content negotiatlon for HTML vs XHTML</del> (no, see previous note)
<ul>
<li><del>look for <code>Accepts: application/xml+xhtml</code></del> (not sent
by Internet Explorer or Safari)</li>
<li><del>html flavour vs xhtml</del></li>
<li><del>would need to modify Markdown, etc., to generate both
flavours</del> (but needed for feeds)</li>
</ul></li>
<li>cache control
<ul>
<li><del>send <code>Last-modified</code> header</del> (<a href="http://hecker.org/blosxom/lastmodified2">done</a>)</li>
<li><del>send <code>ETag</code> header</del> (done)</li>
<li><del>send <code>Expires</code> header</del> (done)</li>
<li><del>send <code>Cache-control</code> header</del> (done)</li>
<li><del>send <code>Content-length</code> header</del> (done)</li>
<li><del>do <code>If-modified-since</code> check ourselves</del> (done)</li>
<li><del>do <code>If-none-match</code> check ourselves</del> (done)</li>
<li><del>add support for <code>ETag</code> as strong validator</del> (done)</li>
</ul></li>
</ul>
</div>
    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2005:/site/copyright</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/copyright" />

    <title type="text">Copyright and license</title>
    <published>2005-01-08T13:20:00Z</published>
    <updated>2005-01-08T13:20:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="html" xml:base="http://hecker.org" xml:lang="en">
&lt;p&gt;I've done a lot of work related to software licensing as part of the
&lt;a href="http://www.mozilla.org/MPL/relicensing-faq.html
" title="Frequently asked questions about relicensing of the Mozilla software"&gt;Mozilla relicensing project&lt;/a&gt;
and when I worked at &lt;a href="http://www.collab.net/
" title="CollabNet: collaborative software development on demand"&gt;CollabNet&lt;/a&gt;.
As a result of enduring endess wrangling about licensing terms
I've been put off complex licensing schemes, and prefer to make my own
works available under very liberal terms.&lt;/p&gt;

&lt;p&gt;For software I prefer use of the so-called "&lt;a href="http://www.opensource.org/licenses/mit-license.php" title="Reference copy of the MIT license maintained by the Open Source Initiative"&gt;MIT license&lt;/a&gt;" (also
known as the "X11 license") as being the most simple and
easy-to-understand license in common use, and the one least likely to
cause issues for others who want to reuse my software. For my writings
in general I use an even simpler license that grants essentially
blanket permissions subject only to a minimal notification
requirement.&lt;/p&gt;

&lt;p&gt;(I don't use the &lt;a href="http://creativecommons.org/licenses/by/2.0/"&gt;Creative Commons attribution license&lt;/a&gt; because I
don't see it adding much value for my purposes, and the additional
language in the Creative Commons license just gives people &lt;a href="http://people.debian.org/~evan/ccsummary.html" title="debian-legal summary of Creative Commons 2.0 licenses"&gt;more
things to argue about&lt;/a&gt;. However if for some reason you want to
reuse my works under a Creative Commons license or comparable
licenses, such as the &lt;a href="http://www.gnu.org/copyleft/fdl.html"&gt;GNU Free Documentation License&lt;/a&gt; or the
&lt;a href="http://opencontent.org/openpub/"&gt;Open Publication License&lt;/a&gt;, just &lt;a href="mailto:hecker@hecker.org"&gt;send me a request&lt;/a&gt;; I'll
almost certainly grant you whatever permissions you require.)&lt;/p&gt;

&lt;h2&gt;Content license&lt;/h2&gt;

&lt;p&gt;Unless otherwise noted the following copyright notice and license
terms apply to all normal prose material (i.e., anything other than
software and associated documentation) created by me and published on
this site:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Copyright &amp;copy; 1994-2006 Frank Hecker (http://www.hecker.org)&lt;/p&gt;
  
  &lt;p&gt;You may freely distribute this material in whole or in part, with or
  without modifications, in the original language or in translation, by
  itself or as part of other works, in any form and for any purpose,
  and may permit others to whom you distribute this material to do so as
  well, provided only that you include the above copyright notice and
  this permissions notice in all copies or substantial portions of the
  material.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;Software license&lt;/h2&gt;

&lt;p&gt;Unless otherwise noted the following copyright notice and license
terms apply to all software and associated documentation created by me
and distributed on this site or as part of this site (including CSS
stylesheets, HTML templates, and so on):&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Copyright &amp;copy; 1994-2006 Frank Hecker (http://www.hecker.org)&lt;/p&gt;
  
  &lt;p&gt;Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
  without limitation the rights to use, copy, modify, merge, publish,
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:&lt;/p&gt;
  
  &lt;p&gt;The above copyright notice and this permission notice shall be
  included in all copies or substantial portions of the Software.&lt;/p&gt;
  
  &lt;p&gt;THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
  NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;Mini-FAQ&lt;/h2&gt;

&lt;p&gt;One thing I've noticed with open source and free software licensing is
that many people don't seem to take the licenses at face value, but
are always asking questions about what they can and can't do, even
when the issue is directly addressed by the license itself. In that
spirit here is a "mini-FAQ" intended to forestall such questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;"Do I need to ask your permission before I do something with your
writings or your software?" No, you do not. However I'd appreciate
it if you'd send me an email message if you find anything I've
created to be of enough interest to you that you're motivated to
redistribute or reuse it; that may motivate me in turn to keep
creating new stuff.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"Can I use your writings or software in commercial products like
books or proprietary software?" Yes, you can, and you don't have to
pay me a cent. However if you'd like me to enhance something I've
written or create new related material then I might be willing to do
that for a fee, depending on the nature of the project and whether
my work situation permits me to do so.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"If I use your writings or software in creating my own works, what
do I need to do in order to comply with your license terms?" Not
much at all: If you use brief extracts then you don't need to do
anything at all; this includes quoting material in blog postings,
magazine articles, books and other contexts that would traditionally
be considered "fair use". (However both common courtesy and
scholarly tradition would dictate that you attribute the material to
me.) If you redistribute entire articles or source code files, or
incorporate substantial portions of them in your own work, then the
license terms require that you duplicate my copyright notice and
permissions notice (i.e., the license terms themselves) in an
appropriate place (e.g., accompanying the writings or source code in
question).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"Can I use your software in software licensed under the GPL, or your
writings in documents licensed under the GNU Free Documentation
License?" Yes. For software I'm using the MIT/X11 license, which the
&lt;a href="http://www.gnu.org/fsf/fsf.html"&gt;Free Software Foundation&lt;/a&gt; considers to be &lt;a href="http://www.gnu.org/licenses/gpl-faq.html#WhatDoesCompatMean"&gt;compatible with the
GPL&lt;/a&gt;. For my writings I'm using a license that is so
liberal I'd be surprised if there were any compatibility issue with
the GFDL. However note that using my software (or writings) in a
GPL-licensed program (or GFDL-licensed document) does &lt;em&gt;not&lt;/em&gt; mean
that you can simply replace my license notice with a GPL (or GFDL)
license notice; if you want to do that then you need to ask my
permission (which I'll almost certainly grant).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;UPDATE: Changed the copyright notice to reflect the new year.&lt;/p&gt;

    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2005:/site/feeds</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/feeds" />

    <title type="text">Syndication feeds</title>
    <published>2005-01-05T13:30:00Z</published>
    <updated>2005-01-05T13:30:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>If you'd like to receive full-text articles from this site as they are
published, you can subcribe to one or more of the following feeds, in
the formats indicated; simply cut and paste the URLs into your feed
reader of choice. The Atom feeds are preferred; I maintain the RSS
feeds only for older news aggregators that are not yet Atom-enabled.</p>

<ul>
<li>All blog articles. This feed includes all articles posted in any of
the categories discussed below (<a href="http://hecker.org/index.atom" title="Syndicate this site in Atom format">Atom</a> or <a href="http://hecker.org/index.rss" title="Syndicate this site in RSS 0.91 format">RSS</a>).</li>
<li>Mozilla-related articles. This feed includes all articles posted in
the "mozilla" category (<a href="http://hecker.org/mozilla/index.atom" title="Syndicate the 'mozilla' category in Atom format">Atom</a> or <a href="http://hecker.org/mozilla/index.rss" title="Syndicate the 'mozilla' category in RSS 0.91 format">RSS</a>).</li>
<li>Blosxom-related articles. This feed includes all articles posted in
the "blosxom" category (<a href="http://hecker.org/blosxom/index.atom" title="Syndicate the 'blosxom' category in Atom format">Atom</a> or <a href="http://hecker.org/blosxom/index.rss" title="Syndicate the 'blosxom' category in RSS 0.91 format">RSS</a>).</li>
<li>Articles about this site and how it's built. This feed includes all
articles posted in the "site" category (<a href="http://hecker.org/site/index.atom" title="Syndicate the 'site' category in Atom format">Atom</a> or
<a href="http://hecker.org/site/index.rss" title="Syndicate the 'site' category in RSS 0.91 format">RSS</a>).</li>
<li>Miscellaneous articles that don't fit into any of the categories
above. This feed includes all articles posted in the "misc" category
(<a href="http://hecker.org/misc/index.atom" title="Syndicate the 'misc' category in Atom format">Atom</a> or <a href="http://hecker.org/misc/index.rss" title="Syndicate the 'misc' category in RSS 0.91 format">RSS</a>).</li>
</ul>

<p>Note that I don't post new articles that often, so you should set your
feed reader to check no more than once a day.</p>
</div>
    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2004:/site/design-philosophy</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/design-philosophy" />

    <title type="text">Design philosophy</title>
    <published>2004-12-11T07:53:00Z</published>
    <updated>2004-12-11T07:53:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>The basic principles I tried to follow in creating the this site were
as follows:</p>

<ul>
<li>The site should be entirely text-based, with minimal or no use of
graphics.</li>
<li>All web pages on the site should validate as HTML 4.01 Strict.</li>
<li>All web pages on the site should be accessible using URIs that hide
the details of the particular content type or page generation
mechanism associated with the page.</li>
<li>The site should be a transparent upgrade from my previous site
(created a few years ago), so that all previous URLs should
continue to work.</li>
</ul>

<p>The following sections expand on these points:</p>

<h2>Text-based site</h2>

<p>When it comes to designing a web site, I have an unfortunate problem:
I very much appreciate good graphic design but am utterly incapable of
actually doing graphic design. I guess I could try stealing (I mean,
"taking inspiration from") someone else's design, but unless I did a
literal copy I'd probably mess up the design in the process of reusing
it.</p>

<p>I've therefore decided to avoid the problem entirely and just create
an entirely text-based site using a minimal design, minimal enough
that hopefully I won't screw it up too badly. As a side benefit this
means that the site should load faster; this may help offset the
penalty of doing URL rewriting and dynamic page generation for the
weblog.</p>

<p>(Note that if some graphic designer out there wants to help me create
a better looking site, whether on a volunteer basis or for a fee, I'd
be happy to entertain any offers!)</p>

<h2>Use of HTML 4.01 Strict</h2>

<p>Since I volunteer for the Mozilla project I feel it's incumbent on me
to support the <a href="http://www.webstandards.org/" title="The Web Standards Project">use of web standards</a>; this would also show off
the standards support in <a href="http://www.mozilla.org/products/firefox/">Firefox</a> and the browser component of the
Mozilla Suite. I then faced the choice between using <a href="http://www.w3.org/TR/html4" title="HTML 4.01 specification">HTML
4.01</a> or <a href="http://www.w3.org/TR/xhtml1" title="XHTML 1.0 specification">XHTML 1.0</a> (or <a href="http://www.w3.org/TR/xhtml11/" title="XHTML 1.1 specification">XHTML 1.1</a> if I
wanted to be truly <i>au courant</i>).</p>

<p>In the end I was swayed by <a href="http://www.hixie.ch/advocacy/xhtml" title="Sending XHTML as text/html Considered Harmful">Ian Hickson's comments on XHTML</a>
and decided to use HTML 4.01 Strict. I did do some experiments in
serving HTML or XHTML depending on the browser; in particular, I
implemented a test Blosxom plugin to do content negotiation and send
back content of the appropriate type depending on what the browser
would accept. (The plugin works by checking the HTTP Accept header and
tweaking the Blosxom "flavour" value appropriately.) However in the
end I decided it wasn't worth the trouble.</p>

<p>(Briefly, the problem is that some browsers that can properly
interpret the XHTML content type <code>application/xhtml+xml</code>, like Safari
and Opera, don't indicate an explicit preference for that content type
over the HTML content type <code>text/html</code>. But if a browser doesn't
indicate an explicit preference then we have to send HTML; otherwise
we'd break Internet Explorer. So we can't take advantage of the XHTML
support in Safari and Opera without doing browser sniffing; the more
"pure" approach using just content negotiation works only for Mozilla,
Firefox, and other browsers that explicitly indicate a preference for
XHTML.)</p>

<h2>Cool URIs</h2>

<p>My guides in designing a URI scheme have been Tim Berners-Lee's essay
"<a href="http://www.w3.org/Provider/Style/URI">Cool URIs don't change</a>" and Jakob Neilsen's essay "<a href="http://www.useit.com/alertbox/990321.html">URL
as UI</a>".</p>

<p>I haven't taken all their advice; in particular, I didn't want to
implement the full date-based URI scheme recommended by Berners-Lee, or
the use of a spell-checking web server as recommended by Neilsen.</p>

<p>These are the principles I'm trying to follow:</p>

<ul>
<li><p>allow file extensions to be omitted on all site URIs, e.g.,
<code>.../site/design-philosophy</code> instead of
<code>.../site/design-philosophy.html</code></p></li>
<li><p>use only lower-case alphanumeric text in URI path components, with a
hyphen used to separate words, e.g., <code>design-philosophy</code> instead of
<code>design_philosophy</code>, <code>DesignPhilosophy</code>, or (worst of all)
<code>design%20philosophy</code>.</p></li>
<li><p>hide the fact that Blosxom (or some other mechanism) is being used
to generate pages, e.g., <code>.../site/design-philosophy</code> instead of
<code>.../cgi-bin/blosxom.cgi/site/design-philosophy</code></p></li>
<li><p>use a flat one- or two-level hierarchy for organizing the site,
e.g., <code>.../mozilla/foo</code> instead of
<code>.../computers/internet/browsers/mozilla/foo</code></p></li>
</ul>

<h2>Transparent upgrade</h2>

<p>I've had my web site active for several years, and in some cases I've
published articles that other people have linked to. I wanted to turn
my home page into the index page for my weblog, and have category
index pages under that, e.g., <code>http://www.hecker.org/mozilla/</code> for my
Mozilla-related blog postings. However at the time I needed to ensure
that the URIs for existing directories and files would still work.</p>

<p>In particular, I wanted to ensure the following:</p>

<ul>
<li><p>all URIs for existing documents (e.g.,
<code>http://www.hecker.org/info/bio.html</code>) would continue to return the
appropriate document</p></li>
<li><p>all URIs for existing directories (e.g.,
<code>http://www.hecker.org/writings/</code>) would return either the existing
index page or (optionally) an index page for a blog category (e.g.,
<code>http://www.hecker.org/mozilla/</code>)</p></li>
<li><p>all other URIs would be considered as requests for blog-related
pages</p></li>
</ul>

<h2>Implementation</h2>

<p>See the site's <a href="http://hecker.org/site/colophon">colophon</a> for more information on how I put the
above philosophy into practice. (In short, it involves using the
Blosxom blogging system and lots of URI rewriting.)</p>
</div>
    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2004:/site/colophon</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/colophon" />

    <title type="text">Colophon</title>
    <published>2004-12-11T07:30:00Z</published>
    <updated>2004-12-11T07:30:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>This site is a mixture of static content and dynamic content served
through the <a href="http://www.blosxom.com/">Blosxom blogging system</a>. I use
various URI rewriting rules and a number of Blosxom plugins (some
slightly hacked) in order to implement the site according to my
personal <a href="http://hecker.org/site/design-philosophy">design philosophy</a>.</p>

<h2>Overall implementation</h2>

<p>As mentioned above, my site is (for the most part) powered either
directly or indirectly by Blosxom. I use three different approaches to
serving site content:</p>

<ul>
<li><p>All weblog pages (including the site home page) are dynamically
generated by Blosxom.</p></li>
<li><p>Most remaining (non-weblog) pages, eventually including all my
<a href="http://hecker.org/writings">writings</a>, are statically generated by Blosxom, with the
resulting HTML files then copied into the document root directory.</p></li>
<li><p>The remaining web pages and other files (e.g., images, PDF files,
tarballs, and the like) are traditional static web content. Over
time I'll migrate as many of these files as possible into Blosxom
(among other things, so I can take advantage of
<a href="http://hecker.org/blosxom/markdown">Markdown</a> for text formatting).</p></li>
</ul>

<p>The site is hosted by an Intel server running Apache and Linux; since
I administer the server I have complete control over the Apache
configuration, and implement extensive URI rewriting to integrate the
three types of content together in a reasonably seamless manner.</p>

<p>I use <a href="http://subversion.tigris.org/">Subversion</a> to maintain a version-controlled repository of
all the files needed to (re)create the site. (One of the reasons I
chose Blosxom was because it stores weblog content as regular text
files in the file system--as opposed to in a relational database like
a lot of other blog software--and thus is very appropriate for use
with a version control system.) I then use <a href="http://www.gnu.org/software/make/make.html">GNU Make</a> to automate
pushing new content to the site or updating existing content.</p>

<h2>URI rewriting</h2>

<p>My overall goal is that URIs used to request site content should be
entirely independent of the techniques used to serve the content; I
use URI rewriting to achieve this goal. In general, if a requested URI
refers to existing static content on the site then that content is
returned directly by the web server; otherwise Blosxom is invoked to
resolve the URL and generate the appropriate content. However there
are exceptions; for example, we force URIs referring to certain
pre-existing directories (e.g., <code>/mozilla</code>) to be instead interpreted
as Blosxom categories.</p>

<p>I implemented the URI rewriting rules directly in the <code>httpd.conf</code>
Apache configuration file (actually, in a file included by that
file). For more information, including the actual rewriting rules, see
my discussion of <a href="http://hecker.org/site/uri-rewriting">URI rewriting and canonical URIs</a>.</p>

<h2>Extensionless URIs and URI redirection</h2>

<p>I wrote two Blosxom plugins to implement my chosen URI scheme. The
<a href="http://hecker.org/blosxom/extensionless">extensionless</a> plugin allows use of URIs without file extensions
to refer to individual entry pages. The <a href="http://hecker.org/blosxom/canonicaluri">canonicaluri</a> plugin forces
redirects on category and date-based archive pages if the URI omits a
trailing slash, in order to emulate standard Apache behavior for URIs
referring to directories; it also forces a redirect if a URI for an
individual entry page includes a (spurious) trailing slash or an
unneeded <code>.html</code> extension.</p>

<h2>Two-column "liquid" layout</h2>

<p>I implemented the site layout using the techniques described in the
article <a href="http://www.alistapart.com/articles/negativemargins/">"Creating Liquid Layouts with Negative Margins"</a>
published on the <a href="http://www.alistapart.com/">A List Apart</a> site. However I didn't
implement the full set of techniques because I wasn't using background
colors or images.</p>
</div>
    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2004:/site/uri-rewriting</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/uri-rewriting" />

    <title type="text">URI rewriting and canonical URIs</title>
    <published>2004-11-18T12:10:00Z</published>
    <updated>2004-11-18T12:10:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>Here I document the way in which I use URI rewriting (along with
redirection and a couple of Blosxom plugins) to help implement my
personal <a href="http://hecker.org/site/design-philosophy">design philosophy</a> for my web
site. My goal is to create a unified URI space within which static and
dynamic content can transparently co-exist, with publicly-visible URIs
for human-readable content (i.e., HTML pages) having a canonical form
that omits file extensions or other content type specifiers.</p>

<p>Achieving this goal requires solving two separate problems:
intermixing dynamic and static content (in my case, Blosxom-generated
and non-Blosxom content) in the same URI hierarchy, and recognizing
and enforcing preferred canonical forms for URIs.</p>

<h2>Intermixing dynamic and static content</h2>

<p>As an example of freely intermixing dynamically-generated content and
static content (e.g., images) within the same URI hierarchy, consider
a Blosxom-based web site where <code>http://www.example.com/foo</code> is a
Blosxom category, <code>/foo/bar.html</code> is the HTML page displayed for an
individual Blosxom entry in that category, and <code>/foo/baz.jpg</code> is an
image referenced by that entry. (We assume that the site is already
using one of the <a href="http://www.blosxom.com/faq/cgi/hide_cgi_bit.htm" title="Blosxom FAQ: How do I hide the /cgi-bin/blosxom.cgi bit of my URL?">suggested techniques</a> for hiding the
<code>/cgi-bin/blosxom.cgi</code> part of the URI.) With Blosxom there are at
least two possible approaches to support such intermixing.</p>

<p>One approach would be to invoke Blosxom (i.e., <code>blosxom.cgi</code>) for each
and every URI processed, and then to use the <a href="http://www.blosxom.com/plugins/display/binary.htm">binary plugin</a> or the
<a href="http://www.blosxom.com/plugins/files/static_file.htm">static_file plugin</a> based on it; these Blosxom plugins check to see
if a requested URI corresponds to an existing file in the file system
and, if so, they return the file's contents as the output (as opposed
to trying to generate a Blosxom page).</p>

<p>The second possible approach is the reverse: Have Apache serve up
existing files (and indices for existing directories), and invoke
Blosxom only when the URI references a file (or directory) that
doesn't exist. This is the approach I've taken, for various reasons;
in particular, I wanted to avoid the overhead of invoking Blosxom for
each and every URI. This approach is also compatible with a strategy
of converting all or part of the Blosxom-managed content into static
files. For example, one could run <code>blosxom.cgi</code> in static mode to
generate files and directories under the Apache document root; Apache
would then serve up those files and directories just as it would
non-Blosxom content.</p>

<h2>Recognizing and enforcing canonical URIs</h2>

<p>We want to enforce the following rules for how URIs should be
represented:</p>

<ul>
<li><p>URIs used to access HTML content for directories, Blosxom
categories, and Blosxom date-based archive pages should have no
<code>index.html</code> component and one (and only one) trailing slash:</p>

<pre><code>http://www.example.com/foo/
http://www.example.com/foo/2004/11/14/
</code></pre></li>
<li><p>URIs used to access HTML content for static web pages and individual
Blosxom entries should have no <code>.html</code> extension and no trailing
slash:</p>

<pre><code>http://www.example.com/foo/bar
</code></pre></li>
<li><p>URIs used to access all other (non-HTML) content should have
filename extensions; <code>index.*</code> components should be included for
directories, Blosxom categories, and Blosxom date-based archives,
etc.:</p>

<pre><code>http://www.example.com/foo/baz.png
http://www.example.com/foo/index.rss
http://www.example.com/foo/2004/11/14/index.rss
</code></pre></li>
</ul>

<p>Enforcing these conventions is relatively straightforward: if a
requested URI does not follow the above rules then we simply force a
redirect to the canonical URI. However determining whether a URI is
already in the proper canonical form or not is less straightforward,
especially when we intermix Blosxom and non-Blosxom content.</p>

<p>For example, if the URI <code>http://www.example.com/foo/bar</code> is requested
then we have to do the following checks:</p>

<ul>
<li>Is <code>/foo/bar</code> an existing directory?</li>
<li>Is there an existing HTML file <code>/foo/bar.html</code>?</li>
<li>Is <code>/foo/bar</code> a Blosxom category?</li>
<li>Is there an individual Blosxom entry <code>/foo/bar.html</code>?</li>
</ul>

<p>In practice we do these checks in the order shown, both to eliminate
ambiguity (for example, if there's a directory <code>/foo/bar</code> and also a
Blosxom entry <code>/foo/bar.html</code>) and also because it's easier to
implement: The first two checks are done by Apache in the course of
doing URI rewriting, and the second two checks are done by a Blosxom
plugin once the URI has been passed off to Blosxom for processing.</p>

<h2>Strategy</h2>

<p>Here's the overall strategy we follow in doing URI rewriting:</p>

<ul>
<li><p>We divide requests into those that can be satisfied by returning a
static file or directory index and those that are for
dynamically-generated content; the latter are rewritten into URIs
that invoke the necessary CGI script (<code>blosxom.cgi</code>). Note that in
some cases we can make an immediate determination as to whether
content is static or dynamic, while in other cases we have to wait
for the results of subsequent URI rewriting rules.</p></li>
<li><p>If a request references an existing directory then if necessary we
force a redirect to the canonical URI for the directory, namely a
URI with one (and only one) trailing slash after the directory name.</p></li>
<li><p>If a request is for an existing <code>index.html</code> file then we force a
redirect to the canonical URI for the directory in which that file
is located.</p></li>
<li><p>If a request is for any other existing HTML file then we allow and
require that the <code>.html</code> file extension be omitted. If the <code>.html</code>
file extension is included in the requested URI then we force a
redirect to the canonical URI for the file, namely a URI that omits
the <code>.html</code> file extension and has no trailing slashes.</p></li>
<li><p>If a request is for an existing file (or a symbolic link to a file)
then it is handled by Apache in the normal way. Any other requests
are passed to Blosxom for processing.</p></li>
<li><p>Blosxom (actually, a Blosxom plugin) then performs its own set of
checks on URIs, and rewrites URIs and/or forces redirects if
needed.</p></li>
</ul>

<h2>URI rewriting quirks</h2>

<p>Before I discuss the URI rewriting rules themselves, here are various
points to keep in mind when reading the rules; some of these points
are not necessarily immediately apparent from reading the
documentation for the <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html"><code>mod_rewrite</code> module</a> and the
Apache <a href="http://httpd.apache.org/docs-2.0/misc/rewriteguide.html">URL Rewriting Guide</a>:</p>

<ul>
<li><p>In our case we have root access to the system and can put our
rewriting rules in the master Apache configuration file
(<code>httpd.conf</code>). These rules will <em>not</em> work as is if you do not have
root access and have to put your rewriting rules in a <code>.htaccess</code>
file. (I don't have time to revise the rules to work for the
<code>.htaccess</code> case, but perhaps someone else can do so; unfortunately
URI rewriting in a .htaccess file is much more complicated than when
done in <code>httpd.conf</code>.)</p></li>
<li><p><a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewriterule">RewriteRule</a> directives are evaluated in order looking for a
match of the URI against the left-hand side of the RewriteRule
directive. <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewritecond">RewriteCond</a> directives are looked at <em>only</em> if the
corresponding RewriteRule directive matches. If the left-hand side
of the RewriteRule directive matches and the corresponding
RewriteCond directives (if any) evaluate true, then the URI is
rewritten into the form specified by the right-hand side of the
RewriteRule directive. Otherwise Apache just goes on to the next
RewriteRule.</p></li>
<li><p>The URI matched against the RewriteRule directives is not the full
URI but rather just the path component of the URI; it does <em>not</em>
include either the hostname part or any query string. Thus, for
example, if the original request was for
<code>http://www.example.com/foo/bar?flav=baz</code> then the RewriteRule
directives will be matched against <code>/foo/bar</code>. (At least this is
true in my case; this works slightly differently if you have to put
your rules in a <code>.htaccess</code> file.)</p></li>
<li><p>Rule matching is done using regular expressions modeled on those in
Perl. However Apache doesn't support the full range of Perl regular
expressions, and in particular doesn't support any options to
address the "greedy matching" problem. For example, if you match
URIs against an expression like <code>^(.*)//?$</code> (for example, to detect
excess trailing slashes) then for a URI like <code>/foo//</code> Apache will
assign <code>$1</code> to be <code>/foo/</code> instead of <code>/foo</code> as we'd like. We have to
hack around this problem as described in the next section.</p></li>
<li><p>Once we have rewritten a URI to our satisfaction we normally want to
stop URI rewriting at that point, and use the the <code>L</code> ("last") flag
to do this; otherwise Apache would continue evaluating RewriteRule
directives looking for further matches. However using <code>L</code> by itself
is normally not sufficient to get Apache to do the right thing;
typically we have to include other flags as discussed below.</p></li>
<li><p>Normally once rewriting ends Apache will simply take the (possibly
rewritten) path component of the URI and append it to the defined
"document root" value (i.e., the directory where static web content
is located). This causes a problem if the URI actually refers to a
path for which an Apache <a href="http://httpd.apache.org/docs-2.0/mod/mod_alias.html#alias">Alias</a> or <a href="http://httpd.apache.org/docs-2.0/mod/mod_alias.html#scriptalias">ScriptAlias</a> directive is
defined. (For example, on my server <code>/cgi-bin</code> is not in the main
document root, but is located elsewhere as defined by a ScriptAlias
directive.) To fix this we use the <code>PT</code> ("pass through") flag where
needed to tell Apache to first pass the URI through to the
<a href="http://httpd.apache.org/docs-2.0/mod/mod_alias.html"><code>mod_alias</code> module</a>.</p></li>
<li><p>If the URI refers to an existing directory then after the first
round of rewriting ends the Apache <code>mod_dir</code> module will initiate
so-called "subrequests" for URIs with <code>index.html</code>, etc., appended,
and each subrequest will start a new round of rewriting.  We don't
want to have any of our rewriting rules invoked in that case (among
other things, this can lead to problems with looping), so we also
use the <code>NS</code> ("no subrequest") flag to note that our rules should
not be invoked for such internal subrequests.</p></li>
<li><p>In some cases we don't simply want to rewrite the URI, we want to
correct perceived mistakes in how the URI was originally
requested. (For example, we want all URIs referring to existing
directories to end in one--and only one--trailing slash.) For these
cases we use the <code>R</code> flag to tell Apache to redirect the user's
browser to a new and corrected URI. (We actually use <code>R=301</code> to
specify the HTTP code returned, in this case a code meaning "the URI
has permanently moved".) In combination with the <code>L</code> flag this
immediately ends rewriting for the original URI; a new round of
rewriting starts once the browser requests the new URI.</p></li>
</ul>

<h2>The rewriting rules</h2>

<p>Here are the actual Apache URI rewriting rules we use, in order of
evaluation and application:</p>

<ol>
<li><p>We first enable the rewriting engine and specify a location for
logging rewriting actions:</p>

<pre><code>RewriteLog logs/hecker_error_log
RewriteLogLevel 0
RewriteEngine on
</code></pre>

<p>Note that <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewriteloglevel">RewriteLogLevel</a> should be set to a non-zero value to
enable logging; a loglevel value of 9 produces lots of output and
can be very useful when debugging rewriting rules and/or learning
how rewriting works.</p>

<p>Also note that the <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewritebase">RewriteBase</a> directive is not needed here
because we are specifying rewriting rules in the <code>httpd.conf</code>
configuration file; RewriteBase is typically needed only when
specifying rewriting rules in a <code>.htaccess</code> file.</p></li>
<li><p>We start the process of handling trailing slashes by stripping all
but the last two trailing slashes off of all URIs:</p>

<pre><code>RewriteRule  ^(.*)///  $1//  [N,NS]
</code></pre>

<p>The form of this rule is a hack to get around the "greedy matching"
problem described above, which prevents us from using the regular
expression <code>^(.*)//+$</code> for this purpose. For any URI ending in at
least three slashes, we rewrite the URI to remove a trailing slash
and then use the <code>N</code> ("next round") flag to restart rewriting from
the beginning. Since this is the first rule this simply repeats the
rule until no more than two trailing slashes remain on the URI.</p>

<p>Why stop at two, as opposed to removing all but one slash? Because
we want to keep track of URIs with excess trailing slashes and
force a redirect further on down; hence we leave at least one
excess slash on the URI.</p></li>
<li><p>We protect against potential XST cross-site scripting attacks by
rejecting all HTTP requests with either the TRACE or TRACK
methods. (See the relevant <a href="http://www.whitehatsec.com/press_releases/WH-PR-20030120.pdf">WhiteHat Security press release</a>
and a related <a href="http://archives.neohapsis.com/archives/vulnwatch/2003-q1/0035.html">VulnWatch list posting</a>.) The <code>F</code> flag causes
Apache to immediately send back an HTTP response of 403
("FORBIDDEN"). (The <code>L</code> flag is not needed here because the <code>F</code>
flag causes rewriting to stop automatically.)</p>

<pre><code>RewriteCond  %{REQUEST_METHOD}  ^(TRACE|TRACK)
RewriteRule  .*  -  [F,NS]
</code></pre></li>
<li><p>We eliminate duplicate trailing slashes at the end of URIs by
redirecting to a new URI with only one trailing slash, to enforce
canonical forms for directory URIs and also get rid of some
potential problems when requesting Blosxom pages.</p>

<p>Note that the first rule above guarantees that by the time we get
to this rule we'll never see more than two trailing slashes on a
URI.</p>

<pre><code>RewriteRule  ^(.*)//$  $1/  [L,R=301,NS]
</code></pre></li>
<li><p>We don't attempt to do rewriting of URIs that are handled
separately and do not correspond to either Blosxom content or
content in our document root. In particular, we don't attempt to
rewrite <code>/cgi-bin/...</code> or <code>/usage/...</code> URIs (for which we define
aliases for the virtual host associated with the site) or
<code>/icon/...</code> URIs (for which <code>httpd.conf</code> defines a server-wide
alias). Instead we just pass these URIs through to be handled by
<code>mod_alias</code>.</p>

<p>Note that there is also a server-wide alias for <code>/manual/</code> but we
ignore this since we aren't serving up a copy of the Apache
documentation.  (We include <code>/icons/</code> because it is needed for
auto-generated directory listings.)</p>

<pre><code>RewriteRule  ^/(cgi-bin|icons|usage)(/.*)?$  -  [L,PT,NS]
</code></pre></li>
<li><p>We have Blosxom override the directory index for <code>/</code> (the home
page) and other existing directories that should be treated as
Blosxom categories instead.  (References to other stuff under those
directories, including <code>index.html</code> pages, is handled below.)</p>

<p>Note that these rewrite rules must come first in order to handle
these special cases before we check for existing directories in
general. The URIs may or may not have a trailing slash; we need
to handle both cases. We pass any trailing slash on to Blosxom,
which can then force a redirect if the trailing slash is omitted;
we do it this way (rather than forcing a redirect here) so that
such redirection can be done consistently for all Blosxom
categories.</p>

<p>Also note that in order for Blosxom to produce the correct results
the directories in question should <em>not</em> have any <code>index.*</code> files
within the directory. Further on down we explicitly handle the case
where an <code>index.html</code> page is requested (by redirecting to the
canonical URI without <code>index.html</code>) but references to other
<code>index.*</code> pages (e.g., <code>/misc/index.rss</code>) will not return the
corresponding Blosxom page if a file of that name is already
present in the directory.</p>

<pre><code>RewriteRule  ^(/?)$  /cgi-bin/blosxom.cgi$1  [L,PT,NS]
RewriteRule  ^/(blosxom|misc|mozilla)(/?)$
        /cgi-bin/blosxom.cgi/$1$2  [L,PT,NS]
</code></pre></li>
<li><p>We next check to see if the URI corresponds to an existing
directory under the document root. If so and the URI has a trailing
slash then we simply stop rewriting and use the URI as is;
otherwise we add a trailing slash and force a redirect to the new
URI.</p>

<p>Note that we can't use <code>%{REQUEST_FILENAME}</code> in the directory
existence check because that variable has not yet been set; hence
we have to explicitly append the URI to the document root pathname.</p>

<pre><code>ReWriteCond  %{DOCUMENT_ROOT}/$1  -d 
RewriteRule  ^/(.*)/$  -  [L,NS]


ReWriteCond  %{DOCUMENT_ROOT}/$1  -d 
RewriteRule  ^/(.*)$  /$1/  [L,R=301,NS]
</code></pre></li>
<li><p>If the URI is explicitly requesting an index.html file for an
existing directory (e.g., <code>/foo/index.html</code>) then we force a
redirect to the canonical URI for that directory (e.g., <code>/foo/</code>).</p>

<p>Note that we catch the case where the URI (incorrectly) includes a
trailing slash (e.g., <code>/foo/index.html/</code>), and redirect that to the
canonical URI as well. Note also that skipping this rule on
internal subrequests is particularly important; otherwise we'd
cause major looping problems with URIs requesting directory
indices.</p>

<pre><code>ReWriteCond  %{DOCUMENT_ROOT}$1  -d 
RewriteRule  ^(.*)/index\.html(/?)$  $1/  [L,R=301,NS]
</code></pre></li>
<li><p>We check to see if the URI is explicitly requesting an existing
HTML file (e.g., <code>/foo/bar.html</code> where <code>bar.html</code> exists). If so
then we force a redirect to the canonical URI for the file, without
the <code>.html</code> file extension (e.g., <code>/foo/bar</code>).</p>

<p>As with the previous rule, we properly handle the case where the
URI (incorrectly) includes a trailing slash, and we make sure
to avoid problems with internal subrequests for directory index
URIs.</p>

<p>Finally, note a minor bug in the rule: It incorrectly rewrites a
URI like <code>/foo/.html</code> that references a hidden file named <code>.html</code>;
I chose to ignore this uncommon (and arguably nonsensical) case.</p>

<pre><code>ReWriteCond  %{DOCUMENT_ROOT}/$1.html  -f 
RewriteRule  ^/(.*)\.html/?$  /$1  [L,R=301,NS]
</code></pre></li>
<li><p>We check to see if the URI corresponds to an existing HTML file
after adding a <code>.html</code> extension. If so, we rewrite the URI to
include the extension and pass it through to Apache.</p>

<p>Again we properly handle the case where a trailing slash has been
incorrectly added. Also note that we use the <code>OR</code> flag on the
RewriteCond directive because we are checking to see if the URI
references a regular file <em>or</em> a symbolic link; by default all
RewriteCond directives must evaluate true in order for the
corresponding rewriting rule to be invoked.</p>

<pre><code>RewriteCond  %{DOCUMENT_ROOT}/$1.html  -f [OR]
RewriteCond  %{DOCUMENT_ROOT}/$1.html  -l
RewriteRule  ^/(.*)/$  /$1.html  [L,R=301,NS]


RewriteCond  %{DOCUMENT_ROOT}/$1.html  -f [OR]
RewriteCond  %{DOCUMENT_ROOT}/$1.html  -l
RewriteRule  ^/(.*)$  /$1.html  [L,NS]
</code></pre></li>
<li><p>We check to see if the URI corresponds to any other (non-HTML)
existing file or symbolic link and, if so, we force a redirect if
a trailing slash is present.</p>

<pre><code>RewriteCond  %{DOCUMENT_ROOT}/$1  -f [OR]
RewriteCond  %{DOCUMENT_ROOT}/$1  -l
RewriteRule  ^/(.*)/$  /$1  [L,R=301,NS]
</code></pre></li>
<li><p>Finally, we pass the URI on to Blosxom if it does not correspond
to an existing (non-HTML) file or symlink.</p>

<pre><code>RewriteCond  %{DOCUMENT_ROOT}/$1  !-f
RewriteCond  %{DOCUMENT_ROOT}/$1  !-l
RewriteRule  ^/(.*)$  /cgi-bin/blosxom.cgi/$1  [L,PT,NS]
</code></pre></li>
</ol>

<p>This concludes the rewriting rules.</p>

<h2>After (Apache) rewriting ends</h2>

<p>Once Apache has concluded all rewriting (including further rewriting
done for new requests due to redirects) all URIs should be in one of
the following forms, and are handled as indicated:</p>

<ul>
<li><p>Canonical URIs for existing directories (e.g., <code>/foo/</code>). Each such
URI is subsequently processed by the <code>mod_dir</code> module, which will
attempt to look for a directory index file as specified by the
<a href="http://httpd.apache.org/docs-2.0/mod/mod_dir.html#directoryindex">DirectoryIndex</a> directive. The <code>mod_dir</code> module may generate
internal subrequests, e.g., for <code>/foo/index.html</code>, but these will
not invoke our rewriting rules.</p>

<p>If <code>mod_dir</code> cannot find an index file then a directory listing will
be generated by the <a href="http://httpd.apache.org/docs-2.0/mod/mod_autoindex.html"><code>mod_autoindex</code> module</a> if the
<code>Indexes</code> option is specified for the <a href="http://httpd.apache.org/docs-2.0/mod/core.html#options">Options</a> directive.</p></li>
<li><p>URIs for existing HTML files (e.g., <code>/foo/bar.html</code>) rewritten from
the canonical form of such URIs (e.g., <code>/foo/bar</code>). Each such URI is
handled normally by Apache (<em>without</em> going through <code>mod_alias</code>,
since there's no need to do so).</p></li>
<li><p>Canonical URIs for existing non-HTML files (e.g.,
<code>/foo/baz.png</code>). Each such URI is handled normally by Apache (again
without going through <code>mod_alias</code>).</p></li>
<li><p>URIs to be handled by Blosxom (e.g.,
<code>/cgi-bin/blosxom.cgi/foo/</code>). Each such URI is passed to <code>mod_alias</code>
(to determine the location of the <code>cgi-bin</code> directory) and then the
<code>blosxom.cgi</code> script is invoked with the path specified (e.g.,
<code>/foo</code>). Note that the path in question is not necessarily (yet) in
canonical form, except that it is guaranteed not to have more than
one trailing slash.</p></li>
<li><p>Other (non-Blosxom) <code>cgi-bin</code> URIs or other URIs needing further
translation (e.g., <code>/icon</code> URIs for images in auto-generated
directory listings). Each such URI is passed to <code>mod_alias</code>.</p></li>
</ul>

<h2>Canonical URIs in Blosxom</h2>

<p>Once a URI is passed to Blosxom then we have to do the same sorts of
URI checks, rewriting, and/or redirection done by the Apache URI
rewriting rules. We divide this work into two separate plugins:</p>

<ul>
<li><p>The <a href="http://hecker.org/blosxom/extensionless">extensionless plugin</a> checks requests for which the requested
URI lacks a flavour extension (e.g., <code>/foo/bar</code>) and adds the
appropriate flavour extension (<code>.html</code> in our case) to the variable
<code>$blosxom::path_info</code> if there is an individual entry corresponding
to that URI.</p></li>
<li><p>The <a href="http://hecker.org/blosxom/canonicaluri">canonicaluri plugin</a> checks URIs to see if they are in
canonical form and forces a redirect if necessary.</p></li>
</ul>

<p>For more information see the documentation for those plugins.</p>
</div>
    </content>
  </entry>
<entry>
    <id>tag:hecker.org,2004:/site/accessibility</id>
    <link rel="alternate" type="text/html" href="http://hecker.org/site/accessibility" />

    <title type="text">Accessibility statement for www.hecker.org</title>
    <published>2004-10-20T12:12:00Z</published>
    <updated>2004-10-20T12:12:00Z</updated>
    <category term="site" />
    <author>
      <name>Frank Hecker</name>
      <uri>http://hecker.org</uri>
    </author>
    <content type="xhtml" xml:base="http://hecker.org" xml:lang="en">
<div xmlns="http://www.w3.org/1999/xhtml"><p>I've tried to make this site accessible to as many people as possible;
here I describe the accessibility features of this site. (This
statement is based on <a href="http://diveintomark.org/about/accessibility/">Mark Pilgrim's accessibility
statement</a>.) If you have
any questions or comments about the accessibility of this site, feel
free to email me at <a href="mailto:hecker@hecker.org">hecker@hecker.org</a>.</p>

<h2>Access keys</h2>

<p>Most browsers support jumping to specific links by typing special key
combinations defined on the web site.  On Windows, you can press
<acronym>ALT</acronym> + an access key; on Macintosh, you can press
<acronym>Control</acronym> + an access key.</p>

<p>The home page and all archives define the following access keys:</p>

<dl>
<dt>Access key 1</dt>
<dd>Home page</dd>
<dt>Access key 4</dt>
<dd>Search box</dd>
<dt>Access key 9</dt>
<dd>Feedback</dd>
<dt>Access key 0</dt>
<dd>Accessibility statement</dd>
</dl>

<p>(Note that I didn't define an access key to skip to the main content
because the main content is already the first thing on the page.)</p>

<h2>Standards compliance</h2>

<ol>
<li><p>I intend to ensure that all pages are Bobby AAA approved. More on
that later as I complete the necessary work.</p></li>
<li><p>I intend to ensure that all pages are Section 508 compliant. More
on that later as I complete the necessary work.</p></li>
<li><p>The home page and blog archives validate as HTML 4.01 Strict. (Some
older pages on the site have not yet been modified to validate
properly.)</p></li>
<li><p>The home page, blog archives, and other pages use structured
semantic markup.  For example, on pages with more than one entry H2
tags are used for individual post titles, so that JAWS users can
skip to the next post with ALT+INSERT+2.</p></li>
</ol>

<h2>Navigation aids</h2>

<ol>
<li><p>All blog archive pages have <code>rel=home</code> links to aid navigation in
text-only browsers and screen readers; I may add <code>rel=previous</code>,
<code>next</code>, and <code>up</code> links in the future. (Unfortunately <code>prev</code> and
<code>next</code> in particular are not simple to implement in Blosxom, the
blogging system I'm using.) Mozilla users can take advantage of
this feature by selecting the View menu, Show/Hide, Site Navigation
Bar, Show Only As Needed (or Show Always).  Opera 7 has similar
functionality.</p></li>
<li><p>The home page and all archive pages include a search box (access key
4).</p></li>
</ol>

<h2>Links</h2>

<ol>
<li><p>Many links have title attributes which describe the link in greater
detail, unless the text of the link already fully describes the
target (such as the headline of an article).</p></li>
<li><p>Whever possible, links are written to make sense out of context.
Many browsers (such as JAWS, Home Page Reader, Lynx, and Opera) can
extract the list of links on a page and allow the user to browse the
list, separately from the page.</p></li>
<li><p>Link text is never duplicated; two links with the same link text
always point to the same address</p></li>
<li><p>There are no "<code>javascript:</code>" pseudo-links.  All links can be
followed in any browser, even if scripting is turned off.</p></li>
<li><p>There are no links that open new windows without warning.</p></li>
</ol>

<h2>Images</h2>

<ol>
<li>With one exception (a photo for my biography page) this site does not
use images at all.</li>
</ol>

<h2>Visual design</h2>

<p>This site and all its archives use cascading style sheets for visual
layout.</p>

<ol>
<li><p>The style sheets for this site do not specify a base font size, and
use relative font sizes to specify the appearance of headings and
related text. Text on this site should be resizable in any browser
that permits text resizing.</p></li>
<li><p>If your browser or browsing device does not support stylesheets at
all, the content of each page is still readable.</p></li>
</ol>

<h2>References</h2>

<p>In creating this site I made use of Mark Pilgrim's
"<a href="http://diveintoaccessibility.org/" title="30 days to a more accessible web site">Dive Into Accessibility</a>" book and related materials. See the book
for a complete list of other references.</p>
</div>
    </content>
  </entry>
</feed>
