<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-9361273</id><updated>2024-03-14T02:30:00.133-04:00</updated><category term="XML"/><category term="JHOVE"/><category term="HTML"/><category term="W3C"/><category term="PDF"/><category term="TIFF"/><category term="metadata"/><category term="GDFR"/><category term="conferences"/><category term="JPEG"/><category term="JPEG2000"/><category term="Microsoft"/><category term="preservation"/><category term="BigTIFF"/><category term="ISO"/><category term="MP3"/><category term="ODF"/><category term="ZIP"/><category term="DLF"/><category term="DROID"/><category term="IANA"/><category term="JBIG"/><category term="JPEG XR"/><category term="OOXML"/><category term="Planets"/><category term="Pronom"/><category term="UDFR"/><category term="UTF-8"/><category term="Unicode"/><category term="WARC"/><category term="XBRL"/><category term="XHTML"/><category term="humor"/><title type='text'>File Formats Blog</title><subtitle type='html'>This blog has moved. The permanent address for this blog is &lt;a href=&quot;http://www.mcgath.com/fileformatsblog&quot;&gt;www.mcgath.com/fileformatsblog&lt;/a&gt;.&#xa;&#xa;Having observed how capriciously Blogger/Blogspot treats its own content providers, I am no longer using it and do not recommend it as a blogging platform.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default?alt=atom'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default?alt=atom&amp;start-index=26&amp;max-results=25'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>241</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-9361273.post-5797842628536090377</id><published>2009-08-09T08:20:00.002-04:00</published><updated>2009-08-09T08:33:45.176-04:00</updated><title type='text'>Moving day</title><content type='html'>&lt;p&gt;This blog is now &lt;a href=&quot;http://fileformats.wordpress.com&quot;&gt;located at Wordpress&lt;/a&gt;. Old posts will remain here for the foreseeable future, but all new posts will be at the new location.
&lt;p&gt;
The reason for this move is the increasingly capricious policies of Blogspot (i.e., Google) in declaring blogs to have &quot;objectionable content.&quot; I haven&#39;t been affected by that, not counting the time they mysteriously declared this a &quot;spam blog&quot; several years ago along with dozens or hundreds of other blogs, but I&#39;m hoping Wordpress will be a more reasonable host.
&lt;p&gt;
If you&#39;re using &lt;a href=&quot;http://www.mcgath.com/fileformatsblog&quot;&gt;http://www.mcgath.com/fileformatsblog&lt;/a&gt; to bookmark this blog, then you won&#39;t be affected, except that you won&#39;t even see this post and the look will be a bit different. 
&lt;p&gt;
You&#39;ll have to register for a Wordpress account if you want to comment there. There may also be an Open ID option.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/5797842628536090377/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/5797842628536090377' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/5797842628536090377'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/5797842628536090377'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/08/moving-day.html' title='Moving day'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-7346703340231464787</id><published>2009-08-03T15:17:00.000-04:00</published><updated>2009-08-03T15:20:26.954-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><category scheme="http://www.blogger.com/atom/ns#" term="PDF"/><title type='text'>JHOVE 1.4</title><content type='html'>&lt;p&gt;&lt;a href=&quot;http://sourceforge.net/projects/jhove/&quot;&gt;JHOVE 1.4&lt;/a&gt; is now available on SourceForge. The main change is that PDF/A compliance is more accurately identified than before, and is based on the final standard rather than a draft.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/7346703340231464787/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/7346703340231464787' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/7346703340231464787'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/7346703340231464787'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/08/jhove-14.html' title='JHOVE 1.4'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-1813579469393813734</id><published>2009-07-31T08:58:00.002-04:00</published><updated>2009-07-31T09:02:44.012-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="ISO"/><category scheme="http://www.blogger.com/atom/ns#" term="JPEG"/><category scheme="http://www.blogger.com/atom/ns#" term="JPEG XR"/><title type='text'>JPEG XR is ISO standard</title><content type='html'>&lt;p&gt;JPEG XR, formerly known as Microsoft HD Photo, is now an international standard, as reported in a &lt;a href=&quot;http://jpeg.org/newsrel26.html&quot;&gt;JPEG press release&lt;/a&gt;. 
&lt;p&gt;
Thanks to &lt;a href=&quot;http://blogs.msdn.com/billcrow/archive/2009/07/29/jpeg-xr-is-now-an-international-standard.aspx&quot;&gt;Bill Crow&#39;s blog&lt;/a&gt; for calling this to my attention.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/1813579469393813734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/1813579469393813734' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/1813579469393813734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/1813579469393813734'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/07/jpeg-xr-is-iso-standard.html' title='JPEG XR is ISO standard'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-3972156050543449866</id><published>2009-07-22T13:50:00.002-04:00</published><updated>2009-07-22T13:55:09.295-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><title type='text'>JHOVE2 workshop</title><content type='html'>&lt;p&gt;A &lt;a href=&quot;http://jhove2.eventbrite.com/&quot;&gt;workshop on JHOVE2&lt;/a&gt; will be held after the conclusion of &lt;a href=&quot;http://www.cdlib.org/iPres/&quot;&gt;iPres 2009&lt;/a&gt; in San Francisco, on October 7, 2009. This will include, for the first time, a presentation of the prototype code. &lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/3972156050543449866/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/3972156050543449866' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3972156050543449866'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3972156050543449866'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/07/jhove2-workshop.html' title='JHOVE2 workshop'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-3734186765523678749</id><published>2009-07-01T08:25:00.000-04:00</published><updated>2009-07-01T09:05:48.140-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="HTML"/><category scheme="http://www.blogger.com/atom/ns#" term="W3C"/><title type='text'>Jumping the gun on HTML 5</title><content type='html'>&lt;p&gt;With the release of Firefox 3.5, references to the &quot;HTML 5 standard&quot; are on the rise in tech news. &lt;a href=&quot;http://news.cnet.com/8301-17939_109-10275863-2.html&quot;&gt;CNET&lt;/a&gt; refers to &quot;new standards such as HTML 5.&quot; The &lt;a href=&quot;http://features.csmonitor.com/innovation/2009/06/30/download-of-the-day-firefox-35/&quot;&gt;Christian Science Monitor&lt;/a&gt; says &quot;this latest version [of Firefox] adds support for the HTML 5 web standard.&quot; &lt;a href=&quot;http://industry.bnet.com/technology/10002353/three-ways-html-5-is-transforming-it/&quot;&gt;BNET Technology&lt;/a&gt; has a whole article on HTML 5 without once suggesting that it isn&#39;t in final form.
&lt;p&gt;
But as the World Wide Web Consortium notes, &lt;a href=&quot;http://www.w3.org/TR/html5/&quot;&gt;HTML 5 is still a long way from being settled&lt;/a&gt;. The latest working draft states: &quot;Implementors should be aware that this specification is not stable. &lt;b&gt;Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways.&lt;/b&gt;&quot; (Emphasis in the original.)
&lt;p&gt;
This most likely means that there will be several different &quot;HTML 5&quot;s from different vendors, and that it will be a long time before they all come close to agreement. HTML 5 was intended to close the gap among different implementations by specifying more aspects of the language and basing it on an object model rather than a syntactic one, but we&#39;re probably in for more of the same techno-Babel.
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/3734186765523678749/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/3734186765523678749' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3734186765523678749'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3734186765523678749'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/07/jumping-gun-on-html-5.html' title='Jumping the gun on HTML 5'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-4473814605826103816</id><published>2009-06-11T10:19:00.001-04:00</published><updated>2009-06-11T10:22:45.965-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="DROID"/><title type='text'>DROID 4.0</title><content type='html'>&lt;p&gt;The National Archives (UK) has released &lt;a href=&quot;http://sourceforge.net/forum/forum.php?forum_id=963409&quot;&gt;DROID 4.0&lt;/a&gt;, the latest version of its file format identification tool.&lt;/p&gt;
&lt;blockquote&gt;
The focus of the latest major release is the inclusion of the DCS (Digital Continuity Service) and Planets, Collection Profiler. DROID now runs in two modes. The original file identification mode, and the new “profile” mode that allows users to obtain file format information gathered from large distributed sources of digital files, providing users with aggregated statistical data and reports to help them take appropriate management decisions regarding risk associated with such large collections of files.
&lt;/blockquote&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/4473814605826103816/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/4473814605826103816' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4473814605826103816'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4473814605826103816'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/06/droid-40.html' title='DROID 4.0'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-4219809532280264382</id><published>2009-06-06T08:16:00.002-04:00</published><updated>2009-06-06T08:21:42.618-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><category scheme="http://www.blogger.com/atom/ns#" term="preservation"/><title type='text'>JHOVE 1.3 is out</title><content type='html'>&lt;p&gt;&lt;a href=&quot;https://sourceforge.net/projects/jhove/&quot;&gt;JHOVE 1.3&lt;/a&gt; is now available. This version includes fixes to some serious bugs in the PDF module. It now has a much lower rate of spurious rejections. Read the &lt;a href=&quot;https://sourceforge.net/project/shownotes.php?group_id=221311&amp;release_id=687368&quot;&gt;release notes&lt;/a&gt; for full details.
&lt;p&gt;
I&#39;ve already made a few post-release changes to source code, removing a stack dump for debugging purposes which I&#39;d inadvertently left in, and updating version information where I&#39;d forgotten to. Sigh ... I need to put all the configuration information in one place, instead of having it scattered through a dozen source files, before the next release.
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/4219809532280264382/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/4219809532280264382' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4219809532280264382'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4219809532280264382'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/06/jhove-13-is-out.html' title='JHOVE 1.3 is out'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-825331680292004254</id><published>2009-06-05T09:49:00.002-04:00</published><updated>2009-06-05T09:54:11.939-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="WARC"/><title type='text'>WARC is ISO standard</title><content type='html'>&lt;p&gt;The WARC (Web Archive) format is now an &lt;a href=&quot;http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=44717&quot;&gt;ISO standard&lt;/a&gt;, available for a mere 118 Swiss francs. Or you can grab a &lt;a href=&quot;http://bibnum.bnf.fr/WARC/WARC_ISO_28500_version1_latestdraft.pdf&quot;&gt;near-final draft&lt;/a&gt; (PDF) for free.
&lt;p&gt;
Found by way of &lt;a href=&quot;http://digitizationblog.interoperating.info/node/438&quot;&gt;digitizationblog&lt;/a&gt;.
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/825331680292004254/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/825331680292004254' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/825331680292004254'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/825331680292004254'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/06/warc-is-iso-standard.html' title='WARC is ISO standard'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-2916809938927306109</id><published>2009-05-21T15:21:00.002-04:00</published><updated>2009-05-21T15:26:09.357-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="HTML"/><category scheme="http://www.blogger.com/atom/ns#" term="W3C"/><category scheme="http://www.blogger.com/atom/ns#" term="XHTML"/><category scheme="http://www.blogger.com/atom/ns#" term="XML"/><title type='text'>W3C rescinds four drafts</title><content type='html'>&lt;p&gt;This is unusual; W3C has &lt;a href=&quot;http://www.w3.org/News/2009#item79&quot;&gt;rescinded four XHTML-related drafts&lt;/a&gt;. This means that they roll back to the previous versions. The rescinded drafts are:
&lt;/p&gt;
&lt;ul&gt; 
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/PER-xhtml11-20090507/&quot;&gt; XHTML™ 1.1 - Module-based XHTML - Second Edition&lt;/a&gt;&lt;/li&gt; 
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/PER-xhtml-basic-20090507/&quot;&gt;XHTML™ Basic 1.1 - Second Edition&lt;/a&gt;&lt;/li&gt; 
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/PER-xhtml-print-20090507/&quot;&gt;XHTML-Print - Second Edition&lt;/a&gt;&lt;/li&gt; 
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/PER-xhtml1-20090507/&quot;&gt;XHTML™ 1.0&lt;/a&gt;&lt;/li&gt; 
&lt;/ul&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/2916809938927306109/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/2916809938927306109' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/2916809938927306109'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/2916809938927306109'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/05/w3c-rescinds-four-drafts.html' title='W3C rescinds four drafts'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-448441199835544434</id><published>2009-05-19T14:53:00.003-04:00</published><updated>2009-05-19T14:58:18.884-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JPEG2000"/><title type='text'>Survey results on JPEG2000</title><content type='html'>&lt;p&gt;The results of a &lt;a href=&quot;http://digitalcommons.uconn.edu/libr_pubs/16/&quot;&gt;University of Connecticut study&lt;/a&gt; on how libraries are using JPEG2000 format are available online. Thanks to &lt;a href=&quot;http://hurstassociates.blogspot.com/2009/05/jpeg-2000-survey-results.html&quot;&gt;Digitization 101&lt;/a&gt; for the link.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/448441199835544434/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/448441199835544434' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/448441199835544434'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/448441199835544434'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/05/survey-results-on-jpeg2000.html' title='Survey results on JPEG2000'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-7723834578448188068</id><published>2009-05-08T08:00:00.000-04:00</published><updated>2009-05-08T08:00:00.563-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="Unicode"/><category scheme="http://www.blogger.com/atom/ns#" term="UTF-8"/><title type='text'>Spoofing characters</title><content type='html'>&lt;p&gt;You wouldn&#39;t think there&#39;s a security issue in the UTF-8 character set, but there is, in an indirect way. In a &lt;a href=&quot;http://blogs.sun.com/CoreJavaTechTips/entry/the_overhaul_of_java_utf&quot;&gt;note on a Sun Java blog&lt;/a&gt;, it&#39;s explained that in the old (20th century) definition of UTF-8, some characters could be represented by more than one byte sequence. For example, any ASCII character (0X01 through 0X7F) could be preceded by 0XC0 without changing its interpretation. This could create problems if security filters are looking for certain characters in order to stop cross-site scripting or SQL injection; the spoofed characters could get past such filters if they don&#39;t take the alternate byte representations into account. For this reason, the &lt;a href=&quot;http://www.unicode.org/versions/corrigendum1.html&quot;&gt;current UTF-8 requirements&lt;/a&gt; specify that the shortest byte representation of a character is the only legitimate one. 
&lt;p&gt;
This change was made in 2000, but not all implementations of UTF-8 have caught up. Sun has only recently fixed this in Java, with JDK7, Open JDK 6, JDK 6 update 11 and later, JDK5.0u17, and Java 1.4.2_19. (If you&#39;re using Java 1.3 or earlier, you&#39;re probably stuck, but why would you do that?)
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/7723834578448188068/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/7723834578448188068' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/7723834578448188068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/7723834578448188068'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/05/spoofing-characters.html' title='Spoofing characters'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-4900055539036993283</id><published>2009-05-05T16:20:00.001-04:00</published><updated>2009-05-05T16:28:58.264-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="PDF"/><title type='text'>PDF with DRM</title><content type='html'>&lt;p&gt;PDFZone &lt;a href=&quot;http://www.pdfzone.com/c/a/Authoring/Vitrium-Unleashes-Version-2-of-PDF-DRM-for-the-Masses/?kc=rss&quot;&gt;discusses ProtectedPDF&lt;/a&gt;, a way of incorporating digital rights management in PDF. According to the article, ProtectedPDF files can be read in Acrobat Reader without any plugins, which presumably means that the files are fully PDF-compliant. Internet access to Vitrium&#39;s server is required to open documents. While the article doesn&#39;t say so, the only way I can think of doing this is through JavaScript.
&lt;p&gt;
The value or abusiveness of DRM depends on how it is used. For short-term use of a document, it can be a sensible way of limiting access. For limiting the distribution of content while supposedly allowing recipients permanent access, it doesn&#39;t work so well. If Vitrium ever goes out of business or gets tired of supporting ProtectedPDF, then all the rights-managed documents become inaccessible. &quot;Buying&quot; DRM-protected content, as opposed to renting it, is a dubious proposition. The effectiveness of DRM is limited; if you can get content on your computer, there&#39;s always a way to copy it.
&lt;p&gt;
Assuming JavaScript is required, users must enable it to view ProtectedPDF documents. (A &lt;a href=&quot;http://www2.iccsafe.org/states/questions/protectedpdf.htm&quot;&gt;third-party FAQ&lt;/a&gt; confirms this.) This exposes users to JavaScript-based vulnerabilities in Acrobat Reader, as well as possible loss of privacy. The article notes that ProtectedPDF files can report back on who&#39;s reading them. People are aware of these risks when using a web browser, less so when using a PDF reader.
&lt;p&gt;
As a technical concept, it&#39;s very interesting that DRM can be implemented within the PDF specification. How it works out in practice remains to be seen.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/4900055539036993283/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/4900055539036993283' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4900055539036993283'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4900055539036993283'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/05/pdf-with-drm.html' title='PDF with DRM'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-6726609675327481752</id><published>2009-04-24T15:02:00.002-04:00</published><updated>2009-04-24T15:13:02.250-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="HTML"/><category scheme="http://www.blogger.com/atom/ns#" term="W3C"/><title type='text'>HTML 5 update</title><content type='html'>&lt;p&gt;W3C has published new working drafts of:
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/WD-html5-20090423/&quot;&gt;HTML 5: A vocabulary and associated APIs for HTML and XHTML&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/WD-html5-diff-20090423/&quot;&gt;HTML 5 Differences from HTML 4&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/WD-websockets-20090423/&quot;&gt;The Web Sockets API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/WD-eventsource-20090423/&quot;&gt;Server-sent Events&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/WD-webstorage-20090423/&quot;&gt;Web Storage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.w3.org/TR/2009/WD-workers-20090423/&quot;&gt;Web Workers&lt;/a&gt;&lt;/li&gt; 
&lt;/ul&gt;
&lt;p&gt;
The last four are newly broken out from the HTML 5 specification. Or should we call it Ginormica?&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/6726609675327481752/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/6726609675327481752' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/6726609675327481752'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/6726609675327481752'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/04/html-5-update.html' title='HTML 5 update'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-5569245698376026205</id><published>2009-04-17T18:47:00.002-04:00</published><updated>2009-04-17T18:52:29.723-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="conferences"/><category scheme="http://www.blogger.com/atom/ns#" term="DLF"/><title type='text'>CLIR absorbs DLF</title><content type='html'>&lt;p&gt;The &lt;a href=&quot;http://www.diglib.org/&quot;&gt;Digital Library Federation&lt;/a&gt; is being merged into the &lt;a href=&quot;http://www.clir.org/&quot;&gt;Council on Library and Information Resources&lt;/a&gt; (CLIR), effective July 1, 2009. The &lt;a href=&quot;http://www.diglib.org/forums/spring2009/&quot;&gt;DLF Spring Forum&lt;/a&gt; isn&#39;t affected by this.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/5569245698376026205/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/5569245698376026205' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/5569245698376026205'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/5569245698376026205'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/04/clir-absorbs-dlf.html' title='CLIR absorbs DLF'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-8184542810371133249</id><published>2009-04-08T15:35:00.002-04:00</published><updated>2009-04-08T15:55:39.589-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="GDFR"/><category scheme="http://www.blogger.com/atom/ns#" term="Pronom"/><category scheme="http://www.blogger.com/atom/ns#" term="UDFR"/><title type='text'>GDFR + Pronom = UDFR</title><content type='html'>&lt;p&gt;For some time, people working in digital preservation have been wondering what was happening with the &lt;a href=&quot;http://gdfr.info&quot;&gt;Global Digital Format Registry (GDFR)&lt;/a&gt;. The answer was: Not much. But a lot of discussions were happening on how to combine the openness of GDFR with the maturity of &lt;a href=&quot;http://www.nationalarchives.gov.uk/PRONOM/Default.aspx&quot;&gt;Pronom&lt;/a&gt;. Now it can be told: Planning for the &lt;a href=&quot;http://gdfr.info/udfr.html&quot;&gt;&lt;i&gt;Unified&lt;/i&gt; Digital Formats Registry (UDFR)&lt;/a&gt; has been announced. 
&lt;p&gt;
The &lt;a href=&quot;http://gdfr.info/udfr_docs/Unified_Digital_Formats_Registry.pdf&quot;&gt;proposal and roadmap&lt;/a&gt; (PDF) states:
&lt;/p&gt;&lt;blockquote&gt;
There are two major efforts underway to create a format registry with 
complimentary strengths and weaknesses.   PRONOM, created by The National Archives 
(TNA) in the UK, has a strong technological base, and has been building a database of 
original information about various digital formats.  PRONOM at this point however is 
owned and maintained by a single organization, making it vulnerable to changes in that 
institution.  The Global Digital Formats Registry (GDFR) effort, hosted by Harvard 
University, has developed a model for a registry based on shared governance, cooperative data contribution, and distributed data hosting.  However, GDFR is technically less far along in development, and has not yet begun database building.  
Given the paucity of resources in the digital preservation community it would be highly unfortunate if these efforts were to compete for resources.  Therefore a group of involved and interested institutions have agreed to join together to create a single shared formats registry drawing on the individual strengths of the two existing efforts.  The initiative would:
&lt;br&gt;&amp;nbsp;&lt;br&gt; 
&amp;nbsp;&amp;nbsp;* be technically based on the existing PRONOM system and database; 
&amp;nbsp;&amp;nbsp;* create a community governance model for the registry involving all institutions willing to contribute to its development;   
&amp;nbsp;&amp;nbsp;* develop a mechanism for the distribution of the registry data in such a way as to support local extensions and additions to the database; 
&amp;nbsp;&amp;nbsp;* develop both technical and organizational support for distributed input to the registry, including some form of quality vetting of contributed data. 
&lt;/blockquote&gt;
&lt;p&gt;
The start of the working group came at &lt;a href=&quot;http://www.bl.uk/ipres2008/&quot;&gt;iPres&lt;/a&gt; in London last September, but I couldn&#39;t talk about it publicly then. The working group includes members from British Library, the California Digital Library, the Harvard University Libraries, the National Archives, the National Library of Australia, the National Library of New Zealand, Portico, and Tessella. 
&lt;p&gt;
HUL has registered the domain udfr.org, but there is no content up yet.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/8184542810371133249/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/8184542810371133249' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/8184542810371133249'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/8184542810371133249'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/04/gdfr-pronom-udfr.html' title='GDFR + Pronom = UDFR'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-916594759325192787</id><published>2009-03-31T15:15:00.003-04:00</published><updated>2009-03-31T15:21:35.228-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><title type='text'>Keeping track of JHOVE2</title><content type='html'>&lt;p&gt;Sheila Morrissey and Stephen Abrams mentioned to me that some people have been looking at the old JHOVE2 proposal on the HUL website, even though it&#39;s far out of date. I&#39;ve just removed a couple of links to it on the &lt;a href=&quot;http:hul.harvard.edu/jhove&quot;&gt;JHOVE website&lt;/a&gt;. Here&#39;s a link to the &lt;a href=&quot;http://confluence.ucop.edu/download/attachments/2098004/JHOVE2-Project-Proposal.doc?version=1&quot;&gt;current and funded proposal&lt;/a&gt; (DOC format).</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/916594759325192787/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/916594759325192787' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/916594759325192787'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/916594759325192787'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/03/keeping-track-of-jhove2.html' title='Keeping track of JHOVE2'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-709054952908985786</id><published>2009-03-24T14:57:00.004-04:00</published><updated>2009-03-24T15:01:28.059-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="conferences"/><category scheme="http://www.blogger.com/atom/ns#" term="PDF"/><title type='text'>PDF/A conference</title><content type='html'>The third international PDF/A conference will be held in Berlin on June 17-19, 2009. Here&#39;s the &lt;a href=&quot;http://www.pdfa.org/lib/exe/fetch.php?id=press%3Aen&amp;cache=cache&amp;media=press:press_pdfa_conference-berlin.pdf&quot;&gt;announcement&lt;/a&gt; (PDF, naturally).</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/709054952908985786/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/709054952908985786' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/709054952908985786'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/709054952908985786'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/03/pdfa-conference.html' title='PDF/A conference'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-130113244315035480</id><published>2009-03-23T06:45:00.001-04:00</published><updated>2009-03-25T10:40:03.619-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><category scheme="http://www.blogger.com/atom/ns#" term="preservation"/><title type='text'>Reconstructability and digital preservation</title><content type='html'>&lt;p&gt;
In my post on &lt;a href=&quot;http://fileformats.blogspot.com/2009/03/preservation-vs-format-tolerance.html&quot;&gt;Preservation vs. format tolerance&lt;/a&gt;, I briefly addressed the issue of how strictly format compliance should be required for preservation purposes. Here I&#39;d like to expand on that. A frequently-mentioned idea in digital preservation is that people in the future might need to examine old files armed only with the specification for the format. Following this view, files worthy of preservation should strictly follow the specification, so that our descendants can write a file reader that implements the specification and open the ancient documents without difficulty. 
&lt;p&gt;
There are several problems with this view. First, many specifications aren&#39;t unambiguously written. It&#39;s often unclear what is a requirement and what is desirable. (The TIFF specification has this problem.) Second, future programmers won&#39;t just give up if their software won&#39;t open a file; they&#39;ll look to see what&#39;s going wrong and will be able to work around some problems. Third, not all deviations make the file unreadable; some cause the loss of only limited information. For instance, a malformed date in TIFF or PDF may cause problems reading the date, but won&#39;t otherwise impact the ability to read the file.
&lt;p&gt;
JHOVE 1.x takes a &quot;one strike, you&#39;re out&quot; approach to compliance, with certain exceptions. If a file violates the specification in any way, it&#39;s ill-formed or invalid and you&#39;re told only that it&#39;s a &quot;bytestream.&quot; (We have compromised this approach on specific features, but it&#39;s still the basic approach. One character with the high bit set in an ASCII file, for instance, says it&#39;s not well-formed ASCII.)
&lt;p&gt;
JHOVE2 improves on this by introducing the concept of &lt;i&gt;assessment&lt;/i&gt;. According to the JHOVE2 functional requirements (version1.3), &quot;Assessment is the process of determining the level of acceptability of a digital object for a specific purpose on the b[asis] of locally-defined policy rules. Since these rules are configurable, assessment is considered (vis-a-vis validation) a subjective determination.&quot; From what I&#39;ve seen, this concept hasn&#39;t been very well nailed down yet, though I&#39;m sure Stephen Abrams has thought a lot more about it than what I&#39;ve read.
&lt;p&gt;
What I&#39;m proposing here is that a metric of assessment should be &lt;i&gt;reconstructability&lt;/i&gt;, the degree to which a file can be reconstructed in spite of non-standard aspects in its content or structure.
&lt;p&gt;
To a certain extent, estimates of reconstructability require guesswork about what information people will have in the future. In the case of TIFF, will they have just the 6.0 specification? Will they have the Photoshop supplement? The contents of technical mailing lists? But we can recognize that some deviations are easier to reconstruct by guesswork and common sense than others. If strictly following byte-alignment requirements makes files look broken, it won&#39;t be hard to try relaxing the requirement. If being unable to decipher a field means losing just one piece of metadata, that&#39;s not usually a disaster.
&lt;p&gt;
Conversely, a file can be completely well-formed and valid, yet have reconstructability issues because of external dependencies. Interestingly, all schema-based XML files have this problem. Schemas are necessarily external to the data file, and the schema&#39;s location is a URI which might easily become worthless in the future. Sometimes schemas change, even though they shouldn&#39;t; for instance, the &lt;a href=&quot;http://www.loc.gov/standards/mix/mix.xsd&quot;&gt;schema for MIX metadata&lt;/a&gt; used to be the one for MIX 0.2, but was replaced with the schema for MIX 2.0, breaking the validation MIX 0.2 documents. So unless an XML file is packaged together with its schema, it should lose points for reconstructability. On the positive side, anyone with knowledge of the ASCII character set and a few representative files to study should be able to reconstruct the basic rules of XML, even without documentation of the format, and if the tags reflect their intended meaning, a file can be largely self-documenting.
&lt;p&gt;
HTML provides a worrisome case. The large majority of the HTML on the Web is defective in one more more ways, and browsers use ingenious tricks to approximate the intent of the author. Read a file with the wrong browser, and it may be a complete mess. Then there&#39;s the matter of external links, which may include JavaScript and CSS files that are essential to the presentation. HTML accounts for a large portion of the digital information available today, but it presents severe reconstructability problems.
&lt;p&gt;
This post is just a first shot at describing the idea, and it&#39;s possible I&#39;m repeating what someone else has said (if so, someone please tell me). But I hope it will be a springboard for discussion.
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/130113244315035480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/130113244315035480' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/130113244315035480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/130113244315035480'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/03/reconstructability-and-digital.html' title='Reconstructability and digital preservation'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-3008344087115995427</id><published>2009-03-18T00:15:00.001-04:00</published><updated>2009-03-18T12:16:38.071-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><category scheme="http://www.blogger.com/atom/ns#" term="PDF"/><category scheme="http://www.blogger.com/atom/ns#" term="preservation"/><title type='text'>Preservation vs. format tolerance</title><content type='html'>In doing a search for blog posts about JHOVE, I came across a very interesting &lt;a href=&quot;http://blog.dshr.org/2009/01/postels-law.html&quot;&gt;discussion of the conflict between exact format checking and error tolerance&lt;/a&gt;. &quot;Postel&#39;s law&quot; implies that readers should overlook errors in files whenever feasible. But what should digital repositories do? Should they be completely strict, because we don&#39;t know how future readers will work? Or should they overlook some harmless errors rather than reject otherwise perfectly good documents? JHOVE uses a &quot;one strike, you&#39;re out&quot; approach, which is often less then ideal. But accepting any document which doesn&#39;t break the current Adobe Reader would be reckless. The balance is tricky. Some errors are deadly, some harmless, and validation software should really be able to distinguish degrees of harm.</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/3008344087115995427/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/3008344087115995427' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3008344087115995427'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3008344087115995427'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/03/preservation-vs-format-tolerance.html' title='Preservation vs. format tolerance'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-2010881998930488779</id><published>2009-03-16T06:09:00.002-04:00</published><updated>2009-03-16T06:11:34.777-04:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="MP3"/><title type='text'>Services which ban MP3 format</title><content type='html'>&lt;p&gt;Over at &lt;a href=&quot;http://taking-license.blogspot.com/2009/03/sites-which-prohibit-music-files.html&quot;&gt;Taking License&lt;/a&gt;, I&#39;ve posted a discussion of Internet services which, for reasons beyond my comprehension, have a specific prohibition on the MP3 file format.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/2010881998930488779/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/2010881998930488779' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/2010881998930488779'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/2010881998930488779'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/03/services-which-ban-mp3-format.html' title='Services which ban MP3 format'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-2312937276425578333</id><published>2009-03-05T12:35:00.004-05:00</published><updated>2009-03-05T12:57:08.837-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="Planets"/><category scheme="http://www.blogger.com/atom/ns#" term="preservation"/><title type='text'>Planets Testbed</title><content type='html'>&lt;p&gt;The &lt;a href=&quot;http://planets-project.eu/&quot;&gt;Planets project&lt;/a&gt; has announced a web application, the &lt;a href=&quot;http://testbed.planets-project.eu/testbed/&quot;&gt;Planets Testbed&lt;/a&gt;, designed to allow evaluation of digital preservation strategies. Testers can use various digital preservation tools (including JHOVE) and migration paths. People with an interest in digital preservation from libraries, archives, museums, and so on are invited to take part. Institutions will be able to register and carry out experiments.
&lt;p&gt;
I should mention that the login page triggers some urgent warning messages in Firefox, since it uses a self-signed security certificate. Since you won&#39;t be giving them any credit card numbers or other sensitive information, you shouldn&#39;t have to worry about that.
&lt;p&gt;
By the way, congratulations to Adrian Brown on his appointment to the Parliamentary Archives.
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/2312937276425578333/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/2312937276425578333' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/2312937276425578333'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/2312937276425578333'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/03/planets-testbed.html' title='Planets Testbed'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-3942751541490216393</id><published>2009-02-26T16:07:00.002-05:00</published><updated>2009-02-26T16:10:43.477-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="W3C"/><category scheme="http://www.blogger.com/atom/ns#" term="XML"/><title type='text'>New working drafts from XML working security group</title><content type='html'>The XML security working group of the W3C has &lt;a href=&quot;http://www.w3.org/News/2009#item25&quot;&gt;released eight new working drafts&lt;/a&gt;. These deal with crypotographic algorithms, use cases, key derivation, and best practices.</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/3942751541490216393/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/3942751541490216393' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3942751541490216393'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/3942751541490216393'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/02/new-working-drafts-from-xml-working.html' title='New working drafts from XML working security group'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-4160188746130925918</id><published>2009-02-23T12:15:00.000-05:00</published><updated>2009-02-23T12:16:03.220-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="ODF"/><category scheme="http://www.blogger.com/atom/ns#" term="XML"/><title type='text'>ODF-Next</title><content type='html'>&lt;p&gt;OASIS has started to solicit &lt;a href=&quot;http://lists.oasis-open.org/archives/tc-announce/200902/msg00007.html&quot;&gt;input for &quot;ODF-Next,&quot;&lt;/a&gt; the version which will follow ODF 1.2. This input will go on concurrently with the finalization of 1.2. Proposals must be received by March 31 to be considered in the ODF-Next requirements report, which is expected to come out around May 1. Proposals previously submitted to the ODF TC&#39;s public comment
list don&#39;t need to be resubmitted.
&lt;p&gt;
See &lt;a href=&quot;http://www.robweir.com/blog/2009/02/looking-for-good-ideas-for-odf-next.html&quot;&gt;discussion by Rob Weir&lt;/a&gt;.
&lt;p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/4160188746130925918/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/4160188746130925918' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4160188746130925918'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/4160188746130925918'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/02/odf-next.html' title='ODF-Next'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-5221215372433079497</id><published>2009-02-11T12:30:00.002-05:00</published><updated>2009-02-11T12:34:46.388-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><title type='text'>JHOVE 1.2</title><content type='html'>&lt;p&gt;&lt;a href=&quot;https://sourceforge.net/projects/jhove/&quot;&gt;JHOVE 1.2&lt;/a&gt; is now up on SourceForge. This release has some bug fixes, and includes scripts which were accidentally left out of the 1.1 release.
&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/5221215372433079497/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/5221215372433079497' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/5221215372433079497'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/5221215372433079497'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/02/jhove-12.html' title='JHOVE 1.2'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9361273.post-6025263548290547518</id><published>2009-02-04T07:00:00.000-05:00</published><updated>2009-02-04T07:00:01.487-05:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="JHOVE"/><title type='text'>JHOVE2 functional requirements</title><content type='html'>&lt;p&gt;The latest (version 1.3) &lt;a href=&quot;http://confluence.ucop.edu/display/JHOVE2Info/Functional+Requirements&quot;&gt;functional requirements&lt;/a&gt; for JHOVE2 have been posted.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://fileformats.blogspot.com/feeds/6025263548290547518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/9361273/6025263548290547518' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/6025263548290547518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9361273/posts/default/6025263548290547518'/><link rel='alternate' type='text/html' href='http://fileformats.blogspot.com/2009/02/jhove2-functional-requirements.html' title='JHOVE2 functional requirements'/><author><name>Gary McGath</name><uri>http://www.blogger.com/profile/12880087933512343984</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>