<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>iNODE</title>
	
	<link>http://timesync.gmu.edu/wordpress</link>
	<description>The weblog of Digital Programs and Systems at George Mason University Libraries</description>
	<pubDate>Sat, 19 Jul 2008 01:14:07 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/Inode" type="application/rss+xml" /><feedburner:browserFriendly></feedburner:browserFriendly><item>
		<title>Say?</title>
		<link>http://timesync.gmu.edu/wordpress/?p=593</link>
		<comments>http://timesync.gmu.edu/wordpress/?p=593#comments</comments>
		<pubDate>Fri, 18 Jul 2008 14:44:49 +0000</pubDate>
		<dc:creator>Wally</dc:creator>
		
		<category><![CDATA[Apple / OSX]]></category>

		<category><![CDATA[Desktop Software]]></category>

		<category><![CDATA[Library Tech]]></category>

		<guid isPermaLink="false">http://timesync.gmu.edu/wordpress/?p=593</guid>
		<description><![CDATA[
Earlier this week I attended the Project Bamboo workshop at Princeton and thought I&#8217;d share a discovery I made while trying to complete the pre-workshop reading assignments attendees received:
1. Please read the proposal in its entirety.  The proposal can be found at:
http://projectbamboo.org/files/docs/bamboo_proposal.pdf
2. Please read the Identifying Scholarly Practices handout. This handout can be found [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://timesync.gmu.edu/wordpress/?p=593"><!-- &nbsp; --></abbr>
<p>Earlier this week I attended the <a href="http://www.projectbamboo.org">Project Bamboo</a> workshop at Princeton and thought I&#8217;d share a discovery I made while trying to complete the pre-workshop reading assignments attendees received:</p>
<p style="padding-left: 30px;">1. Please read the proposal in its entirety.  The proposal can be found at:</p>
<p style="padding-left: 30px;"><a href="http://projectbamboo.org/files/docs/bamboo_proposal.pdf">http://projectbamboo.org/files/docs/bamboo_proposal.pdf</a></p>
<p style="padding-left: 30px;">2. Please read the Identifying Scholarly Practices handout. This handout can be found at:</p>
<p style="padding-left: 30px;"><a href="http://projectbamboo.org/files/3/Identifying_Scholarly_Practices.pdf">http://projectbamboo.org/files/3/Identifying_Scholarly_Practices.pdf</a></p>
<p>All through the week leading up to the workshop I figured I&#8217;d surely get around to reading those documents but I never did.   The night before I was to leave, I realized it just wasn&#8217;t going to happen.   Then I had a thought: why not run the text of these documents through my Mac&#8217;s &#8220;Text to Speech&#8221; service, capture the output and later listen to it as a podcast during the dead time of my  3+ hour drive up to New Jersey?</p>
<p>I launched <a href="http://www.ambrosiasw.com/utilities/wiretap/">WireTap Studio</a> (to capture the sound) then opened the proposal PDF in Preview, highlighted the text of the document, selected &#8220;Services -&gt; Speech -&gt; Start Speaking Text&#8221; under Preview&#8217;s application menu and hit the record button.  I stopped after 30 seconds and imported the mp3 into iTunes. Sounded terrible—lots of ambient noise and sort of muddy sound quality.   Oh yeah, I was just picking up the tinny sound of the Macbook&#8217;s speakers with the low-quality built-in mic, no wonder it sounded so bad.</p>
<p>Next tried to use WireTap Studio  to intercept the audio stream (could also do this with <a href="http://rogueamoeba.com/audiohijackpro/">Audio Hijack Pro</a>) and found that I couldn&#8217;t seem to interrupt (and grab) the Speech services audio.  It wasn&#8217;t associated with the application and no matter what I selected as input, it didn&#8217;t get the speech audio.   I assume it can be done but I wasn&#8217;t having any success.  Time to Google&#8230;</p>
<p>Doh! Turns out there&#8217;s a unix command baked right in to OSX (since 10.3) that not only does exactly what I was trying to do, it does is much faster than the real-time capture I was experimenting with.  Meet &#8220;say&#8221;</p>
<pre><strong>say [-v voice] [-o out.aiff] [-f file]</strong></pre>
<p>So, I opened the PDF in Preview, used Command-A to select all the text, pasted it into a text file using BBEdit, chopped out the parts I didn&#8217;t care about then saved it to the desktop.  Then in a terminal window, issued this command:</p>
<p><strong>say -v Alex -o ~/desktop/bamboo.aiff -f ~/desktop/bamboo.txt</strong></p>
<p>Alex is the &#8220;new and improved&#8221; voice in OS X 10.5 (Leopard).   He has much better inflection and sounds much more human and much less Cylon.    If you really get into this (or need a voice that deals with a language other than US English), you can purchase additional voices from Cepstral (<a href="http://www.cepstral.com">http://www.cepstral.com</a>).  The voices are roughly $30 each.</p>
<p>In a little less than 3 minutes wall time, &#8217;say&#8217; produced the bamboo.aiff file that was easily imported into iTunes (2 hours, 5 minutes of audio).   Here&#8217;s a representative sample of how Alex sounded with the material:</p>
<p></p>
<p style="padding-left: 60px;"><em>&#8230;information technologists to collectively tackle the question: How can we enhance arts and humanities research through the development of shared technology services? This proposal represents an 18-month planning and community design program, the Bamboo Planning Project, where through a series of conversations and workshops, we will map out the scholarly practices and common technology challenges across and among disciplines, and discover where a coordinated, cross-disciplinary development effort can best foster academic innovation. Input into the Bamboo process&#8230;</em></p>
]]></content:encoded>
			<wfw:commentRss>http://timesync.gmu.edu/wordpress/?feed=rss2&amp;p=593</wfw:commentRss>
		</item>
		<item>
		<title>Help Wanted, Passwords, Zotero Syncs! and Bamboo</title>
		<link>http://timesync.gmu.edu/wordpress/?p=591</link>
		<comments>http://timesync.gmu.edu/wordpress/?p=591#comments</comments>
		<pubDate>Fri, 11 Jul 2008 03:07:08 +0000</pubDate>
		<dc:creator>Wally</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Library Tech]]></category>

		<guid isPermaLink="false">http://timesync.gmu.edu/wordpress/?p=591</guid>
		<description><![CDATA[
Digital Library Developer
We&#8217;ve posted a job advertisement for a Digital Library Developer and I encourage you to apply if you have an interest in building the sort of tools today&#8217;s library could really use but tomorrow&#8217;s digital library will absolutely require.
You&#8217;ll find the full posting (and online application form) at http://jobs.gmu.edu (position number FA730z).
Here&#8217;s the [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://timesync.gmu.edu/wordpress/?p=591"><!-- &nbsp; --></abbr>
<p><em><strong>Digital Library Developer</strong></em></p>
<p>We&#8217;ve posted a job advertisement for a Digital Library Developer and I encourage you to apply if you have an interest in building the sort of tools today&#8217;s library could really use but tomorrow&#8217;s digital library will absolutely require.</p>
<p>You&#8217;ll find the full posting (and online application form) at <a href="http://jobs.gmu.edu">http://jobs.gmu.edu</a> (position number FA730z).</p>
<p>Here&#8217;s the heart of the announcement:</p>
<p>George Mason University, University Libraries seeks a Digital Library Developer to join our innovative Digital Programs and Systems division as we build new ways to deliver library content and services.</p>
<p>Duties include: Anticipating and investigating trends in digital library technology so we can respond quickly to new opportunities. Provide primary support for new initiatives in resource discovery, digital preservation, knowledge management, and scholarly communication. This position reports to the Associate University Librarian for Digital Programs and Systems.</p>
<p><span id="more-591"></span></p>
<p>QUALIFICATIONS: ALA-accredited MLS or ALA-recognized foreign equivalent or Masters in information science or information systems or a related field required. (Other advanced degree may be considered.) Ability to work independently on technology implementation projects; excellent analytical skills to support problem solving and systems analysis; ability to handle multiple, simultaneous priorities. Progressively responsible information technology work experience and demonstrated competence with web development tools and technologies. Capacity to meet requirements for reappointment and promotion under Librarian&#8217;s Handbook.</p>
<p>PREFERRED: Experience with at least one of the following technologies: SOLR/Lucene; Java Enterprise Edition (JEE) to include servlets and JSP with Tomcat or similar technologies; LAMP (Linux, Apache, MySQL, PHP,/Perl/Python); experience with XML and related technologies (particularly XPATH and XSLT).</p>
<p>Salary range: $52,500 - $62,500</p>
<p><em><strong>Strong Passwords</strong></em></p>
<p>I&#8217;ve <a href="http://timesync.gmu.edu/wordpress/?p=331">written previously</a> about the value of a strong password (or to be more precise, about the hell that can break loose when you have a weak one).   I recently stumbled across a quick way  to generate a really strong password that you can then let your Mac&#8217;s keychain remember for you.    If you want an 8 character password, in a terminal window, type:</p>
<pre class="syntax-highlight:delphi">
openssl rand -base64 6
</pre>
<p>You&#8217;ll get something like this:</p>
<p>36qOaUvH</p>
<p>For reasons I haven&#8217;t taken the time to discover, if I don&#8217;t run the command as root, I also get an &#8220;unable to write &#8216;random state&#8217;&#8221; message after the password displays.   That&#8217;s annoying but not really a problem.</p>
<p><em><strong>Zotero&#8217;s syncing</strong></em></p>
<p>A new preview <a href="http://www.zotero.org/documentation/sync_preview">build of Zotero (1.5) is available</a> from my friends over at the <a href="http://chnm.gmu.edu">Center for History and New Media</a>.  Yes, this version lets you sync your Zotero database across multiple computers.   Hallelujah!</p>
<p>I installed it on my office MacPro earlier today and on my laptop at home this evening.   Once I had the 1.5 release on the second machine, I added my forum username/password in the Zotero add-on preference pane and hit the sync button&#8230;in seconds I had the 11 records that my desktop machine sent to the sync server earlier today.  Cool.</p>
<p>My personal research toolbox consists of <a href="http://www.devon-technologies.com/products/devonagent/index.html">DevonAgent</a> (for web-based research), a<a href="http://timesync.gmu.edu/wordpress/?p=317"> ScanSnap</a> (to convert paper to PDFs) and <a href="http://www.devon-technologies.com/products/devonthink/index.html">DevonThink Pro Office</a> (to OCR the PDFs and then store and index the output of both the ScanSnap and DevonAgent).   This combination works really well for me and thus I never built a large Zotero database.  Now, with syncing, I&#8217;m thinking I&#8217;ll begin working Zotero into my information workflow (ingest into Zotero from whatever browser I&#8217;m using at the moment, then export relevant records to DevonThink as appropriate).</p>
<p><em><strong>Project Bamboo</strong></em></p>
<p>Off to Princeton next week for a <a href="http://www.projectbamboo.org">Project Bamboo</a> workshop.   I&#8217;m sure it will be an interesting and informative experience but right now I&#8217;m just in awe of the scope of planning.   For me, planning is the stuff you do after you&#8217;ve been working for a while and something begins to take shape—you start <em>planning</em> how it might actually be made useful.  Project Bamboo plans for eighteen months with workshops held around the country (and Paris, too)?   This is very different and I&#8217;m looking forward to seeing how it works up close.</p>
]]></content:encoded>
			<wfw:commentRss>http://timesync.gmu.edu/wordpress/?feed=rss2&amp;p=591</wfw:commentRss>
		</item>
		<item>
		<title>Sproutcore</title>
		<link>http://timesync.gmu.edu/wordpress/?p=580</link>
		<comments>http://timesync.gmu.edu/wordpress/?p=580#comments</comments>
		<pubDate>Wed, 02 Jul 2008 16:55:27 +0000</pubDate>
		<dc:creator>Wally</dc:creator>
		
		<category><![CDATA[Apple / OSX]]></category>

		<category><![CDATA[Coding]]></category>

		<guid isPermaLink="false">http://timesync.gmu.edu/wordpress/?p=580</guid>
		<description><![CDATA[

It seems there&#8217;s some sort of new web development framework released every week or two but the other day I found one that shows a lot of promise: Sproutcore.
Odd name but an interesting concept.  At the most recent WWDC (Apple&#8217;s World Wide Developer&#8217;s Conference), Sproutcore was revealed as the &#8220;engine&#8221; behind many of the [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://timesync.gmu.edu/wordpress/?p=580"><!-- &nbsp; --></abbr>
<p><a href="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/07/sproutcore.png"><img src="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/07/sproutcore.png" alt="" align="left" /></a></p>
<p>It seems there&#8217;s some sort of new web development framework released every week or two but the other day I found one that shows a lot of promise: <a href="http://www.sproutcore.com/">Sproutcore</a>.</p>
<p>Odd name but an interesting concept.  At the most recent <a href="http://developer.apple.com/wwdc/">WWDC</a> (Apple&#8217;s World Wide Developer&#8217;s Conference), Sproutcore was revealed as the &#8220;engine&#8221; behind many of the new services on Apple&#8217;s .Mac replacement (<a href="http://www.apple.com/mobileme/">MobileMe</a>).  Many are suggesting the real purpose is an open-source, plugin-free alternative to Adobe&#8217;s <a href="http://www.adobe.com/products/flash/">Flash</a> and Microsoft&#8217;s <a href="http://silverlight.net/">SilverLight</a>.  If you&#8217;re interested in how something like Sproutcore fits in with cloud computing, Google, Flash, and the future, you should read the <a href="http://www.roughlydrafted.com/2008/06/14/cocoa-for-windows-flash-killer-sproutcore/">&#8220;Cocoa for Windows + Flash Killer = SproutCore&#8221;</a> post on Roughly Drafted from June 14th.</p>
<p>From the Sproutcore site:</p>
<p><strong>What is SproutCore?</strong></p>
<p><em>SproutCore is a framework for building applications in JavaScript with remarkably little amounts of code. It can help you build full “thick” client applications in the web browser that can create and modify data, often completely independent of your web server, communicating with your server via Ajax only when they need to save or load data. </em></p>
<p>I spent an hour or so working through the &#8220;hello world&#8221; demo and it&#8217;s cool. You do development coding in Ruby with an interactive server process that simplifies the code-test-debug-code cycle.  When done, there&#8217;s a standalone SproutCore utility that converts everything into static Javascript and CSS files—ready for deployment under Apache or whatever.  Here&#8217;s my &#8216;production&#8217; version of the demo:</p>
<p><a href="http://deskbox.gmu.edu/static/hello_world/">hello_world</a></p>
<p style="text-align: left;">I tested the look on both Windows (Firefox and IE7) and Mac (Firefox 3) and for this simple demo, at least, rendering was identical across platforms.  I think this is going to be a framework to watch.   It&#8217;s open source, doesn&#8217;t rely on plugins, is reasonably platform neutral (I&#8217;ve seen implementations on Ubuntu and Windows boxes) and relies on basic internet standards (Javascript and CSS).</p>
<p><a href="http://www.sproutcore.com"><br />
<strong>http://www.sproutcore.com</strong></a></p>
<p><a href="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/07/sproutcore.png"><br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://timesync.gmu.edu/wordpress/?feed=rss2&amp;p=580</wfw:commentRss>
		</item>
		<item>
		<title>Free Science</title>
		<link>http://timesync.gmu.edu/wordpress/?p=562</link>
		<comments>http://timesync.gmu.edu/wordpress/?p=562#comments</comments>
		<pubDate>Tue, 17 Jun 2008 14:42:29 +0000</pubDate>
		<dc:creator>Wally</dc:creator>
		
		<category><![CDATA[Library Tech]]></category>

		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://timesync.gmu.edu/wordpress/?p=562</guid>
		<description><![CDATA[
I often hear my fellow librarians lamenting the fact that so many students fail to appreciate the wealth of resources available to them.   I agree, up to a point, but  they have to qualify the scope of their complaint the next time I hear it—every day I see students going to great lengths to [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://timesync.gmu.edu/wordpress/?p=562"><!-- &nbsp; --></abbr>
<p>I often hear my fellow librarians lamenting the fact that so many students fail to appreciate the wealth of resources available to them.   I agree, up to a point, but  they have to qualify the scope of their complaint the next time I hear it—every day I see students going to great lengths to use our e-resources.  Trouble is they&#8217;re not our students.</p>
<p>Earlier today I decided to do a quick scan of our log files, sorting for multiple logins on the same username from different IP addresses.  Wasn&#8217;t long before my investigation led me to this page (excerpted below):</p>
<p><a href="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/crooks.jpg"><img class="alignleft" title="crooks" src="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/crooks.jpg" alt="" /></a></p>
<p>View yourself at: <a href="http://www.smso.net/forum/forumdisplay.php?f=317">http://www.smso.net/forum/forumdisplay.php?f=317</a></p>
<p>If you go to this site, you&#8217;ll find 12 pages of username/password combinations for gaining illicit access to many, many libraries around the world.    Seems the domain (smso.net) is registered to Mutib Al Tamimi on King Fahad Street in Riyadh, Saudi Arabia, and operates under the name Saudi Medical Site Online.</p>
<p>I especially enjoyed this:  explaining how to help keep the SMSO Free Science Team working for all &#8220;Researchers&#8221; (after all, sharing and helping is their hallmark).  It explains why the spelling in the first excerpt looks kinda odd (e.g., IOwa State Un1versity):</p>
<p><a href="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/caminosnapz002.jpg"><img class="alignleft size-full wp-image-565" title="caminosnapz002" src="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/caminosnapz002.jpg" alt="" width="499" height="308" /></a></p>
<p>If you&#8217;re in library IT and responsible for authentication issues, I&#8217;ll recommend a quick visit to the Saudi Medical Site Online—just to make sure you haven&#8217;t left a door open.  As I write this, the <a href="http://www.smso.net/">SMSO home page</a> advertises a new login link for a university library every day so your number will probably come up sooner or later.</p>
]]></content:encoded>
			<wfw:commentRss>http://timesync.gmu.edu/wordpress/?feed=rss2&amp;p=562</wfw:commentRss>
		</item>
		<item>
		<title>DIY up and running</title>
		<link>http://timesync.gmu.edu/wordpress/?p=556</link>
		<comments>http://timesync.gmu.edu/wordpress/?p=556#comments</comments>
		<pubDate>Fri, 13 Jun 2008 01:21:01 +0000</pubDate>
		<dc:creator>Wally</dc:creator>
		
		<category><![CDATA[Desktop Software]]></category>

		<category><![CDATA[Library Tech]]></category>

		<category><![CDATA[MARS]]></category>

		<category><![CDATA[Off Topic]]></category>

		<guid isPermaLink="false">http://timesync.gmu.edu/wordpress/?p=556</guid>
		<description><![CDATA[
It has taken about a week off and on to get all the pieces in place: installation of the scanner, software installation, camera calibration, three days lost while we waited for a replacement lighting housing, RTFM&#8217;ing and what not but I finished generating my first e-book with the ATIZ Bookdrive DIY scanner late this afternoon [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://timesync.gmu.edu/wordpress/?p=556"><!-- &nbsp; --></abbr>
<p><a href="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/55almanac4.jpg"><img title="55almanac4" src="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/55almanac4.jpg" alt="" align="left" /></a>It has taken about a week off and on to get all the pieces in place: installation of the scanner, software installation, camera calibration, three days lost while we waited for a replacement lighting housing, RTFM&#8217;ing and what not but I finished generating my first e-book with the <a href="http://www.atiz.com">ATIZ</a> Bookdrive DIY scanner late this afternoon (the title is one I picked up in a used bookshop in Boston a few years back).</p>
<p>It took roughly 20 minutes to scan the 240+ page book (the scanning software indicated I was working thru the book at a 632 pages-per-hour clip) and then another 30 minutes or so for post-processing (deskewing the jpg images, applying automatic crops, despeckling the page backgrounds, improving contrast and ultimately producing a PDF).   While scanning is a hands-on endeavor, for the most part post-processing runs unattended (after you interactively test and then set processing parameters).</p>
<p>The &#8220;optimized&#8221; version of the PDF weighed in at 18 megabytes—a major improvement over the &#8220;raw&#8221; version (500Mb).   I suspect I can get that down but will have to experiment with what lower resolutions scans will do (I was working at 300dpi).  I still need to work through various post-processing options (e.g., although the pages were faded the sample could use some brightening and I think I needed to use a higher aperture for better focus), but I can tell this is going to work well once I figure out the optimal settings and workflow.</p>
<p>Here&#8217;s a link to a sample PDF with a few pages from the book.  The schematics of older ballparks are kind of interesting.</p>
<p><a href='http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/1955sample.pdf'>Sample </a>  (800K)</p>
]]></content:encoded>
			<wfw:commentRss>http://timesync.gmu.edu/wordpress/?feed=rss2&amp;p=556</wfw:commentRss>
		</item>
		<item>
		<title>Moving Forward with Backing Up</title>
		<link>http://timesync.gmu.edu/wordpress/?p=552</link>
		<comments>http://timesync.gmu.edu/wordpress/?p=552#comments</comments>
		<pubDate>Wed, 11 Jun 2008 20:56:57 +0000</pubDate>
		<dc:creator>Wally</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<category><![CDATA[General]]></category>

		<category><![CDATA[Library Tech]]></category>

		<category><![CDATA[MARS]]></category>

		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://timesync.gmu.edu/wordpress/?p=552</guid>
		<description><![CDATA[
If you have sysadmin duties in a library like Mason&#8217;s (where our core technologies actually reside in the library and not the computer center), backing up is one big part of what you do.  Though heretical at the time, we abandoned tape for disk-to-disk backup in the early &#8217;90&#8217;s so while it doesn&#8217;t take [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://timesync.gmu.edu/wordpress/?p=552"><!-- &nbsp; --></abbr>
<p>If you have sysadmin duties in a library like Mason&#8217;s (where our core technologies actually reside in the library and not the computer center), backing up is one big part of what you do.  Though heretical at the time, we abandoned tape for disk-to-disk backup in the early &#8217;90&#8217;s so while it doesn&#8217;t take a lot of any one person&#8217;s time it&#8217;s still something that has to be scripted, tested, monitored and regularly thought about.  When it comes to the actual work of getting the backups done, well, it&#8217;s pretty much just one machine talking to another at cron-induced intervals.<br />
<span id="more-552"></span></p>
<p>For some of our systems, backup is a simple script containing a find /cpio combination.  Here&#8217;s the key:</p>
<pre>cd /DirectoryToBackup
find . -depth -print | cpio -pdmv /PlaceWhereYouWantToPutStuff</pre>
<p>&#8216;find&#8217; steps recursively through all the files and sub-directories, sending each file&#8217;s name to &#8216;cpio&#8217; which then copies it to the destination <strong>if the file is newer</strong> than a copy in the destination directory.  You probably see the gotcha here:  If you have deleted a file that was backed up last week it will stay in the destination directory since there&#8217;s nothing newer to overwrite it.</p>
<p>That&#8217;s where rsync shines: it makes sure the destination is identical to the original (deleting files in the destination directory that don&#8217;t exist in the original).   But I digress..</p>
<p>For some of our systems we&#8217;ve created elaborate backup/redundancy environments. Our Voyager ILS, for example, lives on a striped, mirrored, hot-spared RAID5 array inside a Sun V880 which gives an immediate measure of protection against any single disk failure.  Each evening, we take the system down for about 13 minutes and run a backup from this mirrored array to yet another stripped partition on a different set of drives in the same machine.   When that finishes, Voyager returns and a follow-on process copies that new &#8220;backup&#8221; partition to another machine in yet another part of the library.  This copy of the backup goes to a different machine each night so that over the course of seven days, we have seven copies spread out across seven different systems.   Were we to have catastrophic failure, we could cycle back through these backups and surely find one that could be restored.</p>
<p>For most of our other systems, like <a href="http://mars.gmu.edu">MARS</a> (our DSpace institutional/digital repository) less manic measures are sufficient.  The hardware environment for that system enables us to function well on a weekly backup.</p>
<p>MARS lives on a XServe RAID (RAID5) array with two hot spare drives so we&#8217;re really only talking about needing to survive a failure of more than three drives (a component of the array and both spares)&#8211;which isn&#8217;t likely in the course of a single week. At weekly intervals we mirror the bitstreams partition to another RAID5 partition on a nearby server.   We follow a slightly different process for the postgres-based metadata but always have several restorable copies lying about.</p>
<p>For those services that live on Apple XServes (e.g., <a href="http://furbo.gmu.edu/cgi-bin/ers/OSCRgen.cgi">E-Reserves</a>, our <a href="http://phobos.gmu.edu/melange">research portals</a>, <a href="http://mars.gmu.edu">MARS</a> and parts of the library&#8217;s <a href="http://furbo.gmu.edu/dbwiz/SPT--BrowseResources.php">website</a>), we use <a href="http://www.shirt-pocket.com/SuperDuper/SuperDuperDescription.html">SuperDuper!</a> to clone the system bootdrive (with Applications) to a spare internal drive on weekly schedule and then mirror important data via rsync to another machine.</p>
<p>At the center of our backup strategy for every system is an Apple XServe with a 1.8 TB RAID5 array.  Throughout the week this machine sets up and tears down NFS exports to our Solaris, Apple or Linux servers for rsync mirroring of important data.</p>
<p style="padding-left: 30px;"><em>Note: To improve security, we export the relevant directory from the backup server to the specific machine we want to backup.  The backup process is launched on the production machine (via cron) which then mounts the exported drive from our backup server.   We never export directories from our production machines and restrict access to the NFS daemons via hosts.allow.   Not foolproof, of course, but combined with a few other measures it is much more secure than failing to make backups would prove to be.<br />
</em></p>
<p>Then, weekly this large data store of backups is mirrored to a <a href="http://eshop.macsales.com/item/Other%20World%20Computing/MEAQ7500GB16/">500GB external drive</a> attached to the XServe via the firewire port.  We have two of these firewire drives and rotate them weekly: When finished, this week&#8217;s copy goes across campus to another building and last week&#8217;s drive returns and is connected to capture next week&#8217;s data.   In this way, we have a copy of all data stored off site and assume it will never be more than a couple of weeks old.    We&#8217;ll next have to look into getting a drive completely off campus since we&#8217;re reasonably well protected in case of something like a building fire but still quite vulnerable to a disaster with a slightly larger footprint (e.g., meterorites, flood, etc.).</p>
<p><img src="http://timesync.gmu.edu/wordpress/wp-content/uploads/2008/06/ff223d0d-f48c-486f-9542-786ea2e79c3a.jpg" border="0" alt="FF223D0D-F48C-486F-9542-786EA2E79C3A.jpg" width="94" height="96" align="left" />Earlier this week I incorporated a <a href="http://www.drobo.com/">Drobo</a> into the backup regimen: as one more spot where I can store the most important 500 or so gigabytes of library data.    I decided to go with the Drobo since I also had a half-terabtye of video that I needed to back up somewhere and putting it on an expensive RAID unit seemed a waste of higher-throughput space.  The Drobo seemed a reasonable alternative.   The Drobo isn&#8217;t fast by any means (throttled by a USB 2 interface and a controller that doesn&#8217;t appear to offer much in the way of throughput) but it is dead-simple easy to use. To quote Howlin&#8217; Wolf, &#8220;it&#8217;s <a href="http://www.bluesforpeace.com/lyrics/built-for-comfort.htm">built for comfort</a>, it ain&#8217;t built for speed.&#8221;</p>
<p>I put four 500GB Seagate SATA-2 drives in the drobo (yielding roughly 1.36 terabytes of protected storage) and in no time had a functioning array ready for data (by the way, SATA-I would be fine since the Drobo can&#8217;t keep up with that level of throughput either).  It&#8217;s not RAID but the Drobo uses some sort of proprietary storage virtualization system.    According to the documentation, if a drive in the Drobo fails, doing a hot swap with a replacement drive will restore the data and reestablish redundancy.  I&#8217;ll just think of it as a black box appliance.</p>
<p>Attached to my Mac Pro, I just completed a backup of close to 280 gigabytes of data from an external drive attached to the same computer via Firewire 800 and on larger files, throughput averaged roughly 15MB/s according to the rsync progress reports:</p>
<pre>/weblogs/ezproxy/FY0506/01_31_06-03_03_06.log
  1015703564 100%   15.83MB/s    0:01:01  (65313, 100.0% of 268220)
/weblogs/ezproxy/FY0506/10_25_05-01_30_2006.log
  2147483647 100%   15.97MB/s    0:02:08  (65314, 100.0% of 268220)
/weblogs/ezproxy/FY0506/ezproxy.log.030306_050306
  1847436796 100%   16.10MB/s    0:01:49  (65315, 100.0% of 268220)
/weblogs/ezproxy/FY0506/ezproxy.log.082505_102405
  1561279782 100%   15.86MB/s    0:01:33  (65316, 100.0% of 268220)</pre>
<p>The plan going forward is take the external drive that&#8217;s pulled from our XServe each week and rsync it to the drobo before sending it across campus to its &#8220;undisclosed&#8221; location.  This will give me a nearby copy of the data and thanks to the drobo&#8217;s built-in redundancy, a safer feeling than I have exposing backup data to the failure of a single-spindle drive.</p>
]]></content:encoded>
			<wfw:commentRss>http://timesync.gmu.edu/wordpress/?feed=rss2&amp;p=552</wfw:commentRss>
		</item>
	</channel>
</rss>
