<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jamie Todd Rubin</title>
	<atom:link href="https://jamierubin.net/feed/" rel="self" type="application/rss+xml" />
	<link>https://jamierubin.net</link>
	<description>Writer, Coder, Avid Reader</description>
	<lastBuildDate>Tue, 09 Jun 2026 18:16:07 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://i0.wp.com/jamierubin.net/wp-content/uploads/2021/06/cropped-cropped-JTR-Social-Media-New.jpg?fit=32%2C32&#038;ssl=1</url>
	<title>Jamie Todd Rubin</title>
	<link>https://jamierubin.net</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">194599522</site>	<item>
		<title>`ark`: A Personal Archive System, Part 3: The Store — Where the Archive Actually Lives</title>
		<link>https://jamierubin.net/2026/06/09/ark-part-3/</link>
					<comments>https://jamierubin.net/2026/06/09/ark-part-3/#respond</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Tue, 09 Jun 2026 18:13:36 +0000</pubDate>
				<category><![CDATA[Technology & Gadgets]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27705</guid>

					<description><![CDATA[That last post. What can I say. Four thousand plus words? Did I really do that to you? It won&#8217;t happen again. If I could do it over, I would. When I set out to (very tentatively) write this series of posts on ark, I intended it as being a way of showcasing something fun [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">That <a href="https://jamierubin.net/2026/06/03/ark-part-2/">last post</a>. What can I say. Four thousand plus words? Did I really do that to you? It won&#8217;t happen again. If I could do it over, I would<sup data-fn="cd57032d-680f-4726-9f47-a61c85e52bf5" class="fn"><a href="#cd57032d-680f-4726-9f47-a61c85e52bf5" id="cd57032d-680f-4726-9f47-a61c85e52bf5-link">1</a></sup>. When I set out to (very tentatively) write this series of posts on <code>ark</code>, I intended it as being a way of showcasing something fun that I built that actually turned out to be the thing that I always wanted to build. I forgot to bring the fun to that last post and it reads (to me) like one big &#8220;look what I can do!&#8221; flail. And this piece! This piece started out in the same direction, only worse. It was mired in technical detail. At one point, in the first draft, I wrote,</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph">This is naturally a more technical piece than most that I write, given the nature of what I am describing. I&#8217;ll do my best to smooth out those rough edges, but know that I am aware this isn&#8217;t for my usual audience.</p>
</blockquote>



<p class="wp-block-paragraph">Lazy, lazy, lazy. And not the right intention. Fortunately, I caught onto what I was doing when only a thousand words or so had been set down. Pauciloquy<sup data-fn="49d49dd4-d6d4-4d86-9ad0-5f888aa46b5a" class="fn"><a href="#49d49dd4-d6d4-4d86-9ad0-5f888aa46b5a" id="49d49dd4-d6d4-4d86-9ad0-5f888aa46b5a-link">2</a></sup> is called for and there is still (barely) enough to recover. And so here we are, about to talk about <code>ark</code>&#8216;s store, or, where&#8217;d all those files go? Where to begin&#8230;?</p>



<h2 class="wp-block-heading">The Dreaded Org Chart</h2>



<p class="wp-block-paragraph">I&#8217;ve been unsatisfied with the hierarchical structure of file systems since I ran that very first <code>catalog</code> command on an Apple <code>][ e</code>. I&#8217;ve lived with them ever since, an accumulating succession of decades that have done little to quench the burning dissatisfaction with the way files are stored on computers. It sometimes seems like a large part of my avocation in technology has been a desperate search for a way out of the rigid hierarchy<sup data-fn="da0ac8d6-43f8-431f-9e12-68c8fd70bd21" class="fn"><a href="#da0ac8d6-43f8-431f-9e12-68c8fd70bd21" id="da0ac8d6-43f8-431f-9e12-68c8fd70bd21-link">3</a></sup>. &#8220;Tear down the wall!&#8221;</p>



<p class="wp-block-paragraph">The search <a href="https://jamierubin.net/going-paperless/">took me to Evernote</a> with its notebooks and tags, <a href="https://jamierubin.net/blog-series/practically-paperless-with-obsidian/">and then to Obsidian</a> with its org chart and tags. But Obsidian introduced, to me at least, the notion of a graph: that is links between files that form an edge between nodes. It is a powerful idea, more powerful than I realized. And so when I approached this hobby project and was considering the design, I had two strong ideas in mind:</p>



<ol class="wp-block-list">
<li>Get me off this org chart!</li>



<li>How might I take advantage of graphs?</li>
</ol>



<h2 class="wp-block-heading">Simple Requirements</h2>



<p class="wp-block-paragraph">In my limited imagination, there are two poles on the file storage spectrum<sup data-fn="0ee0f906-4545-436c-9daf-d06e60559b68" class="fn"><a href="#0ee0f906-4545-436c-9daf-d06e60559b68" id="0ee0f906-4545-436c-9daf-d06e60559b68-link">4</a></sup>: a simple listing of files and a graph of files where every file points to every other file. <code>ark</code> is designed to be as close to the simple listing of files as possible. My requirements were, therefore, simple:</p>



<ol class="wp-block-list">
<li>There should only ever be one of each item. <em>Preventing</em> duplicates makes things easier to find. I can&#8217;t tell you how many times I have found three different copies of a Word document or photo on my computer.</li>



<li>Items in the archive must be described separately from the files themselves. File systems provide the bare minimum capacity for describing a file. An archive is more than a file system so I need a way of describing those files to make finding them as easy as possible.</li>



<li>The archive only stores finished products. Working documents, working files don&#8217;t get into the archive until they are finished.</li>
</ol>



<p class="wp-block-paragraph">With these requirements in hand, I set about meeting each one. To ensure that each item in the archive is unique, it gets a unique file name based on its digital DNA. The unique set of bytes that make up a file can be &#8220;hashed&#8221; into a number that is unique for that set of bytes. <code>ark</code> uses <code>sha256</code> for its hashing mechanism. That number becomes not only the name of the file, but its identifier in <code>ark</code>&#8216;s database. What it means in practice is that if I bring an exact copy of a file into <code>ark</code> that already exists, it doesn&#8217;t get added a second time, it is simply ignored in favor of the copy that is already in the system.</p>



<p class="wp-block-paragraph">To describe the files in the system, <code>ark</code> uses a SQLite database. This allows <code>ark</code> to have full-text search and semantic search capabilities without running a database server. The SQLite database is just another file on my computer. True, SQLite is not designed to be a multi-user database, but <code>ark</code> is not designed to be a multi-user application, so we&#8217;re all good here. All of the meta-data needed to describe a file is stored in the database. That meta-data breaks down into five categories:</p>



<ol class="wp-block-list">
<li>Classification (doc type, series, sub-series, format, sensitivity, priority, etc.)</li>



<li>Provenance (date authored, record origin, original source path, physical location, etc.)</li>



<li>Identification (doc ID, sha256, title, store path)</li>



<li>Quality (OCR status, OCR quality)</li>



<li>Content (full-text search content, LLM summary, embeddings)</li>
</ol>



<p class="wp-block-paragraph">Finally, <code>ark</code> uses a &#8220;copy-in&#8221; strategy for files. That means that the source file is left untouched, wherever it comes from, and a copy of the file is brought into the archive. Because of this, and the other requirements I listed above, I can store all of the files in a flat structure within <code>ark</code>&#8216;s store. After all, I never need to know the actual file name. I just need to be able to describe what I am looking for and the database takes care of the rest.</p>



<h2 class="wp-block-heading">Connections</h2>



<p class="wp-block-paragraph">In your standard OS, files sit there on a file system completely unconnected. But in a personal archive, people are first-class citizens. So in addition to the database storing information about files, it stores information about the people in those files. And since it is the connections that make an archive like this come alive, <code>ark</code> supports three kinds of connections:</p>



<ol class="wp-block-list">
<li><strong>Document-to-document</strong>: one email is linked to another as a &#8220;reply-to&#8221;. A receipt is linked to an invoice. Documents can link to one another.</li>



<li><strong>Document-to-person</strong>: Who wrote it, received it, who&#8217;s mentioned, who&#8217;s the subject. The same person may show up in lots of roles across documents. This is the single most-queried table in the archive. Anything person-aware reads from here.</li>



<li><strong>Person-to-person</strong>: Friends, family, colleagues, including date ranges. A handyman who works on your house may retire, and someone else takes over. The connections capture it all.</li>
</ol>



<p class="wp-block-paragraph">Here&#8217;s where things get fun! In an org chart, the answer to &#8220;show me everything about my son from the last two years&#8221; depends on remembering which folder you filed each thing in. In a graph, it doesn&#8217;t. I asked Claude Desktop<sup data-fn="4f27e668-0e90-4579-a814-3a32cca089b2" class="fn"><a href="#4f27e668-0e90-4579-a814-3a32cca089b2" id="4f27e668-0e90-4579-a814-3a32cca089b2-link">5</a></sup> that exact question and got back this: </p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph"><strong>Theater &amp; Performance</strong> — The dominant thread. He&#8217;d been active in community theater for a while: directing plays, performing in musicals (the creepy Wolf in Into the Woods), and preparing for the role of Edwin Drood at his high school. He&#8217;s a baritone/tenor in the choral group and was being recruited for an a cappella group. By May 2026, a choral concert I attended. Then the big news: he and a friend auditioned for <em>Guys and Dolls</em> — a six-hour audition — and he was offered the lead role of Nathan Detroit. Around 200 lines.</p>



<p class="wp-block-paragraph"><strong>Prom</strong> — May 19, 2026. He went with a friend. Dinner first, then photos at the Jefferson Memorial via Uber Black, then the prom itself. I left the car at the venue so he could drive home. He got home around 10pm, changed, headed to a post-prom party, was home just after 1am. I noted he &#8220;seemed very grown up about the whole thing.&#8221;</p>



<p class="wp-block-paragraph"><strong>Memorial Day 2026</strong> — At our friends&#8217; house. He &#8220;christened the pool as he does every year.&#8221;</p>



<p class="wp-block-paragraph"><strong>College Prep</strong> — From a family Zoom in late May: he&#8217;s working on the Common Application and asking for recommendation letters. Summer plans still up in the air — SAT prep, camp counselor work, or a possible Europe trip.</p>



<p class="wp-block-paragraph"><strong>Lawn</strong> — An October 2025 iMessage asks whether he can cut the lawn that weekend.</p>



<p class="wp-block-paragraph"><strong>Rutgers campus tour </strong>— Mentioned in passing in the Zoom notes: I drove by my childhood New Jersey house &#8220;after a Rutgers campus tour with him.&#8221;</p>



<p class="wp-block-paragraph"><strong>Cotillion</strong> — Per a January 2026 letter, he was working as an instructor — teaching students to dance and how to conduct themselves in society.</p>



<p class="wp-block-paragraph"><strong>Driving</strong> — Also from that January letter, the dry observation that his having a license &#8220;does wonders for the nervous system.&#8221;</p>
</blockquote>



<p class="wp-block-paragraph">One question; every connected thing — pulled together by edges the archive had been quietly maintaining while I wasn&#8217;t looking. Letters, iMessages, Zoom call notes, photos, calendar entries, diaries, social media — formats I don&#8217;t normally think of as connected, returning one coherent picture of my kid over a two-year span<sup data-fn="fae962e7-8160-4a93-b80b-f20ab949065d" class="fn"><a href="#fae962e7-8160-4a93-b80b-f20ab949065d" id="fae962e7-8160-4a93-b80b-f20ab949065d-link">6</a></sup>. The org chart could have stored these. Only the graph could have answered them. (More on how that LLM call works under the hood — bundles, edges, redaction — next post.)</p>



<h2 class="wp-block-heading">Two Ways to Organize</h2>



<p class="wp-block-paragraph">In looking at how archivists tend to organize archives, a 4-tiered, um, hierarchy (sorry!) emerged as a trend:</p>



<ol class="wp-block-list">
<li>Series (biographical, correspondence, writings, research, professional, financial, legal, medical, etc.).</li>



<li>Sub-series (fixed categories that fall underneath each of the series).</li>



<li>File. A collection of items in a series/sub-series in a physical archive.</li>



<li>Item. The thing itself.</li>
</ol>



<p class="wp-block-paragraph">In <code>ark</code> we have hard mappings to three of the four: series, sub-series, and the item itself.</p>



<p class="wp-block-paragraph">Series and sub-series are categories that form a controlled vocabulary. But I find it useful to have user-curated groupings as well. While <code>ark</code> can use tags, I created something called a &#8220;collection&#8221; which is a curated grouping named after the reason that the items are grouped together. For example #2026-tax-documents, or #2019-house-purchase, or #vacation-in-the-golden-age-notes. Documents can, of course, have a series and sub-series, be tagged, <em>and</em> be members of one or more collections. The nice thing about collections is that they can be used as input for other <code>ark</code> commands. For instance:</p>



<p class="wp-block-paragraph"><code>ark bundle '#vacation-in-the-golden-age-notes' | ark task summarize</code></p>



<p class="wp-block-paragraph">which will create a bundle of all of the items in the <code>#vacation-in-the-golden-age-notes</code> collection and then use an LLM to summarize the entire bundle.</p>



<p class="wp-block-paragraph">Bottom line: a collection is a list of items in the archive <em>with context</em>.</p>



<h2 class="wp-block-heading">Some Things Aren&#8217;t Documents</h2>



<p class="wp-block-paragraph">Most things in <code>ark</code> are documents: emails, PDFs, photos, diary entries, Office documents, text files. But several things in the archive aren&#8217;t documents; instead, they have their own dedicated database tables. These include reading events (to manage my reading list), health data from Apple Health and FitBit, location data pulled off photos and extracted from other sources like diary entries, and more.</p>



<p class="wp-block-paragraph">I&#8217;ll write about each of these later on in this series. For the store, it is useful to know that these live alongside the document model and follow the same rules: they are addressable with a unique <code>sha256</code> identifier, auditable, and integrated with <code>ark</code>&#8216;s core command set (although they sometimes have commands of their own).</p>



<h2 class="wp-block-heading">Two Things That Touch Everything</h2>



<p class="wp-block-paragraph">Two things in <code>ark</code>&#8216;s data model don&#8217;t sit in any one layer; they sit over all of them:</p>



<ol class="wp-block-list">
<li><strong>Sensitivity</strong>: This is set on every item that comes into the archive. The archive treats sensitivity as a query filter, not a display hint. Items that are sensitive are automatically routed through different code paths. For example, if I use Claude Desktop to ask a question and the result includes sensitive data, the code path prevents &#8220;restricted&#8221; data from leaving the local machine so that it never gets to Claude Desktop. If the data is marked &#8220;sensitive&#8221;, any sensitive information is stripped and replaced with &#8220;[REDACTED]&#8221; before being sent off the local machine. This is true everywhere data might be exfiltrated off the local machine.</li>



<li><strong>Annotations</strong>: I have a lot to say about things (pauciloquy goes only so far). I engineered the annotations layer to sit atop everything in the archive. This way, I can add notes and comments to a document, a book record, a watch event, an Apple Health record, a person — anything in the archive can be annotated. Those annotations are searchable, and they are surfaced most commonly when looking at a document in the archive. This allows me to add context without touching the original item.</li>
</ol>



<h2 class="wp-block-heading">Conclusion</h2>



<p class="wp-block-paragraph">This design keeps <code>ark</code> entirely self-contained on my local machine. The sensitivity layer ensures that documents that shouldn&#8217;t leave the machine don&#8217;t. SQLite handles full-text and semantic search. And so far, this scales well. As of this writing, my store is 125 GB not counting the SQLite database which adds another 9 GB:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">=== ark store stats ===</div><div class="cm-line"></div><div class="cm-line">  Strategy:       copy</div><div class="cm-line">  Store path:     /Users/[username]/.local/share/ark/store</div><div class="cm-line">  Store size:     125.0 GB  (390,749 files on disk)</div><div class="cm-line">  In store (DB):  365,774 document(s)</div><div class="cm-line">  Index-only:     330,680 document(s)  (no managed copy)</div><div class="cm-line">  Total docs:     696,454</div><div class="cm-line"></div><div class="cm-line">  Type                    Total   In store   Index-only</div><div class="cm-line">  ────────────────────  ───────  ─────────  ───────────</div><div class="cm-line">  email                 281,740    272,068        9,672</div><div class="cm-line">  browser_visit         106,616          0      106,616</div><div class="cm-line">  image                  85,415     77,182        8,233</div><div class="cm-line">  imessage               41,267          0       41,267</div><div class="cm-line">  tweet                  27,599          0       27,599</div><div class="cm-line">  watch_event            26,149          0       26,149</div><div class="cm-line">  calendar_event         21,564          0       21,564</div><div class="cm-line">  facebook-post          19,234          0       19,234</div><div class="cm-line">  cli_command            15,189          0       15,189</div><div class="cm-line">  note                   12,914          7       12,907</div><div class="cm-line">  health_day              9,411          0        9,411</div><div class="cm-line">  pdf                     8,400      8,400            0</div><div class="cm-line">  blog_comment            8,040          0        8,040</div><div class="cm-line">  blog_post               7,477          0        7,477</div><div class="cm-line">  purchase                5,813          0        5,813</div><div class="cm-line">  music_play              5,194          0        5,194</div><div class="cm-line">  attachment              4,302      4,302            0</div><div class="cm-line">  git_commit              2,261          0        2,261</div><div class="cm-line">  office                  2,078      2,078            0</div><div class="cm-line">  reading_finished        1,547          0        1,547</div><div class="cm-line">  book                    1,257          0        1,257</div><div class="cm-line">  text                    1,114      1,114            0</div><div class="cm-line">  diary_entry               804        612          192</div><div class="cm-line">  reminder                  373          0          373</div><div class="cm-line">  action_item               209          0          209</div><div class="cm-line">  playlist                  124          0          124</div><div class="cm-line">  video                      96          0           96</div><div class="cm-line">  review                     83          0           83</div><div class="cm-line">  weather_snapshot           52          0           52</div><div class="cm-line">  blog_page                  38          0           38</div><div class="cm-line">  subscription               27          0           27</div><div class="cm-line">  message                    22          0           22</div><div class="cm-line">  reading_started            19          0           19</div><div class="cm-line">  code_file                  11         11            0</div><div class="cm-line">  outbox_draft                8          0            8</div><div class="cm-line">  conversation                3          0            3</div><div class="cm-line">  day_summary                 3          0            3</div><div class="cm-line">  timeline_event              1          0            1</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">You&#8217;ll notice the &#8220;Index-only&#8221; column has some big numbers. Some doc types — browser history, iMessages, tweets — live as index entries pointing at source databases or cloud accounts. The original bytes aren&#8217;t worth duplicating, so <code>ark</code> keeps the metadata and content for search but doesn&#8217;t manage a separate copy.</p>



<p class="wp-block-paragraph">That&#8217;s nearly 700,000 items in <code>ark</code>. Most of these items were ingested automatically into <code>ark</code> from a variety of sources. I&#8217;ll talk about &#8220;ingestion at scale&#8221; next time.</p>


<ol class="wp-block-footnotes"><li id="cd57032d-680f-4726-9f47-a61c85e52bf5">I suppose I <em>could</em> do it over, but a long-standing tenet of this blog is that I make mistakes in public, and learn from them. It does no one any good to erase those mistakes and pretend they don&#8217;t exist. <a href="#cd57032d-680f-4726-9f47-a61c85e52bf5-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="49d49dd4-d6d4-4d86-9ad0-5f888aa46b5a">Today&#8217;s <a href="https://wordsmith.org/awad/">A.Word.A.Day</a> word. This is the first mailing list I ever subscribed to. According to `ark` my first message from the list came on February 2, 1997. I think I&#8217;ve subscribed longer, but I may have been deleting that email. (I did delete email early on, not thinking about future value.) I still enjoy reading it each day it arrives in my inbox. I wonder how many people out there have subscribed to, and actively read, a mailing list for just about 30 years now? <a href="#49d49dd4-d6d4-4d86-9ad0-5f888aa46b5a-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="da0ac8d6-43f8-431f-9e12-68c8fd70bd21">That word is just too hard to type. I&#8217;m going to stick with &#8220;org chart&#8221; going forward if you don&#8217;t mind. <a href="#da0ac8d6-43f8-431f-9e12-68c8fd70bd21-link" aria-label="Jump to footnote reference 3"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="0ee0f906-4545-436c-9daf-d06e60559b68">I&#8217;m speaking here of a file as an atomic unit of measure. I know that files are made up of bytes, but for the purposes of a workable archive, the file is the base unit. <a href="#0ee0f906-4545-436c-9daf-d06e60559b68-link" aria-label="Jump to footnote reference 4"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="4f27e668-0e90-4579-a814-3a32cca089b2">Recall from the previous post how `ark` commands are exposed to LLMs via an MCP server. This is an example of that in action. Here, Claude Desktop is taking my natural language question and turning it into a series of `ark` queries (this is the agentic model) to get back the best possible answer. <a href="#4f27e668-0e90-4579-a814-3a32cca089b2-link" aria-label="Jump to footnote reference 5"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="fae962e7-8160-4a93-b80b-f20ab949065d">Keep in mind I&#8217;ve only been using `ark` for a few months so many of these connections were &#8220;discovered&#8221; by the ingestion process (more on this next time). Over time, this should become a richer source of linkages. <a href="#fae962e7-8160-4a93-b80b-f20ab949065d-link" aria-label="Jump to footnote reference 6"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/06/09/ark-part-3/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27705</post-id>	</item>
		<item>
		<title>`ark`: A Personal Archive System, Part 2: A Day with the Archives</title>
		<link>https://jamierubin.net/2026/06/03/ark-part-2/</link>
					<comments>https://jamierubin.net/2026/06/03/ark-part-2/#respond</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Wed, 03 Jun 2026 12:00:00 +0000</pubDate>
				<category><![CDATA[Technology & Gadgets]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27683</guid>

					<description><![CDATA[This is the second in a new series describing ark, a command-line-based personal archive system I have developed and am using as my primary archiving tool. If you are not interested in this type of tech post, feel free to skip it. For those who are interested, a plan to provide a new entry in the [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="is-style-info wp-block-paragraph">This is the second in a <a href="https://jamierubin.net/ark/">new series</a> describing <code>ark</code>, a command-line-based personal archive system I have developed and am using as my primary archiving tool. If you are not interested in this type of tech post, feel free to skip it. For those who are interested, a plan to provide a new entry in the series each Tuesday for the next 13 weeks.</p>



<p class="wp-block-paragraph">&#8220;Pivot! <em>Pivot!</em> <strong>PIVOT!</strong>&#8221; &#8212; Ross Geller</p>



<p class="wp-block-paragraph">I had planned to write about <code>ark</code>&#8216;s store and how it manages the items in the archive today. But it occurred to me that in writing about <code>ark</code> last week, I didn&#8217;t really <em>show</em> any of the system. Instead, I told a lot about the system. And as they say in the writing world: <em>show, don&#8217;t tell</em>. So this week I&#8217;ve opted to show what <code>ark</code> looks like so you get a tour before seeing the plumbing. <a href="https://jamierubin.net/2026/05/26/ark-part-1/">Part 1</a> made the case for the archivist&#8217;s lens. Part 2 shows what it looks like. I&#8217;ve cast this as seven &#8220;scenes&#8221; through a day &#8212; not necessarily a literal day, but seven realistic moments, stitched into one. The post looks long only because of the archive output for the examples I&#8217;ve provided.</p>



<h2 class="wp-block-heading">Scene 1 &#8211; Morning: Catching Up</h2>



<p class="wp-block-paragraph"><code>ark</code> has a wide range of items throughout my life. I need an easy way to see what&#8217;s there for a given day, and this is where the <code>ark day</code> command comes in handy. Here is what it looks like for the day before I began writing this post:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark day yesterday</div><div class="cm-line"># Saturday, May 30, 2026  ·  yesterday</div><div class="cm-line">&#x1f4cd; Arlington, VA  ·  &#x1f324; Partly cloudy  73°/63°F</div><div class="cm-line">&#x1f4aa; 7,405 steps  6h 37m sleep  72 bpm  HRV 38 ms</div><div class="cm-line">   SpO2 95.4%  VO2max 35.3</div><div class="cm-line"></div><div class="cm-line">## Calendar (2 events)</div><div class="cm-line">[1]  dare to dream  *Family*</div><div class="cm-line">[2]  zach soccer  *Family*</div><div class="cm-line"></div><div class="cm-line">## Diary (1 entry)</div><div class="cm-line">[3] Evening Notes</div><div class="cm-line">    A beautiful day today. From the time I woke up and headed out for my walk, through dinner on the deck with Kelly and the…</div><div class="cm-line"></div><div class="cm-line">## Reading (3)</div><div class="cm-line">[4] *The Complete Essays* by Michel de Montaigne  (started today)</div><div class="cm-line">[5] *In Memory Yet Green* by Isaac Asimov  (finished today ✓)  ✎4</div><div class="cm-line">[6] *Proto* by Laura Spinney  (started 2 days ago)</div><div class="cm-line"></div><div class="cm-line">## Reminders (1 open)</div><div class="cm-line">[7] Check into my flight  *My Tasks*  &#x26a0; overdue 1 day</div><div class="cm-line"></div><div class="cm-line">## Email (37 messages — none from known contacts, 30 filtered)</div><div class="cm-line">… 37 other messages  (use --all-emails to see all)</div><div class="cm-line"></div><div class="cm-line">## People</div><div class="cm-line">[8] Kelly R.  (in 3 documents today)</div><div class="cm-line">[9] Eric S.  (in 2 documents today)</div><div class="cm-line">[10] Lisa K.  (in 1 document today)</div><div class="cm-line">[11] Zachary R.  (in 1 document today)</div><div class="cm-line"></div><div class="cm-line">## Messages (7 messages)</div><div class="cm-line">[12] Kelly R. —   (+2 messages)</div><div class="cm-line">[13] Eric S. — Yeah but mine&apos;s not on your...  (+1 message, group)</div><div class="cm-line">[14] Zachary R. — https://www.youtube.com/watch?v=0h2jZtuRuic</div><div class="cm-line">[15] Lisa K. — Maybe he’s role playing  (group)</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">A few things to note here. First, <code>ark</code> is pulling information from the archive from a variety of sources. The header information contains the date (and how long it has been from that date to the time the command was run). It has my location, which <code>ark</code> captures either explicitly through artifacts in the archive, or through an inferred default location over time. The health information (steps, sleep, etc) comes from Apple Health.</p>



<p class="wp-block-paragraph">It has seven sections that follow that try to provide a high level picture of the day: what was on the calendar and diary entries lead the way. The numbers (<code>[1]</code>) are sigils that can be used with other <code>ark</code> commands in place of longer document IDs. So I can type <code>ark show @3</code> to see the full diary entry.</p>



<p class="wp-block-paragraph">Reading is important to me so I can see what I was reading on that day, including what I started reading and what I finished. Any reminders from Apple Reminders show up for that day. Also email, but note that the email is filtered in this view to show only mail from known contacts, and despite getting 37 messages, none of them were from my known contacts that I&#8217;d want appearing here.</p>



<p class="wp-block-paragraph">There is a list of people I interacted with on that day, or who appear in documents that were ingested on that day, followed by a list of text messages (or message groups) that took place on that day.</p>



<p class="wp-block-paragraph">The command provides an easy way to get a view of any day in the archive. I could also have <code>ark</code> summarize the day for me via <code>ark day 2026-05-30 --summarize</code>. Like most <code>ark</code> commands, there are plenty of options for filtering and controlling the output. Here is what <code>ark day --help</code> looks like:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">usage: ark day [-h] [--on-this-day] [--deep] [--all-emails] [--all-calendars] [--verbose] [--full]</div><div class="cm-line">               [--format {table,json}] [--bundle] [--summarize] [--regenerate]</div><div class="cm-line">               [DATE]</div><div class="cm-line"></div><div class="cm-line">Compact daily digest — every item gets a [N] sigil for follow-on commands.</div><div class="cm-line"></div><div class="cm-line">positional arguments:</div><div class="cm-line">  DATE                  Date to view (YYYY-MM-DD, &apos;today&apos;, &apos;yesterday&apos;). Default: today</div><div class="cm-line"></div><div class="cm-line">options:</div><div class="cm-line">  -h, --help            show this help message and exit</div><div class="cm-line">  --on-this-day         Include documents from the same date in other years</div><div class="cm-line">  --deep                Include low-priority items (music plays, browser visits)</div><div class="cm-line">  --all-emails          Show all emails, ignoring contact and blocklist filters</div><div class="cm-line">  --all-calendars       Show events from all calendars, ignoring the day.calendar.include filter</div><div class="cm-line">  --verbose             Full narrative output (old default behavior)</div><div class="cm-line">  --full                Show full document text instead of excerpts (--verbose mode)</div><div class="cm-line">  --format {table,json}</div><div class="cm-line">                        Output format: table (default) or json</div><div class="cm-line">  --bundle              Output as ark bundle format (pipe to ark task)</div><div class="cm-line">  --summarize           Display or generate a narrative summary of the day via LLM</div><div class="cm-line">  --regenerate          Force regeneration of the day summary (implies --summarize)</div><div class="cm-line"></div><div class="cm-line">Subcommands:</div><div class="cm-line">  ark day rank [DATE] &lt;-2..+2&gt; [--note NOTE]   # set the Jim-Collins-style day-rank</div><div class="cm-line">  ark day rank [DATE] --clear                  # remove a rank</div><div class="cm-line">  ark day rank [DATE]                          # show the rank for a date</div><div class="cm-line">  ark day ranks [--since 30d|--year YYYY]      # chart + summary of recorded ranks</div><div class="cm-line"></div><div class="cm-line">See &apos;ark day rank --help&apos; or &apos;ark day ranks --help&apos; for details.</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<h2 class="wp-block-heading">Scene 2 &#8211; Triage: What Wants Attention</h2>



<p class="wp-block-paragraph">With a critical mass in the archive, and new items always coming in, it was a logical next step to make use of all of that information. After reviewing the day, I might make use of <code>ark</code>&#8216;s &#8220;stewardship&#8221; layer. <code>ark</code> reviews items in the archive, as well as items coming into the archive, and identifies things that require my attention or action. From the CLI I can get a digest of this as follows:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark prepare digest</div><div class="cm-line"># Stewardship digest — 2026-05-31</div><div class="cm-line"></div><div class="cm-line">_act today: 0 · aging: 2 · outbox: 0 · stalled: 0 · review: 1_</div><div class="cm-line"></div><div class="cm-line">**Yesterday (2026-05-30) unranked.**  Rate it: `ark day rank yesterday &lt;-2..+2&gt;` (or skip — auto-clears once ranked).</div><div class="cm-line"></div><div class="cm-line">## Aging into Q1</div><div class="cm-line"></div><div class="cm-line">- **[orange] Pay Arlington County utilities bill due 06/16/26** — Arlington County Utilities Services Office</div><div class="cm-line">  age *10d* · Q2 · *reminder*  (`ark prepare show 722866`)</div><div class="cm-line">- **[orange] Complete patient experience survey for my doctor&apos;s office** — (medical provider)</div><div class="cm-line">  age *8d* · Q4 · *reminder*  (`ark prepare show 724994`)</div><div class="cm-line"></div><div class="cm-line">## Review (low-confidence classifications)</div><div class="cm-line"></div><div class="cm-line">- **Prepare for upcoming trip to Boston per American Airlines reminder** — American Airlines</div><div class="cm-line">  conf *0.70* · *reminder*  (`ark prepare show 727451`)</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">or in TUI:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/06/2026-05-31-ark-part-2-01.png?ssl=1" alt="&quot;Same digest in TUI view&quot;"/><figcaption class="wp-element-caption">Same digest in TUI view</figcaption></figure>



<p class="wp-block-paragraph">Here, I can see at a glance things that require my attention and take action on them<sup data-fn="85e60200-d86e-4e44-907e-2c0d73f6b696" class="fn"><a href="#85e60200-d86e-4e44-907e-2c0d73f6b696" id="85e60200-d86e-4e44-907e-2c0d73f6b696-link">1</a></sup>. <code>ark</code> uses the Eisenhower matrix to put actions into quadrants around (1) Do First, (2) Schedule, (3) Delegate, and (4) Delete. That is what those Q1, Q2, etc., represent.</p>



<p class="wp-block-paragraph"><code>ark</code> knows how to dedupe these items as they come into the archive so that if I get 3 reminder emails about the same thing, the thing only shows up once in the digest (even though all 3 emails are stored in the archive).</p>



<p class="wp-block-paragraph">Working through the day, <code>ark prepare digest</code> is a nice follow-up to looking at the day or the previous day to get the context of what happened and what needs to happen.</p>



<h2 class="wp-block-heading">Scene 3 &#8211; Looking Back: Finding What&#8217;s In the Archive</h2>



<p class="wp-block-paragraph">Let&#8217;s face it, a big reason for having an archive is so that you can search it. <code>ark</code> has sophisticated search capabilities that allow for a rich search experience. First, <code>ark</code> can do full-text searches of anything in the archive, whether text, PDFs, Office documents, email, text message, annotations, even photos. That&#8217;s because when something enters the archive it it goes through a process that &#8220;enriches&#8221; what comes in making anything searchable. </p>



<p class="wp-block-paragraph">Let&#8217;s start with a regular full-text search. Here, I am going to search for items in the archive with the words &#8220;tax 2025&#8221;. The other parameters in the command tell it to limit the results to 10 items, to leave out snippets of the items, and to show how long it took the search to run:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark search &quot;tax 2025&quot; --limit 10 --no-snippet --timing</div><div class="cm-line">Searching: &apos;tax 2025&apos;</div><div class="cm-line">[1]   557281  imessage  2025-12-19  iMessage to High School Gang — 2025-12-19: He can now claim his tax-free…</div><div class="cm-line">[2]   151377  pdf       2025-08-09  Holiday Inn Express Neptune Receipt — 2025-08-07</div><div class="cm-line">[3]   152738  pdf       2025-01-07  Holiday Inn Express Florence Hotel Receipt — 2025-01-06</div><div class="cm-line">[4]   413583  pdf       2026-03-01  2025 St. Ann Faith Formation Donation Tax Statement</div><div class="cm-line">[5]   149508  pdf       2026-01-02  Holiday Inn Express and Suites Boynton Beach West — 2025-12-31</div><div class="cm-line">[6]   151497  pdf       2025-08-02  Holiday Inn Express and Suites Dickson City-Scranton Receipt — 2025-08-01</div><div class="cm-line">[7]   165270  email     2026-03-01  2025 Tax documents have been uploaded!</div><div class="cm-line">[8]   165269  email     2026-03-01  Re: 2025 Tax documents have been uploaded!</div><div class="cm-line">[9]   413659  pdf       2026-03-25  2025 Tax Summary</div><div class="cm-line">[10]  146864  email     2026-03-19  Prepare for tax time - access your 2025 prescription spending report</div><div class="cm-line"># 10 results</div><div class="cm-line"># query: 295ms</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">Not a bad mix of items, but probably, not quite was I was looking for, either. The search ran in about quarter of a second, but the results were meh. I was really looking for tax-related documents, not documents with the words &#8220;tax&#8221; and &#8220;2025&#8221; in them.</p>



<p class="wp-block-paragraph">This is where a semantic search comes in. Semantic searches allow searching for <em>meaning</em>. </p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark search &quot;documents I&apos;ll need at year&apos;s end&quot; --semantic --from 2026-01-01 --timing --limit 10</div><div class="cm-line">Semantic search: &quot;documents I&apos;ll need at year&apos;s end&quot;</div><div class="cm-line">[1]   148828  email           2026-01-23  Your 2025 tax documents are ready</div><div class="cm-line">[2]   413602  pdf             2026-03-01  2025 Tax Organizer - Completed</div><div class="cm-line">[3]   165270  email           2026-03-01  2025 Tax documents have been uploaded!</div><div class="cm-line">[4]   708408  pdf             2026-05-12  Brokerage Tax Package 10 — 2024 Tax Year</div><div class="cm-line">[5]   145936  email           2026-01-08  Important information about the year-end tax statement for your ho…</div><div class="cm-line">[6]   165269  email           2026-03-01  Re: 2025 Tax documents have been uploaded!</div><div class="cm-line">[7]   165261  email           2026-03-26  Re: 2025 Tax documents have been uploaded!</div><div class="cm-line">[8]   614083  calendar_event  2026-02-11  discuss CDMS remaining use cases</div><div class="cm-line">[9]   413598  pdf             2026-02-04  Form 1099-G — 2024 Tax Year</div><div class="cm-line">[10]  413659  pdf             2026-03-25  2025 Tax Summary</div><div class="cm-line"># 10 results</div><div class="cm-line"># query: 1052ms</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">In this case, I searched for &#8220;documents I&#8217;ll need at the end of the year&#8221; added a filter from the beginning of the year, and added the <code>--semantic</code> parameter to tell <code>ark</code> do perform a semantic search. The results are <em>much</em> closer to what I was looking for. They are tax documents or tax-adjacent documents when compared with the simple full-text search. The trade-off is (a) more time during ingestion to give each item embeddings<sup data-fn="bcd70b27-9736-4134-96fd-51cf9143dfa0" class="fn"><a href="#bcd70b27-9736-4134-96fd-51cf9143dfa0" id="bcd70b27-9736-4134-96fd-51cf9143dfa0-link">2</a></sup> that can be used by semantic search, and (b) a slightly longer time to execute the query (because it uses AI) &#8212; just over 1 second compared to 0.250 seconds. But even against an archive with 700,000 items, it is a negligible change in speed and the value of the quality of the search far outweighs it.</p>



<h2 class="wp-block-heading">Scene 4: A Conversation with My Archive via Claude Desktop (MCP)</h2>



<p class="wp-block-paragraph">With more than 7,300 posts here written over 20+ years, I can&#8217;t always remember if I&#8217;ve written about something before. For instance, as I write this, I am about to head up to Boston for a conference. In reviewing the profiles of the various attendees, I couldn&#8217;t help being bemused by the wild garden of jargon on display. Immediately I thought: <em>great topic for the blog</em>. But I&#8217;ve written a lot over the years and it was entirely possible I&#8217;d written about this before. Since all of my blog posts are captured in <code>ark</code>, I decided to check there first.</p>



<p class="wp-block-paragraph">One of the great things about the design of the archive is that many of the commands in the system are exposed as tools to LLMs via an MCP server. That means I don&#8217;t always have to sit at the command line. I can go into Claude Desktop, for instance, and ask it directly: &#8220;I want to write a blog post about how business jargon annoys me. What have I written about jargon in the past?&#8221;</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/06/2026-05-31-ark-part-2-02.png?ssl=1" alt="A conversation with Claude"/><figcaption class="wp-element-caption">A conversation with Claude</figcaption></figure>



<p class="wp-block-paragraph">If you can&#8217;t read the screenshot, here is what Claude said in answer to my question:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph">Searched memory<br><br>The conversation search didn&#8217;t turn up anything specifically about jargon. Let me try the ark search directly, since that&#8217;s where your blog posts live.</p>



<p class="wp-block-paragraph">Used ark integration, loaded tools<br><br>There&#8217;s a very promising hit: &#8220;Meaningless Statements, Vague Gibberish, and Useless Qualifications&#8221; (2021) and &#8220;What not to do in a resume&#8221; (2006). Also that &#8220;AI jargon&#8221; email from 2025. Let me pull the relevant ones.</p>



<p class="wp-block-paragraph">Used ark integration<br><br>You&#8217;ve got two solid posts in the archive that directly touch on jargon. Here&#8217;s what I found:</p>



<p class="wp-block-paragraph"><strong>&#8220;Meaningless Statements, Vague Gibberish, and Useless Qualifications&#8221;</strong> (March 10, 2021) — Your most directly relevant piece. It was prompted by a spam email pitching IT services that was so thick with buzzwords it meant nothing. You used the Foundation passage where Holk&#8217;s symbolic logic analysis of a politician&#8217;s speech finds, after eliminating all the meaningless statements and vague gibberish, that nothing is left. You particularly skewered &#8220;talk synergies,&#8221; looked it up in Merriam-Webster, and imagined an email parser that auto-replies with Strunk &amp; White&#8217;s Rule 17: Omit needless words. The post got three comments.</p>



<p class="wp-block-paragraph"><strong>&#8220;What not to do in a resume&#8221;</strong> (November 30, 2006) — This is a broader piece about resume writing, but item #4 is explicitly about jargon: &#8220;Don&#8217;t overuse jargon.&#8221; You wrote that you grow annoyed when bullet points read like a foreign language, and gave the example &#8220;Implemented best-of-breed, open-standards ITIL-based WEP-encryption application&#8221; as something that sounds ridiculous even if true.</p>



<p class="wp-block-paragraph">So you&#8217;ve touched on this twice — once obliquely (resumes) and once head-on (the spam email). The 2021 post has the richest material, with the Asimov hook being particularly strong. For a new blog post, you might want to build on that Foundation framing more broadly, or shift the lens from spam email specifically to workplace communication generally. The 2021 post was reactive; a new one could be more systematic — here&#8217;s what jargon actually costs us, here are the worst offenders, here&#8217;s what plain language looks like instead.</p>
</blockquote>



<p class="wp-block-paragraph">So right away, I know that I&#8217;ve already written about jargon on the blog at least twice before, the last time in 2021 &#8212; so maybe it is time for something new.</p>



<p class="wp-block-paragraph">I wanted to point out the line where Claude said &#8220;Used ark integration, loaded tools.&#8221; When Claude couldn&#8217;t find anything in our conversations about jargon, it immediately knew to use the ark tool provided via the MCP integration to check the archive, and it used the correct search terms to find previous blog posts about jargon. All I had to do was ask my question.</p>



<p class="wp-block-paragraph">This is easier to use than the search commands on the CLI, but they also serve different purposes. The CLI tools can be used in conjunction with other tools and commands through <a href="https://en.wikipedia.org/wiki/Pipeline_(Unix)">pipes</a>, <a href="https://en.wikipedia.org/wiki/Redirection_(computing)">redirects</a>, and scripts. Using Claude lets me interact with the archive as part of other work I am doing. It also allows me to follow up on things. For instance, I might wonder how my writing style changes in the 15 years that separates the two posts.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/06/2026-05-31-ark-part-2-03.png?ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">It is cut-off in the screenshot, but here is the short version at the end of Claude&#8217;s response:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph"><strong>Short version</strong>: 2006 Jamie is a practical manager sharing hard-won tips. 2021 Jamie is a writer using an annoyance as a lens to say something about language, meaning, and the value of clarity — and reaching for Asimov to make the point stick.</p>
</blockquote>



<p class="wp-block-paragraph">Keep in mind, I could do this with any of the 700,000 documents in the archive. That said, there are some limits with the MCP, all intentionally imposed.</p>



<ol class="wp-block-list">
<li>All documents in the archive get a sensitivity level. Any documents with a &#8220;Restricted&#8221; sensitivity are never returned to the MCP. These documents never leave the local machine. If I want LLM summaries of these documents, I can get them from the CLI, which will fall back to a local LLM that runs on my laptop.</li>



<li>Documents marked &#8220;Sensitive&#8221; are swept for sensitive information and that information is redacted before returning it to the MCP. For instance, if an SSN, account number, phone number, etc., appears in a document, those all get replaced with &#8220;[REDACTED]&#8221; before the document is returned to the MCP.</li>



<li>All documents are swept for a known list of sensitive information (SSNs, etc.) and are redacted before leaving the local machine, regardless of sensitivity.</li>
</ol>



<p class="wp-block-paragraph">Still, I can converse with about 90% of the documents that are in the archive, and that can be fun. For instance, I asked Claude what music I was listening to on past May 31sts<sup data-fn="1bcefe97-72b4-4017-8155-cd0be37058af" class="fn"><a href="#1bcefe97-72b4-4017-8155-cd0be37058af" id="1bcefe97-72b4-4017-8155-cd0be37058af-link">3</a></sup>, and it came back with this:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/06/2026-05-31-ark-part-2-04.png?ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">The best part about the MCP integration, for me, is that I use Claude as a kind of personal assistant and with access to my archive, Claude is much more effective in doing things for me because it has a rich context to draw from.</p>



<h2 class="wp-block-heading">Scene 5 &#8211; Adding My Voice: The Annotation System</h2>



<p class="wp-block-paragraph">It is one thing to have artifacts in an archive. But real value is added to the archive when you can layer in context to those artifacts. This is where the annotation system comes in. When items are pulled into the archive, they become read-only and cannot be changed<sup data-fn="35368443-6a51-489c-87c1-f31066165a2f" class="fn"><a href="#35368443-6a51-489c-87c1-f31066165a2f" id="35368443-6a51-489c-87c1-f31066165a2f-link">4</a></sup>. The annotation system provides a layer atop the item itself to provide context to that item. When showing that item in <code>ark</code>, any annotations that are associated with the item are displayed as part of the output. Annotations are available in searches, and there are multiple ways of making annotations.</p>



<ol class="wp-block-list">
<li>Directly through the CLI via the <code>ark annotate</code> command. This command uses the default editor set in your shell<sup data-fn="a1392855-38da-49b9-8006-6bc497fd32a3" class="fn"><a href="#a1392855-38da-49b9-8006-6bc497fd32a3" id="a1392855-38da-49b9-8006-6bc497fd32a3-link">5</a></sup>.</li>



<li>Redirected into the <code>ark annotate</code> command. For instance, if I made a bunch of notes in a files called <code>notes.txt</code>, I could do this as the command line: <code>cat notes.txt | ark annotate document 1234</code> to add the annotation to document 1234.</li>



<li>Via the MCP. Yes, I can ask Claude to annotate for me.</li>
</ol>



<p class="wp-block-paragraph">I have a book club meeting this evening. Our book was <em><a href="https://bookshop.org/p/books/wild-dark-shore-reese-s-book-club-pick-a-novel-charlotte-mcconaghy/45160eba5b1e0f1f">Wild Dark Shore</a></em> by Charlotte McConaghy. <code>ark</code> captures my books and reading so I can search for the book in ark as follows:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark search &quot;wild dark shore&quot; --type book</div><div class="cm-line">Searching: &apos;wild dark shore&apos;</div><div class="cm-line">[1]  725665  book    Wild Dark Shore  ✎2</div><div class="cm-line">   ↳ Wild Dark Shore Charlotte McConaghy A family on a remote island. A mysterious woman washed ashore. A rising storm …</div><div class="cm-line"># 1 result</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">That <code>[1]</code> on the line with the matching record is a &#8220;sigil&#8221;. It is a shortcut reference to the item that I can use in future commands (until another search is run) instead of having to provide the full id (725665 in this case) each time I want to refer to it. When a book is added, <code>ark</code> searches sources on the Internet to pull in metadata about the book. When I look at the record via <code>ark show</code> here is what I see:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark show @1</div><div class="cm-line">id:                725665</div><div class="cm-line">title:             Wild Dark Shore</div><div class="cm-line">doc_type:          book</div><div class="cm-line">record_class:      document</div><div class="cm-line">date_authored:</div><div class="cm-line">date_precision:    undated</div><div class="cm-line">date_ingested:     2026-05-24T14:29:23.876608+00:00</div><div class="cm-line">metadata_updated:  2026-05-24T14:31:12.699497+00:00</div><div class="cm-line">series:</div><div class="cm-line">sub_series:</div><div class="cm-line">sensitivity:       normal</div><div class="cm-line">priority:          normal</div><div class="cm-line">starred:           no</div><div class="cm-line">tags:              bookclub</div><div class="cm-line">slug:</div><div class="cm-line">physical_location:</div><div class="cm-line">source:</div><div class="cm-line">source_path:       ark:/books/work</div><div class="cm-line">store_path:</div><div class="cm-line">sha256:            7b3a2e46382a6c857413db215adb76e6192fe47d9f0c2658dd36b7b5406d2366</div><div class="cm-line">doc_status:        active</div><div class="cm-line">record_origin:     unknown</div><div class="cm-line">origin:</div><div class="cm-line">last_accessed:     2026-05-31 14:24:32</div><div class="cm-line">access_count:      8</div><div class="cm-line">author:            Charlotte McConaghy</div><div class="cm-line">published:         2025</div><div class="cm-line">genre:             Mystery</div><div class="cm-line">isbn:              9789025477356</div><div class="cm-line"></div><div class="cm-line">Wild Dark Shore Charlotte McConaghy A family on a remote island. A mysterious woman washed ashore. A rising storm on the horizon.</div><div class="cm-line"></div><div class="cm-line">Dominic Salt and his three children are caretakers of Shearwater, a tiny island not far from Antarctica. Home to the world’s largest seed bank, Shearwater was once full of researchers, but with sea levels rising, the Salts are now its final inhabitants. Until, during the worst storm the island has ever seen, a woman mysteriously washes ashore.</div><div class="cm-line"></div><div class="cm-line">Isolation has taken its toll on the Salts, but as they nurse the woman, Rowan, back to strength, it begins to feel like she might just be what they need. Rowan, long accustomed to protecting herself, starts imagining a future where she could belong to someone again.</div><div class="cm-line"></div><div class="cm-line">But Rowan isn’t telling the whole truth about why she set out for Shearwater. And when she discovers sabotaged radios and a freshly dug grave, she realizes Dominic is keeping his own secrets. As the storms on Shearwater gather force, they all must decide if they can trust each other enough to protect the precious seeds in their care before it’s too late―and if they can finally put the tragedies of the past behind them to create something new, together.</div><div class="cm-line"></div><div class="cm-line">A novel of breathtaking twists, dizzying beauty, and ferocious love, Wild Dark Shore is about the impossible choices we make to protect the people we love, even as the world around us disappears. Mystery Thriller Literary Fiction</div><div class="cm-line"></div><div class="cm-line">--- annotations ---</div><div class="cm-line"></div><div class="cm-line">[23] 2026-05-31 11:21</div><div class="cm-line">  Initial thoughts after reading:</div><div class="cm-line"></div><div class="cm-line">- characters: Rowan, Dom, Raff Orly, Fen</div><div class="cm-line">- a kind of Robinson Crusoe or Swiss Family Robinson about the story, only darker</div><div class="cm-line">- did a very good job for the first two-thirds at playing out the mystery</div><div class="cm-line">- last third of the book took too many zigs and zags and was on the verge of becoming parody</div><div class="cm-line">- loved the setting of the book</div><div class="cm-line">- wondered why there would only be one place in the world where the seeds would be kept for rebuilding civilization; one would think there would be multiple places; also why one an island as wracked and ravaged as this one.</div><div class="cm-line">- odd to have one chapter from Alex POV -- it throws off the rhythm of the story; as if author could not come up with any other way of conveying that info</div><div class="cm-line"></div><div class="cm-line">[24] 2026-05-31 11:30</div><div class="cm-line">  **Book Club Discussion Notes — May 31, 2026**</div><div class="cm-line"></div><div class="cm-line">*Initial reactions:* Strong first two-thirds that sustains mystery well; final third overloaded with reversals, verging on parody. The island setting is the book&apos;s most consistently successful element, grounded in McConaghy&apos;s research on the real Macquarie Island.</div><div class="cm-line">...</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">After all of that metadata for the book record, note the section called <code>--- annotations ---</code>. This is where I have gone and added annotations to this particular book record. I&#8217;ve added two, as you can see, but I can add as many as I want. These are not part of the book record, but rather are linked to the book record. The <code>ark show</code> command is pulling all of this together when it shows the book to me.</p>



<p class="wp-block-paragraph">Adding the annotation is as simple as running the command. Or, asking Claude to do it for you.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/06/2026-05-31-ark-part-2-05.png?ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Here, not only was Claude easily able to find what I was looking for, but once found, it summarized it as requested, and then added the annotation that I asked for. For me, this turned annotation from a chore into something easy. I can sit with my book and as I read, tell Claude to add various annotations to the book for the thoughts or passages I want to comment on.</p>



<h2 class="wp-block-heading">Scene 6 &#8211; <code>ark quality</code>: The Archive That Knows Where it is Weak</h2>



<p class="wp-block-paragraph">Archivists work to ensure the quality of the archives they manage. That not only means preserving the archive but enriching it with as much information as they can about the contents of the archive. Each item in <code>ark</code> has a certain set of metadata. It has the date of the item, the date it came into the archive, the source it came from, its location in the real world (if it is not digital in the first place). Each item has a title, a series and sub-series, a format, a sensitivity level. Items can have OCR data, embeddings, generated descriptions, and more. With all of that possible data surrounding an item in the archive, some way is needed to assess the quality of items in the archive.</p>



<p class="wp-block-paragraph">Enter the <code>ark quality</code> command.</p>



<p class="wp-block-paragraph">This command provides a way of looking at the archive as a whole seeing the overall quality of items. It also allows for looking at a specific item and seeing the quality of that item. Quality is a score between 0-100 that is a measure of how complete the picture of the item (or archive) is. An item with no metadata at all rates a 0. An item with every possible element filled in rates 100. Here is a quality summary of the entire archive. </p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark quality summary</div><div class="cm-line"></div><div class="cm-line">=== ark quality — Archive Completeness ===</div><div class="cm-line"></div><div class="cm-line">  Total documents:  681,576</div><div class="cm-line">  Median score:     76/100</div><div class="cm-line"></div><div class="cm-line">  Score distribution:</div><div class="cm-line">       0-25                                   11,116  (1.6%)</div><div class="cm-line">      26-50  ███                              85,298  (12.5%)</div><div class="cm-line">      51-75  ███████                         178,002  (26.1%)</div><div class="cm-line">     76-100  █████████████████               407,160  (59.7%)</div><div class="cm-line"></div><div class="cm-line">  Median score by type:</div><div class="cm-line">    attachment        ████                   24/100</div><div class="cm-line">    subscription      ███████                35/100</div><div class="cm-line">    reading_started   ████████               40/100</div><div class="cm-line">    calendar_event    █████████              47/100</div><div class="cm-line">    cli_command       █████████              47/100</div><div class="cm-line">    reminder          █████████              47/100</div><div class="cm-line">    note              ██████████             53/100</div><div class="cm-line">    video             ██████████             53/100</div><div class="cm-line">    reading_finished  ████████████           60/100</div><div class="cm-line">    code_file         █████████████          65/100</div><div class="cm-line">    conversation      █████████████          65/100</div><div class="cm-line">    imessage          █████████████          65/100</div><div class="cm-line">    outbox_draft      █████████████          65/100</div><div class="cm-line">    text              █████████████          65/100</div><div class="cm-line">    image             ██████████████         70/100</div><div class="cm-line">    browser_visit     ██████████████         73/100</div><div class="cm-line">    timeline_event    ██████████████         73/100</div><div class="cm-line">    watch_event       ██████████████         73/100</div><div class="cm-line">    blog_comment      ███████████████        76/100</div><div class="cm-line">    blog_page         ███████████████        76/100</div><div class="cm-line">    book              ███████████████        76/100</div><div class="cm-line">    day_summary       ███████████████        76/100</div><div class="cm-line">    email             ███████████████        76/100</div><div class="cm-line">    facebook-post     ███████████████        76/100</div><div class="cm-line">    message           ███████████████        76/100</div><div class="cm-line">    office            ███████████████        76/100</div><div class="cm-line">    pdf               ███████████████        76/100</div><div class="cm-line">    purchase          ███████████████        76/100</div><div class="cm-line">    review            ███████████████        76/100</div><div class="cm-line">    tweet             ███████████████        76/100</div><div class="cm-line">    git_commit        █████████████████      87/100</div><div class="cm-line">    music_play        █████████████████      87/100</div><div class="cm-line">    weather_snapshot  █████████████████      87/100</div><div class="cm-line">    action_item       █████████████████      88/100</div><div class="cm-line">    blog_post         █████████████████      88/100</div><div class="cm-line">    health_day        █████████████████      88/100</div><div class="cm-line">    playlist          █████████████████      88/100</div><div class="cm-line">    diary_entry       ████████████████████  100/100</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">Here you can see that about 60% of the items in the archive fall into the top quadrant of quality. The highest quality class of items are diary entries. The lowest quality are attachments. This makes sense because attachments are pulled in from email messages and outside of getting descriptions from vision models and embeddings on those descriptions, they have almost no other metadata. </p>



<p class="wp-block-paragraph">Anything that comes into the archive has a title. Often times it is simply the name of the file that was ingested and we all know that filenames can be ambiguous or provide little useful information about what the file contains. <code>ark</code> can assess the quality of titles as well, for instance:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark quality titles --limit 20</div><div class="cm-line"></div><div class="cm-line">=== ark quality titles — 20 document(s) with bad titles ===</div><div class="cm-line"></div><div class="cm-line">       ID  Reason              Type            Date        Title</div><div class="cm-line">  ───────  ──────────────────  ──────────────  ──────────  ──────────────────────────────────────────────────</div><div class="cm-line">     3792  filename w/ ext     attachment      —           Suggestions_for_OS_8_image.doc</div><div class="cm-line">     4048  filename w/ ext     attachment      —           Nuv.doc</div><div class="cm-line">     4245  filename w/ ext     attachment      —           database.doc</div><div class="cm-line">     4326  filename w/ ext     attachment      —           OS_8_seminar_outline.doc</div><div class="cm-line">     5032  filename w/ ext     attachment      —           Proposal.doc</div><div class="cm-line">     5342  filename w/ ext     attachment      —           g970527.doc</div><div class="cm-line">     5343  filename w/ ext     attachment      —           g970515.doc</div><div class="cm-line">     9101  filename w/ ext     attachment      —           updaterr.doc</div><div class="cm-line">    11223  filename w/ ext     attachment      —           REV980617ACOSTEDBOM1.XLS</div><div class="cm-line">    11608  filename w/ ext     attachment      —           adopters2.xls</div><div class="cm-line">    11937  filename w/ ext     attachment      —           Development.xls</div><div class="cm-line">    12092  filename w/ ext     attachment      —           kpatel.doc</div><div class="cm-line">    12859  filename w/ ext     attachment      —           BDY.RTF</div><div class="cm-line">    14769  filename w/ ext     attachment      —           98issues.doc</div><div class="cm-line">    15137  filename w/ ext     attachment      —           michael.doc</div><div class="cm-line">    15326  filename w/ ext     attachment      —           Invoice1.xls</div><div class="cm-line">    16099  filename w/ ext     attachment      —           Dun12.doc</div><div class="cm-line">    16200  filename w/ ext     attachment      —           Lanrover.doc</div><div class="cm-line">    16533  filename w/ ext     attachment      —           %adopters.xls</div><div class="cm-line">    16534  filename w/ ext     attachment      —           adopters.xls</div><div class="cm-line"></div><div class="cm-line">  20 document(s) shown.</div><div class="cm-line">  Use &apos;ark enrich titles&apos; to fix with LLM-proposed replacements.</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">Note that in addition to listing items in the archive with poor quality titles, the command also tells you how you can raise the quality of those titles using <code>ark enrich titles</code>. Automated enrichment is a key element of <code>ark</code>. It uses LLMs to read the document and give a descriptive title to the document based on the context. Each document class has its own template of title format. So for instance a letter might have the format &#8220;Sender-to-recipient (date)&#8221; and the LLM will produce titles in that format so that they are consistent across that class of items, raising the quality and consistency of the archive.</p>



<p class="wp-block-paragraph"><code>ark</code> also uses a controlled vocabulary for its metadata like series and sub-series. When something is ingested it may get a series that falls outside that controlled vocabulary. <code>ark quality vocab</code> can help to identify such items:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark quality vocab</div><div class="cm-line"></div><div class="cm-line">=== ark quality vocab — Controlled Vocabulary Check ===</div><div class="cm-line"></div><div class="cm-line">  Vocabulary: 15 series configured</div><div class="cm-line">  Violations: 4 unknown series value(s), 10 unknown sub-series value(s)</div><div class="cm-line"></div><div class="cm-line">  Unknown series (4):</div><div class="cm-line">    Series                      Docs</div><div class="cm-line">    ────────────────────────  ──────</div><div class="cm-line">    health                      9392</div><div class="cm-line">    shopping                    5813</div><div class="cm-line">    diary                        624</div><div class="cm-line">    environmental                 43</div><div class="cm-line"></div><div class="cm-line">  Unknown sub-series (10):</div><div class="cm-line">    Series              Sub-series              Docs</div><div class="cm-line">    ──────────────────  ────────────────────  ──────</div><div class="cm-line">    media               watch                  26149</div><div class="cm-line">    shopping            digital                 2642 (unknown series)</div><div class="cm-line">    shopping            audible                 2049 (unknown series)</div><div class="cm-line">    shopping            retail                  1122 (unknown series)</div><div class="cm-line">    media               reviews                   83</div><div class="cm-line">    correspondence      amazon                    22</div><div class="cm-line">    legal               identity                  11</div><div class="cm-line">    legal               power-of-attorney          1</div><div class="cm-line">    legal               will                       1</div><div class="cm-line">    professional        systems administration       1</div><div class="cm-line"></div><div class="cm-line">  Sample documents with unknown series (up to 10):</div><div class="cm-line">         ID  Series              Sub-series        Title</div><div class="cm-line">    ───────  ──────────────────  ────────────────  ────────────────────────────────────────</div><div class="cm-line">     728106  environmental       —                 Weather: Arlington, VA on May 30, 2026</div><div class="cm-line">     728102  health              —                 Fitbit: May 30, 2026</div><div class="cm-line">     727759  environmental       —                 Weather: Arlington, VA on May 29, 2026</div><div class="cm-line">     727758  health              —                 Health: May 30, 2026</div><div class="cm-line">     727751  health              —                 Fitbit: May 29, 2026</div><div class="cm-line">     727457  environmental       —                 Weather: Arlington, VA on May 28, 2026</div><div class="cm-line">     727456  health              —                 Health: May 29, 2026</div><div class="cm-line">     727444  health              —                 Fitbit: May 28, 2026</div><div class="cm-line">     727087  environmental       —                 Weather: Arlington, VA on May 27, 2026</div><div class="cm-line">     727086  health              —                 Health: May 28, 2026</div><div class="cm-line"></div><div class="cm-line">  Fix: ark set document &lt;id&gt; --series &lt;valid-series&gt;</div><div class="cm-line">  Or:  ark normalize-series  (if renaming a whole class)</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">Once identified, <code>ark</code> provides commands for automatically fixing the items, either manually, or across the entire series.</p>



<p class="wp-block-paragraph">All of this helps to raise the quality of the archive, making searching more accurate.</p>



<h2 class="wp-block-heading">Scene 7 &#8211; Closing the Day</h2>



<p class="wp-block-paragraph">We&#8217;ll conclude the tour with one way of looking at the work I got done yesterday. This is a good example because it shows how <code>ark</code> was really designed with core Unix principles, each command doing something the output of which could be used as input for other commands.</p>



<p class="wp-block-paragraph">Let&#8217;s begin by looking at what the archive ingested from yesterday. The easy way to do this is as following (I&#8217;m limiting this to just 10 entries for space):</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark recent --days 1 --limit 10</div><div class="cm-line">[1]   728102  health_day    2026-05-31  2026-05-30  Fitbit: May 30, 2026</div><div class="cm-line">[2]   728101  diary_entry   2026-05-31  2026-05-30  Evening Notes</div><div class="cm-line">[3]   728100  git_commit    2026-05-31  2026-05-30  feat(cli): auto-follow supersedes chain to head on reads…</div><div class="cm-line">[4]   728099  git_commit    2026-05-31  2026-05-30  feat(search): identify multi-version docs via --has-vers…</div><div class="cm-line">[5]   728098  git_commit    2026-05-31  2026-05-30  feat(vim): :BlogShot — SnagIt → Vim → WordPress screensh…</div><div class="cm-line">[6]   728097  git_commit    2026-05-31  2026-05-30  fix(reading): ark reading finish &lt;id|@N&gt; stored raw CLI …</div><div class="cm-line">[7]   728096  git_commit    2026-05-31  2026-05-30  fix(blog): standalone images publish as wp:image blocks;…</div><div class="cm-line">[8]   728095  outbox_draft  2026-05-31  2026-05-31  Attempts  ✎1</div><div class="cm-line">[9]   728094  playlist      2026-05-31  2026-05-31  Recently Added (v5)</div><div class="cm-line">[10]  728089  music_play    2026-05-31  2026-05-28  On Your Shore — Enya</div><div class="cm-line"># 10 results</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">But if we wanted to know the total number of things ingested from yesterday we can do this:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark recent --days 1 --limit 200 | wc -l</div><div class="cm-line"># 81 results</div><div class="cm-line">      82</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">Next, we can use the <code>ark bundle</code> command to generate a &#8220;bundle&#8221; of one or more documents from <code>ark</code>. This one bundle would contain the contents of all of the documents we put into it. So for instance, if we wanted to generate a bundle with the full contents of everything that came into <code>ark</code> yesterday we could do this:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark recent --days 1 --limit 200 | ark bundle</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">which would produce output containing the contents of all 81 items ingested yesterday &#8212; a total of 799 lines so I won&#8217;t show them all here.</p>



<p class="wp-block-paragraph">Finally, we can redirect that bundle to an <code>ark task</code>. A <code>task</code> is a prompt for an LLM to do something with the data. We&#8217;ll use my &#8220;shipped-yesterday&#8221; task, which summarizes everything I &#8220;shipped&#8221; (code commits, posts written, emails sent, etc.) Putting it all together, it looks like this:</p>


<div class="wp-block-code">
	<div class="cm-editor">
		<div class="cm-scroller">
			
<pre>
<code><div class="cm-line">ark recent --days 1 --limit 200 | ark bundle | ark task shipped-yesterday</div><div class="cm-line"># 81 results</div><div class="cm-line">ark: warning: Redacted 1 sensitive pattern(s) before sending.</div><div class="cm-line">Running task: &apos;shipped-yesterday&apos; (standard)</div><div class="cm-line">## May 30, 2026 — Daily Output Report</div><div class="cm-line"></div><div class="cm-line">### Shipped</div><div class="cm-line"></div><div class="cm-line">Five commits to the **ark** repo:</div><div class="cm-line"></div><div class="cm-line">- feat(cli): auto-follow supersedes chain to head on reads (#94 Phase 1); bump to 1.68.0 (728100)</div><div class="cm-line">- feat(search): identify multi-version docs via --has-versions + (vN of M) annotation (#95); bump to 1.69.0 (728099)</div><div class="cm-line">- feat(vim): :BlogShot — SnagIt → Vim → WordPress screenshot workflow; bump to 1.69.1 (728098)</div><div class="cm-line">- fix(reading): ark reading finish stored raw CLI arg as event title; bump to 1.69.2 (728097)</div><div class="cm-line">- fix(blog): standalone images publish as wp:image blocks; caption via title (#96); bump to 1.69.3 (728096)</div><div class="cm-line"></div><div class="cm-line">### Written</div><div class="cm-line"></div><div class="cm-line">**Diary** — &quot;Evening Notes&quot; (May 30) covers a pleasant day at home, mentions writing an internal company blog post on AI use, notes a traffic spike (2,300+ views) on his public blog, and reflects on starting Montaigne&apos;s *Essays* and a desire to re-read *The Stand*.</div><div class="cm-line"></div><div class="cm-line">**Self-email** — &quot;Tomorrow&apos;s Plan — Sunday, May 31, 2026&quot; sent from j**.**@icloud.com to himself; a detailed day-plan covering the DCA→BOS flight, the dare-to-dream volunteer event, open action items, and upcoming occasions.</div><div class="cm-line"></div><div class="cm-line">### In Flight</div><div class="cm-line"></div><div class="cm-line">**&quot;Attempts&quot;** (outbox draft, WordPress destination) — a personal essay connecting the etymology of &quot;essay&quot; to the author&apos;s own writing history, from junior high through college, prompted by a morning walk and an archive search. The draft is complete in the outbox but not yet published.</div><div class="cm-line"></div><div class="cm-line"></div></code></pre>
		</div>
	</div>
</div>


<p class="wp-block-paragraph">The first thing to note is the warning that ark gives on that second line after the command:</p>



<p class="wp-block-paragraph"><code>ark: warning: Redacted 1 sensitive pattern(s) before sending</code></p>



<p class="wp-block-paragraph">This is an example of what I was talking about earlier. Sensitive information never leaves the local machine; it is either blocked, or, as in this case, redacted before sending it to an LLM.</p>



<p class="wp-block-paragraph">The rest summarizes the work I did yesterday, based on how I specified the way I wanted it summarized in the &#8220;shipped-yesterday&#8221; task.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p class="wp-block-paragraph">Seven scenes, but really seven angles on one habit. Most days I use four or five of these without thinking about it: the archive surfaces the day, I notice what wants attention, I look for something I half-remember, I add a note to something I just read, I check what shipped. None of those scenes are unique to <code>ark</code> taken in isolation — search engines search, task managers triage, note apps annotate. What&#8217;s unusual is that they sit on top of the same archive, share the same discipline, and the parts compound. That&#8217;s the payoff Part 1 promised the archivist&#8217;s lens would deliver.</p>



<p class="wp-block-paragraph">The next post takes the floor out from under all of this and shows what holds it up — <code>ark</code>&#8216;s data model. A single SQLite database, a clean graph connecting documents to people to one another, and a schema disciplined enough that it should still answer the same questions thirty years from now. Plumbing always sounds duller than the demo, but it&#8217;s where this stops being a pile of clever commands and starts being an archive built to last.</p>


<ol class="wp-block-footnotes"><li id="85e60200-d86e-4e44-907e-2c0d73f6b696">Eventually, `ark` will be able to take some of these actions for me! <a href="#85e60200-d86e-4e44-907e-2c0d73f6b696-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="bcd70b27-9736-4134-96fd-51cf9143dfa0">An embedding is a way of converting text (or other data) into a list of numbers — a vector — that captures its meaning. The key property is that things with similar meanings end up with similar numbers, so you can do math on meaning. When you ask an LLM a question, your question is turned into an embedding and that math is applied to see how &#8220;similar&#8221; it is to other embeddings. <a href="#bcd70b27-9736-4134-96fd-51cf9143dfa0-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="1bcefe97-72b4-4017-8155-cd0be37058af">Yes, `ark` has a history of everything I&#8217;ve listened to (via Apple Music) and watched (via YouTube, Amazon, Netflix and a variety of sources). More on this in the next post in the series. <a href="#1bcefe97-72b4-4017-8155-cd0be37058af-link" aria-label="Jump to footnote reference 3"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="35368443-6a51-489c-87c1-f31066165a2f">They can be versioned and superseded, but that is a topic for a future post. <a href="#35368443-6a51-489c-87c1-f31066165a2f-link" aria-label="Jump to footnote reference 4"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="a1392855-38da-49b9-8006-6bc497fd32a3">MacVim in my case. <a href="#a1392855-38da-49b9-8006-6bc497fd32a3-link" aria-label="Jump to footnote reference 5"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/06/03/ark-part-2/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27683</post-id>	</item>
		<item>
		<title>Attempts</title>
		<link>https://jamierubin.net/2026/05/31/attempts/</link>
					<comments>https://jamierubin.net/2026/05/31/attempts/#respond</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Sun, 31 May 2026 12:00:00 +0000</pubDate>
				<category><![CDATA[Personal & Family]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27654</guid>

					<description><![CDATA[It occurs to me that I have been writing essays for most of my life. At some point, early in life, the thought of writing an essay, a.k.a., a book report, filled me with dread. I&#8217;m not sure why, except that it was work I had to do and I always put off that work [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">It occurs to me that I have been writing essays for most of my life. At some point, early in life, the thought of writing an essay, a.k.a., a book report, filled me with dread. I&#8217;m not sure why, except that it was work I had to do and I always put off that work until the last possible minute<sup data-fn="ae517852-09bd-4216-83c4-6f7d84974982" class="fn"><a href="#ae517852-09bd-4216-83c4-6f7d84974982" id="ae517852-09bd-4216-83c4-6f7d84974982-link">1</a></sup>. I would have sworn that it wasn&#8217;t until high school that I started to change my mind about essays, that there was something to the <em>attempt</em> itself, which it turns out, is where the word comes from.</p>



<p class="wp-block-paragraph">And then this morning, on my early walk, I was listening to SiriusXM 80s on 8 top 40 countdown for this week in 1987. The music immediately had me back in 9th grade, my final year at Porter Junior High School, and less than a month from graduating. I don&#8217;t remember much about the writing I did in Junior High. I know that in 7th grade, we had to research and write books (short books) and then made bindings for them. My book was on Egypt, a subject I never would have chosen, except that I was trying to impress a girl, who liked Egypt. Indeed, I was much more into computers and technology than writing at that point in my life, even receiving recognition for a rudimentary flight simulator I created for my programming class. I received similar recognition for math. But writing? Not until high school!</p>



<p class="wp-block-paragraph">And yet&#8230; Hearing those songs this morning had me wondering about that final year at Porter. I returned from my walk and sat down at this very laptop and at the command-line, I asked <code>[ark](https://jamierubin.net/ark/)</code> what I had in my archive from 9th grade. And the very first match surprised me: an outstanding merit award for the young authors project!</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" fetchpriority="high" decoding="async" width="550" height="341" data-attachment-id="27653" data-permalink="https://jamierubin.net/2026-05-30-attempts-02/" data-orig-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?fit=1258%2C780&amp;ssl=1" data-orig-size="1258,780" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;,&quot;alt&quot;:&quot;&quot;}" data-image-title="2026-05-30-attempts-02" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?fit=550%2C341&amp;ssl=1" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?resize=550%2C341&#038;ssl=1" alt="" class="wp-image-27653" srcset="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?resize=550%2C341&amp;ssl=1 550w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?resize=400%2C248&amp;ssl=1 400w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?resize=768%2C476&amp;ssl=1 768w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?resize=1200%2C744&amp;ssl=1 1200w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/2026-05-30-attempts-02.png?w=1258&amp;ssl=1 1258w" sizes="(max-width: 550px) 100vw, 550px" /></figure>



<p class="wp-block-paragraph">Now, truth be told, I don&#8217;t remember the young authors project, and I would never refer to myself as a young author (or an old one for that matter). A writer, yes; an author? Not me. And I suspect that whatever it was that I was writing<sup data-fn="acda599a-71c1-4dce-8e06-dae1c0865726" class="fn"><a href="#acda599a-71c1-4dce-8e06-dae1c0865726" id="acda599a-71c1-4dce-8e06-dae1c0865726-link">2</a></sup>, it was not essays. But it was a bridge of some kind that got me to high school, where the program I was in tested its students through a single method: the essay. It was there that I made countless &#8220;attempts&#8221; at writing essays to make an argument, to convince, or cajole, or to see just how many pages I could whip out in cursive in a 2-hour period without my hand falling off.</p>



<p class="wp-block-paragraph">It is because of <em>those</em> essays that the essays I wrote in college were both more enjoyable and easier to do. I&#8217;d already had my 10,000 hours of practice<sup data-fn="d901c8a0-5571-4279-9ef1-55dbb2f3f19a" class="fn"><a href="#d901c8a0-5571-4279-9ef1-55dbb2f3f19a" id="d901c8a0-5571-4279-9ef1-55dbb2f3f19a-link">3</a></sup>. And it was in those college essays &#8212; where I Puckishly avoided the traditional stilted style and wrote mostly colloquially &#8212; that I developed the style that would become what I think is the voice of these attempts.</p>



<p class="wp-block-paragraph">Incidentally, this week&#8217;s countdown wasn&#8217;t a bad one. But given that I am now out of junior high school some 39 years, I can&#8217;t for the life of me recall what the #1 song was.</p>


<ol class="wp-block-footnotes"><li id="ae517852-09bd-4216-83c4-6f7d84974982">This never really changed. In college I was something of an anomaly among my friends in putting off writing an essay until the last possible minute &#8212; often in the wee hours on the morning the paper was due. And yet, it flowed naturally somehow, without much effort, and I nearly always received an A. <a href="#ae517852-09bd-4216-83c4-6f7d84974982-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="acda599a-71c1-4dce-8e06-dae1c0865726">It is <em>not</em> preserved in my archive. <a href="#acda599a-71c1-4dce-8e06-dae1c0865726-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="d901c8a0-5571-4279-9ef1-55dbb2f3f19a">It <em>seemed</em> like 10,000 hours. <a href="#d901c8a0-5571-4279-9ef1-55dbb2f3f19a-link" aria-label="Jump to footnote reference 3"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/05/31/attempts/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27654</post-id>	</item>
		<item>
		<title>Chasing Windmills</title>
		<link>https://jamierubin.net/2026/05/28/chasing-windmills/</link>
					<comments>https://jamierubin.net/2026/05/28/chasing-windmills/#respond</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Thu, 28 May 2026 12:00:00 +0000</pubDate>
				<category><![CDATA[Reading & Books]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27642</guid>

					<description><![CDATA[You know that ol&#8217; Peanuts cartoon? The one where Lucy holds the football for Charlie Brown so that he can kick it? C.B. is reluctant because he&#8217;s afraid Lucy will pull the ball away at the last second and he&#8217;ll fall on his can. But somehow, Lucy always convinces him that this time she won&#8217;t [&#8230;]]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" width="550" height="733" data-attachment-id="27646" data-permalink="https://jamierubin.net/img_0698/" data-orig-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?fit=1500%2C2000&amp;ssl=1" data-orig-size="1500,2000" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;1.6&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;iPhone 15&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1779953037&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;5.96&quot;,&quot;iso&quot;:&quot;400&quot;,&quot;shutter_speed&quot;:&quot;0.016666666666667&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;1&quot;,&quot;alt&quot;:&quot;&quot;}" data-image-title="img_0698" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?fit=550%2C733&amp;ssl=1" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?resize=550%2C733&#038;ssl=1" alt="" class="wp-image-27646" srcset="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?resize=550%2C733&amp;ssl=1 550w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?resize=400%2C533&amp;ssl=1 400w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?resize=768%2C1024&amp;ssl=1 768w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?resize=1152%2C1536&amp;ssl=1 1152w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/05/img_0698.jpg?w=1500&amp;ssl=1 1500w" sizes="(max-width: 550px) 100vw, 550px" /></figure>



<p class="wp-block-paragraph">You know that ol&#8217; Peanuts cartoon? The one where Lucy holds the football for Charlie Brown so that he can kick it? C.B. is reluctant because he&#8217;s afraid Lucy will pull the ball away at the last second and he&#8217;ll fall on his can. But somehow, Lucy always convinces him that this time she won&#8217;t pull the ball away, and C.B. goes to kick the ball and Lucy pulls the ball away, there is C.B. on his can.</p>



<p class="wp-block-paragraph">That is me, chasing windmills.</p>



<p class="wp-block-paragraph">Windmills, in this instance, are interminably long books, that despite their interminable length, have an allure, a siren song, a mystique, and a cultish air of mystery that makes me return to them again and again, only to have Lucy pull the football from me, and whoops! Here I am on my can once again.</p>



<p class="wp-block-paragraph">I am foolishly seduced by these types of books, unable to resist their charms. And yet, unable to commit fully to the relationship. I am thinking about this now because, well, not to beat around the bush, but I&#8217;ve been ensorcelled by one of these windmills. Sitting in a waiting room today, I found myself reading<sup data-fn="0b5e2bd3-6f05-472a-ad7a-8e95ab8e8f63" class="fn"><a href="#0b5e2bd3-6f05-472a-ad7a-8e95ab8e8f63" id="0b5e2bd3-6f05-472a-ad7a-8e95ab8e8f63-link">1</a></sup>, for no good reason other than, well, wily charms, Boswell&#8217;s <em><a href="https://bookshop.org/p/books/the-life-of-johnson-james-boswell/f30be42e2b3442e7">Life of Johnson</a></em>. I could say it was because, again and again, I&#8217;ve heard that it is the ultimate biography, and as one who reads a lot of biographies<sup data-fn="9692be1d-b93f-4b9b-b7f7-b0c300fe8403" class="fn"><a href="#9692be1d-b93f-4b9b-b7f7-b0c300fe8403" id="9692be1d-b93f-4b9b-b7f7-b0c300fe8403-link">2</a></sup> who could resist that tag line? But the truth is, I am weak in the knees for these windmills. The Oxford paperback edition on my shelf is 1,402 pages, <em>not</em> counting the index. And on some pages, footnotes consume almost the entire page! I feel faint.</p>



<p class="wp-block-paragraph">I&#8217;ll tilt at that windmill for a while. And then I&#8217;ll start to think to myself: what am I missing by continuing this fruitless quest? There&#8217;s that Laura Spinney book, <em><a href="https://bookshop.org/p/books/proto-how-one-ancient-language-went-global-laura-spinney/0e3ce5edcebe321f">Proto</a></em> on early language that I&#8217;ve wanted to read. Jim Rasenberger just came out with <em><a href="https://bookshop.org/p/books/a-perfect-coincidence-the-extraordinary-friendship-and-astonishing-deaths-of-john-adams-and-thomas-jefferson-jim-rasenberger/158622a7b14b5d74">Perfect Coincidence</a></em> about the friendship of Jefferson and Adams that I&#8217;ve had my eye on. Or there&#8217;s <em><a href="https://bookshop.org/p/books/the-death-and-life-of-zebulon-finch-volume-one-at-the-edge-of-empire-daniel-kraus/4484ca4bf46c638c?ean=9781481411400">The Death and Life of Zebulon Finch, Volume One</a></em> by Daniel Kraus, which while long, is not in the category of a windmill.</p>



<p class="wp-block-paragraph">I&#8217;ve learned to indulge these peccadilloes knowing that the affairs will burn out, the attempts will fail. But the allure of those windmills remain.</p>



<p class="wp-block-paragraph">Other hills I&#8217;ve tried to climb, other windmills I&#8217;ve tilted at? Well, there&#8217;s <em><a href="https://bookshop.org/p/books/infinite-jest-30th-anniversary-edition-david-foster-wallace/6549eeba4077763d">Infinite Jest</a></em> by David Foster Wallace; there&#8217;s <em><a href="https://bookshop.org/p/books/marlborough-his-life-and-times-1933-winston-s-churchill/f995a1b49c7fe8e0">Marlborough: His Life and Times</a></em> by Winston Churchill; there&#8217;s <em><a href="https://bookshop.org/p/books/the-structure-of-evolutionary-theory-the-alexander-agassiz-professor-of-zoology-stephen-jay-gould/b7c5401cc50aca4d">The Structure of Evolutionary Theory</a></em> by Stephen Jay Gould.</p>



<p class="wp-block-paragraph">It is the height of irony that one of these windmills of mine would be <em>Don Quixote</em> by Miguel de Cervantes.</p>



<p class="wp-block-paragraph">But here I am, enchanted at the moment by <em>The Life of Johnson</em>, unable, at the moment, to her aside for something more frivolous, with fewer moving blades, and a less imposing demeanor. Or until Lucy pulls away the football and I am back on my can.</p>


<ol class="wp-block-footnotes"><li id="0b5e2bd3-6f05-472a-ad7a-8e95ab8e8f63">I usually carry a book with me wherever I am, but for Puckish reasons, I find it delightful to be the only person in a waiting room / line / theater lobby who is reading a paper book rather than looking at a phone. <a href="#0b5e2bd3-6f05-472a-ad7a-8e95ab8e8f63-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="9692be1d-b93f-4b9b-b7f7-b0c300fe8403">A couple of hundred at least, if anyone&#8217;s wondering. <a href="#9692be1d-b93f-4b9b-b7f7-b0c300fe8403-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>


<p class="wp-block-paragraph"></p>
]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/05/28/chasing-windmills/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27642</post-id>	</item>
		<item>
		<title>`ark`: A Personal Archive System, Part 1: An Archivist&#8217;s View</title>
		<link>https://jamierubin.net/2026/05/26/ark-part-1/</link>
					<comments>https://jamierubin.net/2026/05/26/ark-part-1/#comments</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Tue, 26 May 2026 11:59:48 +0000</pubDate>
				<category><![CDATA[Technology & Gadgets]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27613</guid>

					<description><![CDATA[Note: This post is the first in a new series of posts describing ark, a command-line-based personal archive system I have developed and am using as my primary archiving tool. If you are not interested in this type of tech post, feel free to skip it. For those who are interested, a plan to provide [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="is-style-info wp-block-paragraph"><strong>Note</strong>: This post is the first in a <a href="https://jamierubin.net/ark/" data-type="page" data-id="27622">new series of posts</a> describing <code>ark</code>, a command-line-based personal archive system I have developed and am using as my primary archiving tool. If you are not interested in this type of tech post, feel free to skip it. For those who are interested, a plan to provide a new entry in the series each Tuesday for the next 14 weeks.<br><br><code>ark</code> is not publicly available on GitHub at the moment. The system has been highly tailored to how I work. It is also highly tailored to the Mac environment. I am on the fence about making it publicly available because I don&#8217;t have the time or inclination to support it. While I was careful with the design, the design was egocentric in that the one and only user I had in mind was me. Depending on the feedback I hear from folks over the next fourteen weeks, I&#8217;ll see about whether I&#8217;ll make the code publicly available.</p>



<p class="wp-block-paragraph">A couple of months ago, I was searching for a trust document. I&#8217;ve got two computers and two external drives. I tried multiple combinations of searches across all devices. I used Spotlight on both machines, and then switched to Unix-based search commands. Despite knowing the document existed <em>somewhere</em>, my search abilities couldn&#8217;t surface it.</p>



<h2 class="wp-block-heading">I. A Moment of Friction</h2>



<p class="wp-block-paragraph">I have <a href="https://jamierubin.net/2019/05/29/the-jamie-todd-rubin-papers/">thought about personal archives</a> now and then &#8212; the kind you read about in biographies of notable people. I recall reading about how Boston University asked Isaac Asimov to collect his papers. I began to wonder if it was possible to create a similar archive for all of my papers. I took a small step in this direction in late 2024 with my <a href="https://jamierubin.net/2024/12/21/my-personal-archive-system/">Personal Archive System</a>. This was an experiment to see what was in the realm of the possible, it was web-based, but there wasn&#8217;t much thought about the long-term design and architecture &#8212; the very kinds of things I do in my day job.</p>



<p class="wp-block-paragraph">At the same time, I had been using Claude Code at work to help me build a command-line system that made it easy to interact with Jira and tie Jira into an LLM for easy summarizing. I was impressed by how well Claude Code worked and how it felt almost like a real collaborator. In that project, I acted as an architect, and Claude Code did all of the grunt work for me.</p>



<p class="wp-block-paragraph">At the end of March, I decided to sit down and, with Claude Code as a kind of partner<sup data-fn="b78bde67-f210-4c2b-869f-aede595353b4" class="fn"><a href="#b78bde67-f210-4c2b-869f-aede595353b4" id="b78bde67-f210-4c2b-869f-aede595353b4-link">1</a></sup>, talk through how to best design a personal archive system that takes its core design principles from real archiving principles, while meeting my requirements for what I wanted in an archive system.</p>



<h2 class="wp-block-heading">II. An Archivist&#8217;s Lens</h2>



<p class="wp-block-paragraph">Longtime readers know that I have gone through at least two major iterations of personal archiving of sorts. The first, in the early 2010s was the time I spent using Evernote to <a href="https://jamierubin.net/going-paperless/">go paperless</a>. That experiment lasted several years. Ultimately, however, it wasn&#8217;t a good fit for what I was trying to do. Part of the problem was that, at the time, I&#8217;m not sure I knew what I was trying to do. I knew that I wanted to be able to find things quickly. In order to do that, Evernote required some amount of metadata infrastructure (notebooks, tags) and for me, maintaining that became a roadblock.</p>



<p class="wp-block-paragraph">The second wave, in the 2020s has been my use of Obsidian to go <a href="https://jamierubin.net/blog-series/practically-paperless-with-obsidian/">practically paperless</a>. One thing that attracted me to Obsidian was its simplicity. At its core, it was plain text, the most basic, most portable form of data there is. Another thing that attracted me was that it was entirely local. No need to store data in the cloud. Everything was on my local machine. Ultimately, however, what I discovered was that both Evernote and Obsidian were <em>working</em> tools. That is, places to do work, take notes, etc., rather than a stable archive of work <em>already</em> done. In other words, these tools are optimized for <em>now</em>. An archive is optimized for <em>posterity</em>.</p>



<p class="wp-block-paragraph">As I worked on the design of the system, one question that guided every decision was <em>how would a real archive handle this, and where does the personal context require an adaptation?</em> One obvious difference: in a public archive, the archivist and the subject are different people; here they are the same person. Other than that, the principles that guide a public archive could apply here. These principles include things like:</p>



<ul class="wp-block-list">
<li>Provenance</li>



<li>Finding aids</li>



<li>Controlled vocabulary</li>



<li>Sensitivity</li>



<li>Accession</li>
</ul>



<p class="wp-block-paragraph">For instance, every item in the archive comes from somewhere in either the digital or physical world. This is its provenance. An archivist typically organizes items in an archive in a hierarchy that starts with series at the top. A series might have sub-series. Items in a sub-series might go in one or more files. Within the files are the items themselves. There is always, therefore, a clear path to an item in the archive.</p>



<h2 class="wp-block-heading">III. Non-Negotiables</h2>



<p class="wp-block-paragraph">I came into the design discussion with several non-negotiable design decisions:</p>



<ol class="wp-block-list">
<li><strong>Local data storage</strong>: The archive would be designed to be stored locally on a file system.</li>



<li><strong>A clear data egress boundary</strong>: A personal archive by its very nature will contain sensitive documents in a life: medical, financial, legal, etc. The sensitivity of these documents is decided at ingest and enforced at egress as a hard rule. As we will see, the archive makes use of LLMs for a variety of tasks. But most LLMs reside on the internet and that means sending data to them. <code>ark</code> was designed to block sensitive data from egress to these sources<sup data-fn="3fd35643-dda6-4f3c-b7ec-15fcffcfc421" class="fn"><a href="#3fd35643-dda6-4f3c-b7ec-15fcffcfc421" id="3fd35643-dda6-4f3c-b7ec-15fcffcfc421-link">2</a></sup>.</li>



<li><strong>Hands-off automation</strong>: When bringing in hundreds of thousands of documents, manual classification, linking, etc. is out of the question. There has to be mechanisms for automating this process.</li>



<li><strong>Durability over cleverness</strong>: Every architectural decision, from the local file storage, to the database, Unix composability, agnostic LLM layer, etc., all favor longevity. Services come and go. Formats change. APIs get deprecated. The goal isn&#8217;t to have the cleverest tool; it is to have a tool that still works fifty or a hundred years from now.</li>
</ol>



<h2 class="wp-block-heading">IV. Pipeline as an Operational Spine</h2>



<p class="wp-block-paragraph">There are seven stages that form the operational backbone of <code>ark</code>.</p>



<ol class="wp-block-list">
<li><strong>Ingest</strong>: Includes parsers for email, PDFs, images, Office documents, Markdown, as well as other sources of data like calendars, text messages, read data, music, health data, and more. When ingested, this data goes through type detection, text extraction, OCR, deduping. Think of this as the narrow waist where everything entering the archive gets normalized into the same shape.</li>



<li><strong>Enrich</strong>: Two flavors of enrichment: (1) automated, which includes things like classification, embeddings, mood scoring, place and person extraction; (2) human enrichment, which includes things like annotations and people curation.</li>



<li><strong>Store</strong>: Content-addressed file storage plus SQLite database holding records, the full text index, embeddings, annotations, and the relationship graph. Ingested data enters the store and becomes a permanent immutable document<sup data-fn="7c2421a9-190b-41c2-99b5-d1e1dd06886d" class="fn"><a href="#7c2421a9-190b-41c2-99b5-d1e1dd06886d" id="7c2421a9-190b-41c2-99b5-d1e1dd06886d-link">3</a></sup>.</li>



<li><strong>Search</strong>: Or more formally, retrieval, includes full-text, semantic, hybrid, and person-aware searches, and a scoring system that ranks results by signal rather than by raw text match.</li>



<li><strong>Surface</strong>: What the archive can show you outside of searches: timelines, &#8220;on this day&#8221;, action-item digests, etc. These are distinct from retrieval in that you don&#8217;t explicitly ask for them. The archive offers them.</li>



<li><strong>Synthesis</strong>: Creating &#8220;bundles&#8221;, LLM tasks, deep searches, MCP. This is where the archive stops being a corpus and starts answering questions.</li>



<li><strong>Stewardship</strong>: Proactively looking for and creating actions based on what is coming into the archive; automatically drafting replies, etc., for incoming activities. This is the bridge to automation that may allow the archive to act on my behalf as a kind of personal assistant.</li>
</ol>



<h2 class="wp-block-heading">V. What&#8217;s Coming in the Rest of the Series</h2>



<p class="wp-block-paragraph">The remainder of this series will describe each of these layers from inside the design:</p>



<ul class="wp-block-list">
<li>Part 2 zooms into the data model and graph that serves as the connective tissues between documents, people, and annotations.</li>



<li>Part 3 covers the ingestion process &#8212; how data comes into the archive, in bulk, and through a daily ingestion process.</li>



<li>Part 4 covers retrieval.</li>



<li>Parts 5-6 cover the enrichment layers.</li>



<li>Part 7 gives a tour of the way books and reading are handled by <code>ark</code>. Both are a significant part of my daily life, and warranted their own functions in the system.</li>



<li>Part 8 covers synthesis: bundles, LLM tasks, deep searches, and the &#8220;recommend&#8221; surface.</li>



<li>Part 9 covers stewardship &#8212; the layer that turns ingested material into prioritized work.</li>



<li>Part 10 covers publishing back out via the outbox.</li>



<li>Part 11 covers the operational tools and functions of living with the archive day-to-day: sync, location, timeline, day rank, etc.</li>



<li>Part 12 covers the four interfaces: CLI, TUI, MCP, and Vim.</li>



<li>Part 13 is all about the long game: production, backup, export, and durability.</li>



<li>Part 14 closes with a portrait of what the archive actually holds.</li>
</ul>



<h2 class="wp-block-heading">VI. The Long Game</h2>



<p class="wp-block-paragraph"><code>ark</code> looks the way it does because archivists solved these problems first, and because the design choices that matter most at a five-year horizon are different from those that matter most at a fifty-year horizon. Keep that in mind as you read the rest of the series.</p>



<h2 class="wp-block-heading">VII. Coda</h2>



<p class="wp-block-paragraph">It took about a week of work to get the core system up and running, and to get 80% of the documents I had into the archive. As I write this, there are 680,497 items in the archive. There were around 400,000 at the end of that first week. At that point, I used the tool to search for the trust document I was searching for. By then, the enrichment layer was in place, and the basic search functionality automatically enriched the search as well. I ran a simple search command in <code>ark</code>:</p>



<p class="wp-block-paragraph"><code>ark search "living trust" --type pdf</code></p>



<p class="wp-block-paragraph">which after about 0.30 seconds, returns a single match: the exact document I was looking for in the first place.</p>



<p class="wp-block-paragraph">Two months of nights and weekends in — about 1,250 commits, 116,000 lines of Python, 4,800 tests — ark holds 680,497 documents across fifty tables: emails going back to 1994, every diary entry I&#8217;ve digitized, photos with their GPS, scanned tax records, blog posts, calendar events, books I&#8217;ve finished, music I&#8217;ve listened to. The 9 GB SQLite database and 125 GB content store live on this laptop and back themselves up overnight. Everything I&#8217;ve described above is what makes that pile navigable.</p>


<ol class="wp-block-footnotes"><li id="b78bde67-f210-4c2b-869f-aede595353b4">I am well aware that a large language model is nothing more than a complex mathematical construct, not a real entity or consciousness. I use the term &#8220;partner&#8221; as a convenient shorthand for how I worked with this particular tool. <a href="#b78bde67-f210-4c2b-869f-aede595353b4-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="3fd35643-dda6-4f3c-b7ec-15fcffcfc421">As we will see in a future post, `ark` has a fallback that allows these sensitive documents to be routed to a local LLM, if one exists. <a href="#3fd35643-dda6-4f3c-b7ec-15fcffcfc421-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="7c2421a9-190b-41c2-99b5-d1e1dd06886d">`ark` is version aware and a newer version of a document can &#8220;supersede&#8221; a previous version. The archive stores all of them, linking them together appropriately and displaying only the current version, unless an earlier version is requested. <a href="#7c2421a9-190b-41c2-99b5-d1e1dd06886d-link" aria-label="Jump to footnote reference 3"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/05/26/ark-part-1/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27613</post-id>	</item>
		<item>
		<title>My Blue Heaven</title>
		<link>https://jamierubin.net/2026/05/25/my-blue-heaven/</link>
					<comments>https://jamierubin.net/2026/05/25/my-blue-heaven/#respond</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Mon, 25 May 2026 12:00:00 +0000</pubDate>
				<category><![CDATA[Personal & Family]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27606</guid>

					<description><![CDATA[The mail arrives, and rather than be full of junk ads for political campaigns, insurance companies, and furniture far too rich for my blood, it comes with three or four personalized advertisements for this college or that one. They are not addressed to me or Kelly, but rather to Zach, who, a year from now, [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">The mail arrives, and rather than be full of junk ads for political campaigns, insurance companies, and furniture far too rich for my blood, it comes with three or four personalized advertisements for this college or that one. They are not addressed to me or Kelly, but rather to Zach, who, a year from now, will be wrapping up his senior year in high school. We&#8217;ll know where he is going to college by then, a thought that seemed utterly impossible 17 years ago when he <a href="https://jamierubin.net/2009/06/18/the-zachary-chronicles-episode-1-our-hero-arrives/">arrived on the scene</a>.</p>



<p class="wp-block-paragraph">I&#8217;ve been thinking a lot about my time in college, about the things I did, and didn&#8217;t do; thinking about the lifelong friends I made there; thinking about the classes I took and the professors I studied under. I&#8217;ve been thinking about the music I listened to while in school, whether it was while studying, or working in the dorm dish room, or walking to-and-from classes listening to cassettes on my Sony Walkman<sup data-fn="4697964c-7581-4ecb-a329-fc715bc8a81f" class="fn"><a href="#4697964c-7581-4ecb-a329-fc715bc8a81f" id="4697964c-7581-4ecb-a329-fc715bc8a81f-link">1</a></sup>. I&#8217;ve been thinking about the movies I saw, and the papers I wrote, and the late night, deep philosophical debates I had with friends and dorm-mates.</p>



<p class="wp-block-paragraph">A university is a place of learning &#8212; not just a place to discover the secrets of the universe, or to delve into the mysteries of human psychology, or the life cycles of insects, or whatever interests you &#8212; but a place to figure out who you are and what you want to do with your life. It is a place to meet new and interesting people, to try new things, whether it is a food or drink, a sport, a hobby. It is a place designed to encourage exploration and experimentation in all directions. Indeed, it is a unique experience in a life, an island of learning that is in many ways cut off from the real world. It is less haven and more heaven.</p>



<p class="wp-block-paragraph">In his memoir <em><a href="https://jamierubin.net/2025/04/06/shelf-life-8-i-asimov/">I. Asimov</a></em>, Isaac Asimov writes,</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph">I dreamed I had died and gone to heaven. I looked about and knew where I was &#8212; green fields, fleecy clouds, perfumed air, and the distant, ravishing sound of the heavenly choir. And there was the recording angel smiling broadly at me in greeting&#8230; [I] asked, &#8220;Is there a typewriter I can use?&#8221;</p>
</blockquote>



<p class="wp-block-paragraph">In W.P. Kinsella&#8217;s marvelous novel <em><a href="https://bookshop.org/p/books/shoeless-joe-w-p-kinsella/dcaf00763b6333ff">Shoeless Joe</a></em>,<sup data-fn="de153714-07e2-4201-bf82-17ea3f4c4607" class="fn"><a href="#de153714-07e2-4201-bf82-17ea3f4c4607" id="de153714-07e2-4201-bf82-17ea3f4c4607-link">2</a></sup> after shagging flies and grounders in a cornfield, Shoeless Joe speaks:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph">&#8220;God what an outfield,&#8221; he says. &#8220;What a left field.&#8221; He looks up at me and I look down at him. &#8220;This must be heaven,&#8221; he says.<br><br>&#8220;No. It&#8217;s Iowa,&#8221; I reply automatically. But then I feel the night rubbing softly against my face like cherry blossoms; look at the sleeping girl-child in my arms, her small hand curled around one of my fingers; think of the fierce warmth of the woman waiting for me in the house; inhale the fresh-cut grass smell that seems locked in the air like permanent incense; and listen to the drone of the crowd, as below me, Shoeless Joe Jackson tenses, watching the angle of the distant bat for a clue as to where the ball will be hit.<br><br>&#8220;I think you&#8217;re right, Joe,&#8221; I say, but softly enough not to disturb his concentration.</p>
</blockquote>



<p class="wp-block-paragraph">For me, heaven would be a university, populated with countless interesting people: students, professors, faculty, staff. It would not be a place where, after a life well-lived, I would suddenly learn the meaning of the universe in one divine blast of knowledge. Rather, it would be a place where I could mingle with others, drink a pint at a local pub, watch a band, and in between, have limitless time to learn, to read, to study, and to slowly and carefully seek out the secrets of the universe for all eternity.</p>



<p class="wp-block-paragraph">Of course, this is what I do every day right here on Earth. It is what I have done ever since I picked up a copy of <em><a href="https://jamierubin.net/2021/01/18/backyard-astronomy-1979/">The Nine Planets</a></em> at the Franklin Township Public Library when I was five or six years old. It is why <a href="https://jamierubin.net/2025/02/25/my-hobby-is-learning/">my hobby is learning</a>. It is why it is a hobby that I will never give up. It is why, despite the turmoil, and trying times we live through, I see life very much like a university, populated with countless interesting people, a place where I can mingle with others, drink a pint at a local pub, watch a band, and in between, learn, read, study, and slowly, carefully, seek out the secrets of the universe. Would that I had the limitless time.</p>


<ol class="wp-block-footnotes"><li id="4697964c-7581-4ecb-a329-fc715bc8a81f">This was long before MP3s, iPods, or streaming media. <a href="#4697964c-7581-4ecb-a329-fc715bc8a81f-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="de153714-07e2-4201-bf82-17ea3f4c4607">The film <em>Field of Dreams</em> is based on this book. <a href="#de153714-07e2-4201-bf82-17ea3f4c4607-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/05/25/my-blue-heaven/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27606</post-id>	</item>
		<item>
		<title>A Prolific Morning</title>
		<link>https://jamierubin.net/2026/05/24/prolific-morning/</link>
					<comments>https://jamierubin.net/2026/05/24/prolific-morning/#respond</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Sun, 24 May 2026 12:00:00 +0000</pubDate>
				<category><![CDATA[Personal & Family]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27599</guid>

					<description><![CDATA[My girls convinced me to take them to a local Barnes &#38; Noble because they were searching for something called a Squeezy Squishy toy &#8212; or something like that. So we braved the unseasonably cool, dreary, rainy weather to drive the short distance to our nearby B&#38;N so that they could seek out their treasure. [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">My girls convinced me to take them to a local Barnes &amp; Noble because they were searching for something called a Squeezy Squishy toy &#8212; or something like that. So we braved the unseasonably cool, dreary, rainy weather to drive the short distance to our nearby B&amp;N so that they could seek out their treasure.</p>



<p class="wp-block-paragraph">As they scoured the store for any signs of Squeezy Squishy things, I wandered around, looking at the various tables of books. I couldn&#8217;t recall the last time I&#8217;d been in a Barnes &amp; Noble, but as I browsed through the various themed tables, I noticed something: at every table I looked at, no matter the theme, there was at least one book that I had read. At some tables, there were three or four. I pointed this out to the girls. They pointed to a random shelf (World History) , asked if I&#8217;d read anything there. I immediately pointed to <em>Churchill</em> by Andrew Roberts. It was an bit of an odd, yet oddly satisfying experience.</p>



<p class="wp-block-paragraph">It was a less satisfying experience for the girls. There were no Squishy Squeezy things to be had in Barnes &amp; Noble.</p>



<p class="wp-block-paragraph">I&#8217;ve always admired prolific people, whatever their profession. For a time, I was even mildly prolific writing here on the blog. But I was never a prolific writer of fiction, for instance. It would take me a month or more to write a story, and that is a non-starter when it comes to being prolific. Yet somehow, slowly, but steadily, I became a prolific reader. This pleases me, as silly as it may seem. I&#8217;ve written before that <a href="https://jamierubin.net/2025/02/25/my-hobby-is-learning/">my hobby is learning</a>. What pleases me even more is the range of my reading. No matter what area of the store I wandered through, there was something I&#8217;d read.</p>



<p class="wp-block-paragraph">Once we realized there were no Squeezy Squishy things at Barnes &amp; Noble, we headed out home.</p>



<p class="wp-block-paragraph">And yet, the story has a coda. Being prolific runs in the genes. The girls are prolific shoppers. They located Squeezy Squishy things at a nearby Target. So I drove them over there and the store indeed had the Squeezy Squishy things, stored in a cabinet at guest services. Mission accomplished. With that success in hand, I took the girls to lunch at a diner next door.</p>



<p class="wp-block-paragraph">It was a prolific morning. </p>
]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/05/24/prolific-morning/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27599</post-id>	</item>
		<item>
		<title>Offices, Then and Now</title>
		<link>https://jamierubin.net/2026/05/23/offices-then-and-now/</link>
					<comments>https://jamierubin.net/2026/05/23/offices-then-and-now/#comments</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Sat, 23 May 2026 13:00:29 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27592</guid>

					<description><![CDATA[As I wrapped up my work yesterday, I looked around my home office and flashed back to my home office 30 years earlier. My &#8220;office&#8221; back then was a desk in a small one-bedroom Studio City apartment, with a single 3-shelf bookcase. Today, my office is a room off the living room, a former sun [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">As I wrapped up my work yesterday, I looked around my home office and flashed back to my home office 30 years earlier. My &#8220;office&#8221; back then was a desk in a small one-bedroom Studio City apartment, with a single 3-shelf bookcase. Today, my office is a room off the living room, a former sun room with a dozen windows overlooking front, back and side yards. I stand writing this essay at a sit-stand desk, and instead of 1 small bookshelf, all four sides of my office are filled by overflowing bookcases.</p>



<p class="wp-block-paragraph">The most remarkable thing, however, is the change in technology that has taken place over that three decade span. I stood at my desk last night, listening to Elvis Costello on a Bose Bluetooth speaker that sat on a shelf to my right, the music streaming from my iPhone which stood on a wireless charger on the opposite side of my desk. Instead of a single, bulky CRT on my desk, four flat screens sweep across the surface of my desk today: a MacBook Pro, connected to two large external monitors, my MacBook Air, all of which work in concert. The nest of wires that used to accumulate across my desk have been replaced by wireless keyboards and trackpads.</p>



<p class="wp-block-paragraph">And the software! I remember installing Microsoft Office on something like 26 3-1/2-inch floppy disks. Now, I can download and install a much more advanced Office in far less time than I could install it from those floppy disks. And with tools like Claude and Claude Code, I have a personal assistant to handle routine things for me, leaving me more time to focus on the things that I am most interested in.</p>



<p class="wp-block-paragraph">Still, despite all of these changes, some things don&#8217;t change much. I am writing this essay in Vim, which I used back then, and which I have used off-and-on ever since. The muscle memory of the keyboard commands has become second nature. I still spend the vast majority of my time in Unix command line terminals, or in similar LLM chats. In some sense, with the advent of LLMs, UXs have started moving back toward the command line, an irony, which, as a command line enthusiast, I relish.</p>



<p class="wp-block-paragraph">And while I can simply ask Siri to define or spell a word for me, I prefer to reach for one of the ten dictionaries that sit on a shelf beside my desk. And while streaming music is convenient, and I do it frequently, right here on a cabinet beside me, and sitting next to my ScanSnap ix1300 scanner, is a Victrola turntable, which my son got me as a present, and for which I have started to amass a small collection of vinyl, and which I much prefer for listening to things like Miles Davis&#8217; <em>Kind of Blue</em> or the Eagles&#8217; <em>Hotel California</em>.</p>



<p class="wp-block-paragraph">One thing that old office had that my current office lacks is a cabinet full of file folders. Over the three-decade span, my files have slowly made their way into the digital realm, slowly at first, then at a more accelerated pace in the early 2010s when I <a href="https://jamierubin.net/going-paperless/">adopted Evernote</a>. When I gave up on Evernote in the early 2020s, <a href="https://jamierubin.net/blog-series/practically-paperless-with-obsidian/">I moved a lot of things into Obsidian</a>, but I found that Obsidian was more of a tool for writing than managing an archive. So after searching around and looking at tools like DevonThink, among others, I finally built my own archive, first testing things out in what I call my Personal Archive System (PAS), and then really studying the problem the way an archivist would, and with the help of Claude Code, building a new and sophisticated command-line system called <code>ark</code>.</p>



<p class="wp-block-paragraph">Today, <code>ark</code> hosts my entire archive, more than 670,000 items, including every email I&#8217;ve sent or received since late 1994, and digital documents going back to 1992, scanned documents going back to my birth, as well as a pipeline of continually updated data for capturing browser history, CLI history, my entire social media history, photos, calendars, diaries, blog posts, comments, reading history, annotations, Apple Health data, you name it, it is in the archive. The archive has a sophisticated search capability that allows not only full-text search, but embedded searches<sup data-fn="b695293d-88b6-456e-ae45-aad70c1cc4bb" class="fn"><a href="#b695293d-88b6-456e-ae45-aad70c1cc4bb" id="b695293d-88b6-456e-ae45-aad70c1cc4bb-link">1</a></sup> (every item gets an embedding, every photo gets an LLM description and embedding). Because I&#8217;ve created an MCP server<sup data-fn="bcf82055-2340-44a7-9d20-36791dc05dd1" class="fn"><a href="#bcf82055-2340-44a7-9d20-36791dc05dd1" id="bcf82055-2340-44a7-9d20-36791dc05dd1-link">2</a></sup> for the archive, I can &#8220;chat&#8221; with the archive using Claude. And I&#8217;ve built an egress layer to prevent sensitive information from leaving my local machine (for sensitive data, embeddings and LLM activities use a local LLM<sup data-fn="f6dade71-56c0-4fff-bc23-a8e82b9cb790" class="fn"><a href="#f6dade71-56c0-4fff-bc23-a8e82b9cb790" id="f6dade71-56c0-4fff-bc23-a8e82b9cb790-link">3</a></sup>.)</p>



<p class="wp-block-paragraph">This is a big difference to my office from 30 years ago. Instead of hunting through paper documents, or hacking my way through files in the file system, I can type a simple command on the command line, or even ask Claude, &#8220;What posts have I written about my desk?&#8221; and it does all of the grunt work for me. That, at least for me, is a change for the better.</p>



<p class="wp-block-paragraph">(P.S.: if you are curious about Claude&#8217;s answer to that question, here it is:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph">I can see the blog post titles clearly from the social media announcements. Here&#8217;s what turned up — looks like you&#8217;ve written at least four dedicated desk posts:</p>



<ol class="wp-block-list">
<li>&#8220;<a href="https://jamierubin.net/2010/09/14/my-desk-annotated/">My desk, annotated</a>&#8221; (September 2010) — the earliest one, with an annotated look at your desk setup.</li>



<li>&#8220;<a href="https://jamierubin.net/2019/05/16/my-cluttered-desk/">My Cluttered Desk</a>&#8221; (May 2019) — a post about desk clutter.</li>



<li>&#8220;<a href="https://jamierubin.net/2021/04/13/the-desk-and-the-desktop-musings-on-productivity-part-1/">The Desk and The Desktop: Musings on Productivity, Part 1</a>&#8221; (April 2021) — a more reflective productivity piece.</li>



<li>&#8220;<a href="https://jamierubin.net/2021/07/07/a-newer-healthier-desk-for-my-office/">A Newer Healthier Desk for My Office</a>&#8221; (July 2021) — likely tied to the L-shaped electric standing desk you bought from Amazon in June 2021.</li>



<li>&#8220;<a href="https://jamierubin.net/2021/08/10/a-cluttered-desk/">A Cluttered Desk</a>&#8221; (August 2021) — another clutter-themed post, just a month later.</li>
</ol>



<p class="wp-block-paragraph">There&#8217;s also a tweet from August 2021 quoting the Einstein line about cluttered desks and cluttered minds, which may have been a weekly recap referencing one of those posts.</p>



<p class="wp-block-paragraph">Want me to pull up the full content of any of these?</p>
</blockquote>



<p class="wp-block-paragraph">I told Claude it didn&#8217;t need to do that. I&#8217;ll just link to them here instead.</p>


<ol class="wp-block-footnotes"><li id="b695293d-88b6-456e-ae45-aad70c1cc4bb">An embedded search is a way to find things by <em>meaning</em> as opposed to just pattern matching. <a href="#b695293d-88b6-456e-ae45-aad70c1cc4bb-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="bcf82055-2340-44a7-9d20-36791dc05dd1">MCP (Model Context Protocol) is a way of exposing <code>ark</code> commands to an LLM like Claude. The LLM can then use my tool to answer questions. <a href="#bcf82055-2340-44a7-9d20-36791dc05dd1-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="f6dade71-56c0-4fff-bc23-a8e82b9cb790">I use LM Studio for hosting local models, <code>ark</code> uses llama-3.2-3b-instruct for local text-based LLM activity, and it uses qwen3-vl-8b for vision (getting descriptions from photos, documents, etc.) <a href="#f6dade71-56c0-4fff-bc23-a8e82b9cb790-link" aria-label="Jump to footnote reference 3"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/05/23/offices-then-and-now/feed/</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27592</post-id>	</item>
		<item>
		<title>30 Years of My Diary</title>
		<link>https://jamierubin.net/2026/04/06/30-years-of-my-diary/</link>
					<comments>https://jamierubin.net/2026/04/06/30-years-of-my-diary/#comments</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Mon, 06 Apr 2026 11:33:08 +0000</pubDate>
				<category><![CDATA[Personal & Family]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27452</guid>

					<description><![CDATA[Isaac Asimov died 34 years ago today. 30 years ago today, inspired by Asimov’s lifelong habit of keeping a diary, I started my own diary habit, one that continues today. I woke up this morning and headed into my office, looking at the shelf of reference books beside my desk. In addition to half a [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Isaac Asimov died 34 years ago today. 30 years ago today, inspired by Asimov’s lifelong habit of keeping a diary, I started my own diary habit, one that continues today. I woke up this morning and headed into my office, looking at the shelf of reference books beside my desk. In addition to half a dozen dictionaries and other reference book, there are 43 notebooks of various kids representing my handwritten diaries going back to April 6, 1996.</p>




<p class="wp-block-paragraph">I pulled out the very first diary, an Avery Record Book which I got in an office supply store in Sherman Oaks, California. It’s a slim volume because I was counting my pennies back then. I turned to the first page and read what I had written.</p>




<div class="wp-block-image"><figure class="aligncenter"><img data-recalc-dims="1" decoding="async" width="3949" height="5142" data-attachment-id="27451" data-permalink="https://jamierubin.net/img_0140/" data-orig-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?fit=3949%2C5142&amp;ssl=1" data-orig-size="3949,5142" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;1.6&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;iPhone 15&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1775459680&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;5.96&quot;,&quot;iso&quot;:&quot;640&quot;,&quot;shutter_speed&quot;:&quot;0.025&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;1&quot;}" data-image-title="My First Diary Volume" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?fit=550%2C716&amp;ssl=1" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=3949%2C5142&#038;ssl=1" alt="The first volume of my diary" class="wp-image-27451" srcset="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?w=3949&amp;ssl=1 3949w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=400%2C521&amp;ssl=1 400w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=550%2C716&amp;ssl=1 550w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=768%2C1000&amp;ssl=1 768w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=1180%2C1536&amp;ssl=1 1180w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=1573%2C2048&amp;ssl=1 1573w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?resize=1200%2C1563&amp;ssl=1 1200w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?w=2000&amp;ssl=1 2000w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/04/IMG_0140.jpeg?w=3000&amp;ssl=1 3000w" sizes="(max-width: 1000px) 100vw, 1000px" /><figcaption>The first volume of my diary</figcaption></figure></div>


<p class="wp-block-paragraph">The very first thing I mentioned in that first entry was that I had finished read <em>I. Asimov</em>, Isaac Asimov’s posthumously published memoir (a retrospective follow up to his massive 1980 double-volume autobiography). This was my 3rd time reading the book. I also mentioned starting to read Robert Heinlein’s <em>Double Star</em>, which it turned became one of my favorite Heinlein books. I mentioned seeing the movie <em>Babe</em>. And I concluded the entry with a renewed desire to write science fiction stories and, someday, see them published.</p>




<p class="wp-block-paragraph">I have been working slowly over the last year or so to transcribe and digitize these diaries as part of my personal digital archive. The archive now contains digitized schoolwork and artwork going back to kindergarten, and email going back to October 18, 1994—even before my diary. Indeed, as of the this morning, the archive contains more than a quarter of a million messages. By the time my diary is fully digitized, I’d expect there to be about 10,000 entires. I also have more than 7,000 blog posts archived there a well, making the archive a comprehensive collection of my output during my lifetime.</p>




<p class="wp-block-paragraph">There is another coincidence today: The <em>Artemis II</em> / <em>Orion</em> spacecraft and crew will surpass Apollo 13’s distance record from Earth later today, and head behind the moon, the first time humans have done so in 54 years. I’ve been following the mission assiduously, and I can’t help but think that Isaac Asimov would have been delighted by this mission. Even more incredible than 30 years of diary entries is that fact that once again, we’ve gone to the moon.</p>




]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/04/06/30-years-of-my-diary/feed/</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27452</post-id>	</item>
		<item>
		<title>The New Typewriter</title>
		<link>https://jamierubin.net/2026/01/08/the-new-typewriter/</link>
					<comments>https://jamierubin.net/2026/01/08/the-new-typewriter/#comments</comments>
		
		<dc:creator><![CDATA[Jamie Todd Rubin]]></dc:creator>
		<pubDate>Thu, 08 Jan 2026 13:23:33 +0000</pubDate>
				<category><![CDATA[Technology & Gadgets]]></category>
		<category><![CDATA[essay]]></category>
		<guid isPermaLink="false">https://jamierubin.net/?p=27279</guid>

					<description><![CDATA[I bought for myself, as a belated holiday gift, a brand new Royal Scriptor II Portable Electric Typewriter. I’ve been yearning for a working typewriter for at least a year now. What finally pushed me to make the purchase was my recent realization that I’ve spent at least a third of my life for the [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">I bought for myself, as a belated holiday gift, a brand new Royal Scriptor II Portable Electric Typewriter. I’ve been yearning for a working typewriter<sup data-fn="73aadaed-e366-411e-9aee-1e8b040712c9" class="fn"><a href="#73aadaed-e366-411e-9aee-1e8b040712c9" id="73aadaed-e366-411e-9aee-1e8b040712c9-link">1</a></sup> for at least a year now. What finally pushed me to make the purchase was my <a href="https://jamierubin.net/2025/12/28/screen-weary-to-screen-wary/">recent realization</a> that I’ve spent <em>at least</em> a third of my life for the last 34 years looking at screens. Looking for ways to cut back, a typewriter seemed an ideal compromise: one that would allow me to write without always having to be looking at a screen<sup data-fn="26a8e37e-e381-4a71-9014-dcfd51513d76" class="fn"><a href="#26a8e37e-e381-4a71-9014-dcfd51513d76" id="26a8e37e-e381-4a71-9014-dcfd51513d76-link">2</a></sup>.</p>



<p class="wp-block-paragraph">The new typewriter arrived yesterday and what a joy it is!</p>



<figure class="wp-block-image aligncenter"><img data-recalc-dims="1" loading="lazy" decoding="async" width="4032" height="3024" data-attachment-id="27278" data-permalink="https://jamierubin.net/img_4874/" data-orig-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?fit=4032%2C3024&amp;ssl=1" data-orig-size="4032,3024" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;1.6&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;iPhone 13&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1767858517&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;5.1&quot;,&quot;iso&quot;:&quot;250&quot;,&quot;shutter_speed&quot;:&quot;0.016666666666667&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;1&quot;}" data-image-title="The New Typewriter" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?fit=550%2C413&amp;ssl=1" src="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=4032%2C3024&#038;ssl=1" alt="" class="wp-image-27278" srcset="https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?w=4032&amp;ssl=1 4032w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=400%2C300&amp;ssl=1 400w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=550%2C413&amp;ssl=1 550w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=768%2C576&amp;ssl=1 768w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=1536%2C1152&amp;ssl=1 1536w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=2048%2C1536&amp;ssl=1 2048w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=1200%2C900&amp;ssl=1 1200w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=800%2C600&amp;ssl=1 800w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=600%2C450&amp;ssl=1 600w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?resize=200%2C150&amp;ssl=1 200w, https://i0.wp.com/jamierubin.net/wp-content/uploads/2026/01/IMG_4874.jpeg?w=3000&amp;ssl=1 3000w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /><figcaption class="wp-element-caption">My new Royal Scriptor II</figcaption></figure>



<p class="wp-block-paragraph">Admittedly, I originally set out to get a manual typewriter, but those seemed much harder to come by than a new Royal. Besides, I wanted a typewriter that was in working condition, not for show. As it turns out, the model that I got has some advanced features, including a screen that you can see what you’re typing before it types it on the page so you can make corrections. <strong>I have avoided this feature so far</strong>. After spending about an hour learning the basics, I obtained a set of blank postcards, and set about writing half a dozen notes to friends and family on my new machine. I made two surprising discoveries:</p>



<ol class="wp-block-list">
<li>I am not nearly as good a typist as I think I am when editing capabilities and auto-correct are taken out of the picture. I type fast, but I make a typo every 3 or 4 lines of text. This discovery is a blessing in disguise: by using the typewriter, I think I can make myself a <em>better</em> typist.</li>



<li>When I get into a rhythm, with the staccato RAT-TAT-TAT of my typing as background music, I get what I only describe as an endorphin rush that spurs me on, and is like nothing I’ve ever felt typing on a computer keyboard and watching the words appear on a screen.<br></li>
</ol>



<p class="wp-block-paragraph">There are a few useful features that I’ve already made use of. The device has a whopping 16K of text memory, which can be used to store text files, or format settings. I’ve used it for the latter, setting up my margins and tab for standard letter paper, as well as 4&#215;6 postcards. This makes it convenient to easily switch formatting settings based on my task. The typewriter can produce underlined text, bold text, and 3 different text pitches. It can auto-center a line, and right-justify a line. It has word correction features that make use of the correction tape, but I haven’t played around with that yet, preferring the old-fashioned X-ing out of my mistakes. This gives me visual cues of my progress over time.</p>



<p class="wp-block-paragraph">I’m excited to put my new toy to more use. The feeling I got when I used it yesterday was just wonderful. As I sit here typing these words into my laptop, I am imagining typing them into my new typewriter. It is not the same.</p>



<p class="has-text-align-right wp-block-paragraph">— Arlington, Virginia</p>


<ol class="wp-block-footnotes"><li id="73aadaed-e366-411e-9aee-1e8b040712c9">I have a manual Royal typewriter circa 1950, which I inherited from my grandfather, but it is no longer in working condition. <a href="#73aadaed-e366-411e-9aee-1e8b040712c9-link" aria-label="Jump to footnote reference 1"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li><li id="26a8e37e-e381-4a71-9014-dcfd51513d76">Someone will note that I could also simply write longhand and at a significantly lower cost. The fact is I do write longhand, every day, in my diaries. That gives my hand and wrist enough of a workout. <a href="#26a8e37e-e381-4a71-9014-dcfd51513d76-link" aria-label="Jump to footnote reference 2"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/21a9.png" alt="↩" class="wp-smiley" style="height: 1em; max-height: 1em;" />︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://jamierubin.net/2026/01/08/the-new-typewriter/feed/</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">27279</post-id>	</item>
	</channel>
</rss>
