<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-33534782</id><updated>2026-04-03T13:48:00.351+02:00</updated><category term="linux"/><category term="configuration/usage"/><category term="software"/><category term="评论和记事"/><category term="Windows"/><category term="Tinker"/><category term="hardware"/><category term="code"/><category term="network"/><category term="game"/><category term="firefox"/><category term="looking"/><category term="resource"/><category term="web"/><category term="TeX/LaTeX"/><category term="c/c++"/><category term="blog"/><category term="font"/><category term="security"/><category term="Flash"/><category term="color"/><category term="css"/><category term="sysadmin"/><category term="Experiment"/><category term="anti-virus"/><category term="python"/><category term="bootc"/><category term="mathematics"/><category term="Study"/><category term="crack"/><category term="immutable"/><category term="programming"/><category term="vim"/><category term="Visual C++"/><category term="backup"/><category term="javascript"/><category term="qubes os"/><category term="NixOS"/><category term="podman"/><category term="Thoughts"/><category term="ZFS"/><category term="asm"/><category term="blender"/><category term="c#"/><category term="cheat"/><category term="declarative"/><category term="esp32s3"/><category term="gameconqueror"/><category term="gmail"/><category term="ipad"/><category term="perl"/><category term="privacy"/><category term="selinux"/><category term="systemd"/><category term="ubuntu"/><category term="webvfx"/><category term="LXC"/><category term="email"/><category term="esp32"/><category term="google"/><title type='text'>WangLu&#39;s Notes</title><subtitle type='html'>久病成医 | Prolonged Illness Makes the Patient a Good Doctor</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://blog.wang-lu.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default?redirect=false'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default?start-index=26&amp;max-results=25&amp;redirect=false'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>415</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-33534782.post-988237530339696866</id><published>2026-03-26T00:13:00.003+01:00</published><updated>2026-03-26T00:13:56.285+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="评论和记事"/><title type='text'>Solving LVM Detection Failures in GRUB After a Force Shutdown</title><content type='html'>&lt;p &gt;After a routine system update and an unfortunate hang that required a hard reset, my Linux machine refused to boot. Instead of the familiar login prompt, I was greeted by a cryptic GRUB error:&amp;nbsp;&lt;code &gt;error: no cryptodisk module can handle this device.&lt;/code&gt;&lt;/p&gt;&lt;p &gt;My setup uses LUKS2 + LVM. From the GRUB rescue shell, I could actually decrypt the LUKS container. But once decrypted, GRUB completely failed to detect any LVM volumes. It simply acted as if the LVM structure didn&#39;t exist.&lt;/p&gt;&lt;p &gt;Meanwhile, if I boot it from a Live Rescue USB, everything worked perfectly. I could open the LUKS container, and the volume group appeared immediately. Tools like&amp;nbsp;&lt;code &gt;vgck&lt;/code&gt;&amp;nbsp;and&amp;nbsp;&lt;code &gt;pvck&lt;/code&gt;&amp;nbsp;reported no issues.&lt;/p&gt;&lt;p &gt;After a length discussion with AI, eventually I found the magical commands:&lt;/p&gt;&lt;pre &gt;&lt;code &gt;vgcfgbackup -f lvm_backup.txt &amp;lt;vg_name&amp;gt;
vgcfgrestore -f lvm_backup.txt &amp;lt;vg_name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p &gt;After running these commands and rebooting, GRUB recognized the LVM volumes immediately, and I was back in my system.&lt;/p&gt;&lt;p &gt;Supposedly this forces rewriting the LVM metadata. Perhaps there were issues with the metadata, which were caused by the forced shutdown. The issues can be handled by LVM parser in Linux, but not by the limited implementation in GRUB.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/988237530339696866' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/988237530339696866'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/988237530339696866'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2026/03/solving-lvm-detection-failures-in-grub.html' title='Solving LVM Detection Failures in GRUB After a Force Shutdown'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-3938930141742382776</id><published>2026-03-12T20:17:00.006+01:00</published><updated>2026-03-12T20:17:39.701+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="qubes os"/><category scheme="http://www.blogger.com/atom/ns#" term="Windows"/><title type='text'>Notes on a Tricky Linux Installation: Qubes OS and Windows</title><content type='html'>&lt;p &gt;I recently tried to install Qubes OS alongside an existing Windows installation. It turned out to be surprisingly difficult—way harder than my last attempt—likely due to a combination of my encrypted&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&amp;nbsp;setup and older hardware. Here are some notes from the process.&lt;/p&gt;&lt;h1 &gt;Shrinking an NTFS Volume&lt;/h1&gt;&lt;p &gt;I needed to free up some space from a Windows NTFS volume. Normally, this just takes a few clicks in Disk Management. This time, however, Windows reported a &quot;shrinkable volume&quot; that was suspiciously small.&lt;/p&gt;&lt;p &gt;Following&amp;nbsp;&lt;a data-href=&quot;https://superuser.com/a/1060508&quot; href=&quot;https://superuser.com/a/1060508&quot; &gt;this answer&lt;/a&gt;, I tried the standard fixes:&lt;/p&gt;&lt;ul &gt;&lt;li &gt;Disabled hibernation (&lt;code &gt;powercfg /h off&lt;/code&gt;)&lt;/li&gt;&lt;li &gt;Disabled the pagefile&lt;/li&gt;&lt;li &gt;Disabled system protection&lt;/li&gt;&lt;/ul&gt;&lt;p &gt;This increased the shrinkable volume a bit, but nowhere near the actual free space left on the partition.&lt;/p&gt;&lt;p &gt;Digging into the Windows Application logs in Event Viewer, I finally found the culprit:&amp;nbsp;&lt;code &gt;The last unmovable file appears to be: \$Mft::$DATA&lt;/code&gt;. It turns out&amp;nbsp;&lt;code &gt;$Mft&lt;/code&gt;&amp;nbsp;is a special block in NTFS that cannot be easily moved, and a simple defrag wasn&#39;t going to cut it. I tried a few third-party partition managers, but they all failed initially. Following a hint from one of the tools, I temporarily disabled BitLocker. That did the trick—AOMEI Partition Assistant was finally able to shrink the volume. Once it was done, I just had to re-enable everything.&lt;/p&gt;&lt;h1 &gt;Configuring the Display&lt;/h1&gt;&lt;p &gt;I use a dual-monitor setup (let&#39;s call them A and B). The Linux console and GUI installer assumed monitor A was the primary display, leaving monitor B either completely blank (in the terminal) or showing an empty desktop (in the GUI). I wanted everything on B, and the usual Super+Arrow Key shortcut wasn&#39;t working.&lt;/p&gt;&lt;p &gt;Here is how I forced the display:&lt;/p&gt;&lt;h2 &gt;Turn Off the Display in the Terminal&lt;/h2&gt;&lt;p &gt;Note: This forces the GUI installer to the correct screen, but doesn&#39;t change the terminal itself.&lt;/p&gt;&lt;ol &gt;&lt;li &gt;Switch to the terminal (&lt;code &gt;Ctrl&lt;/code&gt;&amp;nbsp;+&amp;nbsp;&lt;code &gt;Alt&lt;/code&gt;&amp;nbsp;+&amp;nbsp;&lt;code &gt;F2&lt;/code&gt;).&lt;/li&gt;&lt;li &gt;Find the &quot;bad&quot; display in&amp;nbsp;&lt;code &gt;/sys/class/drm/&lt;/code&gt;.&lt;/li&gt;&lt;li &gt;Run&amp;nbsp;&lt;code &gt;echo off &amp;gt; /sys/class/drm/card0-&amp;lt;DEVICE_NAME&amp;gt;/status&lt;/code&gt;.&lt;/li&gt;&lt;/ol&gt;&lt;ul &gt;&lt;li &gt;Example:&amp;nbsp;&lt;code &gt;echo off &amp;gt; /sys/class/drm/card0-HDMI-A-1/status&lt;/code&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ol &gt;&lt;li &gt;Switch back to the GUI (&lt;code &gt;Ctrl&lt;/code&gt;&amp;nbsp;+&amp;nbsp;&lt;code &gt;Alt&lt;/code&gt;&amp;nbsp;+&amp;nbsp;&lt;code &gt;F6&lt;/code&gt;). The installer should now be forced onto the only &quot;active&quot; screen.&lt;/li&gt;&lt;/ol&gt;&lt;h2 &gt;Turn Off the Display via Kernel Parameters&lt;/h2&gt;&lt;ol &gt;&lt;li &gt;First, find the device/port name by running:&amp;nbsp;&lt;code &gt;ls -d /sys/class/drm/card*-*&lt;/code&gt;. Examples:&lt;/li&gt;&lt;/ol&gt;&lt;ul &gt;&lt;li &gt;&lt;code &gt;/sys/class/drm/card0-DP-1&lt;/code&gt;&amp;nbsp;(Integrated)&lt;/li&gt;&lt;li &gt;&lt;code &gt;/sys/class/drm/card1-HDMI-A-1&lt;/code&gt;&amp;nbsp;(Discrete)&lt;/li&gt;&lt;/ul&gt;&lt;ol &gt;&lt;li &gt;Add something like&amp;nbsp;&lt;code &gt;video=eDP-1:d&lt;/code&gt;&amp;nbsp;to the kernel parameters.&lt;/li&gt;&lt;li &gt;This gets more complicated when the ports are the same but the cards are different, though I haven&#39;t tested that scenario.&lt;/li&gt;&lt;/ol&gt;&lt;h1 &gt;GRUB and Encrypted&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&lt;/h1&gt;&lt;p &gt;GRUB 2.12 supports LUKS2, but it doesn&#39;t support the Argon2 hashing algorithm—that didn&#39;t arrive until GRUB 2.14.&lt;/p&gt;&lt;p &gt;While GRUB 2.14 works perfectly on my newer machine, it refused to boot properly on this older one. After hours of troubleshooting, I realized that while GRUB 2.14 could boot directly into Linux, but it failed completely when trying to boot through Xen. Suspecting the issue lay somewhere between GRUB and Xen, I eventually gave up and downgraded to GRUB 2.12, changing my LUKS partition to use PBKDF2 instead of Argon2.&lt;/p&gt;&lt;p &gt;Another quirk: GRUB&#39;s decryption implementation feels about 100 times slower than&amp;nbsp;&lt;code &gt;cryptsetup&lt;/code&gt;. It likely lacks hardware acceleration for decryption, and the&amp;nbsp;&lt;code &gt;-A&lt;/code&gt;&amp;nbsp;parameter isn&#39;t available for the&amp;nbsp;&lt;code &gt;cryptomount&lt;/code&gt;&amp;nbsp;command in my version of GRUB. To keep boot times reasonable, I had to decrease the number of PBKDF2 iterations in LUKS.&lt;/p&gt;&lt;h1 &gt;LVM Issues&lt;/h1&gt;&lt;p &gt;Xen and Linux finally booted, but the celebration was cut short. The boot process stalled, complaining that the&amp;nbsp;&lt;code &gt;/dev/mapper/boot&lt;/code&gt;&amp;nbsp;device timed out, and it wouldn&#39;t even let me enter the rescue shell.&lt;/p&gt;&lt;p &gt;I fixed the rescue shell issue by booting from a live USB and giving&amp;nbsp;&lt;code &gt;root&lt;/code&gt;&amp;nbsp;a password. The timeout issue, however, was much weirder. It turned out that only the LVM logical volumes specifically listed in the&amp;nbsp;&lt;code &gt;rd.lvm.lv&lt;/code&gt;&amp;nbsp;kernel parameters were being unlocked; the rest were completely invisible. Even running&amp;nbsp;&lt;code &gt;vgs&lt;/code&gt;&amp;nbsp;and&amp;nbsp;&lt;code &gt;lvs&lt;/code&gt;&amp;nbsp;returned empty results.&lt;/p&gt;&lt;p &gt;I was eventually able to recover everything using the&amp;nbsp;&lt;code &gt;vgimportdevices -a&lt;/code&gt;&amp;nbsp;command. Oddly, this created a duplicate LVM entry in the&amp;nbsp;&lt;code &gt;system.devices&lt;/code&gt;&amp;nbsp;file, and manually removing the duplicate broke everything again. I ended up just deleting the file entirely and letting&amp;nbsp;&lt;code &gt;vgimportdevices&lt;/code&gt;&amp;nbsp;recreate it from scratch. It’s been running smoothly ever since.&lt;/p&gt;&lt;h1 &gt;Secure Boot&lt;/h1&gt;&lt;p &gt;Just as I got Qubes OS working with Secure Boot enabled, Windows broke. The UEFI complained that the boot signature wasn&#39;t recognized.&lt;/p&gt;&lt;p &gt;After some debugging, I figured out what happened: I had reset the Secure Boot keys on my motherboard to their factory defaults. Because the laptop is old, the factory defaults only contained the 2011 Microsoft keys. However, my Windows boot loader had been updated and was signed with the newer 2023 key.&lt;/p&gt;&lt;p &gt;I tried copying the Secure Boot database from another machine, but that failed (likely due to PK/KEK mismatch issues). After hours of trial and error, I solved it with a combination of the following actions:&lt;/p&gt;&lt;ul &gt;&lt;li &gt;Set Secure Boot to Setup mode.&lt;/li&gt;&lt;li &gt;Reset to factory keys&lt;/li&gt;&lt;li &gt;Update Secure Boot variables via Windows registry and scheduled tasks (&lt;a data-href=&quot;https://support.microsoft.com/en-us/topic/registry-key-updates-for-secure-boot-windows-devices-with-it-managed-updates-a7be69c9-4634-42e1-9ca1-df06f43f360d#bkmk_registry_keys&quot; href=&quot;https://support.microsoft.com/en-us/topic/registry-key-updates-for-secure-boot-windows-devices-with-it-managed-updates-a7be69c9-4634-42e1-9ca1-df06f43f360d#bkmk_registry_keys&quot; &gt;source&lt;/a&gt;)&lt;/li&gt;&lt;/ul&gt;&lt;pre &gt;&lt;code &gt;reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Secureboot /v AvailableUpdates /t REG_DWORD /d &lt;span &gt;0&lt;/span&gt;x5944 /f

&lt;span &gt;Start-ScheduledTask&lt;/span&gt; &lt;span &gt;-TaskName&lt;/span&gt; &lt;span &gt;&quot;\Microsoft\Windows\PI\Secure-Boot-Update&quot;&lt;/span&gt;

&lt;span &gt;# Manually reboot the system when the AvailableUpdates becomes 0x4100&lt;/span&gt;

&lt;span &gt;Start-ScheduledTask&lt;/span&gt; &lt;span &gt;-TaskName&lt;/span&gt; &lt;span &gt;&quot;\Microsoft\Windows\PI\Secure-Boot-Update&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;ul &gt;&lt;li &gt;Boot&amp;nbsp;&lt;code &gt;SecureBootRecovery.efi&lt;/code&gt;&amp;nbsp;from the Windows EFI partition.&lt;/li&gt;&lt;/ul&gt;&lt;p &gt;I suspect running&amp;nbsp;&lt;code &gt;SecureBootRecovery.efi&lt;/code&gt;&amp;nbsp;was the magic bullet, though the other steps likely set the stage. Surprisingly, this file never came up in my online troubleshooting; I just stumbled across it by accident while browsing the EFI partition.&lt;/p&gt;&lt;h1 &gt;Final Thoughts&lt;/h1&gt;&lt;p &gt;Looking back, I definitely stacked too many tricky components together—older hardware, dual-booting, encrypted&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;, LVM, and Secure Boot—and hit a bit of bad luck along the way. Fortunately, it all worked out in the end. What a journey!&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/3938930141742382776' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/3938930141742382776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/3938930141742382776'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2026/03/notes-on-tricky-linux-installation.html' title='Notes on a Tricky Linux Installation: Qubes OS and Windows'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-9093303707722793815</id><published>2026-03-08T23:56:00.003+01:00</published><updated>2026-03-08T23:56:12.389+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="qubes os"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>Refined Boot for Qubes OS: Minimal USB Key, Dual Boot, Secure Boot</title><content type='html'>&lt;p &gt;I&#39;ve been running Qubes OS on Machine A alongside Windows for a while. My setup involved storing the unencrypted&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&amp;nbsp;partition and the LUKS header on an external USB drive.&lt;/p&gt;&lt;p &gt;Recently, I planned to install Qubes on Machine B, also in a dual-boot configuration. However, the complexity jumped significantly:&lt;/p&gt;&lt;ul &gt;&lt;li &gt;&lt;p &gt;Machine B has Secure Boot enabled because BitLocker requires it. On previous installs, I grew tired of toggling Secure Boot in the BIOS every time I switched operating systems.&lt;/p&gt;&lt;/li&gt;&lt;li &gt;&lt;p &gt;I only have one USB drive. Managing separate&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&amp;nbsp;partitions for two different Qubes installations on a single thumb drive is messy.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p &gt;After some experimentation, I found a way to solve both problems.&lt;/p&gt;&lt;h1 &gt;Sharing One USB Drive for Multiple Qubes Installations&lt;/h1&gt;&lt;p &gt;The solution is elegant: Don&#39;t store&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&amp;nbsp;on the USB drive. Instead, move&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&amp;nbsp;to the encrypted internal disk partition. The USB drive&#39;s only job is to unlock that partition and hand over control to the system. Once I grasped this concept, implementation was relatively straightforward using the&amp;nbsp;&lt;a data-href=&quot;https://wiki.archlinux.org/title/GRUB&quot; href=&quot;https://wiki.archlinux.org/title/GRUB&quot; &gt;Arch Wiki&lt;/a&gt;&amp;nbsp;and AI.&lt;/p&gt;&lt;p &gt;Some notes:&lt;/p&gt;&lt;ul &gt;&lt;li &gt;The USB drive only needs an EFI partition containing the GRUB binary and the LUKS header files. In my case, these total less than 30MB.&lt;/li&gt;&lt;li &gt;While recent GRUB versions support LUKS2, but only&amp;nbsp;&lt;a data-href=&quot;https://cgit.git.savannah.gnu.org/cgit/grub.git/commit/?id=6052fc2cf684dffa507a9d81f9f8b4cbe170e6b6&quot; href=&quot;https://cgit.git.savannah.gnu.org/cgit/grub.git/commit/?id=6052fc2cf684dffa507a9d81f9f8b4cbe170e6b6&quot; &gt;very recent commits&lt;/a&gt;&amp;nbsp;supports Argon2. Fedora&#39;s version was too old, but the packages in Debian Sid worked. Arch shoud also work.&lt;/li&gt;&lt;li &gt;I wrote a simple&amp;nbsp;&lt;code &gt;grub.cfg&lt;/code&gt;&amp;nbsp;that loads the necessary modules (GPT, LUKS2), unlocks the partition via&amp;nbsp;&lt;code &gt;cryptomount&lt;/code&gt;&amp;nbsp;using the header file on the USB, and then passes control using&amp;nbsp;&lt;code &gt;configfile&lt;/code&gt;.&lt;/li&gt;&lt;li &gt;For this setup, usually,&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;&amp;nbsp;can just be a folder in the root directory/filesystem. However, because my root is on a thin-provisioned LVM logical volume (which GRUB does not yet support), I had to create a separate, standard LVM logical volume specifically for&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;.&lt;/li&gt;&lt;li &gt;I used&amp;nbsp;&lt;code &gt;grub-mkstandalone&lt;/code&gt;&amp;nbsp;to create a bundled GRUB binary and&amp;nbsp;&lt;code &gt;efibootmgr&lt;/code&gt;&amp;nbsp;to register it with the UEFI firmware.&lt;/li&gt;&lt;li &gt;When the kernel/initramfs boots from the now-unlocked&amp;nbsp;&lt;code &gt;/boot&lt;/code&gt;, it doesn&#39;t &quot;inherit&quot; the unlocked state. To avoid typing the password twice, I embedded a LUKS keyfile into the initramfs to automate the second unlock.&lt;/li&gt;&lt;li &gt;To support both machines, I copied both LUKS headers to the USB. I then modified grub.cfg to detect the machine&#39;s UUID via&amp;nbsp;&lt;code &gt;smbios&lt;/code&gt;, allowing it to automatically select the correct header and partition.&lt;/li&gt;&lt;/ul&gt;&lt;h2 &gt;The Benefits&lt;/h2&gt;&lt;p &gt;Compared to the standard &quot;detached USB&quot; method, this stores significantly less data on the unencrypted drive, making backups easier and security tighter.&lt;/p&gt;&lt;p &gt;It also solved a major headache: updates. Previously, I had to manually mount /boot before updating dom0 and remember to unmount it before rebooting. If I forgot and triggered a reboot of sys-usb, the system would often hang. Now, those days are over. I can update the system without the USB drive even being plugged in; the drive itself rarely needs updates.&lt;/p&gt;&lt;h1 &gt;Make Qubes OS Play Nice with Secure Boot&lt;/h1&gt;&lt;p &gt;It turned out I misunderstood &quot;Qubes OS doesn&#39;t support Secure Boot.&quot; What that actually means is that Qubes doesn&#39;t provide an out-of-the-box way to verify the entire chain (Xen, kernels, etc.).&lt;/p&gt;&lt;p &gt;However, if our goal is simply to allow Qubes to boot without disabling Secure Boot in the BIOS, it’s actually quite easy:&lt;/p&gt;&lt;ul &gt;&lt;li &gt;Set up&amp;nbsp;&lt;a data-href=&quot;https://wiki.archlinux.org/title/Unified_Extensible_Firmware_Interface/Secure_Boot#shim&quot; href=&quot;https://wiki.archlinux.org/title/Unified_Extensible_Firmware_Interface/Secure_Boot#shim&quot; &gt;shim&lt;/a&gt;, register the entry with UEFI, and point it to my GRUB EFI binary.&lt;/li&gt;&lt;li &gt;Ensure my GRUB binary contains an SBAT section. I used&amp;nbsp;&lt;code &gt;objcopy&lt;/code&gt;&amp;nbsp;to extract this from Debian GRUB binary, then included it via&amp;nbsp;&lt;code &gt;grub-mkstandalone&lt;/code&gt;.&lt;/li&gt;&lt;li &gt;Disable MOK validation&amp;nbsp;&lt;code &gt;mokutil --disable-validation&lt;/code&gt;. The MOK Manager actually refers to it by &quot;Disable Secure Boot&quot;, but it doesn&#39;t actually affect UEFI or Windows.&lt;/li&gt;&lt;li &gt;Enroll my GRUB binary hash during the first boot.&lt;/li&gt;&lt;/ul&gt;&lt;p &gt;That&#39;s it!&lt;/p&gt;&lt;p &gt;I was surprised by the simplicity. I expected to be wrestling with PK/KEK keys and accidentally bricking my Windows bootloader. Instead, learning how shim and MOK interact made the process painless.&lt;/p&gt;&lt;h2 &gt;Security Implications&lt;/h2&gt;&lt;p &gt;To be clear: this method allows Qubes to run alongside Secure Boot, but it does not (yet) cryptographically verify the Xen image or Linux kernels.&lt;/p&gt;&lt;p &gt;However, since that data resides on an encrypted partition and the bootloader is on a detached USB, it meets my personal threat model. In the future, I may look into creating my own keys to sign the images properly.&lt;/p&gt;&lt;p &gt;I’m also considering moving the GRUB binary to an internal partition. If I embed the LUKS header directly into the GRUB binary and enroll its hash, the shim would theoretically notify me if the binary was tampered with. It’s not a perfect defense against physical access, but it&#39;s a significant step up from a standard unencrypted boot.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/9093303707722793815' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/9093303707722793815'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/9093303707722793815'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2026/03/refined-boot-for-qubes-os-minimal-usb.html' title='Refined Boot for Qubes OS: Minimal USB Key, Dual Boot, Secure Boot'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-1831802453695424262</id><published>2026-02-05T21:58:00.005+01:00</published><updated>2026-02-05T21:59:18.778+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="Experiment"/><category scheme="http://www.blogger.com/atom/ns#" term="programming"/><category scheme="http://www.blogger.com/atom/ns#" term="评论和记事"/><title type='text'>Writing Sudoku Solvers</title><content type='html'>&lt;p&gt;After writing a Nonogram solver, I decided to tackle a Sudoku solver to practice Rust. My goal wasn&#39;t just to support classic Sudoku rules, but also to handle variants like Thermometer, Arrow, and Cage etc.&lt;/p&gt;&lt;h1&gt;1. Brute Force&lt;/h1&gt;&lt;p&gt;It is fairly easy to write a brute force or backtracking algorithm. This approach is sufficient for most classic Sudoku puzzles, but it becomes unbearably slow as soon as variant rules are introduced.&lt;/p&gt;&lt;p&gt;I considered this step a warmup—a baseline to improve upon.&lt;/p&gt;&lt;h1&gt;2. Constraint Propagation&lt;/h1&gt;&lt;p&gt;Here, I tried to introduce &quot;logical thinking&quot; to the algorithm. I used&amp;nbsp;&lt;code&gt;u16&lt;/code&gt;&amp;nbsp;as a bitmask to represent the possible values of a cell. Whenever a cell&#39;s state changes (due to guessing, backtracking, or propagation), the algorithm consults all constraints to eliminate impossible candidates.&lt;/p&gt;&lt;p&gt;While Nonogram is technically an NP-Complete problem, in practice, my constraint-propagation solver (without guessing) can solve almost all puzzles found online. I’ve only seen one exception where I had to guess a few cells. This proves that puzzles designed for human players are meant to be solved via logical deduction, making them computationally &quot;easy.&quot;&lt;/p&gt;&lt;p&gt;It turns out Sudoku is similar. Although some backtracking is still needed, classic puzzles are typically solved within 100~200 µs (microseconds), while variants might take a few milliseconds. I did see one puzzle take ~20 seconds, but overall, I was happy with the result.&lt;/p&gt;&lt;p&gt;So, can it go faster? Two optimization options came to mind:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;When a cell changes, only consult&amp;nbsp;&lt;em&gt;relevant&lt;/em&gt;&amp;nbsp;constraints rather than re-checking all of them.&lt;/li&gt;&lt;li&gt;Instead of copying the entire board state for every guess, I could carefully track changes and undo them during backtracking. However, since the board state is already quite small, I doubted this would yield a significant performance boost in practice.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Right before I decided to optimize further, I learned something new.&lt;/p&gt;&lt;h2&gt;3. Dancing Links (DLX)&lt;/h2&gt;&lt;p&gt;I discovered &quot;Dancing Links&quot; while asking an AI for the best Sudoku algorithms. This is a technique invented by Donald Knuth to efficiently implement Algorithm X, which solves the &quot;exact cover&quot; problem.&lt;/p&gt;&lt;p&gt;This is perfect for classic Sudoku, where the goal is to find an exact cover between &quot;putting a digit in a cell&quot; and &quot;satisfying every row, column, and box constraint.&quot;&lt;/p&gt;&lt;p&gt;This is perfect for the classic Sudoku algorithm, where we essentially try to find the exact cover between &quot;putting a digit into each cell&quot; and &quot;each row/column/box must contain exactly one digit X (where X is 1 ~ 9)&quot;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The magical part:&lt;/strong&gt;&amp;nbsp;We can precisely undo this process by restoring the covered rows and columns. This means we&amp;nbsp;&lt;strong&gt;don&#39;t need to copy the state&lt;/strong&gt;&amp;nbsp;during backtracking! The algorithm only needs to remember the latest guess; the data structure itself holds the information required to reverse it.&lt;/p&gt;&lt;p&gt;However, there&#39;s a catch: things get complicated quickly with variant rules. Only a few rules (like distinct values) can be easily encoded into the DLX structure. For most others, I had to implement them as &quot;external observers&quot; that eliminate impossible candidates after a guess. This forced me to maintain an undo stack again, essentially plugging a constraint propagation engine back into DLX.&lt;/p&gt;&lt;p&gt;Another issue: DLX wasn&#39;t actually faster than my custom constraint propagation solver. Since it consistently took ~200 µs for classic puzzles, I didn&#39;t bother implementing the complex variant rules for it.&lt;/p&gt;&lt;h2&gt;4. Generic Solvers&lt;/h2&gt;&lt;p&gt;I talked to a colleague about my progress, and he asked, &quot;Why are you writing a custom solver? Why not just use a generic SAT/SMT solver like Z3?&quot;&lt;/p&gt;&lt;p&gt;That was a good point. I did some quick research and picked three candidates to test:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a data-href=&quot;https://github.com/Z3Prover/z3&quot; href=&quot;https://github.com/Z3Prover/z3&quot;&gt;Z3&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a data-href=&quot;https://cvc5.github.io/&quot; href=&quot;https://cvc5.github.io/&quot;&gt;cvc5&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a data-href=&quot;https://developers.google.com/optimization&quot; href=&quot;https://developers.google.com/optimization&quot;&gt;OR-Tools&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The results are shown below.&lt;/p&gt;&lt;h2&gt;5. Results&lt;/h2&gt;&lt;p&gt;Here is how the solvers compared.&amp;nbsp;&lt;em&gt;(cp = my constraint propagation implementation, dlx = my dancing links implementation)&lt;/em&gt;&lt;/p&gt;&lt;h3&gt;Classic Sudoku 1 (Easy)&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;cp: ~100 µs&lt;/li&gt;&lt;li&gt;dlx* ~200 µs&lt;/li&gt;&lt;li&gt;OR-Tools: ~10 ms&lt;/li&gt;&lt;li&gt;Z3: ~20 ms&lt;/li&gt;&lt;li&gt;cvc5: ~70 ms&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Classic Sudoku 2 (Harder)&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;cp: ~200 µs&lt;/li&gt;&lt;li&gt;dlx: ~200 µs&lt;/li&gt;&lt;li&gt;OR-Tools: ~10 ms&lt;/li&gt;&lt;li&gt;Z3: ~80 ms&lt;/li&gt;&lt;li&gt;cvc5: ~200 ms&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Hard: Empty Board + Thermometer Rules&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;cp: ~3 ms&lt;/li&gt;&lt;li&gt;OR-Tools: ~30 ms&lt;/li&gt;&lt;li&gt;Z3: ~20 s&lt;/li&gt;&lt;li&gt;cvc5: ~23 s&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Hard: Almost Empty Board + Arrow Rules&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;OR-Tools: ~200 ms (slightly faster than cp)&lt;/li&gt;&lt;li&gt;cp: ~200 ms&lt;/li&gt;&lt;li&gt;Z3: ~20 s&lt;/li&gt;&lt;li&gt;cvc5: ~100 s&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;6. Discussion&lt;/h2&gt;&lt;p&gt;I found these results both surprising and reasonable.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;OR-Tools is significantly faster than Z3 and cvc5.&lt;/strong&gt;&amp;nbsp;I believe this is because OR-Tools uses a CP-SAT (Constraint Programming) solver, whereas Z3 and cvc5 are primarily SMT (Satisfiability Modulo Theories) solvers. Since the Sudoku puzzles I used are designed for human logic, they align better with the constraint propagation techniques used by OR-Tools.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;My custom solver vs. OR-Tools.&lt;/strong&gt;&amp;nbsp;For easy puzzles, OR-Tools is slower than my implementation. This is likely due to initialization overhead; my code is hyper-optimized specifically for Sudoku. However, OR-Tools catches up quickly on complex puzzles. My constraint propagation logic is naturally inferior to the sophisticated heuristics inside OR-Tools, so as complexity rises, the generic solver wins.&lt;/p&gt;&lt;h2&gt;7. Conclusion&lt;/h2&gt;&lt;p&gt;I had a lot of fun and learned a great deal during this process. My next step is to explore OR-Tools further. Perhaps I&#39;ll write solvers for even more complex puzzles without reinventing the wheel!&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/1831802453695424262' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/1831802453695424262'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/1831802453695424262'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2026/02/after-writing-nonogram-solver-i-decided.html' title='Writing Sudoku Solvers'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-2734655004633826937</id><published>2025-10-29T01:06:00.006+01:00</published><updated>2025-10-29T01:06:45.452+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="Experiment"/><category scheme="http://www.blogger.com/atom/ns#" term="qubes os"/><title type='text'>An Adventure with Qubes OS</title><content type='html'>&lt;p class=&quot;code-line&quot; data-line=&quot;2&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;I&#39;ve been experimenting with Qubes OS on my new laptop and wanted to share some notes on the experience.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot; id=&quot;hardware&quot; style=&quot;border-bottom: 1px solid rgba(0, 0, 0, 0.18); border-left-color: rgba(0, 0, 0, 0.18); border-right-color: rgba(0, 0, 0, 0.18); border-top-color: rgba(0, 0, 0, 0.18); font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em; position: relative;&quot;&gt;Hardware&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;6&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;Overall, Qubes OS works quite well on my hardware. Aside from typical issues like deep sleep, speaker performance, and touchpad scroll speed, the experience has been smooth. I particularly like that I can boot directly from a microSD card. This allowed me to move the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;/boot&lt;/code&gt;&amp;nbsp;partition to the card while completely disabling USB access in&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dom0&lt;/code&gt;&amp;nbsp;for better security.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;9&quot; dir=&quot;auto&quot; id=&quot;detached-boot-and-luks-header&quot; style=&quot;border-bottom: 1px solid rgba(0, 0, 0, 0.18); border-left-color: rgba(0, 0, 0, 0.18); border-right-color: rgba(0, 0, 0, 0.18); border-top-color: rgba(0, 0, 0, 0.18); font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em; position: relative;&quot;&gt;Detached&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;/boot&lt;/code&gt;&amp;nbsp;and LUKS Header&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;11&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;Moving&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;/boot&lt;/code&gt;&amp;nbsp;and the LUKS header to a microSD card is a fun project, but it has some drawbacks:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;13&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;13&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;I have to remember to mount&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;/boot&lt;/code&gt;&amp;nbsp;before updating&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dom0&lt;/code&gt;.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;14&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;The system won&#39;t shut down properly if I forget to unmount&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;/boot&lt;/code&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;16&quot; dir=&quot;auto&quot; id=&quot;testing-qubes-os-43-rc3&quot; style=&quot;border-bottom: 1px solid rgba(0, 0, 0, 0.18); border-left-color: rgba(0, 0, 0, 0.18); border-right-color: rgba(0, 0, 0, 0.18); border-top-color: rgba(0, 0, 0, 0.18); font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em; position: relative;&quot;&gt;Testing Qubes OS 4.3 rc3&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;18&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;I decided to test the Qubes OS 4.3 rc3 release by performing an in-place upgrade. Unfortunately, the system failed to boot afterward.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot; id=&quot;dracut-issues&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 1.25em; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;&amp;nbsp;Issues&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;22&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;After the upgrade, the system would hang before prompting me for my LUKS password. Eventually, the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;watchdog&lt;/code&gt;&amp;nbsp;timer would kick in and drop me into an emergency shell.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;24&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;Using the emergency shell and the installation media, I was able to investigate. I realized something was wrong with&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;, as it seemed unable to detect the encrypted disk. I tried including more files in the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;&amp;nbsp;configuration based on information from various sources, but that didn&#39;t help. So I gave up. Thankfully, the upgrade tool created an LVM snapshot, which made it easy to revert the changes. I did have to manually downgrade the kernel and Xen using the installation media to get my Xen domains working again.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;26&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;After more research, I found a bug (&lt;a data-href=&quot;https://github.com/dracut-ng/dracut-ng/issues/684&quot; href=&quot;https://github.com/dracut-ng/dracut-ng/issues/684&quot; style=&quot;color: #006ab1; text-decoration-line: none;&quot;&gt;1&lt;/a&gt;,&amp;nbsp;&lt;a data-href=&quot;https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078792&quot; href=&quot;https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078792&quot; style=&quot;color: #006ab1; text-decoration-line: none;&quot;&gt;2&lt;/a&gt;) in the version of&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;&amp;nbsp;included in Qubes OS 4.3. Essentially, the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;crypt&lt;/code&gt;&amp;nbsp;module stops working when&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;systemd&lt;/code&gt;&amp;nbsp;is available. I also believe the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;systemd-cryptsetup&lt;/code&gt;&amp;nbsp;module wasn&#39;t automatically included because the LUKS header was on a separate device, leading&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;&amp;nbsp;to assume that LUKS decryption wasn&#39;t needed.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;28&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;The fix was simple: manually enable the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;systemd-cryptsetup&lt;/code&gt;&amp;nbsp;module in the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;&amp;nbsp;configuration.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;30&quot; dir=&quot;auto&quot; id=&quot;amdgpu-issues&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 1.25em; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu&lt;/code&gt;&amp;nbsp;Issues&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;32&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;After the upgrade, only the old kernel worked, that&#39;s how I could fix the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dracut&lt;/code&gt;&amp;nbsp;issue. With the new kernels, the system would boot to a blank screen. Removing the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;rhgb&lt;/code&gt;&amp;nbsp;and&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;quiet&lt;/code&gt;&amp;nbsp;kernel parameters revealed some log messages, but the blank screen remained.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;34&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;Adding the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;nomodeset&lt;/code&gt;&amp;nbsp;parameter disabled the graphics driver, which allowed me to enter the LUKS password and log in. However, this caused Xorg and LightDM to fail, with Xorg repeatedly crashing.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;36&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;It was clear this was an issue with the&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu&lt;/code&gt;&amp;nbsp;module. I tried several kernel parameters without success:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;38&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;38&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu.modeset=1&lt;/code&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;39&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu.dc=0&lt;/code&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;40&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu.dpm=0&lt;/code&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;41&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu.ppfeaturemask=0xffffb&lt;/code&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;43&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;Eventually, I found that&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;amdgpu.dcdebugmask=0x10&lt;/code&gt;&amp;nbsp;resolved the problem.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;45&quot; dir=&quot;auto&quot; id=&quot;qubesd-issues&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 1.25em; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; position: relative;&quot;&gt;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;qubesd&lt;/code&gt;&amp;nbsp;Issues&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;47&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;After finally booting into the system, I couldn&#39;t attach any block devices, including my&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;/boot&lt;/code&gt;&amp;nbsp;partition, to&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dom0&lt;/code&gt;. This turned out to be a bug in&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;qubesd&lt;/code&gt;, which I&amp;nbsp;&lt;a data-href=&quot;https://github.com/QubesOS/qubes-issues/issues/10361&quot; href=&quot;https://github.com/QubesOS/qubes-issues/issues/10361&quot; style=&quot;color: #006ab1; text-decoration-line: none;&quot;&gt;reported&lt;/a&gt;.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;49&quot; dir=&quot;auto&quot; id=&quot;backup-strategy&quot; style=&quot;border-bottom: 1px solid rgba(0, 0, 0, 0.18); border-left-color: rgba(0, 0, 0, 0.18); border-right-color: rgba(0, 0, 0, 0.18); border-top-color: rgba(0, 0, 0, 0.18); font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em; position: relative;&quot;&gt;Backup Strategy&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;51&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;Backing up data in Qubes OS can be tricky due to its design. Here is the high-level strategy I&#39;ve been planning:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;53&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;53&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Create as few templates as possible.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;54&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Only use Salt to configure templates. So I only need to back up Salt files in dom0.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;55&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Changes to dom0 are either managed by Salt, or the relevant files are included in the backup. Examples&lt;ul class=&quot;code-line&quot; data-line=&quot;56&quot; dir=&quot;auto&quot; style=&quot;margin-bottom: 0px; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;56&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Xfce settings.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;57&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Qube settings/features, /etc/qubes&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;58&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;/etc/default/grub&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;59&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;/etc/dracut.conf.d&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;61&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;My backup process is as follows:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;63&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;63&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;I use a dedicated&amp;nbsp;&lt;em&gt;disposable&lt;/em&gt;&amp;nbsp;VM with&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;restic&lt;/code&gt;&amp;nbsp;installed. The actual backup scripts and SSH keys remain in&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dom0&lt;/code&gt;.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;64&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Data from&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;dom0&lt;/code&gt;&amp;nbsp;and other qubes is archived using&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;tar&lt;/code&gt;. Use&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;--transform&lt;/code&gt;&amp;nbsp;to prepend the VM name to the path.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;65&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Use&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;qvm-run --pass-io --no-gui&lt;/code&gt;&amp;nbsp;to pass the scripts, SSH keys an data to the disposable VM, which then runs the scripts to execute the backup.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;67&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;There are a few things to consider:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;68&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;68&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Using&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;--pass-io&lt;/code&gt;&amp;nbsp;could have security implications. It might be possible to mitigate this by limiting the number of bytes passed and saving output to a file.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;69&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;The&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;tar&lt;/code&gt;&amp;nbsp;archives are currently extracted in the disposable VM. If the data isn&#39;t trusted, this could be a security risk. In such cases, the extraction step could be skipped.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Accessing the backed-up data requires starting a VM. An alternative could be to create an LVM snapshot and back that up directly.&lt;/li&gt;&lt;/ul&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;72&quot; dir=&quot;auto&quot; id=&quot;thought-manging-qubes-like-containers&quot; style=&quot;border-bottom: 1px solid rgba(0, 0, 0, 0.18); border-left-color: rgba(0, 0, 0, 0.18); border-right-color: rgba(0, 0, 0, 0.18); border-top-color: rgba(0, 0, 0, 0.18); font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em; position: relative;&quot;&gt;Thought: Manging Qubes like Containers&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;74&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;There are many similarities between Qubes VMS and containers (docker, podman):&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;76&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;76&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Template VMs are similar to building containers.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;77&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;AppVMs are like running containers with persistent volumes.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;78&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Disposable VMs are like running containers without persistent volumes.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;80&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;There are some gaps:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;82&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 0.7em; margin-top: 0px; position: relative;&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;82&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;One container can be based on others, the shared part are often stored only once as layers. While cloning a template VM can use&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;reflink&lt;/code&gt;&amp;nbsp;for initial efficienty, changes will evetually cause the data to diverge.&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;bootc&lt;/code&gt;&amp;nbsp;might help bridge this gap.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;83&quot; dir=&quot;auto&quot; style=&quot;position: relative;&quot;&gt;Containers support bind mounts, which is very convenient for backup. While&amp;nbsp;&lt;code style=&quot;background-color: rgba(0, 0, 0, 0.1); border-radius: 4px; color: #a31515; font-family: Consolas, &amp;quot;Courier New&amp;quot;, monospace; font-size: 1em; line-height: 1.357em; padding: 1px 3px;&quot;&gt;virtiofsd&lt;/code&gt;&amp;nbsp;could work for Xen, but I guess there are security concerns. An alternative could be to centralize all data in one qube and share it with others over the network, but again, there could be security concerns.&lt;/li&gt;&lt;/ul&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;85&quot; dir=&quot;auto&quot; id=&quot;conclusion&quot; style=&quot;border-bottom: 1px solid rgba(0, 0, 0, 0.18); border-left-color: rgba(0, 0, 0, 0.18); border-right-color: rgba(0, 0, 0, 0.18); border-top-color: rgba(0, 0, 0, 0.18); font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em; position: relative;&quot;&gt;Conclusion&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;87&quot; dir=&quot;auto&quot; style=&quot;font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe WPC&amp;quot;, &amp;quot;Segoe UI&amp;quot;, system-ui, Ubuntu, &amp;quot;Droid Sans&amp;quot;, sans-serif; font-size: 14px; margin-bottom: 16px; margin-top: 0px; position: relative;&quot;&gt;It&#39;s been a fun and challenging journey exploring Qubes OS. While there were some hurdles with hardware and system upgrades, working through them has been a valuable learning experience. The security architecture is powerful, and I&#39;m excited to continue finding new ways to make it work for my setup.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/2734655004633826937' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2734655004633826937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2734655004633826937'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/10/an-adventure-with-qubes-os.html' title='An Adventure with Qubes OS'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-6589436396464735016</id><published>2025-10-09T00:26:00.005+02:00</published><updated>2025-10-09T00:26:36.599+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="podman"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>A Rocky Migration: Moving from docker-compose to Podman and gVisor</title><content type='html'>&lt;p class=&quot;code-line&quot; data-line=&quot;2&quot; dir=&quot;auto&quot; &gt;I&#39;ve been running a few containers for several years. They were all running under rootless Docker with a single user.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot; &gt;Initially, I planned to&amp;nbsp;&lt;a data-href=&quot;https://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html&quot; href=&quot;https://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html&quot; &gt;migrate the containers to VMs&lt;/a&gt;, but I couldn&#39;t get a stable workflow after about two months of effort. Later,&amp;nbsp;&lt;a data-href=&quot;https://blog.wang-lu.com/2025/09/gvisor-fresh-look-at-container-security.html&quot; href=&quot;https://blog.wang-lu.com/2025/09/gvisor-fresh-look-at-container-security.html&quot; &gt;gVisor caught my attention&lt;/a&gt;, and I decided to migrate to Podman with gVisor instead.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;6&quot; dir=&quot;auto&quot; &gt;The new plan is to run each container with&amp;nbsp;&lt;code &gt;--userns=auto&lt;/code&gt;&amp;nbsp;and use Quadlet for systemd integration. This approach provides better isolation and makes writing firewall rules easier.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;8&quot; dir=&quot;auto&quot; &gt;I&#39;m now close to migrating all my containers. Here are a couple of rough edges I&#39;d like to share.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;10&quot; dir=&quot;auto&quot; id=&quot;network-layout&quot; &gt;Network Layout&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;12&quot; dir=&quot;auto&quot; &gt;I compared&amp;nbsp;&lt;a data-href=&quot;https://blog.wang-lu.com/2025/09/hardening-container-network-security.html&quot; href=&quot;https://blog.wang-lu.com/2025/09/hardening-container-network-security.html&quot; &gt;various networking options&lt;/a&gt;&amp;nbsp;and spent a few hours trying the one-interface-per-group approach before giving up. I settled on a single macvlan network and decided to use static IP addresses for my containers.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;14&quot; dir=&quot;auto&quot; &gt;To prevent a randomly assigned IP address from conflicting with a predefined one, I allocated a large IP range for my containers and assigned random addresses from that range.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;16&quot; dir=&quot;auto&quot; id=&quot;routing-issues&quot; &gt;Routing Issues&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;17&quot; dir=&quot;auto&quot; &gt;I ran into a tricky routing problem. Let&#39;s say my host has a network interface&amp;nbsp;&lt;code &gt;eth0&lt;/code&gt;&amp;nbsp;and a veth pair where&amp;nbsp;&lt;code &gt;veth0-host&lt;/code&gt;&amp;nbsp;is on the host and&amp;nbsp;&lt;code &gt;veth0-ctr&lt;/code&gt;&amp;nbsp;is in the container.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;19&quot; dir=&quot;auto&quot; &gt;Here are the IP addresses:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot; &gt;&lt;li class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot; &gt;&lt;code &gt;eth0&lt;/code&gt;: 192.168.0.1&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;21&quot; dir=&quot;auto&quot; &gt;&lt;code &gt;veth0-host&lt;/code&gt;: 192.168.13.1&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;22&quot; dir=&quot;auto&quot; &gt;&lt;code &gt;veth0-ctr&lt;/code&gt;: 192.168.13.100&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;24&quot; dir=&quot;auto&quot; &gt;To allow an external client to talk to a service on&amp;nbsp;&lt;code &gt;192.168.13.100:1234&lt;/code&gt;, I set up a prerouting DNAT rule in nftables to forward traffic from&amp;nbsp;&lt;code &gt;192.168.0.1:1234&lt;/code&gt;&amp;nbsp;to&amp;nbsp;&lt;code &gt;192.168.13.100:1234&lt;/code&gt;. To my surprise, the host itself couldn&#39;t access&amp;nbsp;&lt;code &gt;192.168.0.1:1234&lt;/code&gt;.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;26&quot; dir=&quot;auto&quot; &gt;It turned out there were two issues:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;27&quot; dir=&quot;auto&quot; &gt;&lt;li class=&quot;code-line&quot; data-line=&quot;27&quot; dir=&quot;auto&quot; &gt;DNAT in the&amp;nbsp;&lt;code &gt;prerouting&lt;/code&gt;&amp;nbsp;chain doesn&#39;t apply to loopback traffic from the host itself.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;28&quot; dir=&quot;auto&quot; &gt;Masquerade and SNAT also didn&#39;t work, likely because the kernel has a short-circuit mechanism for local traffic, so the transport happens at Layer 2.&lt;/li&gt;&lt;/ol&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;30&quot; dir=&quot;auto&quot; &gt;Because of this, the service at&amp;nbsp;&lt;code &gt;192.168.13.100&lt;/code&gt;&amp;nbsp;would send reply packets to&amp;nbsp;&lt;code &gt;192.168.0.1&lt;/code&gt;&amp;nbsp;instead of&amp;nbsp;&lt;code &gt;192.168.13.1&lt;/code&gt;. The packet would still be received on&amp;nbsp;&lt;code &gt;veth0-host&lt;/code&gt;, but my firewall would complain that this violates the &quot;strong host model&quot;.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;32&quot; dir=&quot;auto&quot; &gt;I didn&#39;t have this issue before, probably because port forwarding was handled at Layer 3 by slirp4netns. This seems to be a &quot;hairpin NAT&quot; issue, but it&#39;s more complicated with a veth pair.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;34&quot; dir=&quot;auto&quot; &gt;In the end, I just configured the host to talk to the container at&amp;nbsp;&lt;code &gt;192.168.13.100&lt;/code&gt;&amp;nbsp;directly.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;36&quot; dir=&quot;auto&quot; id=&quot;file-permissions&quot; &gt;File Permissions&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;38&quot; dir=&quot;auto&quot; &gt;Because I&#39;m using&amp;nbsp;&lt;code &gt;--userns=auto&lt;/code&gt;, the&amp;nbsp;&lt;code &gt;:U&lt;/code&gt;&amp;nbsp;flag is almost a must when mounting volumes. Surprisingly, this didn&#39;t cause too many problems as long as I set up the correct user and group.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;40&quot; dir=&quot;auto&quot; &gt;Sometimes a container needs to access files on the host. If a group&amp;nbsp;&lt;code &gt;HOST_GID&lt;/code&gt;&amp;nbsp;has access to a file, we can grant access to the container&#39;s primary user with&amp;nbsp;&lt;code &gt;--userns=auto:gidmapping=$CONTAINER_GID:$HOST_GID:1 --group-add $CONTAINER_GID&lt;/code&gt;. Here,&amp;nbsp;&lt;code &gt;CONTAINER_GID&lt;/code&gt;&amp;nbsp;is an unused GID inside the container.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;42&quot; dir=&quot;auto&quot; &gt;However, this only works well with the default&amp;nbsp;&lt;code &gt;crun&lt;/code&gt;&amp;nbsp;runtime. With gVisor, I found two problems:&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;44&quot; dir=&quot;auto&quot; &gt;First, the permission only works if&amp;nbsp;&lt;code &gt;CONTAINER_GID&lt;/code&gt;&amp;nbsp;is the primary group of the container&#39;s main user. It doesn&#39;t work if it&#39;s a supplementary group.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;46&quot; dir=&quot;auto&quot; &gt;Second, gVisor&amp;nbsp;&lt;a data-href=&quot;https://github.com/google/gvisor/blob/73f1154f95e717ef2f7e8325cc84a0a0a63f4288/pkg/sentry/fsimpl/tmpfs/tmpfs.go#L191&quot; href=&quot;https://github.com/google/gvisor/blob/73f1154f95e717ef2f7e8325cc84a0a0a63f4288/pkg/sentry/fsimpl/tmpfs/tmpfs.go#L191&quot; &gt;does not seem to support POSIX ACL&lt;/a&gt;. This means the&amp;nbsp;&lt;code &gt;dac_override&lt;/code&gt;&amp;nbsp;capability is needed if the&amp;nbsp;&lt;code &gt;CONTAINER_GID&lt;/code&gt;&amp;nbsp;doesn&#39;t appear to have permission according to standard Unix permissions.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;48&quot; dir=&quot;auto&quot; &gt;It&#39;s not too bad in practice, but it was surprising until I figured out what was going on.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;50&quot; dir=&quot;auto&quot; id=&quot;default-dns-server&quot; &gt;Default DNS Server&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;52&quot; dir=&quot;auto&quot; &gt;Docker provides a DNS server at&amp;nbsp;&lt;code &gt;127.0.0.11&lt;/code&gt;&amp;nbsp;for each container. Podman, however, creates a dedicated DNS server for each bridge network. Some of my containers relied on the Docker behavior and had this IP address hard-coded, which caused quite a bit of trouble.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;54&quot; dir=&quot;auto&quot; id=&quot;dns-servers-for-multiple-networks&quot; &gt;DNS Servers for Multiple Networks&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;56&quot; dir=&quot;auto&quot; &gt;If a container joins both an internal bridge network and an external macvlan network, the container only sees the DNS server from the bridge network. IP routing still works, meaning the container can access the internet by IP, but it can&#39;t resolve external domains.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;58&quot; dir=&quot;auto&quot; &gt;This is a bug in Podman that has been fixed in the latest version, but it still exists in the version I&#39;m using from Debian. Below are the hacks I considered for this old version.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;60&quot; dir=&quot;auto&quot; id=&quot;option-1-override-dns&quot; &gt;Option 1: Override DNS&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;62&quot; dir=&quot;auto&quot; &gt;If the container doesn&#39;t need to resolve internal container names, we can force it to use an external DNS server. However, the&amp;nbsp;&lt;code &gt;--dns&lt;/code&gt;&amp;nbsp;flag didn&#39;t work as I expected.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;64&quot; dir=&quot;auto&quot; &gt;According to&amp;nbsp;&lt;a data-href=&quot;https://github.com/containers/podman/issues/17499&quot; href=&quot;https://github.com/containers/podman/issues/17499&quot; &gt;Issue #17500&lt;/a&gt;, the server from&amp;nbsp;&lt;code &gt;--dns&lt;/code&gt;&amp;nbsp;is added&amp;nbsp;&lt;em&gt;into&lt;/em&gt;&amp;nbsp;other DNS servers as upstream, such that internal container names can be resolved first. This doesn&#39;t work in my case because the internal DNS server can&#39;t access the internet, so it can&#39;t forward the query.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;66&quot; dir=&quot;auto&quot; &gt;A simple workaround is to override&amp;nbsp;&lt;code &gt;/etc/resolv.conf&lt;/code&gt;&amp;nbsp;using a bind mount.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;68&quot; dir=&quot;auto&quot; id=&quot;option-2-http-proxy&quot; &gt;Option 2: HTTP Proxy&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot; &gt;If the container supports HTTP proxy, we can remove it from the external network. Instead, we can add a proxy container to both the internal and external networks. The proxy can use its own external DNS server, and the original container can use this proxy via its internal IP.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;72&quot; dir=&quot;auto&quot; &gt;This should work in theory, but it felt like too much effort. It also surprised me that nginx doesn&#39;t support proxying HTTPS traffic (&lt;code &gt;CONNECT&lt;/code&gt;&amp;nbsp;method) without extra efforts. If I had to go this route, I would probably use&amp;nbsp;&lt;code &gt;mitmproxy&lt;/code&gt;.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;74&quot; dir=&quot;auto&quot; id=&quot;option-3-transparent-http-proxy&quot; &gt;Option 3: Transparent HTTP Proxy&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;76&quot; dir=&quot;auto&quot; &gt;I have&amp;nbsp;&lt;a data-href=&quot;https://blog.wang-lu.com/2025/09/hardening-container-network-security.html&quot; href=&quot;https://blog.wang-lu.com/2025/09/hardening-container-network-security.html&quot; &gt;set up a transparent HTTP proxy in my network&lt;/a&gt;. The container thinks it is talking to an external server, but my firewall redirects the traffic to an nginx server, which can forward or reject the traffic. Unlike a normal HTTP proxy, this is easy to do with nginx and is completely transparent to the client.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;78&quot; dir=&quot;auto&quot; &gt;Now, if a container only needs to access a few known HTTP/S domains, I just add entries for those domains to the container&#39;s&amp;nbsp;&lt;code &gt;/etc/hosts&lt;/code&gt;&amp;nbsp;file with an arbitrary IP (like 1.1.1.1), using&amp;nbsp;&lt;code &gt;AddHost=&lt;/code&gt;. Nginx completely ignores this IP, it reads the domain from the request and resolves it on its own. It&#39;s very hacky but also very practical.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;80&quot; dir=&quot;auto&quot; id=&quot;namespaces&quot; &gt;Namespaces&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;82&quot; dir=&quot;auto&quot; &gt;Some containers assume they share the same user namespace, which is common when they are running under docker-compose.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;84&quot; dir=&quot;auto&quot; &gt;Podman has&amp;nbsp;&lt;code &gt;--userns=container:id&lt;/code&gt;&amp;nbsp;to join an existing container&#39;s user namespace, but this doesn&#39;t work with gVisor. From what I&#39;ve learned, this is related to gVisor&#39;s sandbox model.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;86&quot; dir=&quot;auto&quot; &gt;The solution is to put containers into a pod. However, with gVisor, containers will not join the network namespace of the pod due to the same security model. This wasn&#39;t a big deal for my use case, but it was unexpected.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;88&quot; dir=&quot;auto&quot; &gt;The same thing happens with the UTS namespace. When running with gVisor, a container&#39;s hostname becomes empty if it joins a pod.&amp;nbsp;&lt;a data-href=&quot;https://github.com/google/gvisor/issues/7995&quot; href=&quot;https://github.com/google/gvisor/issues/7995&quot; &gt;Issue #7995&lt;/a&gt;&amp;nbsp;is relevant here. Apparently, some binaries (like busybox&#39;s sendmail) don&#39;t like an empty hostname. The solution is to give the container a private UTS namespace:&amp;nbsp;&lt;code &gt;--uts=private&lt;/code&gt;.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;90&quot; dir=&quot;auto&quot; id=&quot;shared-volume&quot; &gt;Shared Volume&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;92&quot; dir=&quot;auto&quot; &gt;Some of my containers use&amp;nbsp;&lt;code &gt;flock&lt;/code&gt;&amp;nbsp;on a shared volume to communicate. I think this is a bad design. And guess what? It doesn&#39;t work with gVisor.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;94&quot; dir=&quot;auto&quot; &gt;I thought I might need to&amp;nbsp;&lt;a data-href=&quot;https://gvisor.dev/docs/user_guide/containerd/configuration/#enabling-inotify-for-shared-volumes&quot; href=&quot;https://gvisor.dev/docs/user_guide/containerd/configuration/#enabling-inotify-for-shared-volumes&quot; &gt;set up some mount hints annotations&lt;/a&gt;, but that didn&#39;t help, so I guess&amp;nbsp;&lt;code &gt;inotify&lt;/code&gt;&amp;nbsp;wasn&#39;t the issue.&amp;nbsp;&lt;a data-href=&quot;https://gvisor.dev/docs/user_guide/filesystem/#shared-root-filesystem&quot; href=&quot;https://gvisor.dev/docs/user_guide/filesystem/#shared-root-filesystem&quot; &gt;Turning on shared file access&lt;/a&gt;&amp;nbsp;didn&#39;t work either.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;96&quot; dir=&quot;auto&quot; id=&quot;other-notes&quot; &gt;Other Notes&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;98&quot; dir=&quot;auto&quot; &gt;I had planned to use socket activation extensively, but it turned out I didn&#39;t. Most containers need networking anyway, and it&#39;s much easier to manage port forwarding with simple firewall rules.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;100&quot; dir=&quot;auto&quot; &gt;It is possible to mount an empty volume, which acts as an&amp;nbsp;&lt;a data-href=&quot;https://docs.docker.com/engine/storage/volumes/#mounting-a-volume-over-existing-data&quot; href=&quot;https://docs.docker.com/engine/storage/volumes/#mounting-a-volume-over-existing-data&quot; &gt;upper overlay&lt;/a&gt;&amp;nbsp;on existing files at the mount destination. This is very useful for read-only containers.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;102&quot; dir=&quot;auto&quot; id=&quot;final-thoughts&quot; &gt;Final Thoughts&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;104&quot; dir=&quot;auto&quot; &gt;This wasn&#39;t a trivial migration, but I guess that was expected since I was changing so many variables at once. In any case, I&#39;m quite happy with the new setup.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/6589436396464735016' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/6589436396464735016'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/6589436396464735016'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/10/a-rocky-migration-moving-from-docker.html' title='A Rocky Migration: Moving from docker-compose to Podman and gVisor'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-9211607361964053600</id><published>2025-09-25T22:54:00.009+02:00</published><updated>2025-09-26T11:31:39.379+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="network"/><category scheme="http://www.blogger.com/atom/ns#" term="podman"/><title type='text'>Hardening Container Network Security: Filtering Outgoing Traffic</title><content type='html'>&lt;p&gt;I want to filter the outgoing network traffic for all of my containers based on a set of rules. For example:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot;&gt;Some containers should be blocked from accessing the internet entirely.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;5&quot; dir=&quot;auto&quot;&gt;Some containers should have unrestricted internet access.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;6&quot; dir=&quot;auto&quot;&gt;Some containers should be able to access the internet, but not a specific list of URLs.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;7&quot; dir=&quot;auto&quot;&gt;Some containers should only be allowed to access a specific list of URLs.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;9&quot; dir=&quot;auto&quot;&gt;To manage this, I will define logical policy groups and assign each container to one. As a general rule, only DNS and HTTP/HTTPS traffic will be permitted.&lt;/p&gt;&lt;h1 class=&quot;code-line&quot; data-line=&quot;12&quot; dir=&quot;auto&quot; id=&quot;option-1-a-proxy-for-each-policy-group&quot;&gt;Option 1: A Proxy for Each Policy Group&lt;/h1&gt;&lt;div class=&quot;separator&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPQGM-OtuNwhDgV55UdL-PjPPsfHOlWslsilOJ7QRuLAG1XrRr-8jugtmOIUiN1NKjByTxzte6j4fGBb4bPObZ_RK-RvLjrVi4nxkaeMtj908AlsPPjxDv3zYZrU8_COoL1I9mDiX64IlHQZ3IGnlJ95WvZPvieoCN5Mh1cKfIQZQVQ7DG1sXW2w/s610/1.png&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;400&quot; data-original-width=&quot;610&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPQGM-OtuNwhDgV55UdL-PjPPsfHOlWslsilOJ7QRuLAG1XrRr-8jugtmOIUiN1NKjByTxzte6j4fGBb4bPObZ_RK-RvLjrVi4nxkaeMtj908AlsPPjxDv3zYZrU8_COoL1I9mDiX64IlHQZ3IGnlJ95WvZPvieoCN5Mh1cKfIQZQVQ7DG1sXW2w/s16000/1.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;16&quot; dir=&quot;auto&quot;&gt;Imagine Container A is only allowed to access&amp;nbsp;&lt;code&gt;www.google.com&lt;/code&gt;. Here’s how this approach would work:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;18&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;18&quot; dir=&quot;auto&quot;&gt;Create an Nginx (or&amp;nbsp;&lt;code&gt;socat&lt;/code&gt;) container that listens on port 443 and acts as a reverse proxy for&amp;nbsp;&lt;code&gt;www.google.com&lt;/code&gt;.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;19&quot; dir=&quot;auto&quot;&gt;Place both the Nginx proxy and Container A into an&amp;nbsp;&lt;em&gt;internal&lt;/em&gt;&amp;nbsp;container network.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot;&gt;Within this network, add&amp;nbsp;&lt;code&gt;www.google.com&lt;/code&gt;&amp;nbsp;as a network alias for the Nginx container.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;21&quot; dir=&quot;auto&quot;&gt;Connect the Nginx container to a second network that has internet access.&lt;/li&gt;&lt;/ol&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;23&quot; dir=&quot;auto&quot; id=&quot;thoughts&quot;&gt;Thoughts&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;24&quot; dir=&quot;auto&quot;&gt;This is my current solution using&amp;nbsp;&lt;code&gt;docker-compose&lt;/code&gt;, and I believe it should also work with Podman.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;26&quot; dir=&quot;auto&quot;&gt;It is possible to use a single Nginx container to proxy multiple domains, even for HTTPS traffic. By using the&amp;nbsp;&lt;a data-href=&quot;https://nginx.org/en/docs/stream/ngx_stream_ssl_preread_module.html&quot; href=&quot;https://nginx.org/en/docs/stream/ngx_stream_ssl_preread_module.html&quot;&gt;&lt;code&gt;ngx_stream_ssl_preread_module&lt;/code&gt;&lt;/a&gt;, Nginx can inspect the requested domain from the TLS handshake and forward the traffic accordingly without needing to decrypt it.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;28&quot; dir=&quot;auto&quot;&gt;This option is straightforward to implement, and a key advantage is that I don&#39;t need to set up a custom DNS server. It is also relatively easier to write firewall rules.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;30&quot; dir=&quot;auto&quot;&gt;On the other hand, configuring and managing a separate proxy container for each rule can become tedious. I think using Quadlet files, especially with templates and drop-in overrides, could simplify this process.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;32&quot; dir=&quot;auto&quot;&gt;Another significant downside is the inability to log blocked traffic. If a container tries to access a domain that isn&#39;t explicitly proxied, the connection will simply fail without a log entry, making troubleshooting difficult.&lt;/p&gt;&lt;h1 class=&quot;code-line&quot; data-line=&quot;35&quot; dir=&quot;auto&quot; id=&quot;option-2-central-proxy-on-a-single-network&quot;&gt;Option 2: Central Proxy on a Single Network&lt;/h1&gt;&lt;div class=&quot;separator&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhw_tFs2RnYhT17zol5S-bXTfebeamHUltgH3ufhg0P6Izuh3tAD-ow6lrj1y39wHQAHfwFlKHArtUaTHdOALnBMBkkO2WsINzwDyvQqw0391KPtbMTee6F_scBuvETFHcqI_bK9gvJwzkqmK-M8eo_QJYqir2V8NIV4vpRp-ad380h_sWycYQdCg/s610/2.png&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;500&quot; data-original-width=&quot;610&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhw_tFs2RnYhT17zol5S-bXTfebeamHUltgH3ufhg0P6Izuh3tAD-ow6lrj1y39wHQAHfwFlKHArtUaTHdOALnBMBkkO2WsINzwDyvQqw0391KPtbMTee6F_scBuvETFHcqI_bK9gvJwzkqmK-M8eo_QJYqir2V8NIV4vpRp-ad380h_sWycYQdCg/s16000/2.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;39&quot; dir=&quot;auto&quot;&gt;In this design, we set up a central proxy for both HTTP/S and DNS traffic and then perform the following steps:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;41&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;41&quot; dir=&quot;auto&quot;&gt;Intercept and redirect all traffic from containers to the central proxy using&amp;nbsp;&lt;code&gt;nftables&lt;/code&gt;&amp;nbsp;rules. For DNS, this is simpler, as I can configure the container network to use my custom DNS server.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;42&quot; dir=&quot;auto&quot;&gt;The proxy must identify the source container to determine which policy group it belongs to.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;43&quot; dir=&quot;auto&quot;&gt;The proxy must identify the requested destination. This is easy for HTTP (from the URL) and DNS (from the query). For HTTPS, we can again use the SSL preread technique to find the domain in the TLS handshake.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;44&quot; dir=&quot;auto&quot;&gt;The proxy applies the policy, then either blocks or forwards the traffic.&lt;/li&gt;&lt;/ol&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;46&quot; dir=&quot;auto&quot; id=&quot;networking&quot;&gt;Networking&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;48&quot; dir=&quot;auto&quot;&gt;First, I would create a&amp;nbsp;&lt;code&gt;veth&lt;/code&gt;&amp;nbsp;pair. On one end, I would create a&amp;nbsp;&lt;code&gt;macvlan&lt;/code&gt;&amp;nbsp;network in &quot;private&quot; mode and connect the containers to it. The other end would be assigned an IP address on the host to allow routing. This essentially creates a bridge where connected containers are isolated from each other but can reach the gateway.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;50&quot; dir=&quot;auto&quot;&gt;Podman doesn&#39;t seem to support configuring a standard bridge with a mix of isolated and non-isolated ports. Note that the&amp;nbsp;&lt;code&gt;--isolate&lt;/code&gt;&amp;nbsp;option in&amp;nbsp;&lt;code&gt;podman network&lt;/code&gt;&amp;nbsp;isolates the entire network from other container networks, not individual ports on the bridge.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;52&quot; dir=&quot;auto&quot;&gt;In the diagram, the proxies are shown on a separate bridge connected to the internet, mainly for illustration. In practice, it might be easier to connect all containers to the same&amp;nbsp;&lt;code&gt;macvlan&lt;/code&gt;&amp;nbsp;network and use a firewall to control traffic flow. Although the&amp;nbsp;&lt;code&gt;macvlan&lt;/code&gt;&amp;nbsp;network is in private mode, the firewall may allow &quot;hairpin&quot; packets to allow traffic between specific containers.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;54&quot; dir=&quot;auto&quot; id=&quot;identifying-containers&quot;&gt;Identifying Containers&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;56&quot; dir=&quot;auto&quot;&gt;We can identify containers by their IP addresses. The tricky part is ensuring these IP addresses are trustworthy and that the setup isn&#39;t prone to errors.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;58&quot; dir=&quot;auto&quot;&gt;Let&#39;s review the IPAM drivers supported by&amp;nbsp;&lt;code&gt;podman network&lt;/code&gt;:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;60&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;60&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;dhcp&lt;/strong&gt;: For each container, we can assign a fixed MAC address and create a static reservation in the DHCP server. The firewall can then reliably use the container&#39;s IP address to identify it. This assumes that containers are unprivileged and cannot change their own MAC or IP addresses. Ideally, the default address pool of the DHCP server should be disabled to prevent unassigned containers from getting an IP.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;61&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;host-local&lt;/strong&gt;: With this driver, we assign a static IP address during&amp;nbsp;&lt;code&gt;podman run&lt;/code&gt;. While this sounds simple, it&#39;s easy to forget to provide an IP when running a container manually. If that happens, Podman will assign an IP automatically. This could accidentally grant a container internet access or cause an IP conflict. I haven&#39;t found a way to disable this automatic IP address allocation.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;62&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;none&lt;/strong&gt;: This driver does not assign an IP address, and you cannot manually provide one either.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;64&quot; dir=&quot;auto&quot;&gt;In theory, using &quot;dhcp&quot; for the IP configuration should work. However, a practical issue emerges because&amp;nbsp;&lt;code&gt;systemd-networkd&lt;/code&gt;&amp;nbsp;respects the client ID in DHCP requests, and Podman&amp;nbsp;&lt;a data-href=&quot;https://github.com/containers/netavark/pull/1130&quot; href=&quot;https://github.com/containers/netavark/pull/1130&quot;&gt;sends the container ID as this client identifier&lt;/a&gt;. Furthermore, since&amp;nbsp;&lt;code&gt;podman-systemd&lt;/code&gt;&amp;nbsp;utilizes&amp;nbsp;&lt;code&gt;podman run --rm&lt;/code&gt;, restarting a&amp;nbsp;&lt;code&gt;container.service&lt;/code&gt;&amp;nbsp;generates a new container with a new ID.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot;&gt;This combination means the DHCP server sees the restarted container as a new machine. It then refuses to offer the configured static IP address, believing the lease is still valid and held by the original container. I have not yet found a way to override this client ID, so I may need to evaluate a different DHCP server or abandon this approach.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;66&quot; dir=&quot;auto&quot; id=&quot;deciding-the-policy-group&quot;&gt;Deciding the Policy Group&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;68&quot; dir=&quot;auto&quot;&gt;Once the container is identified, applying the policy is relatively easy:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;CoreDNS&lt;/strong&gt;&amp;nbsp;has the&amp;nbsp;&lt;a data-href=&quot;https://coredns.io/plugins/view/&quot; href=&quot;https://coredns.io/plugins/view/&quot;&gt;&lt;code&gt;view&lt;/code&gt;&lt;/a&gt;&amp;nbsp;plugin, which can apply different rules based on the client&#39;s IP address.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;71&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;Nginx&lt;/strong&gt;&amp;nbsp;has the&amp;nbsp;&lt;a data-href=&quot;https://nginx.org/en/docs/http/ngx_http_geo_module.html&quot; href=&quot;https://nginx.org/en/docs/http/ngx_http_geo_module.html&quot;&gt;&lt;code&gt;geo&lt;/code&gt;&lt;/a&gt;&amp;nbsp;module, which can be used to map a client&#39;s IP address to a variable for use in access rules. You can also use&amp;nbsp;&lt;code&gt;map $remote_addr&lt;/code&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h1 class=&quot;code-line&quot; data-line=&quot;73&quot; dir=&quot;auto&quot; id=&quot;option-3-one-network-per-policy-group&quot;&gt;Option 3: One Network Per Policy Group&lt;/h1&gt;&lt;div class=&quot;separator&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlh82_WZVUeRfs4CPaVG5L-cKbqzUGMUp8CxhwrK-JJi6k_SK8LKZGaaSLNllL8QnCGdzHAtTCcFBiNtFLIvRjyRJ25AZDQ7bAIPl4m1BuzYoCBmOv7dUE9eUhk02qyjVB0wwzRSGVJCXnyyJ4_XAs4LH1WmKYk70sQDe3al3CB__sGKzqcz5R8w/s610/3.png&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;470&quot; data-original-width=&quot;610&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlh82_WZVUeRfs4CPaVG5L-cKbqzUGMUp8CxhwrK-JJi6k_SK8LKZGaaSLNllL8QnCGdzHAtTCcFBiNtFLIvRjyRJ25AZDQ7bAIPl4m1BuzYoCBmOv7dUE9eUhk02qyjVB0wwzRSGVJCXnyyJ4_XAs4LH1WmKYk70sQDe3al3CB__sGKzqcz5R8w/s16000/3.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;75&quot; dir=&quot;auto&quot;&gt;This approach extends the &quot;veth+macvlan&quot; technique by creating a separate network for each policy group. We then use&amp;nbsp;&lt;code&gt;nftables&lt;/code&gt;&amp;nbsp;rules to forward traffic from all networks to a central proxy. This is similar to Option 2, but this time&amp;nbsp;&lt;code&gt;nftables&lt;/code&gt;&amp;nbsp;can identify the source policy group by the network interface the traffic arrives on.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;79&quot; dir=&quot;auto&quot;&gt;This approach is more secure if you are concerned about IP or MAC spoofing since the network interface is a more reliable identifier than an IP address alone.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;81&quot; dir=&quot;auto&quot; id=&quot;identifying-the-policy-group&quot;&gt;Identifying the Policy Group&lt;/h2&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;83&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;83&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;By IP Address&lt;/strong&gt;: We can configure a DHCP server for each network with a non-overlapping IP range. The proxies can then identify containers by their IP address, just like in Option 2, but with greater trust since the IP is tied to a specific network. We still need to be cautious to ensure IP ranges don&#39;t overlap.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;84&quot; dir=&quot;auto&quot;&gt;&lt;strong&gt;By Interface&lt;/strong&gt;: We can identify traffic by the interface it comes from.&lt;ul class=&quot;code-line&quot; data-line=&quot;85&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;85&quot; dir=&quot;auto&quot;&gt;CoreDNS has the&amp;nbsp;&lt;a data-href=&quot;https://coredns.io/plugins/bind/&quot; href=&quot;https://coredns.io/plugins/bind/&quot;&gt;&lt;code&gt;bind&lt;/code&gt;&lt;/a&gt;&amp;nbsp;plugin, which allows it to listen on specific host interfaces. However, this requires CoreDNS to run in the host network, and the proxy would need to be restarted every time a new policy group (and thus a new interface) is added. It&#39;s also unclear how this would work with Nginx.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;86&quot; dir=&quot;auto&quot;&gt;A variation is to run CoreDNS and use port forwarding (or maybe socket activation) to listen on all interfaces, then I can use&amp;nbsp;&lt;a data-href=&quot;https://wiki.nftables.org/wiki-nftables/index.php/Performing_Network_Address_Translation_(NAT)#Redirect&quot; href=&quot;https://wiki.nftables.org/wiki-nftables/index.php/Performing_Network_Address_Translation_(NAT)#Redirect&quot;&gt;&lt;code&gt;redirect&lt;/code&gt;&lt;/a&gt;&amp;nbsp;in nftables. This way the traffic within each policy group&amp;nbsp;&lt;em&gt;should&lt;/em&gt;&amp;nbsp;be redirected to the corresponding gateway. However, this setup sounds complicated, and similar to above, I&#39;m not sure if it&#39;ll work for nginx.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;87&quot; dir=&quot;auto&quot;&gt;A more complex option is to use&amp;nbsp;&lt;code&gt;nftables&lt;/code&gt;&amp;nbsp;to map each incoming interface to a different port on the host. We could then run a proxy instance for each policy group, listening on its assigned port. This essentially moves the identification logic into&amp;nbsp;&lt;code&gt;nftables&lt;/code&gt;&amp;nbsp;and is useful if a proxy doesn&#39;t support IP-based policies, but the rules would be complicated and fragile. For example, we would need rules to prevent a container from accessing a proxy port it&#39;s not authorized for.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h1 class=&quot;code-line&quot; data-line=&quot;90&quot; dir=&quot;auto&quot; id=&quot;my-plan&quot;&gt;My Plan&lt;/h1&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;92&quot; dir=&quot;auto&quot;&gt;Ultimately, I need to find a balance between two goals:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;94&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;94&quot; dir=&quot;auto&quot;&gt;Maximum Security: Resisting vulnerabilities and malicious actors.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;95&quot; dir=&quot;auto&quot;&gt;Ease of Maintenance: Requiring minimal effort and not being error-prone.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;97&quot; dir=&quot;auto&quot;&gt;I will most likely implement Option 2, if I can make it work. It offers a good blend of centralized control and flexibility without the complexity of managing dozens of networks or proxy containers.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/9211607361964053600' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/9211607361964053600'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/9211607361964053600'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/09/hardening-container-network-security.html' title='Hardening Container Network Security: Filtering Outgoing Traffic'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPQGM-OtuNwhDgV55UdL-PjPPsfHOlWslsilOJ7QRuLAG1XrRr-8jugtmOIUiN1NKjByTxzte6j4fGBb4bPObZ_RK-RvLjrVi4nxkaeMtj908AlsPPjxDv3zYZrU8_COoL1I9mDiX64IlHQZ3IGnlJ95WvZPvieoCN5Mh1cKfIQZQVQ7DG1sXW2w/s72-c/1.png" height="72" width="72"/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-2480761985834851375</id><published>2025-09-20T01:45:00.006+02:00</published><updated>2025-09-21T11:27:45.304+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="Experiment"/><category scheme="http://www.blogger.com/atom/ns#" term="podman"/><title type='text'>A Journey into Podman: Notes on My First Adventure</title><content type='html'>&lt;p class=&quot;code-line&quot; data-line=&quot;2&quot; dir=&quot;auto&quot; &gt;For the last few days, I&#39;ve been experimenting with Podman. My goal was to get a feel for the setup, create a minimal yet scalable environment for a few containers, and identify potential problems early on.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot; &gt;Here are my notes from this experience.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;6&quot; dir=&quot;auto&quot; &gt;[Updates 2025-09-21] Added more networking options and other information.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;8&quot; dir=&quot;auto&quot; id=&quot;quadlet&quot; &gt;Quadlet&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;10&quot; dir=&quot;auto&quot; &gt;&lt;a data-href=&quot;https://docs.podman.io/en/latest/markdown/podman-systemd.unit.5.html&quot; href=&quot;https://docs.podman.io/en/latest/markdown/podman-systemd.unit.5.html&quot; &gt;Quadlet&lt;/a&gt;&amp;nbsp;allows you to define containers, networks and more using a syntax similar to systemd. This includes helpful features like drop-in overrides and templates.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;12&quot; dir=&quot;auto&quot; &gt;The framework is tightly integrated with systemd, and Quadlet actually generates real systemd units. This means I can directly write systemd options in my Quadlet files.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;14&quot; dir=&quot;auto&quot; &gt;One of the biggest benefits I&#39;ve found is how easy Quadlet makes it to set up socket activation. This allows me to place some containers in an internal network or even without a network at all.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;16&quot; dir=&quot;auto&quot; id=&quot;hardening-defaults&quot; &gt;Hardening Defaults&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;18&quot; dir=&quot;auto&quot; &gt;Let&#39;s say I have a group of Systemd and Quadlet units, all named in the format of&amp;nbsp;&lt;code &gt;xyz-*&lt;/code&gt;. My goal is to define some secure, hardened default values for these units that can still be overridden by individual units.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot; &gt;For example, I want to change the default for&amp;nbsp;&lt;code &gt;ProtectSystem=&lt;/code&gt;&amp;nbsp;from&amp;nbsp;&lt;code &gt;false&lt;/code&gt;&amp;nbsp;to&amp;nbsp;&lt;code &gt;true&lt;/code&gt;.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;22&quot; dir=&quot;auto&quot; &gt;Simply creating a&amp;nbsp;&lt;code &gt;xyz-.service.d/00-override.conf&lt;/code&gt;&amp;nbsp;doesn&#39;t work, because an individual unit cannot override this setting. While I could create a&amp;nbsp;&lt;code &gt;xyz-foo.service.d/10-override.conf&lt;/code&gt;&amp;nbsp;for each specific service, this would split the definition of&amp;nbsp;&lt;code &gt;xyz-foo&lt;/code&gt;&amp;nbsp;into two separate files, which isn&#39;t ideal.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;24&quot; dir=&quot;auto&quot; &gt;To solve this, I created a script that moves the main&amp;nbsp;&lt;code &gt;xyz-foo.service&lt;/code&gt;&amp;nbsp;file to&amp;nbsp;&lt;code &gt;xyz-foo.service.d/10-override.conf&lt;/code&gt;&amp;nbsp;and creates a nearly empty&amp;nbsp;&lt;code &gt;xyz-foo.service&lt;/code&gt;&amp;nbsp;as a placeholder. This facilitates the process of setting and overriding defaults.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;26&quot; dir=&quot;auto&quot; id=&quot;gvisor&quot; &gt;gVisor&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;28&quot; dir=&quot;auto&quot; &gt;Setting up gVisor&#39;s&amp;nbsp;&lt;code &gt;runsc&lt;/code&gt;&amp;nbsp;was straightforward, and I haven&#39;t encountered any compatibility issues so far.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;30&quot; dir=&quot;auto&quot; &gt;Unfortunately, the version of gVisor in the Debian repositories is quite old, and&amp;nbsp;&lt;a data-href=&quot;https://www.debian.org/releases/trixie/release-notes/issues.en.html#go-and-rust-based-packages&quot; href=&quot;https://www.debian.org/releases/trixie/release-notes/issues.en.html#go-and-rust-based-packages&quot; &gt;Debian is unable to provide prompt security updates for it&lt;/a&gt;. This meant I had to add the official gVisor&amp;nbsp;&lt;code &gt;apt&lt;/code&gt;&amp;nbsp;repository. It&#39;s not the ideal solution, but it works.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;33&quot; dir=&quot;auto&quot; id=&quot;credentials&quot; &gt;Credentials&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;35&quot; dir=&quot;auto&quot; &gt;Systemd-Credentials is a very handy tool for managing sensitive information and can be used directly in Quadlet files.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;37&quot; dir=&quot;auto&quot; &gt;However, I ran into an issue when using it with containers that have&amp;nbsp;&lt;code &gt;--userns=auto&lt;/code&gt;. Because Podman still runs as root, the credentials are only readable by the root user and not by the containers.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;39&quot; dir=&quot;auto&quot; &gt;Podman offers its own solution for this called&amp;nbsp;&lt;code &gt;podman-secret&lt;/code&gt;. This feature allows you to either have Podman store the secrets or use drivers to connect to your own storage solution.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;41&quot; dir=&quot;auto&quot; &gt;I prefer to keep all my secrets in a dedicated directory. To accommodate this, I wrote a simple script that registers a file as a Podman secret:&lt;/p&gt;&lt;pre &gt;&lt;code class=&quot;code-line&quot; data-line=&quot;43&quot; dir=&quot;auto&quot; &gt;podman secret create \
  --driver=shell \
  --driver-opts=list=&quot;echo $1&quot; \
  --driver-opts=lookup=&quot;cat $2&quot; \
  --driver-opts=store=/bin/true \
  --driver-opts=delete=/bin/true \
  --replace=true \
  &quot;$1&quot; - &amp;lt;&amp;lt;&amp;lt;SECRET
&lt;/code&gt;&lt;/pre&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;54&quot; dir=&quot;auto&quot; &gt;For any container that needs access to secrets, I simply call this script in&amp;nbsp;&lt;code &gt;ExecStartPre=&lt;/code&gt;.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;56&quot; dir=&quot;auto&quot; &gt;I found a few issues, and create&amp;nbsp;&lt;a data-href=&quot;https://github.com/containers/podman/issues/27130&quot; href=&quot;https://github.com/containers/podman/issues/27130&quot; &gt;an issue&lt;/a&gt;&amp;nbsp;on GitHub. Fortunately workarounds are available.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;58&quot; dir=&quot;auto&quot; id=&quot;networking&quot; &gt;Networking&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;60&quot; dir=&quot;auto&quot; &gt;My plan is to run a few groups of containers with the following requirements:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;61&quot; dir=&quot;auto&quot; &gt;&lt;li class=&quot;code-line&quot; data-line=&quot;61&quot; dir=&quot;auto&quot; &gt;Containers within the same group can communicate with each other, but containers from different groups cannot.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;62&quot; dir=&quot;auto&quot; &gt;I want to avoid manually managing IP addresses for each group or container.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;63&quot; dir=&quot;auto&quot; &gt;It should be easy to write firewall rules to intercept container traffic.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;65&quot; dir=&quot;auto&quot; &gt;I explored a few options to achieve this:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;67&quot; dir=&quot;auto&quot; &gt;&lt;li class=&quot;code-line&quot; data-line=&quot;67&quot; dir=&quot;auto&quot; &gt;&lt;p class=&quot;code-line&quot; data-line=&quot;67&quot; dir=&quot;auto&quot; &gt;&lt;strong&gt;Plain Bridges:&lt;/strong&gt;&amp;nbsp;The simplest approach is to create a default bridge for each container group, which is the default behavior in Podman. Everything works out of the box, and I don&#39;t need to explicitly specify IP addresses. However, writing firewall rules is tricky because the IP addresses and bridge names are not predetermined. To make this work, I would need to carefully define an IP range and/or bridge name for each bridge.&lt;/p&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;69&quot; dir=&quot;auto&quot; &gt;&lt;p class=&quot;code-line&quot; data-line=&quot;69&quot; dir=&quot;auto&quot; &gt;&lt;strong&gt;Single Bridge:&lt;/strong&gt;&amp;nbsp;Another option is to create a bridge using&amp;nbsp;&lt;code &gt;systemd-networkd&lt;/code&gt;&amp;nbsp;and then, for each container group, create a Podman bridge network using the existing bridge with VLANs and/or isolation. Since the bridge is unmanaged, I will need to manually set up DHCP, DNS, and the firewall.&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot; &gt;&lt;li class=&quot;code-line&quot; data-line=&quot;70&quot; dir=&quot;auto&quot; &gt;Simply setting up a DHCP server on the bridge (i.e. in bridge.netweork) doesn&#39;t work. I think it is because podman, or more precisely, netavark-dhcp-proxy, will use the interface (bridge in this case) as DHCP client rather than server.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;71&quot; dir=&quot;auto&quot; &gt;It works if I add an external DHCP server to the bridge. E.g. create a veth pair, put one end under the bridge and add a DHCP server on the other end.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;73&quot; dir=&quot;auto&quot; &gt;&lt;p class=&quot;code-line&quot; data-line=&quot;73&quot; dir=&quot;auto&quot; &gt;&lt;strong&gt;VRF:&lt;/strong&gt;&amp;nbsp;I tried creating a VRF with&amp;nbsp;&lt;code &gt;systemd-networkd&lt;/code&gt;&amp;nbsp;and then creating a Podman bridge network for each container group, specifying the VRF. DHCP works with this setup, but&amp;nbsp;&lt;a data-href=&quot;https://github.com/containers/aardvark-dns/pull/562&quot; href=&quot;https://github.com/containers/aardvark-dns/pull/562&quot; &gt;DNS server doesn&#39;t&lt;/a&gt;. While I was able to force&amp;nbsp;&lt;code &gt;aardvark-dns&lt;/code&gt;&amp;nbsp;to use the VRF by using a wrapper script (&lt;code &gt;exec ip vrf exec MY-VRF aardvark-dns &quot;$@&quot;&lt;/code&gt;), this solution felt too fragile for long-term use. Also, I wasn&#39;t able to easily isolate containers within the same network.&lt;/p&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;75&quot; dir=&quot;auto&quot; &gt;&lt;p class=&quot;code-line&quot; data-line=&quot;75&quot; dir=&quot;auto&quot; &gt;&lt;strong&gt;veth + macvlan:&lt;/strong&gt;&amp;nbsp;This setup involves creating a&amp;nbsp;&lt;code &gt;veth&lt;/code&gt;&amp;nbsp;pair for each container group, putting DHCP server on one end, and then using the other end to create a Podman&amp;nbsp;&lt;code &gt;macvlan&lt;/code&gt;&amp;nbsp;network. This works, and it is easy to isolate containers within one podman network, by using the&amp;nbsp;&lt;code &gt;private&lt;/code&gt;&amp;nbsp;mode and setting up necessary firewall rules. But note that the DNS server is not supported in&amp;nbsp;&lt;code &gt;macvlan&lt;/code&gt;&amp;nbsp;mode.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;77&quot; dir=&quot;auto&quot; id=&quot;my-networking-plan&quot; &gt;My Networking Plan&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;79&quot; dir=&quot;auto&quot; &gt;All these options more-or-less work, with different trade-offs. I plan to use 1 and 4 for different use cases:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;81&quot; dir=&quot;auto&quot; &gt;&lt;li class=&quot;code-line&quot; data-line=&quot;81&quot; dir=&quot;auto&quot; &gt;When a group of containers need to find each other by name, I put them into a bridge network (option 1) with&amp;nbsp;&lt;code &gt;Internal=true&lt;/code&gt;. Both DCHP and DNS work, and I don&#39;t need to write much firewall rules.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;82&quot; dir=&quot;auto&quot; &gt;When a container needs to access Internet, I create a private macvlan network (option 4) and put the container into it. Since I can easily isolate containers within the same network, it is possible to put containers from different groups into the same bridge network. Then I can just use the bridge network to group containers by policies, e.g. some containers can access everything, but some are only allows to visit specific URLs. I just need to write firewall rules for each bridge network.&lt;/li&gt;&lt;/ol&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;87&quot; dir=&quot;auto&quot; id=&quot;final-thoughts&quot; &gt;Final Thoughts&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;92&quot; dir=&quot;auto&quot; &gt;So far, the combination of Podman, Quadlet, and gVisor has been a positive experience. Not everything has worked perfectly, but I&#39;m quite happy with the setup. If things continue to go well, I might be able to migrate my&amp;nbsp;&lt;code &gt;docker-compose&lt;/code&gt;&amp;nbsp;setup in the near future.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/2480761985834851375' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2480761985834851375'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2480761985834851375'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/09/a-journey-into-podman-notes-on-my-first.html' title='A Journey into Podman: Notes on My First Adventure'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-8558732413464279046</id><published>2025-09-17T22:08:00.009+02:00</published><updated>2025-09-17T22:21:18.784+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="Experiment"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="selinux"/><title type='text'>gVisor: A Fresh Look at Container Security</title><content type='html'>&lt;p class=&quot;code-line&quot; data-line=&quot;2&quot; dir=&quot;auto&quot;&gt;My original plan was to stabilize my&amp;nbsp;&lt;a data-href=&quot;https://blog.wang-lu.com/2025/08/rethinking-my-vm-image-pipeline.html&quot; href=&quot;https://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html&quot;&gt;VM pipeline&lt;/a&gt;&amp;nbsp;before deploying containers using a hardened stack of Podman, QEMU, SELinux, and user namespaces (&lt;code&gt;--userns=auto&lt;/code&gt;). However, the pipeline&#39;s complexity grew, requiring script rewrites and schema redesigns, and the process took much longer than anticipated.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot;&gt;In the meantime, an interesting alternative has captured my attention:&amp;nbsp;&lt;a data-href=&quot;https://gvisor.dev/&quot; href=&quot;https://gvisor.dev/&quot;&gt;gVisor&lt;/a&gt;. It occupies a unique space between traditional SELinux policies and full-blown virtual machines, offering a compelling set of trade-offs.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;26&quot; dir=&quot;auto&quot; id=&quot;what-is-gvisor&quot;&gt;What is gVisor?&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;8&quot; dir=&quot;auto&quot;&gt;At its core, gVisor is an application kernel, written in the memory-safe language Go, that provides an additional layer of isolation between containerized applications and the host operating system. It&#39;s essentially a user-space implementation of the Linux kernel&#39;s system call interface.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;10&quot; dir=&quot;auto&quot;&gt;The security model is explained&amp;nbsp;&lt;a data-href=&quot;https://gvisor.dev/docs/architecture_guide/security/&quot; href=&quot;https://gvisor.dev/docs/architecture_guide/security/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;21&quot; dir=&quot;auto&quot; id=&quot;gvisor-in-practice&quot;&gt;gVisor in Practice&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;14&quot; dir=&quot;auto&quot;&gt;gVisor provides an OCI-compliant runtime called&amp;nbsp;&lt;code&gt;runsc&lt;/code&gt;, which can be almost transparently integrated with container tools like Docker and Podman.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;16&quot; dir=&quot;auto&quot;&gt;And that&#39;s it! Unlike SELinux, here we don&#39;t need to write any policies. This is the most attractive feature for me.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;18&quot; dir=&quot;auto&quot;&gt;However, it comes with notable downsides:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;19&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;19&quot; dir=&quot;auto&quot;&gt;SELinux is not supported, I cannot use both gVisor and SELinux at the same time.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot;&gt;&lt;code&gt;--ignore-cgroups&lt;/code&gt;&amp;nbsp;must be used for rootless podman, this mean cgroups won&#39;t work. Maybe it can be&amp;nbsp;&lt;a data-href=&quot;https://github.com/google/gvisor/issues/11544&quot; href=&quot;https://github.com/google/gvisor/issues/11544&quot;&gt;fixed later&lt;/a&gt;.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;21&quot; dir=&quot;auto&quot;&gt;There can be potential compatibily issues, because gVisor implements its own version of syscalls.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;22&quot; dir=&quot;auto&quot;&gt;The performance overhead is higher, especially for IO-related syscalls. It is well explain&amp;nbsp;&lt;a data-href=&quot;https://gvisor.dev/docs/architecture_guide/performance/&quot; href=&quot;https://gvisor.dev/docs/architecture_guide/performance/&quot;&gt;here&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;25&quot; dir=&quot;auto&quot; id=&quot;my-plan&quot;&gt;My Plan&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;27&quot; dir=&quot;auto&quot;&gt;I plan to evaluate gVisor with a few of my simple containers. Its promise of &quot;secure-by-default&quot; sandboxing without complex configuration is very appealing, especially for running applications where trust is a concern but the overhead of a full VM is undesirable.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;29&quot; dir=&quot;auto&quot;&gt;I also believe that I don&#39;t really need the fine-grained control offered by SELinux. Bind mounts (read-only, read-write) should be enough for me. Eventually I might even drop the VM pipeline and just use gVisor.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;31&quot; dir=&quot;auto&quot;&gt;We&#39;ll see.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/8558732413464279046' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/8558732413464279046'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/8558732413464279046'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/09/gvisor-fresh-look-at-container-security.html' title='gVisor: A Fresh Look at Container Security'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-7342797061981734885</id><published>2025-08-24T23:40:00.010+02:00</published><updated>2025-08-25T11:58:53.857+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><title type='text'>A Declarative Approach to Config File Management</title><content type='html'>&lt;p class=&quot;code-line&quot; data-line=&quot;2&quot; dir=&quot;auto&quot;&gt;Configuration files for different services are rarely independent. For example, in nftables, I might tag traffic with a firewall mark, and that mark is then used by systemd-networkd or in ip routes. Similarly, when the name of the primary network interface changes, multiple services like nftables, postfix, and samba need to be updated.&lt;/p&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;4&quot; dir=&quot;auto&quot; id=&quot;requirements&quot;&gt;Requirements&lt;/h2&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;6&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;6&quot; dir=&quot;auto&quot;&gt;I want to define core data in one place, then update all config files with a simple command.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;7&quot; dir=&quot;auto&quot;&gt;If a configuration file is modified by an external process (for example, a package update from a vendor or distribution), the changes must be handled gracefully. Either the merge should be automatic and permanent, or I should be notified to easily resolve any conflicts.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;8&quot; dir=&quot;auto&quot;&gt;It should be obvious within the config file itself what changes I have made.&lt;/li&gt;&lt;/ul&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;11&quot; dir=&quot;auto&quot; id=&quot;existing-solutions&quot;&gt;Existing Solutions&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;13&quot; dir=&quot;auto&quot;&gt;I did some quick survey and found a few options.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;15&quot; dir=&quot;auto&quot; id=&quot;1-templates&quot;&gt;1. Templates&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;16&quot; dir=&quot;auto&quot;&gt;These tools render a template using provided data sources. To manage&amp;nbsp;&lt;code&gt;/etc/config.txt&lt;/code&gt;, I would create a&amp;nbsp;&lt;code&gt;/etc/config.txt.template&lt;/code&gt;&amp;nbsp;with all the moving parts marked using the required syntax.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;18&quot; dir=&quot;auto&quot;&gt;Examples include:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;19&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;19&quot; dir=&quot;auto&quot;&gt;&lt;a data-href=&quot;https://github.com/hairyhenderson/gomplate&quot; href=&quot;https://github.com/hairyhenderson/gomplate&quot;&gt;gomeplate&lt;/a&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;20&quot; dir=&quot;auto&quot;&gt;&lt;a data-href=&quot;https://www.gnu.org/software/gettext/manual/html_node/envsubst-Invocation.html&quot; href=&quot;https://www.gnu.org/software/gettext/manual/html_node/envsubst-Invocation.html&quot;&gt;envsubst&lt;/a&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;21&quot; dir=&quot;auto&quot;&gt;&lt;a data-href=&quot;https://github.com/pallets/jinja/&quot; href=&quot;https://github.com/pallets/jinja/&quot;&gt;Jinja&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;23&quot; dir=&quot;auto&quot;&gt;The biggest issue is that the generated config file is no longer the source of truth. This means if the generated file is modified by other tools, those changes will be lost the next time I render the template.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;25&quot; dir=&quot;auto&quot;&gt;Perhaps these tools are better suited for scenarios like building images with&amp;nbsp;&lt;code&gt;bootc&lt;/code&gt;&amp;nbsp;or&amp;nbsp;&lt;code&gt;mkosi&lt;/code&gt;.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;27&quot; dir=&quot;auto&quot; id=&quot;2-patching&quot;&gt;2. Patching&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;28&quot; dir=&quot;auto&quot;&gt;These tools record and apply the diff between a default state and the desired state.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;30&quot; dir=&quot;auto&quot;&gt;Examples include:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;31&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;31&quot; dir=&quot;auto&quot;&gt;&lt;code&gt;diff&lt;/code&gt;&amp;nbsp;and&amp;nbsp;&lt;code&gt;patch&lt;/code&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;32&quot; dir=&quot;auto&quot;&gt;Ansible&#39;s&amp;nbsp;&lt;a data-href=&quot;https://docs.ansible.com/ansible/latest/collections/ansible/builtin/lineinfile_module.html&quot; href=&quot;https://docs.ansible.com/ansible/latest/collections/ansible/builtin/lineinfile_module.html&quot;&gt;&lt;code&gt;lineinfile&lt;/code&gt;&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a data-href=&quot;https://docs.ansible.com/ansible/latest/collections/ansible/builtin/blockinfile_module.html&quot; href=&quot;https://docs.ansible.com/ansible/latest/collections/ansible/builtin/blockinfile_module.html&quot;&gt;&lt;code&gt;blockinfile&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;33&quot; dir=&quot;auto&quot;&gt;&lt;a data-href=&quot;https://augeas.net/&quot; href=&quot;https://augeas.net/&quot;&gt;Augeas&lt;/a&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;34&quot; dir=&quot;auto&quot;&gt;&lt;a data-href=&quot;https://github.com/pixelb/crudini&quot; href=&quot;https://github.com/pixelb/crudini&quot;&gt;crudini&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;36&quot; dir=&quot;auto&quot;&gt;There are two issues with these tools:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;38&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;38&quot; dir=&quot;auto&quot;&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;38&quot; dir=&quot;auto&quot;&gt;The diff is stored separately from the config file, which is hard to read and maintain. I might also need to keep a copy of the original, unpatched file for reference.&lt;/p&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;40&quot; dir=&quot;auto&quot;&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;40&quot; dir=&quot;auto&quot;&gt;The patch might not be reliable if there isn&#39;t enough context to locate the exact area for patching. For example, consider a config file like this:&lt;/p&gt;&lt;pre&gt;&lt;code class=&quot;code-line&quot; data-line=&quot;41&quot; dir=&quot;auto&quot;&gt;[Config for user A]
// many lines
use_https = true

[Config for user B]
// many lines
use_https = false
&lt;/code&gt;&lt;/pre&gt;&lt;p class=&quot;code-line code-active-line&quot; data-line=&quot;50&quot; dir=&quot;auto&quot;&gt;We want to modify the&amp;nbsp;&lt;code&gt;use_https&lt;/code&gt;&amp;nbsp;setting for user A and generate a diff. Later, if the vendor&#39;s config file swaps the order of user A and B, the patch might still apply without error, but it will modify the wrong section!&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;52&quot; dir=&quot;auto&quot;&gt;Note that while Ansible can place markers around managed blocks, it must first insert them. For the initial insertion, it relies on regular expressions (&lt;code&gt;insertafter&lt;/code&gt;&amp;nbsp;and&amp;nbsp;&lt;code&gt;insertbefore&lt;/code&gt;) to find the location, which can be brittle.&lt;/p&gt;&lt;h3 class=&quot;code-line&quot; data-line=&quot;54&quot; dir=&quot;auto&quot; id=&quot;3-generators&quot;&gt;3. Generators&lt;/h3&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;55&quot; dir=&quot;auto&quot;&gt;NixOS allows you to generate all config files using custom data and functions in the same language.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;57&quot; dir=&quot;auto&quot;&gt;The biggest issues with this approach are:&lt;/p&gt;&lt;ol class=&quot;code-line&quot; data-line=&quot;59&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;59&quot; dir=&quot;auto&quot;&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;59&quot; dir=&quot;auto&quot;&gt;You are forced to commit to a specific ecosystem like NixOS or another tool that fully manages your system&#39;s configuration.&lt;/p&gt;&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;61&quot; dir=&quot;auto&quot;&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;61&quot; dir=&quot;auto&quot;&gt;Merge conflicts almost never happen because your own NixOS configuration is just an override of the default values. This means you aren&#39;t notified of potential&amp;nbsp;&lt;em&gt;semantic&lt;/em&gt;&amp;nbsp;conflicts. For example, if a default value you were referencing changes upstream, your configuration will adapt silently, which may not be the desired behavior without a manual review.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;63&quot; dir=&quot;auto&quot; id=&quot;my-plan&quot;&gt;My Plan&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;64&quot; dir=&quot;auto&quot;&gt;The existing solutions I found almost solve my problem, but not 100%.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;66&quot; dir=&quot;auto&quot;&gt;The closest approach I found is a combination of:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;68&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;68&quot; dir=&quot;auto&quot;&gt;Adding a custom, unique anchor comment in the config file.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;69&quot; dir=&quot;auto&quot;&gt;Using Ansible&#39;s&amp;nbsp;&lt;code&gt;blockinfile&lt;/code&gt;&amp;nbsp;with the anchor comment for&amp;nbsp;&lt;code&gt;insertafter&lt;/code&gt;&amp;nbsp;or&amp;nbsp;&lt;code&gt;insertbefore&lt;/code&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;71&quot; dir=&quot;auto&quot;&gt;But I still don&#39;t like that the diff is stored separately from the config file. To solve this, my plan is to embed the template directly inside the configuration file, like this:&lt;/p&gt;&lt;pre&gt;&lt;code class=&quot;code-line&quot; data-line=&quot;73&quot; dir=&quot;auto&quot;&gt;### BEGIN MANAGED BLOCK
### binds_to = {{ config.permanent_lan_interface.name }}
### END OF TEMPLATE
### END MANAGED BLOCK
&lt;/code&gt;&lt;/pre&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;80&quot; dir=&quot;auto&quot;&gt;I&#39;ll then write a script that:&lt;/p&gt;&lt;ul class=&quot;code-line&quot; data-line=&quot;81&quot; dir=&quot;auto&quot;&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;81&quot; dir=&quot;auto&quot;&gt;Deletes all text after&amp;nbsp;&lt;code&gt;END OF TEMPLATE&lt;/code&gt;.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;82&quot; dir=&quot;auto&quot;&gt;Parses the template before&amp;nbsp;&lt;code&gt;END OF TEMPLATE&lt;/code&gt;.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;82&quot; dir=&quot;auto&quot;&gt;Renders the template using libraries like Jinja.&lt;/li&gt;&lt;li class=&quot;code-line&quot; data-line=&quot;83&quot; dir=&quot;auto&quot;&gt;Puts the rendered template after&amp;nbsp;&lt;code&gt;END OF TEMPLATE&lt;/code&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h2 class=&quot;code-line&quot; data-line=&quot;85&quot; dir=&quot;auto&quot; id=&quot;final-thoughts&quot;&gt;Final Thoughts&lt;/h2&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;87&quot; dir=&quot;auto&quot;&gt;My current plan may not be elegant, but it seems to meet my requirements more effectively than the other solutions.&lt;/p&gt;&lt;p class=&quot;code-line&quot; data-line=&quot;89&quot; dir=&quot;auto&quot;&gt;Meanwhile, I&#39;m still looking for new options. Please let me know if you know any.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/7342797061981734885' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7342797061981734885'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7342797061981734885'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/a-declarative-approach-to-config-file.html' title='A Declarative Approach to Config File Management'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-7661329877104827719</id><published>2025-08-22T01:01:00.005+02:00</published><updated>2025-08-22T01:01:27.100+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="network"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>VM Networking From Scratch</title><content type='html'>&lt;p&gt;Now that I&#39;ve settled on my &lt;response-element&gt;&lt;link-block&gt;&lt;a href=&quot;https://blog.wang-lu.com/2025/08/rethinking-my-vm-image-pipeline.html&quot; target=&quot;_blank&quot;&gt;VM image pipeline&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt;, the next logical step is to tackle networking.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;My Requirements&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;So far, I&#39;ve been using QEMU&#39;s default user-mode networking. It&#39;s convenient for quick tasks, allowing for easy port forwarding, Samba shares, and DNS with just a few flags. However, this setup is ultimately insufficient for my needs for a couple of key reasons:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Security and Isolation:&lt;/b&gt; In the default user-mode setup, a VM can access the host&#39;s services via &lt;code&gt;localhost&lt;/code&gt;. Worse, because it uses NAT, the VM can also access the host&#39;s entire LAN using the host&#39;s IP address. Ideally, VMs should have their own identifiable IP addresses, and more importantly, there should be strong network isolation between the host and the VMs.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Centralized Auditing:&lt;/b&gt; I want to audit all network traffic from my VMs through a centralized solution. This means I need a way to route all VM traffic through a single point of control.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Choosing the Right Tool&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;For most people, tools like libvirt or Incus are the best choice for this task. They are well-maintained, thoroughly tested, and have well-designed command-line interfaces that are less error-prone. I should probably just choose one of them and be done with it.&lt;/p&gt;&lt;p&gt;...except that I&#39;m genuinely interested in learning the underlying building blocks and terminology. This is the main reason I chose to write QEMU scripts manually in the first place. Meanwhile, I find myself constantly referring to the documentation for these tools anyway when I&#39;m studying security options.&lt;/p&gt;&lt;p&gt;Maybe one day, when I&#39;m satisfied with my knowledge, I&#39;ll migrate my scripts to one of these excellent tools. But for now, let&#39;s learn by &lt;strike&gt;suffering&lt;/strike&gt; doing.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Bridge + TAP&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;As many guides suggest, for anything beyond basic networking, the place to start is with a Linux bridge and TAP devices. A bridge acts like a virtual network switch, and a TAP device acts like a virtual network port for a VM to connect to that switch.&lt;/p&gt;&lt;p&gt;Thankfully, &lt;code&gt;systemd-networkd&lt;/code&gt; makes this setup fairly easy. In the &lt;code&gt;.network&lt;/code&gt; file for my bridge, setting &lt;code&gt;IPv4Forwarding=yes&lt;/code&gt; and &lt;code&gt;IPMasquerade=ipv4&lt;/code&gt; saves me from writing custom &lt;code&gt;nftables&lt;/code&gt; rules for NAT, which is a huge time-saver. QEMU also makes it simple to attach a VM&#39;s network interface to an existing TAP device.&lt;/p&gt;&lt;p&gt;To keep things tidy, I decided to automatically generate the &lt;code&gt;systemd-networkd&lt;/code&gt; configuration files (e.g., &lt;code&gt;.netdev&lt;/code&gt; and &lt;code&gt;.network&lt;/code&gt;) directly from my VM configuration files. I save these generated files to &lt;code&gt;/run/systemd/network/&lt;/code&gt;. This ensures I don&#39;t have to manually keep two sets of configurations in sync.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;IP Addresses&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The easiest way to assign IP addresses to VMs is to run a DHCP server on the bridge. Most standard cloud images, including &lt;code&gt;bootc&lt;/code&gt; images, are configured to use DHCP by default.&lt;/p&gt;&lt;p&gt;However, I ultimately decided to use static IP addresses. Setting up a DHCP server securely, whether on the host or in a dedicated VM, takes some effort. Even with a DHCP server, I would likely configure static reservations to make it easier to write firewall rules to prevent IP address spoofing.&lt;/p&gt;&lt;p&gt;So, my process for each VM looks like this:&lt;/p&gt;&lt;ol start=&quot;1&quot;&gt;&lt;li&gt;&lt;p&gt;Generate a unique MAC address and a static IP address, and store them in the VM&#39;s configuration file.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Before starting the VM, generate a temporary &lt;code&gt;systemd-networkd&lt;/code&gt; &lt;code&gt;.network&lt;/code&gt; file that matches the VM&#39;s MAC address and configures its static IP, gateway, and DNS settings.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Pass these configuration files into the VM at boot time using systemd&#39;s &lt;response-element&gt;&lt;link-block&gt;&lt;a externallink=&quot;&quot; href=&quot;https://www.freedesktop.org/software/systemd/man/latest/systemd.system-credentials.html#network.conf.*&quot; target=&quot;_blank&quot;&gt;&lt;code&gt;network.*&lt;/code&gt; credentials&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt; feature.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;This should work perfectly... right?&lt;/p&gt;&lt;p&gt;Wrong! I quickly discovered that CentOS&amp;nbsp;does not ship &lt;code&gt;systemd-networkd&lt;/code&gt; (&lt;response-element&gt;&lt;link-block&gt;&lt;a externallink=&quot;&quot; href=&quot;https://bugzilla.redhat.com/show_bug.cgi?id=1650342&quot; target=&quot;_blank&quot;&gt;1&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt;, &lt;response-element&gt;&lt;link-block&gt;&lt;a externallink=&quot;&quot; href=&quot;https://bugzilla.redhat.com/show_bug.cgi?id=2020254&quot; target=&quot;_blank&quot;&gt;2&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt;, &lt;response-element&gt;&lt;link-block&gt;&lt;a externallink=&quot;&quot; href=&quot;https://bugzilla.redhat.com/show_bug.cgi?id=1962257&quot; target=&quot;_blank&quot;&gt;3&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt;).&lt;/p&gt;&lt;p&gt;After looking through the &lt;response-element&gt;&lt;link-block&gt;&lt;a externallink=&quot;&quot; href=&quot;https://docs.fedoraproject.org/en-US/bootc/sysconfig-network-configuration/&quot; target=&quot;_blank&quot;&gt;official options&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt; for &lt;code&gt;bootc&lt;/code&gt; images, I settled on using NetworkManager. This requires me to generate a NetworkManager keyfile and embed it into the container image. This isn&#39;t ideal because updating the network configuration requires rebuilding the image, which is slow. In the future, I might explore better options, such as:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Separating the Linux kernel from the image and booting it directly with QEMU, allowing me to pass network configuration via &lt;a href=&quot;https://docs.fedoraproject.org/en-US/bootc/sysconfig-network-configuration/#_dracut_kernel_arguments&quot;&gt;kernel parameters&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Using a different base image that includes &lt;code&gt;systemd-networkd&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Inter-VM Traffic&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;By default, all VMs connected to the same bridge can communicate with each other freely. This isn&#39;t what I want; my goal is to enforce a &quot;default deny&quot; policy and only allow traffic that is explicitly permitted.&lt;/p&gt;&lt;p&gt;After some research with some help from AI, I learned a few key terms: port isolation, private VLAN, and proxy ARP. It turns out these concepts are perfect for my use case.&lt;/p&gt;&lt;p&gt;Here’s what I discovered when I put it into practice:&lt;/p&gt;&lt;p&gt;I started with a standard bridge and TAP setup, with host firewall rules in &lt;code&gt;nftables&lt;/code&gt; that block all traffic. As expected, the VMs could not connect to the internet. However, they could still talk to each other. Why?&lt;/p&gt;&lt;p&gt;A quick debugging session with &lt;code&gt;nft monitor&lt;/code&gt; revealed that packets traveling between VMs on the same bridge never hit my &lt;code&gt;inet&lt;/code&gt; family firewall rules. This is because the bridge was forwarding the traffic at Layer 2 (like a real switch), so the host&#39;s Layer 3 IP-level firewall was never consulted. &lt;code&gt;nftables&lt;/code&gt; has a &lt;code&gt;bridge&lt;/code&gt; family specifically for filtering this kind of traffic.&lt;/p&gt;&lt;p&gt;Next, I enabled port isolation on the bridge. Now, even the &lt;code&gt;bridge&lt;/code&gt; family rules couldn&#39;t see the packets between VMs. This confirmed that port isolation operates at an even lower level, preventing the bridge from forwarding frames between isolated ports altogether.&lt;/p&gt;&lt;p&gt;This gave me the perfect foundation. Now, if I want to allow two VMs to communicate, I have to do it explicitly. I have two main options:&lt;/p&gt;&lt;ol start=&quot;1&quot;&gt;&lt;li&gt;&lt;p&gt;Force Gateway Routing: I can remove the local subnet route inside each VM, forcing them to send all packets (even to other VMs on the same subnet) to the bridge&#39;s gateway IP address. The host&#39;s routing stack will then receive the packets, which can be filtered by my standard &lt;code&gt;inet&lt;/code&gt; family &lt;code&gt;nftables&lt;/code&gt; rules.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Use Proxy ARP: I can enable &lt;code&gt;IPv4ProxyARPPrivateVLAN=yes&lt;/code&gt; on the bridge&#39;s network configuration. The host will then respond to ARP requests on behalf of the VMs. This tricks the VMs into sending all their packets to the host&#39;s MAC address.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Ultimately, both options achieve the same goal: they force Layer 2 traffic up to Layer 3, where it can be inspected by a central firewall. Option #2 is more elegant because it doesn&#39;t require custom network configuration inside the VMs. Option #3 seems less hacky.&lt;/p&gt;&lt;p&gt;Notes:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;My initial assumption was that with Proxy ARP (Option #2), the traffic would be captured by the &lt;code&gt;bridge&lt;/code&gt; family in &lt;code&gt;nftables&lt;/code&gt;. This is incorrect. The ARP resolution happens at Layer 2, but the subsequent IP packets are routed, so they are captured by the &lt;code&gt;ip&lt;/code&gt; or &lt;code&gt;inet&lt;/code&gt; families.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Proxy ARP doesn&#39;t remove the need for a Layer 3 firewall. A malicious VM could simply add its own static routes (as in Option #1) to try and communicate directly. The key is to have a firewall at the gateway that inspects all traffic, ensuring that even if a VM tries to bypass the intended path, the traffic is still filtered. The main benefit of port isolation is preventing direct, unfiltered Layer 2 communication.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Outgoing Traffic&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;For controlling traffic leaving the host, I have a draft plan that provides strong isolation. The idea is to create a dedicated firewall VM.&lt;/p&gt;&lt;ol start=&quot;1&quot;&gt;&lt;li&gt;&lt;p&gt;On the host, I&#39;ll set up two bridges: &lt;code&gt;bridge-internal&lt;/code&gt; and &lt;code&gt;bridge-external&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;All my regular VMs will be connected to &lt;code&gt;bridge-internal&lt;/code&gt;. The host itself will &lt;b&gt;not&lt;/b&gt; have an IP address on this bridge. This ensures the VMs cannot directly talk to the host. If needed I can set up &lt;a href=&quot;https://wiki.archlinux.org/title/QEMU#Accessing_SSH_via_vsock&quot;&gt;SSH connection over vsock&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I will set up a special firewall VM that has two network interfaces: one connected to &lt;code&gt;bridge-internal&lt;/code&gt; and the other to &lt;code&gt;bridge-external&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The host&#39;s physical network interface will be connected to &lt;code&gt;bridge-external&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;With this setup, all outgoing traffic from the VMs &lt;i&gt;must&lt;/i&gt; pass through the firewall VM, giving me a single place to manage all rules. It also isolates the host&#39;s network stack from the VMs by default.&lt;/p&gt;&lt;p&gt;For services, I can configure the firewall VM:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;For non-HTTP services like NTP, I can set up forwarding or proxy rules.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;For HTTP/HTTPS traffic, I can set up a transparent proxy using Nginx. Previously, I thought this would require a separate proxy configuration for each domain, e.g. dedicated proxy and DNS entry, but AI showed me a much better way:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Nginx&#39;s &lt;a href=&quot;https://nginx.org/en/docs/stream/ngx_stream_ssl_preread_module.html&quot;&gt;&lt;code&gt;ngx_stream_ssl_preread_module&lt;/code&gt;&lt;/a&gt;&amp;nbsp;allows it to inspect the SNI (Server Name Indication) in the TLS handshake without decrypting the traffic.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I can use firewall rules to redirect all outgoing HTTPS traffic from &lt;code&gt;bridge-internal&lt;/code&gt; to this Nginx stream proxy.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In the proxy, I can maintain a simple allowlist of domains and block everything else.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;I plan to explore this design further. For example, is it better to use the host as the firewall? Or split the firewall services into multiple VMs? Could &lt;code&gt;macvlan&lt;/code&gt; be useful here? These are questions for a future post.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;In the end, I&#39;ve replaced QEMU&#39;s basic networking with a much more secure, custom setup. Using a Linux bridge and port isolation, I can now force all VM traffic through a central firewall for inspection.&lt;/p&gt;&lt;p&gt;While it was more work than using a tool like libvirt, building this from scratch was a fantastic way to learn the fundamentals of VM networking and gain complete control over my environment.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/7661329877104827719' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7661329877104827719'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7661329877104827719'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/vm-networking-from-scratch.html' title='VM Networking From Scratch'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-4182028763855862231</id><published>2025-08-17T20:02:00.003+02:00</published><updated>2025-08-17T20:02:19.681+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="评论和记事"/><title type='text'>Sending Emails with Curl: A Nifty Systemd Workaround</title><content type='html'>&lt;p&gt;I recently tried to create a simple systemd service to send an email notification, but my initial approach with &lt;code&gt;mail&lt;/code&gt; and &lt;code&gt;sendmail&lt;/code&gt; failed with a strange permission error.&lt;/p&gt;&lt;p&gt;My original service file looked like this:&lt;/p&gt;&lt;response-element class=&quot;&quot; ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;code-block _nghost-ng-c1437408396=&quot;&quot; class=&quot;ng-tns-c1437408396-37 ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;div _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;code-block ng-tns-c1437408396-37 ng-animate-disabled ng-trigger ng-trigger-codeBlockRevealAnimation&quot; jslog=&quot;223238;track:impression,attention;BardVeMetadataKey:[[&amp;quot;r_e18f12acc685cf06&amp;quot;,&amp;quot;c_ca10d1e14253f191&amp;quot;,null,&amp;quot;rc_b628163f34a18d60&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot;&gt;&lt;!----&gt;&lt;div _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;formatted-code-block-internal-container ng-tns-c1437408396-37&quot;&gt;&lt;div _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;animated-opacity ng-tns-c1437408396-37&quot;&gt;&lt;pre _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;ng-tns-c1437408396-37&quot;&gt;&lt;code _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;code-container formatted ng-tns-c1437408396-37 no-decoration-radius&quot; data-test-id=&quot;code-content&quot; role=&quot;text&quot;&gt;[Service]
ExecStart=mail --subject=Subject recipient@example.com
&lt;/code&gt;&lt;/pre&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/code-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;&lt;p&gt;The error message was a bit of a head-scratcher: &lt;code&gt;warning: mail_queue_enter: create file maildrop/....: Permission denied&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;A quick search pointed me to &lt;a href=&quot;https://linux.m2osw.com/snapwebsites-postfixpostdrop18189-warning-mailqueueenter-create-file-maildrop25937318189-permission&quot;&gt;the cause&lt;/a&gt;: the &lt;code&gt;postdrop&lt;/code&gt; binary has setgid. However, the systemd setting &lt;code&gt;NoNewPrivileges=true&lt;/code&gt; prevents this.&lt;/p&gt;&lt;p&gt;While I hadn&#39;t explicitly used that setting, I was using &lt;code&gt;DynamicUser=true&lt;/code&gt;, which &lt;a href=&quot;https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#DynamicUser=&quot;&gt;implies and enforces&lt;/a&gt; &lt;code&gt;NoNewPrivileges=true&lt;/code&gt;. This meant my service, running as a temporary user, couldn&#39;t get the permissions it needed to interact with the mail queue. Note that this implication cannot be disabled/overriden.&lt;/p&gt;&lt;p&gt;I wanted to avoid creating a new, dedicated user for this task. I realized that the problem was how &lt;code&gt;mail&lt;/code&gt; and &lt;code&gt;sendmail&lt;/code&gt; directly interact with the mail queue. The solution was to bypass that entire process and talk directly to the local SMTP server.&lt;/p&gt;&lt;p&gt;I didn&#39;t want to install another dedicated SMTP client. Fortunately, I learned that the &lt;code&gt;curl&lt;/code&gt; can also act as an SMTP client! This command worked perfectly, sending the email by directly:&lt;/p&gt;&lt;response-element class=&quot;&quot; ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;code-block _nghost-ng-c1437408396=&quot;&quot; class=&quot;ng-tns-c1437408396-38 ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;div _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;code-block ng-tns-c1437408396-38 ng-animate-disabled ng-trigger ng-trigger-codeBlockRevealAnimation&quot; jslog=&quot;223238;track:impression,attention;BardVeMetadataKey:[[&amp;quot;r_e18f12acc685cf06&amp;quot;,&amp;quot;c_ca10d1e14253f191&amp;quot;,null,&amp;quot;rc_b628163f34a18d60&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot;&gt;&lt;!----&gt;&lt;div _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;formatted-code-block-internal-container ng-tns-c1437408396-38&quot;&gt;&lt;div _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;animated-opacity ng-tns-c1437408396-38&quot;&gt;&lt;pre _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;ng-tns-c1437408396-38&quot;&gt;&lt;code _ngcontent-ng-c1437408396=&quot;&quot; class=&quot;code-container formatted ng-tns-c1437408396-38 no-decoration-radius&quot; data-test-id=&quot;code-content&quot; role=&quot;text&quot;&gt;curl --url smtp://localhost:25 --mail-rcpt recipient@example.com --upload-file body.txt&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code-block&gt;&lt;/response-element&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/4182028763855862231' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/4182028763855862231'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/4182028763855862231'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/sending-emails-with-curl-niftysystemd.html' title='Sending Emails with Curl: A Nifty Systemd Workaround'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-2822309351899655751</id><published>2025-08-14T22:06:00.001+02:00</published><updated>2025-08-14T22:06:06.748+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="bootc"/><category scheme="http://www.blogger.com/atom/ns#" term="declarative"/><category scheme="http://www.blogger.com/atom/ns#" term="Experiment"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Thoughts"/><title type='text'>mkosi: First Impressions</title><content type='html'>&lt;p&gt;I stumbled upon &lt;a href=&quot;https://wiki.gentoo.org/wiki/Systemd/systemd-nspawn&quot;&gt;the Gentoo wiki page for &lt;code&gt;systemd-nspawn&lt;/code&gt;&lt;/a&gt;, which in turn led me to &lt;a href=&quot;http://nspawn.org&quot;&gt;nspawn.org&lt;/a&gt;, &lt;code&gt;&lt;a href=&quot;https://mkosi.systemd.io/&quot;&gt;mkosi&lt;/a&gt;&lt;/code&gt;, and later &lt;code&gt;&lt;a href=&quot;https://www.freedesktop.org/software/systemd/man/latest/systemd-sysupdate.html&quot;&gt;systemd-sysupdate&lt;/a&gt;&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt; quickly caught my eye because it&#39;s almost exactly what I wanted to build myself, as &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://blog.wang-lu.com/2025/08/rethinking-my-vm-image-pipeline.html&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;mentioned in a previous post&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;. So, I decided to spend my &quot;sysadmin fun quota&quot; on it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Overview&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt; is similar to &lt;code&gt;docker build&lt;/code&gt; or &lt;code&gt;podman build&lt;/code&gt;, but it&#39;s designed for creating full OS images. It focuses on development and testing. For example, much like &lt;code&gt;nix-shell&lt;/code&gt;, &lt;code&gt;mkosi&lt;/code&gt; can quickly launch a sandboxed shell with a specific distribution and selected packages installed. The &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/systemd/tree/57aeb4a403bd6897b99f07c6efa9e8618df55731/mkosi&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;systemd project&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt; itself uses &lt;code&gt;mkosi&lt;/code&gt; for testing across different distros.&lt;/p&gt;&lt;p&gt;The &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://0pointer.net/blog/a-re-introduction-to-mkosi-a-tool-for-generating-os-images.html&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;re-introduction article&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt;&amp;nbsp;is a great read.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Speed&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Note that this is by no means a rigid benchmark.&lt;/p&gt;&lt;p&gt;My setup is an SSD with LUKS and an ext4 filesystem (without reflink support).&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Building Container Images&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt; is pretty fast. A simple &lt;code&gt;mkosi&lt;/code&gt; command creates a fresh Debian image. I used the &lt;code&gt;--incremental&lt;/code&gt; flag for subsequent builds.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;First run: ~30s&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Second run (after trivial changes): ~5s&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Using &lt;code&gt;mkosi -p systemd&lt;/code&gt; allows the container to boot (via &lt;code&gt;systemd-nspawn -b&lt;/code&gt;), which adds only a few seconds to the build time.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Building VM Images&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Building a VM image with &lt;code&gt;mkosi --include mkosi-vm&lt;/code&gt;&amp;nbsp;is a bit slower, likely due to the extra steps for installing a bootloader and kernel.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;First run: ~1m 30s&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Second run (after minor changes): ~30s&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Comparison with bootc&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I tried to build a fresh CentOS image using both tools.&lt;/p&gt;&lt;p&gt;&lt;code&gt;mkosi --include mkosi-vm -d centos&lt;/code&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Duration: ~1m 30s&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Output disk size: 1.2 GB&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;code-block _nghost-ng-c1830171239=&quot;&quot; class=&quot;ng-tns-c1830171239-24 ng-star-inserted&quot;&gt;&lt;div _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;code-block ng-tns-c1830171239-24 ng-animate-disabled ng-trigger ng-trigger-codeBlockRevealAnimation&quot; jslog=&quot;223238;track:impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot;&gt;&lt;div _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;code-block-decoration header-formatted gds-title-s ng-tns-c1830171239-24 ng-star-inserted&quot;&gt;&lt;div _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;buttons ng-tns-c1830171239-24 ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/div&gt;&lt;!----&gt;&lt;/div&gt;&lt;!----&gt;&lt;div _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;formatted-code-block-internal-container ng-tns-c1830171239-24&quot;&gt;&lt;div _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;animated-opacity ng-tns-c1830171239-24&quot;&gt;&lt;pre _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;ng-tns-c1830171239-24&quot;&gt;&lt;code _ngcontent-ng-c1830171239=&quot;&quot; class=&quot;code-container formatted ng-tns-c1830171239-24&quot; data-test-id=&quot;code-content&quot; role=&quot;text&quot;&gt;podman pull quay.io/centos-bootc/bootc-image-builder:latest &amp;amp;&amp;amp; \
podman run ... \
    quay.io/centos-bootc/bootc-image-builder:latest \
    --&lt;span class=&quot;hljs-built_in&quot;&gt;type&lt;/span&gt; raw ... \
    quay.io/centos-bootc/centos-bootc:stream9
&lt;/code&gt;&lt;/pre&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/code-block&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Duration: ~4m 30s&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Output disk size: 1.9 GB&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Notes:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;The&amp;nbsp;&lt;code&gt;bootc-image-builder&lt;/code&gt;&amp;nbsp;was pre-pulled, and this time isn&#39;t included in the measurement.&lt;/li&gt;&lt;li&gt;The time to pull the base CentOS image is included.&amp;nbsp;&lt;/li&gt;&lt;li&gt;I&#39;m generating a raw image here instead of QCOW2.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Again, these numbers aren&#39;t directly comparable outside of my specific setup.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;bootc-image-builder&lt;/code&gt; runs in a VM, while &lt;code&gt;mkosi&lt;/code&gt; runs directly on the host.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;centos&lt;/code&gt;&amp;nbsp;and &lt;code&gt;centos-bootc&lt;/code&gt;&amp;nbsp;are different distributions, and their configurations (like installed packages) are also very different. This is obvious from the difference in their final image sizes.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Running Images with systemd-nspawn&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I attempted to get unprivileged &lt;code&gt;systemd-nspawn&lt;/code&gt; working but failed:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;systemd-nsresourced.socket&lt;/code&gt; and &lt;code&gt;systemd-mountfsd.socket&lt;/code&gt; must be running.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;systemd-mountfsd&lt;/code&gt; complains that the image is untrusted unless it&#39;s signed or located in a &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://www.freedesktop.org/software/systemd/man/latest/systemd-mountfsd.service.html&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;trusted location&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I got stuck on &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/systemd/issues/35387#issuecomment-3186098634&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;another error&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Eventually, I resorted to using &lt;code&gt;sudo systemd-nspawn -U&lt;/code&gt;, which worked well. The &lt;code&gt;-b&lt;/code&gt; flag &quot;boots&quot; the image by running &lt;code&gt;systemd/init&lt;/code&gt;&amp;nbsp;as PID 1.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Running Images with QEMU&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;code&gt;mkosi --kvm vm&lt;/code&gt;&amp;nbsp;works nicely.&lt;/p&gt;&lt;p&gt;Notes:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Credentials are &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/mkosi/blob/cdd2d1570e256ef0aa122c079e55f093cc0df453/mkosi/qemu.py#L1513&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;visible in the command-line arguments&lt;/a&gt;&lt;/link-block&gt;&lt;/response-element&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I&#39;m not a fan of all the default flags for QEMU, but &lt;code&gt;mkosi&lt;/code&gt; provides many options for customization.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Observations, Thoughts and Concerns&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt;&amp;nbsp;is deeply integrated with systemd. Its configuration files are also following the systemd style: e.g. declarative, ini, drop-in overrides.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I wasn&#39;t able to test the performance benefits of reflink, because my filesystem doesn&#39;t support it and the disk images were small anyway.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I also wasn&#39;t able to test if SELinux works. Supposedly, it needs an extra flag in mkosi.conf and might be slow. On the other hand, it works out-of-the-box in &lt;code&gt;bootc&lt;/code&gt; images.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I don&#39;t really miss &lt;code&gt;Containerfile&lt;/code&gt;s much. I usually just need to copy files, and for my use case, a &lt;code&gt;Containerfile&lt;/code&gt; would essentially just be running my scripts with bind mounts. Plus, I don&#39;t use many layers. But I might miss having an immutable Linux setup.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt; &lt;b&gt;supports&lt;/b&gt; many popular distributions. while &lt;code&gt;bootc&lt;/code&gt;&amp;nbsp;only support Fedora/CentOS.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt;&amp;nbsp;may add surprising modifications to the image:&lt;/p&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;code&gt;mkosi&lt;/code&gt; &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/mkosi/blob/cdd2d1570e256ef0aa122c079e55f093cc0df453/mkosi/distributions/debian.py#L102&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;doesn&#39;t use &lt;code&gt;debootstrap&lt;/code&gt;&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;. It actually used to depend on it, but that dependency was &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/mkosi/blob/cdd2d1570e256ef0aa122c079e55f093cc0df453/mkosi/resources/man/mkosi.news.7.md?plain=1#L896&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;removed&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;. Not sure if this approach is hacky.&lt;p&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt; may &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c4269027044=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c4269027044=&quot;&quot; _nghost-ng-c143246426=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/mkosi/blob/cdd2d1570e256ef0aa122c079e55f093cc0df453/mkosi/__init__.py#L2930&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_a63f7c20f4e4c536&amp;quot;,&amp;quot;c_8a63d0552e45a6b3&amp;quot;,null,&amp;quot;rc_c70ed0d0e0621dd2&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;inject its own SSH server unit&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;zvol&lt;/code&gt; doesn&#39;t seem very reliable, so I&#39;ll probably avoid using it for another few years.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;code&gt;mkosi&lt;/code&gt; is a very interesting tool. While I&#39;m not ready to migrate my entire image-building pipeline yet, I might consider replacing my current LXC setup with it.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/2822309351899655751' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2822309351899655751'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2822309351899655751'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/mkosi-first-impressions.html' title='mkosi: First Impressions'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-4237445026018053509</id><published>2025-08-12T21:33:00.009+02:00</published><updated>2025-08-23T12:28:51.565+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="bootc"/><category scheme="http://www.blogger.com/atom/ns#" term="Experiment"/><category scheme="http://www.blogger.com/atom/ns#" term="immutable"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><category scheme="http://www.blogger.com/atom/ns#" term="ZFS"/><title type='text'>Rethinking My VM Image Pipeline</title><content type='html'>&lt;p&gt;Today, my pipeline regularly builds images for my &lt;a href=&quot;https://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html&quot;&gt;disposable VMs&lt;/a&gt;. Here&#39;s the current process:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;A dedicated builder VM reads&amp;nbsp;&lt;code&gt;Containerfile&lt;/code&gt;s for all VMs, including itself.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The builder VM uses &lt;code&gt;podman build&lt;/code&gt; to create container images for all VMs.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The builder VM then uses &lt;code&gt;bootc-image-builder&lt;/code&gt; to create disk images for all VMs.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This process works well, but it has a significant issue: the disk images aren&#39;t built efficiently. Unlike container images, which benefit from reusable, cacheable layers, disk images are always built from scratch. This leads to long build times and limited opportunities for data deduplication.&lt;/p&gt;&lt;p&gt;To address this, I&#39;ve been exploring alternative options to improve the pipeline.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Disk Image Formats and Deduplication&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;My Current Format: QCOW2&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I currently use QCOW2 with compression enabled. This format offers several features like snapshots, compression, and sparse files, which are useful when the underlying filesystem doesn&#39;t support them. However, if the filesystem does provide these features, QCOW2 doesn&#39;t offer many additional benefits over a simple raw disk image, at least for my use case.&lt;/p&gt;&lt;p&gt;Some notes:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Raw disk images are more transparent and widely supported by various tools. It&#39;s also much easier to deduplicate raw image files than compressed QCOW2 images. A QCOW2 image without compression should theoretically be similar to a raw image, but I haven&#39;t verified this.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The compression in QCOW2 is &quot;read-only,&quot; meaning new writes aren&#39;t compressed. This isn&#39;t a problem for me because my VMs are immutable, so the images are rarely written to after creation.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;span style=&quot;font-family: monospace;&quot;&gt;bootc-image-builder&amp;nbsp;&lt;/span&gt;actually builds the raw image first, before converting it into the QCOW2 format.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;The Power of Deduplication&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I expect deduplication to be highly effective in my setup because most of my disk images are very similar. There are a few ways to achieve this:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Filesystem Deduplication: This approach can be either online (e.g., ZFS) or offline (e.g., btrfs). The filesystem finds duplicate data blocks within files and removes redundant data from the disk. This is a general solution but doesn&#39;t necessarily speed up the initial build process.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Proactive Deduplication: This method is about building new images by applying small changes to an existing one. For example, you can &quot;fork&quot; an image using &lt;code&gt;cp --reflink a.img b.img&lt;/code&gt; or &lt;code&gt;qemu-img create -b a.qcow2 b.qcow2&lt;/code&gt;. Only the differences between the two images are stored on disk. This approach can significantly speed up the build process because you are not building from scratch, but it requires images to be built incrementally, not from a clean slate.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Exploring New Approaches&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Bootc and In-Place Updates&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I&#39;m not currently using &lt;code&gt;bootc&lt;/code&gt; images in their intended way. &lt;code&gt;bootc&lt;/code&gt; is designed so you build a single disk image once and then update it in-place via a container registry.&lt;/p&gt;&lt;p&gt;I&#39;ve considered two ways of leveraging this:&lt;/p&gt;&lt;ol start=&quot;1&quot;&gt;&lt;li&gt;&lt;p&gt;I could trust the VMs to update themselves.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I could maintain a &quot;trusted base image&quot; and follow this process:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Create a base disk image using &lt;code&gt;bootc&lt;/code&gt;. This image is only used for building other images and never for running services.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;To create the disk image for a specific VM, say VM X, I would first fork the base image using &lt;code&gt;cp --reflink&lt;/code&gt; or &lt;code&gt;qemu-img create -b&lt;/code&gt; to create &lt;code&gt;X.img&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I would then boot a VM using &lt;code&gt;X.img&lt;/code&gt; and have it upgrade itself using VM X&#39;s specific container image. This container image could either be served from the builder VM via a server or a mounted directory, or it could be built locally within the forked VM, potentially using shared layers from a mounted cache.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;This process seems workable, but it&#39;s overly complex for my taste. It involves running VMs during the build process, which would require a significant amount of scripting.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Plain Disk Images and In-Place Updates&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;This is similar to the &lt;code&gt;bootc&lt;/code&gt; approach but uses standard raw disk images. Again, I could set up temporary VMs for the build process, but instead of relying on &lt;code&gt;bootc&lt;/code&gt;&#39;s update mechanism, I would need custom scripts. This starts to resemble tools like cloud-init or Ansible.&lt;/p&gt;&lt;p&gt;A key benefit here is that a VM isn&#39;t a strict dependency. I could use something like &lt;code&gt;systemd-nspawn&lt;/code&gt; to directly modify the disk images in-place, which would simplify scripting and make the process more reliable. I did attempt this with &lt;code&gt;bootc&lt;/code&gt; images, but they don&#39;t work well with &lt;code&gt;systemd-nspawn&lt;/code&gt; out of the box because the partitions lack the UUIDs that &lt;code&gt;systemd-nspawn&lt;/code&gt; requires.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Final Thoughts&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Ultimately, I haven&#39;t found a truly satisfying improvement to my current build process. While some of these approaches could theoretically improve build times and reduce disk usage, they also make the build pipeline more complicated and less reliable. At this moment, I don&#39;t think the trade-off is worth it.&lt;/p&gt;&lt;p&gt;For now, I&#39;ll probably just experiment with deduplication on ZFS and reflink on XFS. I noted that ZFS doesn&#39;t support reflink (&lt;a href=&quot;https://github.com/openzfs/zfs/issues/405#issuecomment-1880208374&quot;&gt;zfs_bclone_enabled&lt;/a&gt;) by default, so that&#39;s a small hurdle.&lt;/p&gt;&lt;p&gt;This exploration has been an interesting learning experience. I&#39;ve revisited/discovered some relevant tools:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;libvirt&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;incus&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;systemd-nspawn&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;cloud-init&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;ansible&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;systemd-volatile-root.service&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Sometimes, when I&#39;m writing my own scripts, I feel like I&#39;m building a slimmed-down version of these tools myself. However, I&#39;m not yet convinced that it&#39;s the right time to fully switch to them.&lt;/p&gt;&lt;p&gt;[UPDATE]: I learned that this process is called Golden Image and Phoenix Server.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/4237445026018053509' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/4237445026018053509'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/4237445026018053509'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/rethinking-my-vm-image-pipeline.html' title='Rethinking My VM Image Pipeline'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-5945235029686505334</id><published>2025-08-06T22:09:00.007+02:00</published><updated>2025-08-07T11:27:01.189+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="bootc"/><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="systemd"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>A Practical Guide to Passing Secrets to VMs</title><content type='html'>&lt;p&gt;The central question is: how do you manage secrets like SSH keys, API keys, and passwords for &lt;a href=&quot;https://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html&quot;&gt;disposable VMs&lt;/a&gt;? 🤷‍♂️&lt;/p&gt;&lt;p&gt;Let&#39;s establish some ground rules for this scenario. Suppose I want to pass an API key to the VM&amp;nbsp;&lt;span style=&quot;font-family: monospace;&quot;&gt;chimera&lt;/span&gt;, which is run by the&amp;nbsp;&lt;code&gt;chimera-runner&lt;/code&gt; user on the host. My security requirements are:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;On the host, only &lt;code&gt;root&lt;/code&gt; and&amp;nbsp;&lt;span style=&quot;font-family: monospace;&quot;&gt;chimera-runner&lt;/span&gt;&amp;nbsp;should have access to the secrets.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In VM&amp;nbsp;&lt;span style=&quot;font-family: monospace;&quot;&gt;chimera&lt;/span&gt;, only &lt;code&gt;root&lt;/code&gt; and relevant service users should have access to the secrets.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;No one from other VMs, including their &lt;code&gt;root&lt;/code&gt; users, should have access to VM&amp;nbsp;&lt;span style=&quot;font-family: monospace;&quot;&gt;chimera&lt;/span&gt;&#39;s secrets.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The guest VMs themselves are &lt;b&gt;not trusted&lt;/b&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c1316485290=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c1316485290=&quot;&quot; _nghost-ng-c2323008033=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://bootc-dev.github.io/bootc//building/secrets.html&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_b0d2a03f92936576&amp;quot;,&amp;quot;c_a9d755bed4311fe8&amp;quot;,null,&amp;quot;rc_3335a7b10298e0e1&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;bootc documentation&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt; on this topic is very informative.&lt;/p&gt;&lt;p&gt;On a high level, there are a few ways to achieve this.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;1. OEM Strings / Firmware&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;QEMU can pass data to a VM via SMBIOS OEM strings (&lt;code&gt;-smbios&lt;/code&gt;) or firmware configuration (&lt;code&gt;-fw_cfg&lt;/code&gt;). Notably, both methods are supported by &lt;code&gt;systemd-creds&lt;/code&gt; using special keys.&lt;/p&gt;&lt;p&gt;This approach is practical for small pieces of data, like an individual password or an encryption key. It&#39;s not a new idea and was &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c1316485290=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c1316485290=&quot;&quot; _nghost-ng-c2323008033=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://lists.proxmox.com/pipermail/pve-devel/2017-October/028900.html&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_b0d2a03f92936576&amp;quot;,&amp;quot;c_a9d755bed4311fe8&amp;quot;,null,&amp;quot;rc_3335a7b10298e0e1&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;discussed&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt; years ago.&lt;/p&gt;&lt;p&gt;However, there are a few caveats:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Size Limits&lt;/b&gt;: The QEMU manpage states that the total size of all SMBIOS tables is limited to 65535 bytes. While not explicitly defined, &lt;code&gt;fw_cfg&lt;/code&gt; is also intended for small amounts of data.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Key Length&lt;/b&gt;: The maximum length of a &lt;code&gt;fw_cfg&lt;/code&gt; key name is 55 characters. If you use it with &lt;code&gt;systemd-creds&lt;/code&gt;, a special prefix is required, making the available space for your key name even shorter.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Security Risk&lt;/b&gt;: If you pass data as a string (e.g.&amp;nbsp;&lt;code&gt;-fw_cfg string=secrets&lt;/code&gt;), the secret becomes part of the QEMU process&#39;s command line, which is &lt;b&gt;visible to all users on the host!&lt;/b&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Bugs&lt;/b&gt;: You can provide a file to SMBIOS using &lt;code&gt;-smbios path=filename&lt;/code&gt;, which avoids exposing the secret on the command line. However, this feature is affected by &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c1316485290=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c1316485290=&quot;&quot; _nghost-ng-c2323008033=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://gitlab.com/qemu-project/qemu/-/issues/2879&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_b0d2a03f92936576&amp;quot;,&amp;quot;c_a9d755bed4311fe8&amp;quot;,null,&amp;quot;rc_3335a7b10298e0e1&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;a bug&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt; that is still present in Debian Bookworm.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Accessibility&lt;/b&gt;: Data passed via firmware appears in the guest&#39;s &lt;code&gt;/sys/firmware&lt;/code&gt; directory, which cannot be mounted as a typical block device.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;For these reasons, I generally use &lt;code&gt;-fw_cfg file=filepath&lt;/code&gt; for passing small secrets and leverage &lt;code&gt;systemd-creds&lt;/code&gt; within the guest whenever possible.&lt;/p&gt;&lt;p&gt;I might switch to&amp;nbsp;&lt;span style=&quot;font-family: monospace;&quot;&gt;-smbios path=filename&amp;nbsp;&lt;/span&gt;later, when the bug is fixed.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;2. Network Share&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;This is probably the most common way of sharing files between a host and its guests. The host sets up a file-sharing service, and the guest connects to it to access the files.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Pros&lt;/b&gt;: Changes on the host are reflected in the guest immediately, although this might not be important for static secrets.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Cons&lt;/b&gt;: The server on the host needs to authenticate clients (the VMs) to ensure one VM cannot access another&#39;s secrets. The guest VM also needs to set proper file permissions internally.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Common options include:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;QEMU&#39;s built-in SMB server&lt;/b&gt;: Easy to set up, but the guest share is &lt;b&gt;accessible to everyone in the guest&lt;/b&gt;. A non-root user can access the content using userspace tools.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;SMB Server&lt;/b&gt;: For each VM, you must create a dedicated user/password pair. This password must then be passed securely to the VM (using another method).&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;NFS Server&lt;/b&gt;: NFS authenticates clients by IP address and trusts the client&#39;s &lt;code&gt;root&lt;/code&gt; user. This is risky because a compromised VM could spoof its IP address. An extra authentication layer, like WireGuard, might be necessary.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;SSHFS&lt;/b&gt;: Each VM needs a dedicated SSH key pair stored securely. The host can use a standard SSH server configuration.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;While these options are viable, I find them less than ideal. They require significant effort to set up correctly, and the servers add extra maintenance overhead. Furthermore, SSHFS appears to be no longer actively maintained.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;3. Filesystem Passthrough&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;This method is similar to a network share but is optimized for VM environments. Instead of the network stack, it uses a more direct channel to expose a host filesystem to the guest, which is generally faster.&lt;/p&gt;&lt;p&gt;Common options are:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;9pfs&lt;/b&gt;: Easy to set up, but my guest OS (&lt;code&gt;centos-bootc:stream9&lt;/code&gt;) lacks the necessary kernel support, and I prefer not to compile custom kernels.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;virtiofsd&lt;/b&gt;: This is a popular and high-performance method, but it requires shared memory between the host and guest, plus an extra daemon&amp;nbsp;running on the host.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Unfortunately, neither of these options worked for my specific setup.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;4. Credential Fetcher&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;With this approach, the guest fetches credentials as needed, typically during boot. This can be implemented easily in the guest, for example, by using &lt;code&gt;scp&lt;/code&gt; to copy secrets into a &lt;code&gt;ramfs&lt;/code&gt; mount.&lt;/p&gt;&lt;p&gt;However, this requires setting up a server (e.g. SSH) on the host, which I wanted to avoid due to the added complexity and maintenance.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;5. Embed in Container Image&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;It&#39;s possible to generate and embed secrets directly into the bootc container image during the build process.&lt;/p&gt;&lt;p&gt;An interesting variation is to encrypt the secrets, embed the encrypted data in the image, and pass the decryption key to the VM using another method (like an OEM string). You could use &lt;code&gt;systemd-creds&lt;/code&gt; for this and even bind decryption to a virtual TPM simulated by QEMU. However, as noted in &lt;response-element ng-version=&quot;0.0.0-PLACEHOLDER&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;link-block _nghost-ng-c1316485290=&quot;&quot; class=&quot;ng-star-inserted&quot;&gt;&lt;!----&gt;&lt;!----&gt;&lt;a _ngcontent-ng-c1316485290=&quot;&quot; _nghost-ng-c2323008033=&quot;&quot; class=&quot;ng-star-inserted&quot; externallink=&quot;&quot; href=&quot;https://github.com/systemd/systemd/issues/33500&quot; jslog=&quot;197247;track:generic_click,impression,attention;BardVeMetadataKey:[[&amp;quot;r_b0d2a03f92936576&amp;quot;,&amp;quot;c_a9d755bed4311fe8&amp;quot;,null,&amp;quot;rc_3335a7b10298e0e1&amp;quot;,null,null,&amp;quot;en&amp;quot;,null,1,null,null,1,0]]&quot; rel=&quot;noopener&quot; target=&quot;_blank&quot;&gt;this discussion&lt;/a&gt;&lt;!----&gt;&lt;/link-block&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;!----&gt;&lt;/response-element&gt;, this might not be the intended use case for &lt;code&gt;systemd-creds&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;While this works in theory, it doesn&#39;t offer significant benefits for my workflow and feels tedious to implement.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;6. Disk Images&lt;/h2&gt;&lt;div&gt;Any file can be used as a raw disk image for the VM, the guest just directly read the device for the data, without mounting a filesystem. Note that there may be issues if the file size is not aligned with the block size.&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;A more practical approach is to create a proper disk image containing the secrets and attach it to the VM. Any filesystem supported by the guest OS will work, but the ideal choice would have these characteristics:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;The disk image can be created &lt;b&gt;without root privileges&lt;/b&gt; on the host.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The filesystem is &lt;b&gt;optimized for read-only data&lt;/b&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;I found three practical options:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;EROFS&lt;/b&gt;: A modern, read-only filesystem that supports volume labels.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;SquashFS&lt;/b&gt;: &lt;code&gt;mkfs.squashfs&lt;/code&gt; is very flexible; its &quot;pseudo file&quot; feature lets you create files directly from command output.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;ISO 9660&lt;/b&gt; (CD/DVD images): Universally supported but less flexible.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;I plan to use EROFS mainly because it supports volume labels. I cannot guarantee that the disk order will be consistent across all VMs. Therefore, the guest needs a reliable way to identify the secrets disk, and mounting by label is the easiest solution.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;After evaluating the options, I settled on the following combination:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;OEM strings&lt;/b&gt; for small, individual secrets (like a decryption key).&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Read-only EROFS disk images&lt;/b&gt; for larger sets of secret files.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;QEMU&#39;s built-in SMB server&lt;/b&gt; for sharing &lt;i&gt;encrypted&lt;/i&gt; data blobs, where the decryption key is passed separately via an OEM string.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Keep in mind that this solution is tailored to my specific use case, which has the following constraints:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;The guest VMs are not fully trusted.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;I want to minimize setting up and maintaining complex services on the host.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Automating the setup via scripting before a VM starts is acceptable and even preferable.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The solution must scale easily to many different VMs.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/5945235029686505334' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/5945235029686505334'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/5945235029686505334'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/a-practical-guide-to-passing-secrets-to.html' title='A Practical Guide to Passing Secrets to VMs'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-671061561909468829</id><published>2025-08-01T23:23:00.002+02:00</published><updated>2025-08-02T21:08:15.283+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="backup"/><category scheme="http://www.blogger.com/atom/ns#" term="bootc"/><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="immutable"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>Backing Up VM Disk Images</title><content type='html'>&lt;p&gt;[Update] I managed to work out the AppArmor profile, and decided to go with guestmount for now.&lt;/p&gt;&lt;p&gt;I am setting up a maintenance pipeline for my &lt;a href=&quot;https://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html&quot;&gt;virtual machines&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The pipeline has two main routines:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ol style=&quot;text-align: left;&quot;&gt;&lt;li&gt;The BACKUP routine: Every day, this routine shuts down each VM, backs up the data in /var, updates the VM&#39;s disk image if a new version is available, and then restarts it.&lt;/li&gt;&lt;li&gt;The BUILD routine: Every week, this routine uses a special builder VM to create new disk images for all VMs.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;There is a scheduling conflict with the builder VM: the BACKUP routine needs to shut it down, while the BUILD routine needs it running. To resolve this, I merged both into a single set of systemd services that runs daily. The BUILD routine starts automatically when the builder VM starts, at the end of the BACKUP routine. The builder VM&#39;s systemd unit has an ExecCondition= property, which is skipped 6 days a week.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Surprisingly, the most difficult part of this pipeline was not the scheduling, but the backup process itself.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are two general approaches to backing up a VM: backing up the entire disk image or backing up the files directly from the filesystem.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Backing Up Disk Images&lt;/h2&gt;&lt;div&gt;Backing up a disk image is straightforward because disk images are just regular files. However, this method is often inefficient:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Deduplication may not work well if the disk image is compressed.&lt;/li&gt;&lt;li&gt;Incremental backups are not natively supported.&lt;/li&gt;&lt;li&gt;The entire disk image must be backed up, including unused and deleted data.&lt;/li&gt;&lt;li&gt;You cannot easily choose which specific files to include or exclude from the backup.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;div&gt;There are some ways to improve this approach:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;For deduplication: I can decompress the disk image and pipe the data stream directly to the backup software without saving the decompressed file.&lt;/li&gt;&lt;li&gt;For incremental backups: I can create snapshots and back up only the differences. I would also need to regularly merge the snapshots.&lt;/li&gt;&lt;ul&gt;&lt;li&gt;QEMU supports incremental backup for a running VM. Relavent article: &lt;a href=&quot;https://qemu-project.gitlab.io/qemu/interop/bitmaps.html&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://wiki.qemu.org/Features/IncrementalBackup&quot;&gt;2&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;To reduce backup size: I can defragment, shrink, wipe, and sparsify the disk image (for example, with virt-sparsify) before backing it up.&lt;/li&gt;&lt;li&gt;To exclude specific files: I can put the files or directories that I don&#39;t want to back up on a separate disk image.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Backing Up Filesystems&lt;/h2&gt;&lt;div&gt;One can back up files from inside the VM. It is also possible to mount the disk image from the host system when the VM is shut down.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Raw disk images can be mounted directly using loop devices.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Qcow2 images have several options:&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;`qemu-nbd` can expose an image as a block device (e.g., /dev/nbd*), which can then be mounted. This requires the nbd kernel module and root access. The qemu-nbd man page warns that this may not be suitable for untrusted guests. To back up multiple VMs, I would also need a way to find an available /dev/nbd* device.&lt;/li&gt;&lt;ul&gt;&lt;li&gt;The block device can be exported without root, there are tools like `qemu-nbd`, `nbdfuse` and `qemu-storage-daemon`. &lt;a href=&quot;https://www.qemu.org/2021/08/22/fuse-blkexport/&quot;&gt;This article&lt;/a&gt; is worth reading.&lt;/li&gt;&lt;li&gt;`qemu-nbd` also supports exposing an internal snapshot of a qcow2 image.&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;`guestmount` can mount a disk image without needing root. It uses QEMU and FUSE. While it works, it can be slow, and creating a secure AppArmor profile for it is difficult due to its complexity.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;My Thoughts&lt;/h2&gt;&lt;div&gt;All of these options have trade-offs between security, performance, complexity, and flexibility.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In my case, my priorities are:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Security: I do not trust the guest VMs. This means the guest should not connect to the host, and the host should not load the guest&#39;s disk image using a kernel module. While I could move the backup logic to a separate VM, this would add a lot of complexity.&lt;/li&gt;&lt;li&gt;Simplicity: I want a simple workflow that is easy to maintain and secure. I prefer to avoid writing complicated AppArmor profiles that are difficult to update.&lt;/li&gt;&lt;li&gt;Performance: I don&#39;t need the backup to be super fast, meanwhile I don&#39;t want to keep a VM shut down for too long.&lt;/li&gt;&lt;li&gt;Space Efficiency: I don&#39;t have a large amount of data, so disk space is not a major concern.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;div&gt;Considering these factors, I prefer backing up the entire disk image. Recall that my root filesystem is created from a Containerfile, and /etc is transient, so I primarily need to back up /var.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For now, I plan to use qcow2 images with compression. I will decompress the disk image and pipe the data to the backup software.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the future, I might explore some optimizations:&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Using a raw disk image on a ZFS host filesystem with compression and possibly deduplication enabled.&lt;/li&gt;&lt;li&gt;Taking a snapshot (either qcow2 or ZFS) and backing it up. This would allow me to restart the VM without waiting for the backup to finish.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/671061561909468829' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/671061561909468829'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/671061561909468829'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/08/backing-up-vm-disk-images.html' title='Backing Up VM Disk Images'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-6214735207821216682</id><published>2025-07-28T22:00:00.001+02:00</published><updated>2025-07-28T22:00:07.559+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="configuration/usage"/><category scheme="http://www.blogger.com/atom/ns#" term="评论和记事"/><title type='text'>GNU Stow</title><content type='html'>&lt;p&gt;Just learned about &lt;a href=&quot;https://www.gnu.org/software/stow/&quot;&gt;GNU Stow&lt;/a&gt;, which is a tool for managing symlink farm.&lt;/p&gt;&lt;p&gt;Basically the idea is to store all files in one place, then create symlink all around the system pointing to your files.&lt;/p&gt;&lt;p&gt;There are various use cases, like dot files and installing/uninstalling packages. But I mostly use it for tracking system config files, similar to how NixOS works. In fact I wrote my own scripts with &quot;cp -rs&quot;, but GNU Stow works much better.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/6214735207821216682' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/6214735207821216682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/6214735207821216682'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/07/gnu-stow.html' title='GNU Stow'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-7699581029209988628</id><published>2025-07-27T20:54:00.010+02:00</published><updated>2025-07-27T20:55:48.491+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="immutable"/><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="NixOS"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>Disposable VMs for Home Lab Security and Reproducibility</title><content type='html'>&lt;p&gt;Today, various services (native, LXC, Docker) are running on my server. I&#39;m mostly happy with the setup, but I decided to revisit my server&#39;s defenses under the assumption that a remote attacker or malicious code could compromise my services. A service might break out of its container or even gain root privilege.&lt;/p&gt;&lt;p&gt;VMs are a better security boundary than containers; they can limit the damage if an attacker gains root privilege. I cannot afford to run a dedicated VM for each service, so I will need to carefully group the services and run a dedicated VM for each group. Each group should be carefully designed based on the data accessed and the features/capabilities required. For example, some VMs may have access to my photos, while others may not have network access.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;The Goal&lt;/h2&gt;&lt;p&gt;There are two particular issues I want to address:&lt;/p&gt;&lt;p&gt;First, I want VM images to be easily reproducible, which makes backup and restore trivial. NixOS and GNU Guix System are great examples, where you only need to back up the configuration file. However, I don&#39;t really like them because of their domain-specific language/design.&lt;/p&gt;&lt;p&gt;Second, I want to seal the system as much as possible. Even a compromised root user inside a VM should not be able to permanently infect the VM. Many so-called &quot;immutable&quot; Linux distributions are not truly 100% immutable. Often, they just mean a read-only /usr. Some can be easily broken via `mount -o remount,rw`, and most of them allow self-upgrade, meaning a malicious root user can still inject code via &quot;upgrade and reboot.&quot;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;The Approach&lt;/h2&gt;&lt;p&gt;I use &lt;a href=&quot;https://bootc-dev.github.io/&quot;&gt;bootc containers&lt;/a&gt;. This allows me to build the whole system with standard scripts, and it offers the standard &quot;immutability.&quot;&lt;/p&gt;&lt;p&gt;Furthermore, I run QEMU with `--no-reboot --snapshot`, which means the system cannot update itself even with root privilege.&lt;/p&gt;&lt;p&gt;Lastly, I&#39;ll regularly build new images and restart the VM to pick up the latest security fixes.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;This approach is essentially managing VMs like containers. It&#39;s not a new idea; &lt;a href=&quot;https://words.filippo.io/frood/&quot;&gt;frood&lt;/a&gt; and &lt;a href=&quot;https://gokrazy.org/&quot;&gt;gokrazy&lt;/a&gt; are good examples of this principle.&lt;/p&gt;&lt;p&gt;On a side note, I also plan to learn more about KubeVirt and Nix VMs. Especially, I like the idea (from NixOS) that the guest can directly use the store from the host.&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Notes about QEMU&lt;/h2&gt;&lt;p&gt;Permanent machine-local data is stored in /var, which is put into a separate disk image.&lt;/p&gt;&lt;p&gt;Secrets are sent to QEMU via &lt;a href=&quot;https://systemd.io/CREDENTIALS/&quot;&gt;systemd credentials&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;I tried &lt;a href=&quot;https://virtio-fs.gitlab.io/&quot;&gt;virtiofsd&lt;/a&gt;, but didn&#39;t like it. I ended up with Samba anyway. Maybe I&#39;ll revisit virtiofsd later.&lt;/p&gt;&lt;p&gt;To shut down the VM (e.g., via systemd), I created a special admin user with special privilege defined in the sudoers file, so that I can run `ssh admin@vm sudo poweroff`. The SSH key pair is regenerated before each VM boot. Related: In a systemd unit, ExecStop= does not have access to LoadCredential.&lt;/p&gt;&lt;p&gt;I use `-chardev socket,logfile=...` and `-serial` so that the systemd logs are not filled with console output, and I can view or attach to the serial console later.&lt;/p&gt;&lt;p&gt;I plan to learn more about virtio-balloon and pmem later.&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Conclusion&lt;/h2&gt;&lt;p&gt;I find it very beneficial to deploy VMs. It allows me to shrink and harden the host OS (e.g., disable unprivileged user namespaces), and it allows me to design fine-grained access control.&lt;/p&gt;&lt;p&gt;Next, I&#39;ll start investigating how to organize the containers inside VMs.&amp;nbsp;&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/7699581029209988628' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7699581029209988628'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7699581029209988628'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/07/disposable-vms-for-home-lab-security.html' title='Disposable VMs for Home Lab Security and Reproducibility'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-2286897951241508050</id><published>2025-06-14T08:47:00.006+02:00</published><updated>2025-06-14T08:49:40.413+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="bootc"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="selinux"/><category scheme="http://www.blogger.com/atom/ns#" term="systemd"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>SELinux and useful systemd components</title><content type='html'>&lt;p&gt;Just learned about a few interesting and useful stuff, when playing with bootc:&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;systemd Components&lt;/h2&gt;&lt;div style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://www.freedesktop.org/software/systemd/man/latest/tmpfiles.d.html&quot;&gt;systemd-tmpfiles&lt;/a&gt;&amp;nbsp;and &lt;a href=&quot;https://www.freedesktop.org/software/systemd/man/latest/sysusers.d.html&quot;&gt;systemd-sysusers&lt;/a&gt;&amp;nbsp;allows managing files and users in a declarative way. Originally I learned about this for building bootc images, but later I realized that they are also very useful on Debian.&lt;/div&gt;&lt;div style=&quot;text-align: left;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style=&quot;text-align: left;&quot;&gt;I learned &lt;a href=&quot;https://systemd.io/CREDENTIALS/&quot;&gt;systemd-credential&lt;/a&gt;&amp;nbsp;as a way of passing ssh authorized keys to a QEMU VM, but after reading more, I realized it can be used in other interesting ways. My favorite one is with LoadCredential=, I can run a script with DynamicUser=yes and the script can access some root-only secrets.&lt;/div&gt;&lt;p&gt;&lt;br /&gt;I finally decided to migrate from cron to systemd-timer. systemd-timer is more interesting and handy than expected, and the migration process is less painful than expected.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;SELinux&lt;/h2&gt;&lt;div&gt;Actually I heared about SELinux many years ago. Over the time I just know SELinux as &quot;something about security, similar but more complicated to AppArmor&quot;.&lt;/div&gt;&lt;p&gt;Recently I got to learn more about it:&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://people.redhat.com/duffy/selinux/selinux-coloring-book_A4-Stapled.pdf&quot;&gt;https://people.redhat.com/duffy/selinux/selinux-coloring-book_A4-Stapled.pdf&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://www.youtube.com/watch?v=_WOKRaM-HI4&quot;&gt;https://www.youtube.com/watch?v=_WOKRaM-HI4&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://developers.redhat.com/articles/2025/04/11/my-advice-selinux-container-labeling#&quot;&gt;https://developers.redhat.com/articles/2025/04/11/my-advice-selinux-container-labeling#&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://docs.podman.io/en/v5.0.3/markdown/podmansh.1.html&quot;&gt;https://docs.podman.io/en/v5.0.3/markdown/podmansh.1.html&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://reintech.io/blog/securing-debian-12-with-selinux&quot;&gt;https://reintech.io/blog/securing-debian-12-with-selinux&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/using_selinux/assembly_using-multi-category-security-mcs-for-data-confidentiality_using-selinux&quot;&gt;https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/using_selinux/assembly_using-multi-category-security-mcs-for-data-confidentiality_using-selinux&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://www.redhat.com/en/blog/how-selinux-separates-containers-using-multi-level-security&quot;&gt;https://www.redhat.com/en/blog/how-selinux-separates-containers-using-multi-level-security&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://www.redhat.com/en/blog/why-you-should-be-using-multi-category-security-your-linux-containers&quot;&gt;https://www.redhat.com/en/blog/why-you-should-be-using-multi-category-security-your-linux-containers&lt;/a&gt;&lt;/p&gt;&lt;p&gt;- &lt;a href=&quot;https://wiki.gentoo.org/wiki/SELinux/User-based_access_control&quot;&gt;https://wiki.gentoo.org/wiki/SELinux/User-based_access_control&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;With more knowledge about it, I feel that I like it much better than before:&lt;/p&gt;&lt;p&gt;- I actually have written my own tool to verify file permissions, in a similar fashion (regex -&amp;gt; permissions)&lt;/p&gt;&lt;p&gt;- rootless docker and apparmor didn&#39;t work very well together in my case. rootless podman and selinux might work better together.&lt;/p&gt;&lt;p&gt;- :Z is pretty nice for containers: &lt;a href=&quot;https://github.com/containers/container-selinux&quot;&gt;https://github.com/containers/container-selinux&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I probably will start trying SELinux. With more confidence I might eventually enable it on Debian and replace AppArmor.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/2286897951241508050' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2286897951241508050'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2286897951241508050'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/06/selinux-and-useful-systemd-components.html' title='SELinux and useful systemd components'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-3795344406752487179</id><published>2025-05-10T22:23:00.004+02:00</published><updated>2025-05-10T22:26:36.034+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="bootc"/><category scheme="http://www.blogger.com/atom/ns#" term="immutable"/><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><category scheme="http://www.blogger.com/atom/ns#" term="评论和记事"/><title type='text'>First 3 Days with bootc</title><content type='html'>&lt;p&gt;I decided to spend some time playing with bootc. Mostly I&#39;m inspired by the following articles:&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;a href=&quot;https://blog.gripdev.xyz/2024/03/16/in-search-of-a-zero-toil-homelab-with-immutable-linux/&quot;&gt;CoreOS + native container&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://developers.redhat.com/articles/2024/09/24/bootc-getting-started-bootable-containers&quot;&gt;Hand-on demo&lt;/a&gt; (the last video), build bootc and auto update from registry&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://fedoramagazine.org/building-your-own-atomic-bootc-desktop/&quot;&gt;bootc desktop&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=5ZN_7NDvavY&quot;&gt;bootc for homelab&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Day 1&lt;/h2&gt;&lt;p&gt;To install bootc in a VM I need an image. bootc-image-builder requires root and I don&#39;t want to run this on the host. So I chose CoreOS as the inital system and installed it to QEMU.&lt;/p&gt;&lt;p&gt;I thought it is a great idea to share a folder from host to guest as podman container storage. However, it was not as smooth as I had expected:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;virtiofsd on Debian is too old, so I set up NFS.&lt;/li&gt;&lt;li&gt;rootless podman &lt;a href=&quot;https://www.redhat.com/en/blog/rootless-podman-nfs&quot;&gt;doesn&#39;t work well with NFS&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;rootfull podman complains upstream fs of overlayfs missing features, the performance was terrible.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;I gave up. I guess I&#39;ll just use the CoreOS disk, whose size is 10G, not enough.&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Day 2&lt;/h2&gt;&lt;p&gt;I didn&#39;t find a way of resizing a qcow2 image online. On the other hand I figured maybe I don&#39;t need build a disk image after all. CoreOS is already based on ostree, maybe I can use `bootc switch`. This is essentially the same approach as in the first blog post.&lt;/p&gt;&lt;p&gt;`bootc switch` just works, it can reboot, but I cannot login (ssh or local). Fortunately (and quite nicely), I can rollback ostree even without logging in, because I can do that with grub.&lt;/p&gt;&lt;p&gt;I suspect it is because some files are overriding the files in the image.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Day 3&lt;/h2&gt;&lt;p&gt;I learned that QEMU has builtin samba support, which is much easier to use than NFS.&lt;/p&gt;&lt;p&gt;Eventually I found that it was SELinux that has been messing up. With `restorecon -R` I could login from QEMU terminal, but not ssh. After logging in, `bootc status` threw an error about /boot, so I guess I needed `bootc install` afte rall.&lt;/p&gt;&lt;p&gt;So I just went back to the original solution, just resize the CoreOS image and build bootc image inside CoreOS. It worked this time.&lt;/p&gt;&lt;p&gt;Now I need to complete the loop, the new bootc OS should build itself and automatically update itself. And a few more things to fix:&lt;/p&gt;&lt;p&gt;- /etc/fstab does not work if modified when building the container, I need to create systemd mount files for mounts&lt;/p&gt;&lt;p&gt;- /etc/hostname does not work if modified when building the container, I need to set it after each boot&lt;/p&gt;&lt;p&gt;- Unlike `bootc install`, `bootc-image-builder` does not provide flags to override the bootc repository, so I need to set it after boot&lt;/p&gt;&lt;p&gt;- Transient etc sounds like a good idea, but I&#39;ll need to manually configure /boot. I&#39;d like to enable it when the official doc explains more details.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Conclusion:&lt;/h2&gt;&lt;p&gt;bootc works and its quite fun. But unfortunately I didn&#39;t find a way of actually using it in production yet:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;I could put it into a VM, but sharing files between host and VM is not pretty at the moment (on Debian stable)&lt;/li&gt;&lt;li&gt;I don&#39;t trust it as my main server yet, and I don&#39;t have other machine to which I can install bootc bare-metal.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I think later I&#39;ll spend more time trying to tinker with it.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/3795344406752487179' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/3795344406752487179'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/3795344406752487179'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/05/first-3-days-with-bootc.html' title='First 3 Days with bootc'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-6692553687352083499</id><published>2025-05-05T22:50:00.003+02:00</published><updated>2025-05-05T22:50:19.928+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><category scheme="http://www.blogger.com/atom/ns#" term="评论和记事"/><title type='text'>UID and GID: The New Order</title><content type='html'>&lt;p&gt;When I have important data on a device, I back it up to my server using dedicated user accounts. The other day, I checked /etc/passwd on my server and found entries like this:&lt;/p&gt;&lt;p&gt;&lt;span style=&quot;font-family: Source Code Pro;&quot;&gt;some-backup-user1:x:1003:1004:...&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style=&quot;font-family: Source Code Pro;&quot;&gt;some-backup-user2:x:1004:1007:...&lt;/span&gt;&lt;/p&gt;&lt;p&gt;A few inconsistencies immediately bothered me:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;UID/GID Mismatches:&lt;/b&gt; Many users have UIDs that don&#39;t match their primary GIDs. While this technically works and might seem like just an aesthetic concern, I realized that UIDs and GIDs are crucial metadata. I need to preserve them accurately for future system migrations to maintain correct file ownership.ID Ambiguity:&lt;/li&gt;&lt;li&gt;&lt;b&gt;ID Ambiguity:&lt;/b&gt; The same number (e.g., 1004) could represent a User ID for one account and a Group ID for a completely different group. This overlap is a recipe for mistakes during administration tasks if I&#39;m not paying close attention.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Lack of Structure:&lt;/b&gt; User and group accounts created for very different purposes – regular logins, backup processes, container management, specific ACLs – were all jumbled together in the same ID range. This made management and auditing more cumbersome than necessary.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;So, I decided it was time for a cleanup. My goal was to reorganize these user and group IDs to achieve clarity and predictablity, aiming for a system where I could infer the purpose of an ID even without direct access to /etc/passwd or /etc/group.&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;My Strategy&lt;/h2&gt;&lt;div&gt;To achieve this, I established the following strategy:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Purpose-Based ID Ranges:&lt;/b&gt; Users and groups serving similar functions will be grouped together by assigning them IDs within dedicated numerical ranges. This makes it easier to understand the role of an account at a glance. For exampe:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;1000-1999: Backup-related users and groups.&lt;/li&gt;&lt;li&gt;2000-2999: Groups specifically for managing ACLs.&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;&lt;b&gt;UID/GID Correspondence:&lt;/b&gt;&amp;nbsp;If a number X is used as both a UID and a GID, then GID X must be the primary group for the user with UID X. Unrelated users and groups never share the same ID number.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Allocation Within Ranges:&lt;/b&gt; Within each designated range, users and their primary groups start from the lower end, secondary groups starts from the higher end.&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Example: backup-user1 has 1000:1000, while a group backup-users has GID 1999.&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Options for Managing IDs&lt;/h2&gt;&lt;div&gt;I considered several approaches to implement this strategy:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Manual Scripting:&lt;/b&gt; This would involve carefully crafting scripts using useradd, usermod, groupadd, and groupmod. However, this approach is fraught with risk – a single typo could cause significant problems. It&#39;s also labor-intensive to get right and tedious to maintain. I quickly ruled this out.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Virtual Machine with cloud-init:&lt;/b&gt; Using a VM combined with cloud-init offered a more structured way to script user/group creation via its built-in directives, executing reliably during boot. This reduces some risks compared to updating an existing system but it introduces the overhead of managing a VM (file sharing, updates, resource consumption), which I also wanted to avoid.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Virtual Machine with NixOS/Guix System:&lt;/b&gt; These operating systems offer truly declarative user management, which is very appealing. While I run simple NixOS instances, fully embracing either OS just for UID/GID management felt like overkill and required a significant learning investment. Plus, this option still carried the VM overhead.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Using sysusers.d:&lt;/b&gt; While searching for &quot;declarative user management&quot; solutions, I discovered sysusers.d. This systemd mechanism uses simple configuration files (like fstab) to declare users and groups and their properties (UIDs/GIDs). Systemd ensures these users/groups exist on boot. Crucially, it was already supported and installed on my Debian server. This offered a declarative, script-free, VM-free solution integrated directly into my existing OS – making it the clear choice.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;The Migration&lt;/h2&gt;&lt;div&gt;For each user and group, I added a corresponding line to a configuration file within /etc/sysusers.d. All IDs are manually allocated. Then I deleted the old users one by one and restarted the systemd-sysusers service. Note that the service does not touch existing users or groups.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;After that I needed to migrate the permission of existing files. This proved trickier than I had expected:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;chown has the nice `--from=OWNER:GROUP` parameter. The man page says &quot;Either may be omitted, in which case a match is not required for the omitted attribute.&quot; However:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;`chown --from :group` works&lt;/li&gt;&lt;li&gt;`chown --from user` works&lt;/li&gt;&lt;li&gt;`chown --from user:` doesn&#39;t work.&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;`chown user:` actually means `chown user:user`, i.e. it also updates group, but `chown user` updates only the owner&lt;/li&gt;&lt;li&gt;`setfacl` by default updates the mask, so `setfacl -m g:group:rx` actually made all my files executable! I had to rollback a ZFS snapshot and use `setfacl --no-mask` instead.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Finally, I also learned about `pwck -s` and `grpck -s` to improve the quality of my life.&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;What&#39;s Next&lt;/h2&gt;&lt;p&gt;I plan to apply similar principles to my container setup. Specifically, I want to create dedicated users for each rootless container. And I will have to figure out:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;How to manage subuid and subgid ranges in a structured, preferably declarative, way.&lt;/li&gt;&lt;li&gt;Choosing between docker and podman.&lt;/li&gt;&lt;li&gt;Whether to use VMs. If so, which OS to use, e.g. NixOS, GuixSystem, CoreOS.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/6692553687352083499' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/6692553687352083499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/6692553687352083499'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/05/uid-and-gid-new-order.html' title='UID and GID: The New Order'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-7665128434174373192</id><published>2025-04-18T12:17:00.008+02:00</published><updated>2025-04-18T14:21:46.253+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="declarative"/><category scheme="http://www.blogger.com/atom/ns#" term="immutable"/><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="NixOS"/><category scheme="http://www.blogger.com/atom/ns#" term="sysadmin"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>Exploring Immutable Distros and Declarative Management</title><content type='html'>&lt;p&gt;My current server setup, based on Debian Stable and Docker, has served me reliably for years. It&#39;s stable, familiar, and gets the job done. However, an &lt;a href=&quot;https://blog.gripdev.xyz/2024/03/16/in-search-of-a-zero-toil-homelab-with-immutable-linux/&quot;&gt;intriguing article&lt;/a&gt; I revisited recently about Fedora CoreOS, rpm-ostree, and OSTree native containers sparked my curiosity and sent me down a rabbit hole exploring alternative approaches to system management. Could there be a better way?&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Core Goals &amp;amp; Requirements&lt;/h2&gt;&lt;div&gt;&lt;div&gt;Before diving into new technologies, I wanted to define what &quot;better&quot; means for my use case:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;The base operating system must update automatically and reliably.&lt;/li&gt;&lt;li&gt;Hosted services (applications) should be updatable either automatically or manually, depending on the service.&lt;/li&gt;&lt;li&gt;Configuration and data files need to be easy to modify, and crucially, automatically tracked and backed up.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Current Setup: Debian Stable + Docker&lt;/h2&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My current infrastructure consists of several servers, all running Debian Stable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;System Updates are andled automatically via unattended-upgrades.&lt;/li&gt;&lt;li&gt;Services consist of a mix of native Debian packages and applications running in rootless Docker containers. Docker images are updated either periodically via scripts or manually when needed.&lt;/li&gt;&lt;li&gt;Config and data files are backed up automatically using rsync, but tracking which files need backing up is manual. This&amp;nbsp;is my main pain point. Every time I add or modify a service or configuration file, I have to remember to update my backup scripts.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;While this setup works, it&#39;s not perfect:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;The manual tracking of configuration files is error-prone and tedious. I like how systems like NixOS manage the entire system declaratively from configuration files, automatically tracking changes. However, I have reservations about NixOS&#39;s learning curve and ecosystem, and I generally prefer sticking to more &quot;traditional&quot; distributions like Debian if possible.&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Some services are Internet-facing. While rootless Docker improves security over rootful Docker, it&#39;s not a full security boundary. Moving these services into dedicated VMs could offer better isolation.&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Fedora CoreOS + OSTree Native Container&lt;/h2&gt;&lt;div&gt;&lt;div&gt;Fedora CoreOS is an automatically updating, minimal, monolithic, container-focused operating system. Key concepts here include:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;The core OS is largely read-only, making updates safer and more predictable (atomic rollbacks).&lt;/li&gt;&lt;li&gt;Configuration is applied on the first boot using Butane/Ignition. This is great for initial setup but less ideal for ongoing configuration changes.&lt;/li&gt;&lt;li&gt;rpm-ostree allows layering additional RPM packages onto the immutable base image. While OSTree tracks file changes, rpm-ostree itself doesn&#39;t inherently provide a declarative way to manage the list of installed packages or track arbitrary configuration file modifications in a user-friendly, declarative manifest like NixOS does. It knows files changed, but not necessarily why in a structured way (e.g., &quot;/usr/bin/vim was added&quot; vs &quot;package vim was installed&quot;).&lt;/li&gt;&lt;li&gt;OSTree Native Containers: The article that inspired me highlighted using OSTree&#39;s ability to pull container images as system updates. The workflow looks like this:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Define a system image using a Containerfile/Dockerfile, starting from a CoreOS base.&lt;/li&gt;&lt;li&gt;Build this definition into an OCI container image.&lt;/li&gt;&lt;li&gt;Install a standard CoreOS, using a minimal Butane config that tells rpm-ostree to &quot;rebase&quot; onto your custom container image.&lt;/li&gt;&lt;li&gt;To update the system or add packages, modify the Containerfile, rebuild the image, push it to a registry, and the CoreOS systems will eventually pull and apply it as their next OS update.&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;This approach essentially creates a custom, version-controlled OS image. Variations include building the image via CI/CD (like quay.io), building locally, or even using &lt;a href=&quot;https://coreos.github.io/coreos-assembler/working/&quot;&gt;coreos-assembler&lt;/a&gt; with a lower-level &quot;treefile&quot; manifest.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Bootable Containers&lt;/h2&gt;&lt;div&gt;Emerging from the CoreOS world is &lt;a href=&quot;https://docs.fedoraproject.org/en-US/bootc/&quot;&gt;Fedora Bootc&lt;/a&gt;. This seems specifically designed for use cases requiring more customization than standard CoreOS allows. Instead of Ignition and rpm-ostree, &lt;a href=&quot;https://bootc-dev.github.io/bootc/&quot;&gt;bootc&lt;/a&gt; directly manages bootable container images derived from a Containerfile. You essentially build your entire OS, including customizations, as a container image, and the system boots directly into it and updates by pulling new image versions. This feels conceptually cleaner and more aligned with the goal of a declaratively built system image.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Cloud-init&lt;/h2&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/index.html&quot;&gt;cloud-init&lt;/a&gt; is the de facto standard for bootstrapping cloud instances across various Linux distributions. Like Butane/Ignition, it runs on first boot, but critically, it can be &lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/howto/rerun_cloud_init.html&quot;&gt;re-run&lt;/a&gt; on subsequent boots. This makes testing configuration changes much easier, as you don&#39;t necessarily need to re-image the entire machine for every small tweak. It&#39;s widely supported but focuses more on initial setup and less on managing the entire OS lifecycle declaratively like the OSTree/Bootc approaches.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/index.html&quot;&gt;cloud-init&lt;/a&gt; is a popular tool to bootstrap linux machines for cloud. So it is similar to butane for CoreOS. However, cloud-init &lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/howto/rerun_cloud_init.html&quot;&gt;can be re-run&lt;/a&gt;, which means testing a simple change does not require re-imaging the machine.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Guix&lt;/h2&gt;&lt;div&gt;Guix System is the GNU project&#39;s take on a declarative operating system, similar in principle to NixOS. It uses Guile/Scheme for its configuration language and boasts a clean command-line interface. While appealing from a purity perspective, its smaller community and adoption compared to NixOS make it a less pragmatic choice for me currently. Like NixOS, it can build VM images and mange VMs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Conclusion and Thoughts&lt;/h2&gt;&lt;div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Conceptually, Fedora Bootc is the most compelling alternative I found. It directly addresses the desire to build a customized, yet manageable and updatable, system image using familiar container tooling. The main drawback? It&#39;s very new, lacks widespread adoption, and crucially, there&#39;s no &quot;Debian Bootc&quot; yet. If a stable, Debian-based bootc implementation existed, I&#39;d likely jump on it.&lt;/li&gt;&lt;li&gt;Cloud-init is mature, widely adopted, and works across many distributions (including Debian). It could improve my bootstrapping process, but it doesn&#39;t solve the core desire for managing the entire system state declaratively post-install.&lt;/li&gt;&lt;li&gt;CoreOS/OSTree: While powerful, the standard CoreOS workflow with rpm-ostree layering feels slightly less integrated than bootc for building a heavily customized system declaratively. The native container approach is interesting but adds complexity.&lt;/li&gt;&lt;li&gt;NixOS/Guix/Ansible: These are powerful, but represent a significant shift in tooling and philosophy. They feel like a larger commitment than I&#39;m ready for, especially given my preference for sticking closer to traditional distribution paradigms if possible.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For the time being, I&#39;ll likely stick with my current Debian + Docker setup. On the other hand, I might as well start a VM and try some options for better understanding.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/7665128434174373192' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7665128434174373192'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/7665128434174373192'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/04/exploring-immutable-distros-and.html' title='Exploring Immutable Distros and Declarative Management'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-8794435045889862586</id><published>2025-04-12T21:57:00.000+02:00</published><updated>2025-04-12T21:57:11.434+02:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="linux"/><category scheme="http://www.blogger.com/atom/ns#" term="qubes os"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><title type='text'>Qubes OS: First Impressions</title><content type='html'>&lt;p&gt;A few days ago, while browsing security topics online, &lt;a href=&quot;https://www.qubes-os.org/&quot;&gt;Qubes OS&lt;/a&gt; surfaced—whether via YouTube recommendations or search results, I can&#39;t recall precisely. Intrigued by its unique approach to security through compartmentalization, I delved into the documentation and watched some demos. My interest was piqued enough that I felt compelled to install it and give it a try firsthand.&lt;/p&gt;&lt;p&gt;My overall first impression of Qubes OS is highly positive. Had I discovered it earlier, I might have reconsidered starting my hardware password manager project.&lt;/p&gt;&lt;p&gt;Conceptually, Qubes OS is not much different from running a bunch of virtual machines simultaneously. However, its brilliance lies in the seamless desktop integration and the well-designed template system, making it far more user-friendly than a manual VM setup. I was particularly impressed by the concept of disposable VMs for temporary tasks and the clear separation of critical functions like networking (sys-net) and USB handling (sys-usb) into their own isolated VMs.&lt;/p&gt;&lt;p&gt;Although the default Qubes OS environment is Fedora-based, it doesn&#39;t feel drastically different from my experiences with Arch or Debian. This reinforces my feeling that the core differences between many Linux distributions are shrinking. It&#39;s also a plus that Qubes officially supports templates for Debian and other systems, offering valuable flexibility. (While exploring Qubes, I noted other compartmentalization-focused OS projects exist, like &lt;a href=&quot;https://spectrum-os.org/&quot;&gt;Spectrum OS&lt;/a&gt; and &lt;a href=&quot;https://www.rancher.com/&quot;&gt;RancherOS&lt;/a&gt;, though I haven&#39;t investigated them yet.)&lt;/p&gt;&lt;p&gt;After using the system for just a few days, I can already see how its compartmentalized approach could be incredibly beneficial for managing different aspects of digital life securely. However, despite my enthusiasm, I&#39;m hesitant to adopt Qubes OS as my daily driver just yet. I encountered a few practical hurdles:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ol style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Secure Boot Support:&lt;/b&gt; Qubes OS doesn&#39;t currently support Secure Boot. This is inconvenient for my dual-boot setup with Windows, as it requires toggling the setting in the BIOS every time I switch operating systems. According to &lt;a href=&quot;https://www.youtube.com/watch?v=ZcF_RN04oq8&quot;&gt;a recent talk&lt;/a&gt;, support might arrive with version 4.3, which is encouraging.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Anti Evil Maid (AEM):&lt;/b&gt; While AEM is an excellent security concept, its current implementation has significant restrictions that make it difficult to use easily on many systems. My current workaround is just keeping the boot partition on a separate USB drive.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Backup Workflow:&lt;/b&gt; The built-in backup tool enforces encryption. While I understand the security rationale, this complicates my preferred strategy of frequent (e.g., hourly) incremental backups to a trusted location. Although tutorials exist for bypassing this, native support for optionally disabling encryption (perhaps with strong warnings) would be a welcome convenience.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Bluetooth:&lt;/b&gt; This seems to be a minor point, but Bluetooth isn&#39;t supported out-of-the-box and requires additional setup to get working with peripherals.&lt;/li&gt;&lt;li&gt;&lt;b&gt;GPU Passthrough:&lt;/b&gt; Setting up GPU passthrough for performance-intensive applications (like gaming) appears non-trivial. This limitation means I&#39;ll likely need to keep my Windows installation as a dual boot for gaming purposes.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;In conclusion, Qubes OS is a fascinating and powerful operating system with a unique and compelling security architecture. I&#39;m genuinely impressed with its design and potential. However, the current practical limitations prevent me from making it my primary OS at this time. I&#39;ll definitely be keeping a close eye on its development, and may revisit it as these areas mature.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/8794435045889862586' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/8794435045889862586'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/8794435045889862586'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/04/qubes-os-first-impressions.html' title='Qubes OS: First Impressions'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-4338074821743049017</id><published>2025-03-18T23:49:00.002+01:00</published><updated>2025-03-19T09:31:30.300+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="esp32s3"/><category scheme="http://www.blogger.com/atom/ns#" term="hardware"/><category scheme="http://www.blogger.com/atom/ns#" term="security"/><category scheme="http://www.blogger.com/atom/ns#" term="Tinker"/><title type='text'>Cardputer as a Hardware Password Manager</title><content type='html'>&lt;p&gt;For the past month, I&#39;ve been prototyping my own hardware password manager using the &lt;a href=&quot;https://shop.m5stack.com/products/m5stack-cardputer-kit-w-m5stamps3?variant=44078073872641&quot;&gt;Cardputer&lt;/a&gt;, a compact device that&#39;s surprisingly perfect for the task.&amp;nbsp;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDwqDLeNtulgYZAio_kc12G-U_g8ul7FdFp9TO4y1zkeNwGgQ6GP0OOKyeWIvRuJogw5IePe2-1NpBhoa34TLgDpZpzptVZrVSwmeNAoBljJZeBYVoxDt9qoq62ewHBmjoeRB4yI_eHAMCx-chsJRykDXjTRWGMlTeWBe5RBCq_EKyudBg0Aal7Q/s3468/cardputer.jpg&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;2821&quot; data-original-width=&quot;3468&quot; height=&quot;520&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDwqDLeNtulgYZAio_kc12G-U_g8ul7FdFp9TO4y1zkeNwGgQ6GP0OOKyeWIvRuJogw5IePe2-1NpBhoa34TLgDpZpzptVZrVSwmeNAoBljJZeBYVoxDt9qoq62ewHBmjoeRB4yI_eHAMCx-chsJRykDXjTRWGMlTeWBe5RBCq_EKyudBg0Aal7Q/w640-h520/cardputer.jpg&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I learned about the Cardputer while searching for the ideal hardware to build a password manager. It&#39;s essentially a microcontroller (the ESP32-S3) packed with a screen and a keyboard. What really sold me on it were these features:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Direct Interaction:&lt;/b&gt; I can interact directly with the device – typing my master password, searching for credentials, and confirming password entry, all on the Cardputer itself.&lt;/li&gt;&lt;li&gt;&lt;b&gt;USB Keyboard Emulation:&lt;/b&gt; The device can seamlessly emulate a USB keyboard, allowing for both manual and automatic password entry into other devices.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Hardware-Accelerated Crypto: &lt;/b&gt;The ESP32-S3 boasts hardware acceleration for AES and SHA operations, crucial for secure password management.&lt;/li&gt;&lt;li style=&quot;--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: #3b82f680; --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0; box-sizing: inherit;&quot;&gt;&lt;span style=&quot;--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: #3b82f680; --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0; box-sizing: inherit; font-weight: 700;&quot;&gt;Secure Element Integration:&lt;/span&gt;&amp;nbsp;The Cardputer supports an optional ATECC608B secure element (available as &lt;a href=&quot;https://shop.m5stack.com/products/crypto-authentication-unit-atecc608b&quot;&gt;an official accessory&lt;/a&gt;), providing additional cryptographic capabilities, secure key storage, and true random number generation.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Taming the Cardputer&lt;/h2&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The official development framework is &lt;a href=&quot;https://docs.espressif.com/projects/esp-idf/en/latest/esp32s3/get-started/index.html&quot;&gt;ESP-IDF&lt;/a&gt;, but after some research I decide to use &lt;a href=&quot;https://github.com/esp-rs/esp-hal&quot;&gt;esp-hal&lt;/a&gt;. I took it as a nice opportunity to learn Rust and embedded systems.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;My journey with esp-hal quickly turned into a hands-on crash course in embedded systems development. I wasn&#39;t just writing high-level application code; I was crafting drivers from the ground up.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This meant diving into the specifics of the Cardputer&#39;s hardware. I spent most of my time implementing drivers, not quite at the level of manipulating memory addresses directly, but still very much involved in the low-level details. This included tasks like scanning the keyboard&#39;s button matrix and translating physical presses into logical key events, as well as working with communication protocols like I2C and SPI to interact with peripherals.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Simultaneously, I delved into the world of cryptography, deepening my understanding of common algorithms and concepts like AES modes of operation, key derivation functions (KDFs), and secure hashing. This project became a fantastic opportunity to connect the theoretical aspects of cryptography with practical implementation.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This project also sparked an exhilarating, and at times challenging, adventure in learning and applying Rust. I deliberately chose the no_std and no_alloc path, embracing a truly bare-metal approach. This decision plunged me into the deep end of embedded Rust, where every byte of memory matters. It was both a blessing and a curse: while it demanded careful resource management, with everything either stack-allocated or statically allocated, it also provided unparalleled control and efficiency. Along the way, I encountered some of Rust&#39;s more… idiosyncratic features, wrestling with async function traits, navigating the intricacies of generic types, and occasionally resorting to seemingly superfluous .map(|x| x) closures to appease the borrow checker and its lifetime requirements.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Securing the Secrets&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The prototype is functional. While the source code isn&#39;t ready for release yet, I&#39;d like to share some of the key design decisions.&lt;/p&gt;&lt;p&gt;For data serialization, I chose &lt;a href=&quot;https://flatbuffers.dev/&quot;&gt;FlatBuffers&lt;/a&gt;. Its zero-copy nature is particularly well-suited to the no_alloc environment of this project, minimizing memory overhead. The main database structure is inspired by KeePass. Note that the database is read-only on the device; entries cannot be added, modified, or removed.&lt;/p&gt;&lt;p&gt;The database is encrypted using AES-GCM-SIV. The encryption key is derived via HKDF, with the input keying material (IKM) composed of three distinct components:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;User&#39;s Master Password:&lt;/b&gt; The output of PBKDF2 applied to the user&#39;s master password, using a randomly generated salt.&lt;/li&gt;&lt;li&gt;&lt;b&gt;eFuse Key in ESP32S3&lt;/b&gt;: The result of HKDF-Expand using a pre-defined message, keyed by random bytes securely burned into an ESP32-S3 eFuse. This ties the encryption to the specific hardware.&lt;/li&gt;&lt;li&gt;&lt;b&gt;ATECC608B Private Key&lt;/b&gt;: The output of HKDF using a pre-defined message, keyed by a shared secret. This secret is established through ECDH key exchange between the ECC key stored securely within the ATECC608B and an ephemeral key pair generated by the encryptor (e.g., a host computer or another device).&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Furthermore, when either or both of the hardware-based secrets (eFuse and ATECC608B) are enabled, they also contribute to individual password encryption. A separate key is derived from these secrets, using the password&#39;s sequence number within the database as the IKM for HKDF. This approach, inspired by KeePass&#39;s in-memory protection, provides an additional layer of security, though its benefit in this specific embedded context might be less pronounced than in a traditional software environment.&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Okay, I might have gone a little overboard with the encryption. The eFuse and ATECC608B key derivation steps are probably not strictly necessary. But hey, it was fun to build! And even though secure elements aren&#39;t magic security shields (they do have known vulnerabilities), they make it way harder for someone to steal your passwords.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Final Thoughts&lt;/h2&gt;&lt;p&gt;The Cardputer has proven to be a surprisingly capable platform for this secure password manager project, though not without its limitations.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;What I liked:&lt;/h3&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Affordability: &lt;/b&gt;The Cardputer&#39;s low cost is a major advantage, making this project accessible.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Well-Suited Hardware:&lt;/b&gt; The hardware feature set is almost ideal – providing all the necessary functionality (display, keyboard, secure element, connectivity) without unnecessary extras.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Small Attack Surface:&lt;/b&gt; The firmware footprint is relatively small (~300KB), minimizing the potential attack surface.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Fast Boot Time:&lt;/b&gt; The device boots up quickly, making it convenient to use.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Rust Ecosystem: &lt;/b&gt;The Rust ecosystem, with its package manager (cargo) and readily available crates, made development fast, easy, and enjoyable.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;Areas for Improvement:&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Performance Limitations:&lt;/b&gt; The ESP32-S3&#39;s processing speed is a bottleneck, particularly for computationally intensive operations like PBKDF2-SHA512, which takes approximately 5 seconds for 65536 rounds. On the other hand, the performance is more than enough after the unlocking&lt;/li&gt;&lt;li&gt;&lt;b&gt;Documentation Gaps:&lt;/b&gt; The documentation for both the ESP32-S3 and ATECC608B could be improved. I encountered significant challenges setting up flash encryption, secure boot, and the undocumented KDF command in the ATECC608B.&lt;/li&gt;&lt;li&gt;&lt;b&gt;ATECC608B-TNGTLS Limitations:&lt;/b&gt; The Cardputer uses the ATECC608B-TNGTLS variant, which, unfortunately, does not enforce I/O encryption. This means communication between the ESP32-S3 and the secure element is not as secure as it could be.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Known Vulnerabilities:&lt;/b&gt; Both the ESP32-S3 and ATECC608B have known vulnerabilities. While these don&#39;t necessarily render the device insecure, they highlight the importance of a strong master password as the primary defense.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Missing RTIC Support:&lt;/b&gt; The lack of &lt;a href=&quot;https://rtic.rs/2/book/en/&quot;&gt;RTIC&lt;/a&gt;&amp;nbsp;support on the ESP32-S3 is a missed opportunity for learning and exploration.&lt;/li&gt;&lt;li&gt;&lt;b&gt;RustCrypto Hardware Acceleration Limitations:&lt;/b&gt;&amp;nbsp;RustCrypto&#39;s AES implementation can effectively&amp;nbsp;leverage hardware acceleration when available, but that does not seems true for the HMAC, HKDF, and PBKDF2 implementations. I believe it&#39;s&amp;nbsp;related to the dependency on the Clone trait.&lt;/li&gt;&lt;li&gt;&lt;b&gt;ECC Algorithm Support:&lt;/b&gt; The ATECC608B focuses on ECDSA curves and algorithms. Support for Ed25519 would be nice.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/4338074821743049017' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/4338074821743049017'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/4338074821743049017'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/03/cardputer-as-hardware-password-manager.html' title='Cardputer as a Hardware Password Manager'/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDwqDLeNtulgYZAio_kc12G-U_g8ul7FdFp9TO4y1zkeNwGgQ6GP0OOKyeWIvRuJogw5IePe2-1NpBhoa34TLgDpZpzptVZrVSwmeNAoBljJZeBYVoxDt9qoq62ewHBmjoeRB4yI_eHAMCx-chsJRykDXjTRWGMlTeWBe5RBCq_EKyudBg0Aal7Q/s72-w640-h520-c/cardputer.jpg" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-33534782.post-2927719967447807409</id><published>2025-03-07T21:39:00.001+01:00</published><updated>2025-03-08T13:13:14.154+01:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="esp32"/><category scheme="http://www.blogger.com/atom/ns#" term="esp32s3"/><category scheme="http://www.blogger.com/atom/ns#" term="programming"/><title type='text'>ESP32S3: Flash Encryption and Secure Boot </title><content type='html'>&lt;p&gt;Flash encryption and secure boot are useful security features for ESP32S3 chip. While not perfect, they definitely make it harder to extract the secrets in the chip.&lt;/p&gt;&lt;p&gt;However, it is tricky to enable both features at the same time. The topic is actually discussed in the official documentation:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;a href=&quot;https://docs.espressif.com/projects/esp-idf/en/v5.4/esp32s3/security/security.html&quot;&gt;ESP32S3 Security Features&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://docs.espressif.com/projects/esp-idf/en/v5.4/esp32s3/security/security-features-enablement-workflows.html&quot;&gt;Security Features Enablement Workflows&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Especially, the second one mentioned it is recommended to enable flash encryption before secure boot. But I still find the documentation confusing.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the end I was able to successfully enable both, here&#39;s my findings.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;My Understanding&lt;/h2&gt;&lt;div&gt;After my adventure, here&#39;s what I &lt;i&gt;think &lt;/i&gt;could have worked. WARNING, this is untested.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Follow&amp;nbsp;&lt;a href=&quot;https://docs.espressif.com/projects/esp-idf/en/v5.4/esp32s3/security/security-features-enablement-workflows.html&quot;&gt;Security Features Enablement Workflows&lt;/a&gt;:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Burn all the keys, as long as their purpose eFuses and read/write protections&lt;/li&gt;&lt;li&gt;Burn other security eFuses, but DO NOT burn&amp;nbsp;ENABLE_SECURITY_DOWNLOAD in the middle, which is mentined at the end of the instruction for both flash encryption and secure boot.&lt;/li&gt;&lt;li&gt;Burn&amp;nbsp;SPI_BOOT_CRYPT_CNT (all 3 bits) and&amp;nbsp;SECURE_BOOT_EN&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Sign bootloader and application&lt;/li&gt;&lt;li&gt;Encrypt signed bootloader, partition table and signed application&lt;/li&gt;&lt;li&gt;Flash three encrypted pieces into correct offsets.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;My Actual Adventure&lt;/h2&gt;&lt;div&gt;I had thought the official bootloader was necessary to enable the features, so I actually downloaded ESP-IDF and flashed the bootloader as the last step of enabling flash encryption. Later I found that ENABLE_SECURITY_DOWNLOAD had been automatically burned by the bootloader, because I used the recommended &quot;secure download mode&quot; option in `idf.py menuconfig`.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is not the end of the world, it just meant I could no longer use `espefuse.py`. I could actually use the C API to burn the eFuses. The API is actually quite user-friendly, and the high-level functions (e.g. burn key, purpose and read/write protection at the same time) do check common errors. I just had to read the documentation to understand the differences between blocks, bits and registers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One more thing about the bootloader: A recent version (5.3-ish) of ESP-IDF changed something about application image header, which is no longer compatible with esp-hal before &lt;a href=&quot;https://github.com/esp-rs/esp-hal/pull/3124&quot;&gt;this change&lt;/a&gt;, this took me quite a while to figure out.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/33534782/2927719967447807409' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2927719967447807409'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/33534782/posts/default/2927719967447807409'/><link rel='alternate' type='text/html' href='http://blog.wang-lu.com/2025/03/esp32s3-flash-encryption-and-secure-boot.html' title='ESP32S3: Flash Encryption and Secure Boot '/><author><name>Lu Wang</name><uri>http://www.blogger.com/profile/00576609954224192924</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>