<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
 <title>Chris Hulbert is Splinter Software</title>
 <link href="http://www.splinter.com.au/atom.xml" rel="self"/>
 <link href="http://www.splinter.com.au/"/>
 <updated>2025-12-20T18:23:24+11:00</updated>
 <id>http://www.splinter.com.au</id>
 <author>
   <name>Chris Hulbert</name>
 </author>
 
 
 <entry>
   <title>Commander Keen 4-6 file formats</title>
   <link href="http://www.splinter.com.au/2025/12/20/commander-keen-4-6-file-formats/"/>
   <updated>2025-12-20T00:00:00+11:00</updated>
   <id>http://www.splinter.com.au/2025/12/20/commander-keen-4-6-file-formats/commander-keen-4-6-file-formats</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2025/keenformats.png&quot; alt=&quot;Keen 4 Map&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Recently I made &lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder&quot;&gt;Dopefish Decoder&lt;/a&gt;, a Rust tool for dumping the graphics from a very old-school Id software game: Commander Keen 4-6. It was a bit of work (fun work though) combining information from various sources to figure out how to read it all, so here’s the formats in rough EBNF! Further explanations are afterwards for the more complex elements.&lt;/p&gt;

&lt;h2 id=&quot;files&quot;&gt;Files&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Graphics:
    &lt;ul&gt;
      &lt;li&gt;graph_head&lt;/li&gt;
      &lt;li&gt;graph_dict&lt;/li&gt;
      &lt;li&gt;egagraph&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Maps:
    &lt;ul&gt;
      &lt;li&gt;map_head&lt;/li&gt;
      &lt;li&gt;gamemaps&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;non-file-files&quot;&gt;Non-file files&lt;/h2&gt;

&lt;p&gt;The map_head/graph_head/graph_dict “files” are actually present inside the game executable. Having said that, in many mods they are their own separate files. To get them, the executable first needs to be &lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder/tree/main?tab=readme-ov-file#decompressing-exe-files&quot;&gt;decompressed first&lt;/a&gt;, then &lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder/blob/main/src/versions.rs&quot;&gt;these offsets&lt;/a&gt; used to extract them.&lt;/p&gt;

&lt;h2 id=&quot;ebnf&quot;&gt;EBNF&lt;/h2&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// All multi-byte ints are little-endian.

graph_head = { graph offset }, graph length
graph length = 3 byte int // Matches length of egagraph file.
graph offset = 3 byte int 

graph_dict = { huffman node }
huffman node = node side, node side // Left, right.
node side = node value, node type
node value = byte
node type = leaf | node // Byte: 0 = leaf, else = node.

egagraph = { chunk }
egagraph = unmasked picture table chunk with header,
    masked picture table chunk with header,
    sprite table chunk with header,
    font a chunk with header,
    font b chunk with header,
    font c chunk with header,
    { unmasked picture chunk with header }, // Count from unmasked picture table.
    { masked picture chunk with header }, // Count from masked picture table.
    { sprite chunk with header }, // Count from sprite table.
    unmasked 8x8 tiles chunk without header, // One chunk for all tiles.
    masked 8x8 tiles chunk without header, // One chunk for all tiles.
    { unmasked 16x16 tile chunk without header },
    { masked 16x16 tile chunk without header },
    { text etc }
chunk without header = huffman encoded chunk
chunk with header = chunk decompressed length, huffman encoded chunk
chunk decompressed length = 4 byte int

picture table = { picture table entry }
picture table entry = width_pixels_divided_by_8, height_pixels
width_pixels_divided_by_8 = 2 byte int
height_pixels = 2 byte int

sprite table = { sprite table entry } // 18 bytes each.
sprite table entry = width_div_by_8, // All are 2 byte ints.
    height,
    x offset,
    y offset,
    clip left,
    clip top,
    clip right,
    clip bottom,
    shifts

image = picture | tile | sprite
unmasked image = red plane, green plane, blue plane, intensity plane
masked image = red plane, green plane, blue plane, intensity plane, mask plane

map_head = rlew key, { map header offset }
rlew key = 2 bytes
map header offset = 4 byte int // 0 means no map in this slot.

gamemaps = &quot;TED5v1.0&quot;, { map }
map = map planes, map header
map header = background plane offset, // 38 bytes.
    foreground plane offset,
    sprite plane offset,
    background plane length,
    foreground plane length,
    sprite plane length,
    tile count width,
    tile count height,
    map name
plane offset = 4 byte int
plane length = 2 byte int
tile count = 2 byte int
map name = 16 bytes asciiz
map planes = background carmackized plane,
    foreground carmackized plane,
    sprite carmackized plane
carmackized plane = carmackized decompressed length, carmackized data
carmackized decompressed length = 2 byte int
carmackized data = carmack compressed(rlew plane)
rlew plane = rlew decompresed length, rlew data
rlew decompressed length = 2 byte int
rlew data = rlew compressed(decompressed plane)
decompressed plane = { map plane row }
map plane row = { map plane element }
map plane element = 2 byte int
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;image-planes&quot;&gt;Image planes&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Images are stored in EGA planes.&lt;/li&gt;
  &lt;li&gt;Data is one whole-image plane, then the next plane, and so on.&lt;/li&gt;
  &lt;li&gt;Thus a certain pixel is represented 4-5 times across the data.&lt;/li&gt;
  &lt;li&gt;Masked image planes: RGBIM.&lt;/li&gt;
  &lt;li&gt;Red, Green, Blue, Intensity, Mask.&lt;/li&gt;
  &lt;li&gt;Unmasked image planes: RGBI.&lt;/li&gt;
  &lt;li&gt;When the mask bit = 1, it is a transparent pixel.&lt;/li&gt;
  &lt;li&gt;Each pixel in a plane is represented by 1 bit.&lt;/li&gt;
  &lt;li&gt;Inside each byte, pixels are left-&amp;gt;right in big-endian order, 0x80 being leftmost.&lt;/li&gt;
  &lt;li&gt;All widths are multiples of 8 so you don’t have to worry about rows starting mid-byte.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;map-elements&quot;&gt;Map elements&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Map plane elements are ints representing which tile is displayed at that position.&lt;/li&gt;
  &lt;li&gt;Background plane corresponds to the unmasked tiles.&lt;/li&gt;
  &lt;li&gt;Foreground plane corresponds to the masked tiles.&lt;/li&gt;
  &lt;li&gt;Foreground plane elements are not always present. 0 means no element here. Which means that to represent the first masked tile, the value is 1. This means that you -1 the value to get the tile index.&lt;/li&gt;
  &lt;li&gt;Background plane always has an element, so the above -1 does not apply.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;rlew--huffman--carmackization&quot;&gt;RLEW / Huffman / Carmackization&lt;/h2&gt;

&lt;p&gt;These compression techniques are big topics, far too complex for EBNF, and out of scope for an article like this.&lt;/p&gt;

&lt;p&gt;They are are probably best described in code, which also has links to further reading. Hopefully the following code is readable enough to communicate the how-to:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder/blob/main/src/rlew.rs&quot;&gt;rlew.rs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder/blob/main/src/huffman.rs&quot;&gt;huffman.rs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder/blob/main/src/carmackization.rs&quot;&gt;carmackization.rs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;I know this is the most random topic imaginable. Still, thanks for reading, I pinky promise this was written by a human, not AI, hope you found this fascinating if not useful, at least a tiny bit, God bless!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Cloudflare Rust Analysis</title>
   <link href="http://www.splinter.com.au/2025/12/05/cloudflare-rust-analysis/"/>
   <updated>2025-12-05T00:00:00+11:00</updated>
   <id>http://www.splinter.com.au/2025/12/05/cloudflare-rust-analysis/cloudflare-rust-analysis</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2025/cloud.png&quot; alt=&quot;Angry Cloud from Commander Keen 4&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A few weeks ago, there was a huge Cloudflare outage that knocked out half the internet for a while. As someone who has written a fair bit of Rust in my spare time (23KLOC according to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cloc&lt;/code&gt; over the last few years), I couldn’t resist the urge to add some constructive thoughts to the discussion around the Rust code that was identified for the outage.&lt;/p&gt;

&lt;p&gt;And I’m not going full Rust-Evangelism-Strike-Force here, as my pro-Swift conclusion will attest. Basically I’d just like to take this outage as an opportunity to recommend a couple tricks for writing safer Rust code.&lt;/p&gt;

&lt;h2 id=&quot;the-culprit&quot;&gt;The culprit&lt;/h2&gt;

&lt;p&gt;So, here’s the culprit according to &lt;a href=&quot;https://blog.cloudflare.com/18-november-2025-outage/#memory-preallocation&quot;&gt;Cloudflare’s postmortem&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pub fn fetch_features(
        &amp;amp;mut self,
        input: &amp;amp;dyn BotsInput,
        features: &amp;amp;mut Features,
) -&amp;gt; Result&amp;lt;(), (ErrorFlags, i32)&amp;gt; {
    features.checksum &amp;amp;= 0xffff_ffff_0000_0000;
    features.checksum |= u64::from(self.config.checksum);
    let (feature_values, _) = features
        .append_with_names(&amp;amp;self.config.feature_names)
        .unwrap();
    ...
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Apparently it processes new configuration, and crashed at the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unwrap&lt;/code&gt; because configuration with too many features was passed in.&lt;/p&gt;

&lt;h2 id=&quot;code-review&quot;&gt;Code Review&lt;/h2&gt;

&lt;p&gt;Keep in mind that I’m not seeing the greater context of this function, so the following may be affected by that, but here are my thoughts re the above code:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It returns a Result, with nothing for success case, and a combo of ErrorFlags and i32 for the failure case.&lt;/li&gt;
  &lt;li&gt;The presence of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;amp;dyn&lt;/code&gt; for input indicates this uses dynamic dispatch, which means this isn’t intended as high-performance code. Which makes sense if this is just for loading configuration. Given that, they could have simply used &lt;a href=&quot;https://docs.rs/anyhow/latest/anyhow/&quot;&gt;anyhow&lt;/a&gt;’s all-purpose Result to make their lives simpler instead of this complex tuple for the error generic.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unwrap()&lt;/code&gt; is called. This is the big red flag, and something that should only generally be done in code that you are happy to have panic eg command line utilities, but less so for services. Swift’s equivalent is the force-unwrap operator &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;!&lt;/code&gt;. When Swift was new, it was explained that the ! was chosen because it signifies danger, and stands out like a sore thumb in code reviews to encourage thorough examination. Rust’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unwrap&lt;/code&gt; isn’t as obvious at review time, and thus can sneak through unnoticed.&lt;/li&gt;
  &lt;li&gt;Since we’re already in a function that returns Result, it would be more idiomatic to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;?&lt;/code&gt; after the call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;append_with_names&lt;/code&gt;, so that this function would hot-potato the error to the caller, instead of panicing.&lt;/li&gt;
  &lt;li&gt;If &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;append_with_names&lt;/code&gt; returns an Option not a Result, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ok_or(..)?&lt;/code&gt; would be a tidy option.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;alternative&quot;&gt;Alternative&lt;/h2&gt;

&lt;p&gt;Here I’ve changed the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fetch_features&lt;/code&gt; function to be safer, with a couple options for how to gracefully handle this if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;append_with_names&lt;/code&gt; returns either a Result or an Option (it isn’t clear which it is from Cloudflare’s snippet, so I’ve done both). Note that I’ve also added some boilerplate around all this to keep the fetch_features code as similar as possible, but also commented out some stuff that’s less relevant.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;fn main() {
    let mut fetcher = Fetcher::new();
    let mut features = Features::new();
    if let Err(e) = fetcher.fetch_features(&amp;amp;mut features) {
        // ... Gracefully handle the error here without panicing ...
        eprintln!(&quot;Error gracefully handled: {:#?}&quot;, e);
        return
    }
}

enum FeatureName {
    Foo,
    Bar,
}

struct Fetcher {
    feature_names: Vec&amp;lt;FeatureName&amp;gt;,
}

impl Fetcher {
    fn new() -&amp;gt; Self {
        Fetcher { feature_names: vec![] }
    }
    
    // This is the function Cloudflare said caused the outage:
    fn fetch_features(
        &amp;amp;mut self,
        // input: &amp;amp;dyn BotsInput,
        features: &amp;amp;mut Features,
    ) -&amp;gt; Result&amp;lt;(), (ErrorFlags, i32)&amp;gt; {
        // features.checksum &amp;amp;= 0xffff_ffff_0000_0000;
        // features.checksum |= u64::from(self.config.checksum);
        
        // If append_with_names returns a Result,
        // the question mark operator is safer than unwrap:
        let (feature_values, _) = features
            .append_with_names_result(&amp;amp;self.feature_names)?;
        
        // If append_with_names returns Option,
        // ok_or converts to a result, which forces you to be
        // explicit about what error is relevant,
        // which is then safely unwrapped using the question mark operator.
        let (feature_values, _) = features
            .append_with_names_option(&amp;amp;self.feature_names)
            .ok_or((ErrorFlags::AppendWithNamesFailed, -1))?;
        
        Ok(())
    }
}

#[derive(Debug)]
enum ErrorFlags {
    AppendWithNamesFailed,
    TooManyFeatures,
}

struct Features {
}

impl Features {
    fn new() -&amp;gt; Self {
        Features {}
    }
    
    // This is for if it returns a Result:
    fn append_with_names_result(
        &amp;amp;mut self,
        names: &amp;amp;[FeatureName],
    ) -&amp;gt; Result&amp;lt;(i32, i32), (ErrorFlags, i32)&amp;gt; {
        if names.len() &amp;gt; 200 { // Config is too big!
            Err((ErrorFlags::TooManyFeatures, -1))
        } else {
            Ok((42, 42))
        }
    }

    // This is for if it returns an Option:
    fn append_with_names_option(
        &amp;amp;mut self,
        names: &amp;amp;[FeatureName],
    ) -&amp;gt; Option&amp;lt;(i32, i32)&amp;gt; {
        if names.len() &amp;gt; 200 { // Config is too big!
            None
        } else {
            Some((42, 42))
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Feel free to paste this into the &lt;a href=&quot;https://play.rust-lang.org&quot;&gt;Rust Playground&lt;/a&gt; and see if you have better suggestions :)&lt;/p&gt;

&lt;h2 id=&quot;suggestions&quot;&gt;Suggestions&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Instead of unwrap, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;?&lt;/code&gt; operator is a great option, particularly if you are already in a function that returns a Result, so please take advantage of such a situation.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ok_or&lt;/code&gt; is a great way to safely unwrap Options inside a Result function. If forces you to think about ‘what error should I return if there’s no value here?’.&lt;/li&gt;
  &lt;li&gt;Consider Swift! The exclamation point operator is a great way of drawing attention to danger in a code review, which is a fantastic piece of language ergonomics.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;If anyone from Cloudflare is reading this, I hope this critique does not come across as unkind, much of my code is not amazingly bulletproof either! And kudos to Cloudflare for allowing us to see some of their code in the postmortem :)&lt;/p&gt;

&lt;p&gt;Thanks for reading, I pinky promise this was written by a human, not AI, hope you found this useful, at least a tiny bit, God bless!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Rust Compilation: Sequoia vs Tahoe</title>
   <link href="http://www.splinter.com.au/2025/12/04/sequoia-vs-tahoe/"/>
   <updated>2025-12-04T00:00:00+11:00</updated>
   <id>http://www.splinter.com.au/2025/12/04/sequoia-vs-tahoe/sequoia-vs-tahoe</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2025/isleoffire.png&quot; alt=&quot;Sequoia vs Tahoe&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Are you curious to know if upgrading from macOS Sequoia to Tahoe will affect compilation speeds? Everyone seems to be piling onto the anti-Tahoe bandwagon, so I thought I’d add some anecdata to the anecdotes going around.&lt;/p&gt;

&lt;p&gt;Note that I have two identical laptops, the only difference is that one has Tahoe:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Mac                   macOS         Speed (lower is better)
---                   -----         -----
2025 M2 Air 16GB RAM  Sequoia 15.6  361.54s
2025 M2 Air 16GB RAM  Tahoe 26.1    360.88s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;My core point is: Tahoe isn’t slower in my (admittedly simplistic) Rust compilation benchmark. It’s &lt;a href=&quot;https://www.youtube.com/watch?v=hou0lU8WMgo&quot;&gt;technically&lt;/a&gt; 0.2% faster, but that’s statistically insignificant.&lt;/p&gt;

&lt;p&gt;To the mix, I’ve added a few other Macs I had lying around, to add some colour to the conversation:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;2022 M1 Studio Ultra   Sequoia 15.6.1  512.63s
2025 M4 Air, 16GB RAM  Sequoia 15.6    378.13s
2022 M2 Air, 8GB RAM   Sequoia         343.97s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that all macs are ‘base models’ of their generation.&lt;/p&gt;

&lt;h2 id=&quot;benchmark-details&quot;&gt;Benchmark details&lt;/h2&gt;

&lt;p&gt;So, this benchmark is, as mentioned above, admittedly simple. I recently wrote a &lt;a href=&quot;https://www.reddit.com/r/rustjerk/comments/av5pog/higherres_rust_evangelism_strike_force_image/&quot;&gt;Rust&lt;/a&gt; tool to &lt;a href=&quot;https://github.com/chrishulbert/dopefish-decoder&quot;&gt;extract the sprites and maps from the Commander Keen episodes&lt;/a&gt;, and this benchmark times how long it takes to compile its 16 source files from scratch 400 times. Despite its simplicity, the two identical-hardware Mac’s scored within 0.2% of each other, so it is at least consistent.&lt;/p&gt;

&lt;p&gt;If you’d like to repeat it:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Fresh install of macOS if possible&lt;/li&gt;
  &lt;li&gt;Install default Rust via rustup.rs&lt;/li&gt;
  &lt;li&gt;My macs were running Rustc 1.91.1&lt;/li&gt;
  &lt;li&gt;Install homebrew via brew.sh&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git clone https://github.com/chrishulbert/dopefish-decoder.git&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Do your best to ensure other things aren’t running in the background&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make bench&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;conspiracy-theory&quot;&gt;Conspiracy theory!&lt;/h2&gt;

&lt;p&gt;It’s surprising that the M4 doesn’t trounce the M2’s! I wonder if Apple is actually putting M4 chips into the 2025 batch of “M2” laptops that have been updated to have 16GB RAM. Given the RAM is integrated with the CPU, maybe it was just simpler for them to put M4 chips in, rather than dust off the M2 designs, add more RAM, and restart the production line? And maybe they just didn’t bother to throttle them in some way. Maybe?&lt;/p&gt;

&lt;p&gt;Alternatively… perhaps this was just a poor benchmark? After all, my older M2 somehow came out fastest. But the performance consistency between the two identical laptops is remarkably tight, indicating at least some level of accuracy. My M4 also has a corporate security rootkit installed too, which may slow things. Lots to think about.&lt;/p&gt;

&lt;h2 id=&quot;ultra&quot;&gt;Ultra&lt;/h2&gt;

&lt;p&gt;It’s unfortunate to see the M1 Ultra taking a lot longer than the others. I guess the M1 is showing its age! I can see why Apple’s rumoured to have given up on the Mac Pro: by the time the Ultra team has managed to release an Mn Ultra, the Mn+1 Max is out and faster. If I were to make any recommendations here, I’d say forget previous-gen Ultras, instead buy latest-gen Studio Max. Perhaps Ultra will become more relevant once the yearly pace of improvement in M processors slows down.&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;So there you have it: Benchmarking is hard. Kudos to &lt;a href=&quot;https://www.youtube.com/@GamersNexus&quot;&gt;those who arguably do it well.&lt;/a&gt; If nothing else though, I wouldn’t be too worried about Tahoe slowing things down, it’s a perfectly &lt;a href=&quot;https://simpsons.fandom.com/wiki/Elizabeth_Hoover&quot;&gt;cromulent &lt;del&gt;word&lt;/del&gt;&lt;/a&gt; operating system. Thanks for reading, I pinky promise this was written by a human, not AI, hope you found this fascinating, at least a tiny bit, God bless!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Better React Native devex through Expo Go</title>
   <link href="http://www.splinter.com.au/2025/09/05/react-native-expo-go-devex/"/>
   <updated>2025-09-05T00:00:00+10:00</updated>
   <id>http://www.splinter.com.au/2025/09/05/react-native-expo-go-devex/react-native-expo-go-devex</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2025/devex.jpg&quot; alt=&quot;Devex&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Having worked with React Native projects on and off for years now, I’ve come to appreciate that there are significant productivity and developer experience (devex) gains on the table, that tend to be derailed the moment a native library is added to the mix. But what if you could keep that productivity flowing?&lt;/p&gt;

&lt;p&gt;Most people (somewhat rightly) think of Expo Go as the training wheels that nobody uses for serious React Native development. But you’re probably like me: the vast majority of daily work is simple Create-Read-Update-Delete (CRUD!) data manipulation. And what if, for that daily work, we didn’t need to fight with getting Xcode or Android Studio to compile, code sign, deal with cocoapods, ruby, gradle, etc etc? What if most of your team didn’t even need to install Xcode/Studio at all? I believe this strategy can be beneficial for keeping you and your team productive, and isolate all the pain of the native integration to the CI builds.&lt;/p&gt;

&lt;p&gt;So, how to get to this point? Some thoughts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;When considering libraries, ask yourself ‘is this pure-js or native?’. For instance, when evaluating options for a feature flag library, you could use FooFlags (not a real product) or LaunchDarkly. FooFlags has a react native library that wraps native code, however LaunchDarkly is a pure-js library. You should use the one that has a pure JS library, because that gets you one step closer to being able to do your daily work in Expo Go.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Sometimes, companies release newer versions of their libraries that are pure JS. LaunchDarkly did this in the last year or two: their older library was native + JS shim, but their newer one is pure JS. In cases like these, you can upgrade to the latest pure JS one to make your life easier.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If you have an unavoidably native component, you can wrap it in a pure-JS component that shows a placeholder. If this is a part of the app that you don’t need to work on very often, this can be a great way of having your cake and eating it too: Have native components for part of the app, yet still be able to spend most of your productive workday zipping along with Expo Go.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If you have native modules, you can shim them to perform no-ops (or whatever is reasonable) when in Expo Go. I’ll demonstrate some strategies for achieving these last 2 points next:&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;your-native-expo-modules&quot;&gt;Your native expo modules&lt;/h2&gt;

&lt;p&gt;If you’ve made your own native module, you can ‘shim’ it out in such a way that it does nothing when run in the Expo Go environment. To do so, as an example, I modify the generated &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;modules/my-foo-module/src/MyFooModule.ts&lt;/code&gt; file as follows:&lt;/p&gt;

&lt;div class=&quot;language-tsx highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;NativeModule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;requireNativeModule&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;expo&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Constants&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;expo-constants&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;EventSubscription&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;expo-modules-core&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;MyFooModuleEvents&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;./MyFooModule.types&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

&lt;span class=&quot;kr&quot;&gt;declare&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyFooModule&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;NativeModule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;MyFooModuleEvents&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;PI&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;nf&quot;&gt;getValueSync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;nf&quot;&gt;setValueAsync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Promise&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;nf&quot;&gt;doSomething&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;requireOrMock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;MyFooModule&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Constants&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;executionEnvironment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;StoreClient&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Expo Go:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;// My stuff, mocked:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;PI&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.141&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;getValueSync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;function &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&apos;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;setValueAsync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;async&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;function &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Promise&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{},&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;doSomething&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;function &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{},&lt;/span&gt;

      &lt;span class=&quot;c1&quot;&gt;// Generic expo module stuff:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;addListener&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;EventName&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;keyof&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;MyFooModuleEvents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;(
        eventName: EventName,
        listener: MyFooModuleEvents[EventName]): EventSubscription &lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;,
      removeListener: function &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;EventName&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;keyof&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;MyFooModuleEvents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;(
        eventName: EventName,
        listener: MyFooModuleEvents[EventName]): void &lt;span class=&quot;si&quot;&gt;{}&lt;/span&gt;,
      removeAllListeners: function (
        eventName: keyof MyFooModuleEvents): void &lt;span class=&quot;si&quot;&gt;{}&lt;/span&gt;,
      emit: function &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;EventName&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;keyof&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;MyFooModuleEvents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;(
        eventName: EventName,
        ...args: Parameters&lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;MyFooModuleEvents&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;EventName&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;): void &lt;span class=&quot;si&quot;&gt;{}&lt;/span&gt;,
      listenerCount: function &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;EventName&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;keyof&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;MyFooModuleEvents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;(
        eventName: EventName): number &lt;span class=&quot;si&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;
    } 
  } else &lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;requireNativeModule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;MyFooModule&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;MyFooModule&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;
}
export default requireOrMock();
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;imported-library-components&quot;&gt;Imported library components&lt;/h2&gt;

&lt;p&gt;In our case, we use a native library for VOIP calling. We only have one component that uses this library, so I’ve added a ‘wrapper’ component that replaces our component with a placeholder when we’re using Expo Go. The wrapper works as follows:&lt;/p&gt;

&lt;div class=&quot;language-tsx highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Constants&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ExecutionEnvironment&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;expo-constants&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;View&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;react-native&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;MyComponentProps&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;./MyComponent&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// This wraps a MyComponent in such a way it is not instantiated for Expo Go.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;MyComponentWrapper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;MyComponentProps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Constants&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;executionEnvironment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;ExecutionEnvironment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;StoreClient&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Expo Go:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;View&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;style&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;flex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;justifyContent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;center&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;alignItems&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;center&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;This is disabled while using Expo Go&lt;span class=&quot;p&quot;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;View&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Production:&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;MyComponent&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;./MyComponent&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Lazy import.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;MyComponent&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;props&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;MyComponent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This wrapper has the same props as the actual component, thus everywhere our component is used, this wrapper component is to be simply used instead.&lt;/p&gt;

&lt;p&gt;For this to work, you have to edit MyComponent.tsx and export its props like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;export interface MyComponentProps { ...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;Hope you find this helpful! I strongly recommend using Expo Go for the sake of your team’s productivity if possible, and with the above tips, I think it is reasonably achievable. Thanks for reading, I pinky promise this was written by a human, not AI, hope you found this fascinating, at least a tiny bit, God bless!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The Maths of FM Synthesis</title>
   <link href="http://www.splinter.com.au/2024/10/09/maths-of-fm-synthesis/"/>
   <updated>2024-10-09T00:00:00+11:00</updated>
   <id>http://www.splinter.com.au/2024/10/09/maths-of-fm-synthesis/maths-of-fm-synthesis</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2024/fm.jpg&quot; alt=&quot;FM Synthesis&quot; /&gt;&lt;/p&gt;

&lt;p&gt;FM Synthesis is an old-school way of generating musical instrument sounds, initially popularised by the &lt;a href=&quot;https://en.wikipedia.org/wiki/Ad_Lib,_Inc.&quot;&gt;Adlib&lt;/a&gt; and SoundBlaster PC sound cards in the late ’80s (and, of course, in piano keyboards). Here’s an &lt;a href=&quot;https://chiptune.app/?play=Game%20MIDI%2FDescent%202%20(PC%E2%88%95DOS%2C%201996)%2FFM%2FD2-Descent-FM.mid&quot;&gt;example of what FM Synth music sounded like in games&lt;/a&gt;. Ahh the nostalgia.&lt;/p&gt;

&lt;p&gt;A friend who is a school music teacher found that his students all use the same identical samples for instruments for their creations. So I created &lt;a href=&quot;https://chrishulbert.github.io/you-synth/&quot;&gt;YouSynth, a web app that allows you to create any instrument you like using a basic form of FM synthesis, and download that instrument as WAV file you can use anywhere, as well as play around with it using an attached MIDI keyboard. Please check it out!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So as to not leave out the maths teachers, I thought I’d write an article about how the maths for FM synthesis works! I think it’s fascinating, hopefully you might too. My dream is that maybe a maths teacher somewhere would use this as an interesting demonstration of applied maths to pique their students’ interest :)&lt;/p&gt;

&lt;h2 id=&quot;formula&quot;&gt;Formula&lt;/h2&gt;

&lt;p&gt;To start with, here’s the gist of it - for each sample, the value is:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sin(
    carrierFrequency * time * 2 * pi
    +
    sin(modulatorFrequency * time * 2 * pi) * modulatorEnvelope
) * carrierEnvelope
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now let’s break that down.&lt;/p&gt;

&lt;h2 id=&quot;carrier-frequency&quot;&gt;Carrier frequency&lt;/h2&gt;

&lt;p&gt;The carrier frequency is the fundamental frequency of the note.
Eg for A4, it’s 440 Hz.
For Middle C, aka C4, it’s ~261.6 Hz.&lt;/p&gt;

&lt;p&gt;For each note you go up (including sharps), the frequency is multiplied by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2^(1/12)&lt;/code&gt;.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1/12&lt;/code&gt; is because there are 12 freqencies in each octave when including the sharps.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2^&lt;/code&gt; is because frequencies double with each octave. Eg A4 is 440 Hz, and A5 is 880 Hz.&lt;/p&gt;

&lt;p&gt;When working with MIDI, each note gets a number representation: C4=60, C#4=61, D4=62, etc.
To convert from a midi note to a frequency, the formula is: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;440 * 2 ^ ((midiNote - 69) / 12)&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;time&quot;&gt;Time&lt;/h2&gt;

&lt;p&gt;The time in the above formula is in seconds since the note started playing.
Since you’d typically be generating samples at a rate of 44100 or 48000 Hz, to convert from
the sample number to the time, this formula applies: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;time = sample / sampleRate&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;pi&quot;&gt;Pi&lt;/h2&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2 * pi&lt;/code&gt; is necessary because sin repeats its output every multiple of 2 * pi on its input.
An interesting aside: Credible mathematicians consider that tau (2 * pi) should be taught to students
instead of pi, because it is so common that we need to double pi before using it, so why not just use the
double as the famous constant, then? See the &lt;a href=&quot;https://tauday.com/tau-manifesto&quot;&gt;Tau manifesto&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;modulator-frequency&quot;&gt;Modulator frequency&lt;/h2&gt;

&lt;p&gt;The modulator is the waveform that ‘modulates’ the fundamental frequency. Think of it as the &lt;a href=&quot;https://guitar.fandom.com/wiki/Whammy_bar&quot;&gt;whammy bar&lt;/a&gt; on a guitar being wiggled up and down quickly.&lt;/p&gt;

&lt;p&gt;Typically the modulator frequency is a whole-number multiple or fraction of the fundamental frequency. Eg for a fundamental of 440 Hz, the following modulator frequencies all sound ‘nice’: 110 (440/4), 146.7 (440/3), 220 (440/2), 440, 880, 1320, etc.&lt;/p&gt;

&lt;h2 id=&quot;envelopes&quot;&gt;Envelopes&lt;/h2&gt;

&lt;p&gt;The envelopes control the amplitude/volume of the carrier and modulator over time. From initially zero, quickly up to 100%, then down to a sustained volume of perhaps 50%, where it remains while the piano key is held, then when the key is released, it gradually returns to 0.&lt;/p&gt;

&lt;p&gt;A common strategy is the &lt;a href=&quot;https://en.wikipedia.org/wiki/Envelope_(music)#ADSR&quot;&gt;ADSR envelope&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;During the attack stage: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;amplitude = time / attackDuration&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;During decay stage: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;amplitude = 1 - (time - attackDuration) / decayDuration * (1 - sustainAmplitude)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;During sustain stage: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;amplitude = sustainAmplitude&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;During release stage: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;amplitude = sustainAmplitude - releasingTime / releaseDuration&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;other-waves&quot;&gt;Other waves&lt;/h2&gt;

&lt;p&gt;To make more interesting sounds, other waveforms besides sine waves can be used.
Some common ones are square, triangle, and sawtooth.
Here are their formulae which repeat every multiple of 1 on the input:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Sine &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;= sin(x * 2 * pi)&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Square &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;= 4 * floor(x) - 2 * floor(2 * x) + 1&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Triangle &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;= 2 * abs(2 * (x + 0.25 - floor(x + 0.75))) - 1&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Sawtooth &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;= 2 * (x - floor(x + 0.5))&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So there you have it, the maths behind basic FM Synthesis. Thanks for reading, hope you found this fascinating, at least a tiny bit, God bless!&lt;/p&gt;

&lt;p&gt;Photo by Vackground on Unsplash&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Neural Networks from scratch #4: Training layers of neurons, backpropagation with pseudocode and a Rust demo</title>
   <link href="http://www.splinter.com.au/2024/07/10/neural-networks-4/"/>
   <updated>2024-07-10T00:00:00+10:00</updated>
   <id>http://www.splinter.com.au/2024/07/10/neural-networks-4/neural-networks-4</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2024/layers.jpg&quot; alt=&quot;Training layers of neurons&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hi all, here’s the fourth on my series on neural networks / machine learning / AI from scratch. In the previous articles &lt;a href=&quot;/2024/03/10/neural-networks-1/index.html&quot;&gt;(please read them first!)&lt;/a&gt;, I explained how a single neuron works, then how to calculate the gradient of its weight and bias, and how you can use that gradient to train the neuron. In this article, I’ll explain how to determine the gradients when you have many layers of many neurons, and use those gradients to train the neural net.&lt;/p&gt;

&lt;p&gt;In my previous articles in this series, I used spreadsheets to make the maths easier to follow along. Unfortunately I don’t think I’ll be able to demonstrate this topic in a spreadsheet, I think it’d get out of hand, so I’ll keep it in code. I hope you can still follow along!&lt;/p&gt;

&lt;h2 id=&quot;data-model&quot;&gt;Data model&lt;/h2&gt;

&lt;p&gt;Pardon my pseudocode:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;class Net {
    layers: [Layer]
}

class Layer {
    neurons: [Neuron]
}

class Neuron {
    value: float
    bias: float
    weights: [float]
    activation_gradient: float
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Explanation:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;Layers:&lt;/em&gt; The neural net is made up of multiple layers. The first one in the array is the input layer, the last one is the output layer.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Neurons:&lt;/em&gt; The neurons that make up a layer. Each layer will typically have different numbers of neurons.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Value:&lt;/em&gt; The output of each neuron.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Bias:&lt;/em&gt; The bias of each neuron.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Weights:&lt;/em&gt; Input weights for each neuron. This array’s size will be the number of inputs to this layer. For the first layer, this will be the number of inputs (aka features) to the neural net. For subsequent layers, this will be the count of neurons in the previous layer.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Activation Gradient:&lt;/em&gt; These are the gradients of each neuron, chained to the latter layers via the magic of calculus. This is also equal to the gradient of the bias too. Maybe reading my second article in this series will help understand what this gradient means :)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;highish-level-explanation&quot;&gt;High(ish) level explanation&lt;/h2&gt;

&lt;p&gt;What we’re trying to achieve here is to use calculus to determine the ‘gradient’ of every bias and every weight in this neural net. In order to do this, we have to ‘back propagate’ these gradients from the back to the front of the ‘layers’ array.&lt;/p&gt;

&lt;p&gt;Concretely - if, say, we had 3 layers: we’d figure out the gradients of the activation functions of layers[2], then use those values to calculate the gradients of layers[1], and then layers[0].&lt;/p&gt;

&lt;p&gt;Once we have the gradients of the activation functions for each neuron in each layer, it’s easy to figure out the gradient of the weights and bias for each neuron.&lt;/p&gt;

&lt;p&gt;And, as demonstrated in my previous article, once we have the gradients, we can ‘nudge’ the weights and biases in the direction that their gradients say, thus train the neural net.&lt;/p&gt;

&lt;h2 id=&quot;steps&quot;&gt;Steps&lt;/h2&gt;

&lt;p&gt;Training and determining the gradients go hand-in-hand, as you need the inputs to calculate the values of each neuron in the net, and you need the targets (aka desired outputs) to determine the gradients. Thus it’s a three step process:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Forward pass (calculate the Layer.values)&lt;/li&gt;
  &lt;li&gt;Backpropagation (calculate the Layer.activation_gradients)&lt;/li&gt;
  &lt;li&gt;Train the weights and biases (adjust the Layer.biases and Layer.weights)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;forward-pass&quot;&gt;Forward pass&lt;/h2&gt;

&lt;p&gt;This pass fills in the ‘value’ fields.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The first layer’s neurons must have the same number of weights as the number of inputs.&lt;/li&gt;
  &lt;li&gt;Each neuron’s value is calculated as tanh(bias + sum(weights * inputs)).&lt;/li&gt;
  &lt;li&gt;Since tanh is used as the activation function, this neural net can only work with inputs and outputs and targets that are in the range -1 to +1.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Forward pass pseudocode:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for layer in layers, first to last {
    if this is the first layer {
        for neuron in layer.neurons {
            total = neuron.bias
            for weight in neuron.weights {
                total += weight * inputs[weight_index]
            }
            neuron.value = tanh(total)
        }
    } else {
        previous_layer = layers[layer_index - 1]
        for neuron in layer.neurons {
            total = neuron.bias
            for weight in neuron.weights {
                total += weight * previous_layer.neuron[weight_index].value
            }
            neuron.value = tanh(total)
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;backward-pass-aka-backpropagation&quot;&gt;Backward pass (aka backpropagation)&lt;/h2&gt;

&lt;p&gt;This fills in the ‘activation_gradient’ fields.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Note that when iterating the layers here, you must go last to first.&lt;/li&gt;
  &lt;li&gt;The ‘targets’ are the array of output value(s) from the training data.&lt;/li&gt;
  &lt;li&gt;The last layer must have the same number of neurons as the number of targets.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(1 - value^2) * ...&lt;/code&gt; are calculus equations for determining gradients.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Backward pass pseudocode:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for layer in reversed layers, last to first {
    if this is the last layer {
        for neuron in layer.neurons {
            neuron.activation_gradient =
                (1 - neuron.value^2) *
                (value - targets[neuron_index])
        }
    } else {
        next_layer = layers[layer_index + 1]
        for this_layer_neuron in layer.neurons {
            next_layer_gradient_sum = 0
            for next_layer_neuron in next_layer.neurons {
                next_layer_gradient_sum +=
                    next_layer_neuron.activation_gradient * 
                    next_layer_neuron.weights[this_layer_neuron_index]
            }
            this_layer_neuron.activation_gradient =
                (1 - this_layer_neuron.value^2) *
                next_layer_gradient_sum
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;training-pass&quot;&gt;Training pass&lt;/h2&gt;

&lt;p&gt;Now that you have the gradients, you can adjust the biases/weights to train it to better.&lt;/p&gt;

&lt;p&gt;I’ll skim over this as it’s covered in my earlier articles in this series. The gist of it is that, for each neuron, the gradient is calculated for the bias and every weight, and the bias/weights are adjusted a little to ‘descend the gradient’. Perhaps my pseudocode might make more sense:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;learning_rate = 0.01 // Aka 1%
for layer in layers {
    if this is the first layer {
        for neuron in layer.neurons {
            neuron.bias -= neuron.activation_gradient * learning_rate
            for weight in neuron.weights {
                gradient_for_this_weight = inputs[weight_index] *
                    neuron.activation_gradient
                weight -= gradient_for_this_weight * learning_rate
            }
        }
    } else {
        previous_layer = layers[layer_index - 1]
        for neuron in layer.neurons {
            neuron.bias -= neuron.activation_gradient * learning_rate
            for weight in neuron.weights {
                gradient_for_this_weight =
                    previous_layer.neurons[weight_index].value *
                    neuron.activation_gradient
                weight -= gradient_for_this_weight * learning_rate
            }
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;rust-demo&quot;&gt;Rust demo&lt;/h2&gt;

&lt;p&gt;Because I’m a Rust tragic, here’s a demo. It’s kinda long, sorry, not sorry. It was fun to write :)&lt;/p&gt;

&lt;p&gt;This trains a neural network to calculate the area and circumference of a rectangle, given the width and height as inputs.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Width and height are scaled to the range 0.1 - 1. because that’s the range that the tanh activation function supports.&lt;/li&gt;
  &lt;li&gt;Target values are also scaled to be in the range that tanh supports.&lt;/li&gt;
  &lt;li&gt;Initial biases and weights are randomly assigned.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🦀🦀🦀&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;use rand::Rng;

struct Net {
    layers: Vec&amp;lt;Layer&amp;gt;,
}

struct Layer {
    neurons: Vec&amp;lt;Neuron&amp;gt;,
}

struct Neuron {
    value: f64,
    bias: f64,
    weights: Vec&amp;lt;f64&amp;gt;,
    activation_gradient: f64
}

const LEARNING_RATE: f64 = 0.001;

fn main() {
    let mut rng = rand::thread_rng();

    // Make a 3,3,2 neural net that inputs the width and height of a rectangle,
    // and outputs the area and circumference.
    let mut net = Net {
        layers: vec![
            Layer { // First layer has 2 weights to suit the 2 inputs.
                neurons: vec![
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                ],
            },
            Layer { // Second layer neurons have the same number of weights as the previous layer has neurons.
                neurons: vec![
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                ],
            },
            Layer { // Last layer has 2 neurons to suit 2 outputs.
                neurons: vec![
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                    Neuron {
                        value: 0.,
                        bias: rng.gen_range(-1. .. 1.),
                        weights: vec![
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                            rng.gen_range(-1. .. 1.),
                        ],
                        activation_gradient: 0.,
                    },
                ],
            },
        ],
    };

    // Train.
    let mut cumulative_error_counter: i64 = 0; // These vars are for averaging the errors.
    let mut area_error_percent_sum: f64 = 0.;
    let mut circumference_error_percent_sum: f64 = 0.;
    for training_iteration in 0..100_000_000 {
        // Inputs:
        let width: f64 = rng.gen_range(0.1 .. 1.);
        let height: f64 = rng.gen_range(0.1 .. 1.);
        let inputs: Vec&amp;lt;f64&amp;gt; = vec![width, height];

        // Targets (eg desired outputs):
        let area = width * height;
        let circumference_scaled = (height * 2. + width * 2.) * 0.25; // Scaled by 0.25 so it&apos;ll always be in range 0..1.
        let targets: Vec&amp;lt;f64&amp;gt; = vec![area, circumference_scaled];

        // Forward pass!
        for layer_index in 0..net.layers.len() {
            if layer_index == 0 {
                let layer = &amp;amp;mut net.layers[layer_index];
                for neuron in &amp;amp;mut layer.neurons {
                    let mut total = neuron.bias;
                    for (weight_index, weight) in neuron.weights.iter().enumerate() {
                        total += weight * inputs[weight_index];
                    }
                    neuron.value = total.tanh();
                }
            } else {
                // Workaround for Rust not allowing you to borrow two different vec elements simultaneously.
                let previous_layer: &amp;amp;Layer;
                unsafe { previous_layer = &amp;amp; *net.layers.as_ptr().add(layer_index - 1) }
                let layer = &amp;amp;mut net.layers[layer_index];
                for neuron in &amp;amp;mut layer.neurons {
                    let mut total = neuron.bias;
                    for (weight_index, weight) in neuron.weights.iter().enumerate() {
                        total += weight * previous_layer.neurons[weight_index].value;
                    }
                    neuron.value = total.tanh();
                }
            }
        }

        // Let&apos;s check the results!
        let outputs: Vec&amp;lt;f64&amp;gt; = net.layers.last().unwrap().neurons
            .iter().map(|n| n.value).collect();
        let area_error_percent = (targets[0] - outputs[0]).abs() / targets[0] * 100.;
        let circumference_error_percent = (targets[1] - outputs[1]).abs() / targets[1] * 100.;
        area_error_percent_sum += area_error_percent;
        circumference_error_percent_sum += circumference_error_percent;
        cumulative_error_counter += 1;
        if training_iteration % 10_000_000 == 0 {
            println!(&quot;Iteration {} errors: area {:.3}%, circumference: {:.3}% (smaller = better)&quot;,
                training_iteration,
                area_error_percent_sum / cumulative_error_counter as f64,
                circumference_error_percent_sum / cumulative_error_counter as f64);
            area_error_percent_sum = 0.;
            circumference_error_percent_sum = 0.;
            cumulative_error_counter = 0;
        }

        // Backward pass! (aka backpropagation)
        let layers_len = net.layers.len();
        for layer_index in (0..layers_len).rev() { // Reverse the order.
            if layer_index == layers_len - 1 { // Last layer.
                let layer = &amp;amp;mut net.layers[layer_index];
                for (neuron_index, neuron) in layer.neurons.iter_mut().enumerate() {
                    neuron.activation_gradient =
                        (1. - neuron.value * neuron.value) *
                        (neuron.value - targets[neuron_index]);
                }
            } else {
                // Workaround for Rust not allowing you to borrow two different vec elements simultaneously.
                let next_layer: &amp;amp;Layer;
                unsafe { next_layer = &amp;amp; *net.layers.as_ptr().add(layer_index + 1) }
                let layer = &amp;amp;mut net.layers[layer_index];
                for (this_layer_neuron_index, this_layer_neuron) in layer.neurons.iter_mut().enumerate() {
                    let mut next_layer_gradient_sum: f64 = 0.;
                    for next_layer_neuron in &amp;amp;next_layer.neurons {
                        next_layer_gradient_sum +=
                            next_layer_neuron.activation_gradient * 
                            next_layer_neuron.weights[this_layer_neuron_index];
                    }
                    this_layer_neuron.activation_gradient =
                        (1. - this_layer_neuron.value * this_layer_neuron.value) *
                        next_layer_gradient_sum;
                }
            }
        }

        // Training pass!
        for layer_index in 0..net.layers.len() {
            if layer_index == 0 {
                let layer = &amp;amp;mut net.layers[layer_index];
                for neuron in &amp;amp;mut layer.neurons {
                    neuron.bias -= neuron.activation_gradient * LEARNING_RATE;
                    for (weight_index, weight) in neuron.weights.iter_mut().enumerate() {
                        let gradient_for_this_weight =
                            inputs[weight_index] *
                            neuron.activation_gradient;
                        *weight -= gradient_for_this_weight * LEARNING_RATE
                    }
                }
            } else {
                // Workaround for Rust not allowing you to borrow two different vec elements simultaneously.
                let previous_layer: &amp;amp;Layer;
                unsafe { previous_layer = &amp;amp; *net.layers.as_ptr().add(layer_index - 1) }
                let layer = &amp;amp;mut net.layers[layer_index];
                for neuron in &amp;amp;mut layer.neurons {
                    neuron.bias -= neuron.activation_gradient * LEARNING_RATE;
                    for (weight_index, weight) in neuron.weights.iter_mut().enumerate() {
                        let gradient_for_this_weight =
                            previous_layer.neurons[weight_index].value *
                            neuron.activation_gradient;
                        *weight -= gradient_for_this_weight * LEARNING_RATE;
                    }
                }
            }
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which outputs:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Iteration 0 errors: area 223.106%, circumference: 13.175% (smaller = better)
Iteration 10000000 errors: area 17.861%, circumference: 1.123% (smaller = better)
Iteration 20000000 errors: area 14.656%, circumference: 0.790% (smaller = better)
Iteration 30000000 errors: area 14.516%, circumference: 0.698% (smaller = better)
Iteration 40000000 errors: area 6.359%, circumference: 0.882% (smaller = better)
Iteration 50000000 errors: area 2.966%, circumference: 0.875% (smaller = better)
Iteration 60000000 errors: area 2.769%, circumference: 0.807% (smaller = better)
Iteration 70000000 errors: area 2.600%, circumference: 0.698% (smaller = better)
Iteration 80000000 errors: area 2.401%, circumference: 0.573% (smaller = better)
Iteration 90000000 errors: area 2.166%, circumference: 0.468% (smaller = better)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which you can see the error percentage drop down as it ‘learns’ to calculate the area and circumference of a rectangle. Magic!&lt;/p&gt;

&lt;p&gt;Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!&lt;/p&gt;

&lt;p&gt;Photo by Jonas Hensel on Unsplash&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Previewable SwiftUI ViewModels</title>
   <link href="http://www.splinter.com.au/2024/05/16/previewable-swiftui-viewmodels/"/>
   <updated>2024-05-16T00:00:00+10:00</updated>
   <id>http://www.splinter.com.au/2024/05/16/previewable-swiftui-viewmodels/previewable-swiftui-viewmodels</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2024/viewmodels.jpg&quot; alt=&quot;Previewable SwiftUI ViewModels&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hi all, I’d like to talk about a way to setup your ViewModels in SwiftUI to make previews easy:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;A)&lt;/strong&gt; Decouple your ViewModels from your Views.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;B)&lt;/strong&gt; Replace your ViewModel when previewing.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;C)&lt;/strong&gt; Easily inject any ViewState content when previewing.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;D)&lt;/strong&gt; Test your ViewModels without needing a View, instead testing their ViewState.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve used a variant of this (I simplified it a little) with a big team before so I know it’s battle-proven. But of course this may be more helpful as a starting point for you, too.&lt;/p&gt;

&lt;p&gt;The general idea is this: Have a ‘ViewModel’ protocol, and make your Views have a generic constraint to accept any ViewModel that uses that view’s specific state/events, and use a preview viewmodel that adheres to the protocol.&lt;/p&gt;

&lt;h2 id=&quot;one-time-boilerplate&quot;&gt;One-time boilerplate&lt;/h2&gt;

&lt;p&gt;So here’s the generic ViewModel that every screen will re-use.
ViewEvent is typically an enum, and used by the View to eg send button presses to the ViewModel.
ViewState is the struct that is used to push the loaded/loading/error/whatever state to the View.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;protocol ViewModel&amp;lt;ViewEvent, ViewState&amp;gt;: ObservableObject {
    associatedtype ViewEvent
    associatedtype ViewState

    // For communication in the VM -&amp;gt; View direction:
    var viewState: ViewState { get set }

    // For communication in the View -&amp;gt; VM direction:
    func handle(event: ViewEvent)
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Somewhere you’ll have a ‘preview’ viewmodel.
This is declared once and used by all screens you want to preview.
I’m a fan of putting your preview code in a conditional compilation statement.
Note that this allows you to inject any viewstate you like.
Is ‘preview view’ a tautology? Should this be called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PreviewModel&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PreViewModel&lt;/code&gt;? Flip a coin to decide…&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#if targetEnvironment(simulator)
class PreviewViewModel&amp;lt;ViewEvent, ViewState&amp;gt;: ViewModel {
    @Published var viewState: ViewState

    init(viewState: ViewState) {
        self.viewState = viewState
    }

    func handle(event: ViewEvent) {
        print(&quot;Event: \(event)&quot;)
    }
}
#endif
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;view&quot;&gt;View&lt;/h2&gt;

&lt;p&gt;Before I show the view, I’ll introduce the event and states.
Firstly the event enum, this is the single ‘pipe’ via which the View calls through to the ViewModel (aspirationally… 2-way bindings sidestep this).
You will likely have associated values on some of these, eg the id of which row was pressed, that kind of thing:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;enum FooViewEvent {
    case hello
    case goodbye
    case present
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Next is the ViewState. This controls what is displayed.
Typically you might have an loading/loaded/error enum in here, among other things.
Notice there’s an ‘xIsPresented’ var here that is used in a 2-way-binding later for modal presentation:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;struct FooViewState: Equatable {
    var text: String
    var sheetIsPresented: Bool = false
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Ok, now the state and event are out of the way, here’s how a view might look.
Note the gnarly generic clause up the top, this is the trickiest part of this whole technique to be honest.
Basically it’s saying ‘I can accept any ViewModel that uses this particular screen’s event/state’.
Also note the 2-way binding for the modal sheet: even though this somewhat side-steps the idea of piping all input/output through the event/state concept, it’s very SwiftUI-idiomatic to use these bindings so I don’t want to be overly rigid and make life difficult: we want to avoid ‘cutting against the grain’ when working with SwiftUI. So, yeah, this isn’t architecturally pure, but it is productive!&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;struct FooView&amp;lt;VM: ViewModel&amp;gt;: View
where VM.ViewEvent == FooViewEvent,
      VM.ViewState == FooViewState
{
    @StateObject var viewModel: VM

    var body: some View {
        VStack {
            Text(viewModel.viewState.text)
            Button(&quot;Hello&quot;) {
                viewModel.handle(event: .hello)
            }
            Button(&quot;Goodbye&quot;) {
                viewModel.handle(event: .goodbye)
            }
            Button(&quot;Present modal sheet&quot;) {
                viewModel.handle(event: .present)
            }
        }
        .sheet(isPresented: $viewModel.viewState.sheetIsPresented) {
            Text(&quot;This is a modal sheet!&quot;)
                .presentationDetents([.medium])
                .presentationDragIndicator(.visible)
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;viewmodel&quot;&gt;ViewModel&lt;/h2&gt;

&lt;p&gt;Last but not least is the ViewModel for this screen.
Note that because viewState is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@Published&lt;/code&gt;, and ViewModel is a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@StateObject&lt;/code&gt;, any updates to viewState are magically automatically applied to the View. It’s really simple, no Combine required!
Also note the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xIsPresented&lt;/code&gt; is trivial to set to true to present something, far simpler than using some form of router which I fear can be convoluted.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;class FooViewModel: ViewModel {
    @Published var viewState: FooViewState

    init() {
        viewState = FooViewState(
            text: &quot;Nothing has happened yet.&quot;
        )
    }

    func handle(event: FooViewEvent) {
        switch event {
        case .hello:
            viewState.text = &quot;👋&quot;
        case .goodbye:
            viewState.text = &quot;😢&quot;
        case .present:
            viewState.sheetIsPresented = true
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;previews&quot;&gt;Previews&lt;/h2&gt;

&lt;p&gt;At the bottom of the view file you’ll want your previews.
By using the PreviewViewModel you can inject whatever ViewState you like:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#if targetEnvironment(simulator)
#Preview {
    FooView(
        viewModel: PreviewViewModel(
            viewState: FooViewState(
                text: &quot;This is a preview!&quot;
            )
        )
    )
}    
#endif
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;I hope this helps you use SwiftUI in a preview-friendly way! SwiftUI without previews is the pits…&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://gist.github.com/chrishulbert/9a21635a581e044f86e3ccc1d56010a6&quot;&gt;The source for this is on this github gist here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!&lt;/p&gt;

&lt;p&gt;Photo by Yahya Gopalani on Unsplash
Font by Khurasan on Dafont&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Neural Networks explained with spreadsheets, 3: Training a single neuron</title>
   <link href="http://www.splinter.com.au/2024/04/22/neural-networks-3/"/>
   <updated>2024-04-22T00:00:00+10:00</updated>
   <id>http://www.splinter.com.au/2024/04/22/neural-networks-3/neural-networks-3</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2024/training.jpg&quot; alt=&quot;Training a single neuron&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hi all, here’s the third on my series on neural networks / machine learning / AI from scratch. In the previous articles &lt;a href=&quot;/2024/03/10/neural-networks-1/index.html&quot;&gt;(please read them first!)&lt;/a&gt;, I explained how a single neuron works, and how to calculate the gradient of its weight and bias. In this article, I’ll explain how you can use those gradients to train the neuron.&lt;/p&gt;

&lt;h2 id=&quot;spreadsheet&quot;&gt;Spreadsheet&lt;/h2&gt;

&lt;p&gt;I recommend opening this spreadsheet in a separate tab, and viewing it as you read this post which explains the
maths: &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1nSrsC1W1A_BQlJDi3nSpJ9bqJCWjRicNoX30O2aPVNY/edit?usp=sharing&quot;&gt;Single neuron training&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In case the linked spreadsheet is lost to posterity, here it is in slightly less well-formatted form
(note: for brevity’s sake, I’ve shortened references such as B2 to simply ‘B’ when referring to a column in the same row):&lt;/p&gt;

&lt;div class=&quot;my_spreadsheet_table_is_next&quot;&gt;&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;A&lt;/th&gt;
      &lt;th&gt;B&lt;/th&gt;
      &lt;th&gt;C&lt;/th&gt;
      &lt;th&gt;D&lt;/th&gt;
      &lt;th&gt;E&lt;/th&gt;
      &lt;th&gt;F&lt;/th&gt;
      &lt;th&gt;G&lt;/th&gt;
      &lt;th&gt;H&lt;/th&gt;
      &lt;th&gt;I&lt;/th&gt;
      &lt;th&gt;J&lt;/th&gt;
      &lt;th&gt;K&lt;/th&gt;
      &lt;th&gt;L&lt;/th&gt;
      &lt;th&gt;M&lt;/th&gt;
      &lt;th&gt;N&lt;/th&gt;
      &lt;th&gt;O&lt;/th&gt;
      &lt;th&gt;P&lt;/th&gt;
      &lt;th&gt;Q&lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;th&gt;1&lt;/th&gt;
      &lt;th&gt;Learning rate&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Training&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Neuron&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Outputs&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;th&gt;2&lt;/th&gt;
      &lt;th&gt;0.1&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;In&lt;/th&gt;
      &lt;th&gt;Out&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Input&lt;/th&gt;
      &lt;th&gt;Weight&lt;/th&gt;
      &lt;th&gt;Weight gradient&lt;/th&gt;
      &lt;th&gt;Bias&lt;/th&gt;
      &lt;th&gt;Bias gradient&lt;/th&gt;
      &lt;th&gt;Net&lt;/th&gt;
      &lt;th&gt;Output&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Target&lt;/th&gt;
      &lt;th&gt;Attempt&lt;/th&gt;
      &lt;th&gt;Error&lt;/th&gt;
      &lt;th&gt;Loss&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.01&lt;/td&gt;
      &lt;td&gt;0.1 (C*10)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.01 (C)&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;J * F&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;P * (1-L²)&lt;/td&gt;
      &lt;td&gt;F*G+I&lt;/td&gt;
      &lt;td&gt;Tanh(K)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.1 (D)&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;L-N&lt;/td&gt;
      &lt;td&gt;P² / 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.01&lt;/td&gt;
      &lt;td&gt;0.1 (C*10)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.01 (C)&lt;/td&gt;
      &lt;td&gt;G3 - H3 * LEARNING_RATE&lt;/td&gt;
      &lt;td&gt;J * F&lt;/td&gt;
      &lt;td&gt;I3 - J3 * LEARNING_RATE&lt;/td&gt;
      &lt;td&gt;P * (1-L²)&lt;/td&gt;
      &lt;td&gt;F*G+I&lt;/td&gt;
      &lt;td&gt;Tanh(K)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.1 (D)&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;L-N&lt;/td&gt;
      &lt;td&gt;P² / 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.01&lt;/td&gt;
      &lt;td&gt;0.1 (C*10)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.01 (C)&lt;/td&gt;
      &lt;td&gt;G4 - H4 * LEARNING_RATE&lt;/td&gt;
      &lt;td&gt;J * F&lt;/td&gt;
      &lt;td&gt;I4 - J4 * LEARNING_RATE&lt;/td&gt;
      &lt;td&gt;P * (1-L²)&lt;/td&gt;
      &lt;td&gt;F*G+I&lt;/td&gt;
      &lt;td&gt;Tanh(K)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.1 (D)&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;L-N&lt;/td&gt;
      &lt;td&gt;P² / 2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;style&gt;
    div.my_spreadsheet_table_is_next + table td,th {
        padding: 0.1em;
        border: 1px solid #000;
    }
&lt;/style&gt;

&lt;h2 id=&quot;high-level-explanation&quot;&gt;High level explanation&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Note: “Parameters” is the umbrella term for “weights and biases”.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Row 3 starts with any old values for the parameters.&lt;/li&gt;
  &lt;li&gt;Row 4 optimises the parameters a little to decrease the error.&lt;/li&gt;
  &lt;li&gt;Row 5.1000 repeat this optimisation process, aka ‘gradient descent’.&lt;/li&gt;
  &lt;li&gt;Eventually the optimised parameters will produce the output we want!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;detailed-explanation&quot;&gt;Detailed explanation&lt;/h2&gt;

&lt;p&gt;A2 is the ‘learning rate’. This governs how much we ‘nudge’ our weight/bias each iteration. In this example it’s higher than a more common 0.1% - 1%.&lt;/p&gt;

&lt;p&gt;Columns C-D are the ‘training data’. In this example we want to train the neuron to multiply by 10.&lt;/p&gt;

&lt;p&gt;Columns F-L are the neuron maths, as covered by my earlier articles. The two gradients in particular are tricky and important: They dictate which direction the bias/weight should respectively be ‘nudged’ to decrease the error.&lt;/p&gt;

&lt;p&gt;Columns N-Q are the outputs, and useful for producing the neat graph you’ll hopefully see in the actual spreadsheet, which demonstrates how the error decreases over the iterations.&lt;/p&gt;

&lt;p&gt;Row 3 is the initial data. At this point in a real implementation we would typically choose random values for the initial bias and weight, however I’ve chosen 0.5 to start with because it’s a nice round number.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧨💣💥 Rows 4+ are the same as row 3, except that the parameters have some of their gradient subtracted each time. (this is the important bit)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Incidentally, this might help explain why training a NN uses a lot more computation than using it: Because of all the gradient calculations and iterations over training data.&lt;/p&gt;

&lt;p&gt;And there you have it, that’s how to use the gradients to train a single neuron. Next I’ll explain how to calculate the gradients for a network of them!&lt;/p&gt;

&lt;h2 id=&quot;rust-demo&quot;&gt;Rust demo&lt;/h2&gt;

&lt;p&gt;Because I’m a Rust tragic, here’s a demo:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;const LEARNING_RATE: f64 = 0.01;
const TRAINING_INPUT: f64 = 0.01;
const TRAINING_OUTPUT: f64 = 0.1;

fn main() {
    // Initial parameters.
    let mut weight: f64 = 0.5;
    let mut bias: f64 = 0.5;

    // Train.
    for _ in 0..100_000 {
        let net = TRAINING_INPUT * weight + bias;
        let output = net.tanh();
        let error = output - TRAINING_OUTPUT;
        let loss = error * error / 2.;
        let bias_gradient = error * (1. - output * output);
        let weight_gradient = bias_gradient * TRAINING_INPUT;
        weight -= weight_gradient * LEARNING_RATE;
        bias -= bias_gradient * LEARNING_RATE;
    }

    // Use the trained parameters:
    let trained_net = TRAINING_INPUT * weight + bias;
    let trained_output = trained_net.tanh();
    println!(&quot;Trained output: {}&quot;, trained_output);
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which outputs:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Trained output: 0.1000000000000007
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which matches the training output nicely!&lt;/p&gt;

&lt;p&gt;Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!&lt;/p&gt;

&lt;p&gt;Photo by Eugene Golovesov on Unsplash&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Neural Networks explained with spreadsheets, 2: Gradients for a single neuron</title>
   <link href="http://www.splinter.com.au/2024/03/20/neural-networks-2/"/>
   <updated>2024-03-20T00:00:00+11:00</updated>
   <id>http://www.splinter.com.au/2024/03/20/neural-networks-2/neural-networks-2</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2024/gradients.jpg&quot; alt=&quot;Gradients for a single neuron&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hi all, here’s the second on my series on neural networks / machine learning / AI from scratch. In the previous article &lt;a href=&quot;/2024/03/10/neural-networks-1/index.html&quot;&gt;(please read it first!)&lt;/a&gt;, I explained&lt;/p&gt;

&lt;p&gt;how a single neuron works. In this article, I’ll explain how you can determine the ‘gradients’ of that neuron, in other words how much
effect the weight and bias has on the final ‘loss’, using some high-school calculus. This is an prerequisite for training, which I’ll cover later.&lt;/p&gt;

&lt;h2 id=&quot;spreadsheet&quot;&gt;Spreadsheet&lt;/h2&gt;

&lt;p&gt;I recommend opening this spreadsheet in a separate tab, and viewing it as you read this post which explains the
maths: &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1LPj7aTkAUWww4hIQIpqcL_iNi0FYz1yWEriR3ivTQdw/edit?usp=sharing&quot;&gt;Single neuron gradients&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In case the linked spreadsheet is lost to posterity, here it is in slightly less well-formatted form
(note: for brevity’s sake, I’ve shortened references such as B2 to simply ‘B’ when referring to a column in the same row):&lt;/p&gt;

&lt;div class=&quot;my_spreadsheet_table_is_next&quot;&gt;&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;A&lt;/th&gt;
      &lt;th&gt;B&lt;/th&gt;
      &lt;th&gt;C&lt;/th&gt;
      &lt;th&gt;D&lt;/th&gt;
      &lt;th&gt;E&lt;/th&gt;
      &lt;th&gt;F&lt;/th&gt;
      &lt;th&gt;G&lt;/th&gt;
      &lt;th&gt;H&lt;/th&gt;
      &lt;th&gt;I&lt;/th&gt;
      &lt;th&gt;J&lt;/th&gt;
      &lt;th&gt;K&lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;th&gt;1&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Input&lt;/th&gt;
      &lt;th&gt;Weight&lt;/th&gt;
      &lt;th&gt;Bias&lt;/th&gt;
      &lt;th&gt;Net&lt;/th&gt;
      &lt;th&gt;Output&lt;/th&gt;
      &lt;th&gt;Target&lt;/th&gt;
      &lt;th&gt;Error&lt;/th&gt;
      &lt;th&gt;Loss&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;Neuron maths:&lt;/td&gt;
      &lt;td&gt;0.4&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;0.6&lt;/td&gt;
      &lt;td&gt;0.8 (B*C+D)&lt;/td&gt;
      &lt;td&gt;0.664 (tanh(E))&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;-0.035963 (F-G)&lt;/td&gt;
      &lt;td&gt;0.0006467 (H^2 / 2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;Real local gradients:&lt;/td&gt;
      &lt;td&gt;0.5 (C2)&lt;/td&gt;
      &lt;td&gt;0.4 (B2)&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0.5591 (1-F2^2)&lt;/td&gt;
      &lt;td&gt;-0.036 (H2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;Real global gradients:&lt;/td&gt;
      &lt;td&gt;-0.0101 (B3*E)&lt;/td&gt;
      &lt;td&gt;-0.0080 (C3*E)&lt;/td&gt;
      &lt;td&gt;-0.0201 (E)&lt;/td&gt;
      &lt;td&gt;-0.0201 (E3*F)&lt;/td&gt;
      &lt;td&gt;-0.036 (F3)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;Faux gradient&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;6&lt;/td&gt;
      &lt;td&gt;Faux gradient of ‘output’:&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.66414 (F2+Tiny)&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;-0.035863 (F-G)&lt;/td&gt;
      &lt;td&gt;0.0006431 (H^2 / 2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;-0.0359 ((I - I2)/Tiny)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;7&lt;/td&gt;
      &lt;td&gt;Faux gradient of ‘net’:&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.8001 (E2+Tiny)&lt;/td&gt;
      &lt;td&gt;0.66409 (tanh(E))&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;-0.035907 (F-G)&lt;/td&gt;
      &lt;td&gt;0.0006447 (H^2 / 2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;-0.0201 ((I - I2)/Tiny)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;8&lt;/td&gt;
      &lt;td&gt;Faux gradient of ‘bias’:&lt;/td&gt;
      &lt;td&gt;0.4&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;0.6001 (D2+Tiny)&lt;/td&gt;
      &lt;td&gt;0.8001 (B*C+D)&lt;/td&gt;
      &lt;td&gt;0.66409 (tanh(E))&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;-0.035907 (F-G)&lt;/td&gt;
      &lt;td&gt;0.0006447 (H^2 / 2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;-0.0201 ((I - I2)/Tiny)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;9&lt;/td&gt;
      &lt;td&gt;Faux gradient of ‘weight’:&lt;/td&gt;
      &lt;td&gt;0.4&lt;/td&gt;
      &lt;td&gt;0.5001 (C2+Tiny)&lt;/td&gt;
      &lt;td&gt;0.6&lt;/td&gt;
      &lt;td&gt;0.80004 (B*C+D)&lt;/td&gt;
      &lt;td&gt;0.66406 (tanh(E))&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;-0.035941 (F-G)&lt;/td&gt;
      &lt;td&gt;0.0006459 (H^2 / 2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;-0.0080 ((I - I2)/Tiny)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;10&lt;/td&gt;
      &lt;td&gt;Faux gradient of ‘input’:&lt;/td&gt;
      &lt;td&gt;0.4001 (B2+Tiny)&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;0.6&lt;/td&gt;
      &lt;td&gt;0.80005 (B*C+D)&lt;/td&gt;
      &lt;td&gt;0.66406 (tanh(E))&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;-0.035935 (F-G)&lt;/td&gt;
      &lt;td&gt;0.0006457 (H^2 / 2)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;-0.0100 ((I - I2)/Tiny)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tiny&lt;/td&gt;
      &lt;td&gt;0.0001&lt;/td&gt;
      &lt;td&gt;Moved down here to help with readability&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;style&gt;
    div.my_spreadsheet_table_is_next + table td,th {
        padding: 0.3em;
        border: 1px solid #000;
    }
&lt;/style&gt;

&lt;h2 id=&quot;what-is-the-gradient&quot;&gt;What is the gradient?&lt;/h2&gt;

&lt;p&gt;Firstly: what is the gradient? It is also known as the slope, derivative, or velocity of an equation.&lt;/p&gt;

&lt;p&gt;For a simple example, consider tides in a river mouth:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;At high tide (maximum position), the water is still (0 velocity).&lt;/li&gt;
  &lt;li&gt;Then, half-way from high to low tide (0 position), the water is rushing out (maximum positive velocity).
This is the time when the waves are biggest and my friend almost drowned the other day on his jet ski, but that’s a story for another day!&lt;/li&gt;
  &lt;li&gt;Then, at low tide (minimum position), the water is still again (0 velocity).&lt;/li&gt;
  &lt;li&gt;Then, half-way from low to high tide (0 position again), the water is rushing in (maximum negative velocity).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this analogy, the height of the water is the position (like the values for the weights, bias, net, output, or loss),
and the velocity of the water is the &lt;em&gt;gradient&lt;/em&gt; (or derivative, or slope). Figuring out that gradient is what this article is all about.&lt;/p&gt;

&lt;p&gt;For a more thorough explanation of gradients, &lt;a href=&quot;https://en.wikipedia.org/wiki/Slope#Calculus&quot;&gt;check out Wikipedia&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;why-do-we-want-to-know-the-gradients&quot;&gt;Why do we want to know the gradients?&lt;/h2&gt;

&lt;p&gt;The reason we want the gradients of a neuron’s weight(s) and bias, is that we can use them to figure out whether we need to nudge
their values up or down a bit or leave them as-is, in order to get an output that’s closer to the target during training.&lt;/p&gt;

&lt;h2 id=&quot;faking-a-gradient&quot;&gt;Faking a gradient&lt;/h2&gt;

&lt;p&gt;You can fake a gradient by comparing the result of an equation vs the result when adding a tiny amount to the input.
These faux gradients are helpful for verifying our calculus later.&lt;/p&gt;

&lt;p&gt;Here’s the general way to fake a gradient:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Faux gradient of f(x) = ( f(x + tiny) - f(x) ) / tiny
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To make it more specific to our neuron:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Faux gradient of how weight affects output = (
    tanh(input * (weight + tiny) + bias) -
    tanh(input * weight + bias)
) / tiny
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Or the full kahuna on the loss function:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Faux gradient of how bias affects loss = (
    (tanh(input * weight + (bias + tiny)) - target)^2 / 2 
    -
    (tanh(input * weight + bias) - target)^2 / 2
) / tiny
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Please note that the loss function changed vs the previous article (it now has a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/ 2&lt;/code&gt;) - this is to make the calculus simpler.&lt;/p&gt;

&lt;p&gt;You can look at rows 6 through 10 in the spreadsheet to see how these faux gradients are calculated. In columns B to I, various
things have the tiny value added to them, to see how this affects the final ‘loss’. For instance, on row 6, you can see I’m adding
the tiny value to the output, then feeding that through to the loss function, and doing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(loss with tiny - loss without tiny) / tiny&lt;/code&gt; to
calculate the faux gradient. The rest of these faux gradients are similar.&lt;/p&gt;

&lt;h2 id=&quot;real-gradients-with-calculus&quot;&gt;Real gradients with calculus&lt;/h2&gt;

&lt;p&gt;Lets use calculus to calculate the real gradients. Firstly we need to calculate the ‘local’ gradients. See row 3 in the spreadsheet as you follow along:&lt;/p&gt;

&lt;p&gt;What is a local gradient? Since all our calculations are performed in stages (eg net &amp;gt; output &amp;gt; error &amp;gt; loss), a local gradient is how much impact changes in one stage have on the next stage.&lt;/p&gt;

&lt;p&gt;A better maths teacher than I would be able to explain how we arrive at the following, but here are the formulas below:&lt;/p&gt;

&lt;h3 id=&quot;local-gradient-equations&quot;&gt;Local gradient equations&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;(Note when I say ‘the gradient of Y with respect to X’ it means that X is the input/earlier stage, Y is the output/later stage, and it roughly means
‘if you nudge X, what impact will that have on Y?’.)&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Input (gradient of Net with respect to Input) = Weight (see B3)&lt;/li&gt;
  &lt;li&gt;Weight (gradient of Net with respect to Weight) = Input (see C3)&lt;/li&gt;
  &lt;li&gt;Bias (gradient of Net with respect to Bias) = 1 (see D3)&lt;/li&gt;
  &lt;li&gt;Net (gradient of Output with respect to Net) = 1 - Output^2 (see E3)&lt;/li&gt;
  &lt;li&gt;Output (gradient of Error with respect to Output) = Error (see F3)&lt;/li&gt;
  &lt;li&gt;Error (gradient of Loss with respect to Error) = Error (this is where the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/ 2&lt;/code&gt; in our loss helps) (see H3)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;global-gradients&quot;&gt;Global gradients&lt;/h3&gt;

&lt;p&gt;Next we need to combine the gradients using the calculus ‘chain rule’, so that we can get the impacts of each variable on the loss.&lt;/p&gt;

&lt;p&gt;These are calculated in reverse order (this is why it is called _back_propagation) because most of these rely on the next step’s gradient.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Output (gradient of Loss with respect to Output) = Output (See F4)&lt;/li&gt;
  &lt;li&gt;Net (gradient of Loss with respect to Net) = (1 - Output^2) * Output global gradient (See E4)&lt;/li&gt;
  &lt;li&gt;Bias (gradient of Loss with respect to Bias) = Net global gradient (See D4)&lt;/li&gt;
  &lt;li&gt;Weight (gradient of Loss with respect to Weight) = Input * Net global gradient (See C4)&lt;/li&gt;
  &lt;li&gt;Input (gradient of Loss with respect to Input) = Weight * Net global gradient (See B4)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You may like to compare these with the respective faux gradients and see that they are (roughly) the same.&lt;/p&gt;

&lt;p&gt;And there you have it, you have the gradients for a single neuron. Next I’ll explain how to use these gradients for training!&lt;/p&gt;

&lt;h2 id=&quot;unnecessary-rust-implementation&quot;&gt;Unnecessary Rust implementation&lt;/h2&gt;

&lt;p&gt;Just for the hell of it, here’s an implementation in Rust:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;struct Neuron {
    input: f32,
    weight: f32,
    bias: f32,
    target: f32,
}

impl Neuron {
    fn net(&amp;amp;self) -&amp;gt; f32 {
        self.input * self.weight + self.bias
    }
    fn output(&amp;amp;self) -&amp;gt; f32 {
        self.net().tanh()
    }
    fn error(&amp;amp;self) -&amp;gt; f32 {
        self.output() - self.target
    }
    fn loss(&amp;amp;self) -&amp;gt; f32 {
        let e = self.error();
        e * e / 2.
    }
    fn output_gradient(&amp;amp;self) -&amp;gt; f32 {
        self.error()
    }
    fn net_gradient(&amp;amp;self) -&amp;gt; f32 {
        let o = self.output();
        let net_local_derivative = 1. - o * o;
        net_local_derivative * self.output_gradient()
    }
    fn bias_gradient(&amp;amp;self) -&amp;gt; f32 {
        self.net_gradient()
    }
    fn weight_gradient(&amp;amp;self) -&amp;gt; f32 {
        self.input * self.net_gradient()
    }
}

fn main() {
    let neuron = Neuron {
        input: 0.4,
        weight: 0.5,
        bias: 0.6,
        target: 0.7,
    };
    println!(&quot;Weight gradient: {:.4}&quot;, neuron.weight_gradient());
    println!(&quot;Bias gradient: {:.4}&quot;, neuron.bias_gradient());
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which outputs:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Weight gradient: -0.0080
Bias gradient: -0.0201
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which matches the spreadsheet nicely!&lt;/p&gt;

&lt;p&gt;Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!&lt;/p&gt;

&lt;p&gt;Photo by Chinnu Indrakumar on Unsplash&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Neural Networks explained with spreadsheets, 1: A single neuron</title>
   <link href="http://www.splinter.com.au/2024/03/10/neural-networks-1/"/>
   <updated>2024-03-10T00:00:00+11:00</updated>
   <id>http://www.splinter.com.au/2024/03/10/neural-networks-1/neural-networks-1</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/2024/neuron.jpg&quot; alt=&quot;Neural Networks explained with spreadsheets - 1&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hi all, I’d like to do a series on neural networks (or machine learning, or AI), starting from the very basics, not using any frameworks. This is inspired by &lt;a href=&quot;https://www.youtube.com/watch?v=VMj-3S1tku0&quot;&gt;Andrej Karpathy’s intro video here&lt;/a&gt; so perhaps consider watching that (if you can find a few hours spare!). I seem to be writing about maths a lot lately, which gave me an idea: everyone understands spreadsheets (Excel / Pages / Google Sheets), so I’m going to use them to (hopefully!) make the maths clearer.&lt;/p&gt;

&lt;h2 id=&quot;explanation&quot;&gt;Explanation&lt;/h2&gt;

&lt;p&gt;I want to make ‘what is a neuron’ concrete in some way, to give you a ‘scaffold’ to build your learning on, I believe that helps.&lt;/p&gt;

&lt;p&gt;So: say you want to define a mathematical formula for ‘how much is a square block of land worth’. It has two inputs: width and length. It might look like this:&lt;/p&gt;

&lt;p&gt;Land price($) = width(m) * length(m) * 200 + 100000&lt;/p&gt;

&lt;p&gt;You could call this a function: value(s) in, value out. A machine learning neuron is just one of these: It takes some input(s), does some maths with them, and outputs a value. And a massive grid of these neurons all connected together can achieve surprisingly complex results.&lt;/p&gt;

&lt;h2 id=&quot;maths&quot;&gt;Maths&lt;/h2&gt;

&lt;p&gt;Here’s how the maths behind a single neuron works. There’s not much to it:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Net input = Input 1 * Weight 1  +  Input 2 * Weight 2  +  Bias
Output = tanh(Net input)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;spreadsheet&quot;&gt;Spreadsheet&lt;/h2&gt;

&lt;p&gt;Please click &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1wRXAgKiUSwi3ty9zs1K3exoImrgJfSWxyLKXZr8Ba1E/edit?usp=sharing&quot;&gt;here to see the above in spreadsheet form&lt;/a&gt;. I tried embedding a nice JS spreadsheet but it didn’t work on mobile, thus the google sheets link. In case that doesn’t work, it looks like so:&lt;/p&gt;

&lt;div class=&quot;my_spreadsheet_table_is_next&quot;&gt;&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;A&lt;/th&gt;
      &lt;th&gt;B&lt;/th&gt;
      &lt;th&gt;C&lt;/th&gt;
      &lt;th&gt;D&lt;/th&gt;
      &lt;th&gt;E&lt;/th&gt;
      &lt;th&gt;F&lt;/th&gt;
      &lt;th&gt;G&lt;/th&gt;
      &lt;th&gt;H&lt;/th&gt;
      &lt;th&gt;I&lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;th&gt;1&lt;/th&gt;
      &lt;th&gt;Input 1&lt;/th&gt;
      &lt;th&gt;Input 2&lt;/th&gt;
      &lt;th&gt;Weight 1&lt;/th&gt;
      &lt;th&gt;Weight 2&lt;/th&gt;
      &lt;th&gt;Bias&lt;/th&gt;
      &lt;th&gt;Net&lt;/th&gt;
      &lt;th&gt;Output&lt;/th&gt;
      &lt;th&gt;Target&lt;/th&gt;
      &lt;th&gt;Loss&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;0.9&lt;/td&gt;
      &lt;td&gt;0.8&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;0.6&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;=A2 * C2 + B2 * D2 + E2&lt;/td&gt;
      &lt;td&gt;=tanh(F2)&lt;/td&gt;
      &lt;td&gt;0.8&lt;/td&gt;
      &lt;td&gt;=(G2 - H2)^2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;style&gt;
    div.my_spreadsheet_table_is_next + table td,th {
        padding: 0.3em;
        border: 1px solid #000;
    }
&lt;/style&gt;

&lt;h2 id=&quot;explanation-1&quot;&gt;Explanation&lt;/h2&gt;

&lt;p&gt;You may be wondering what ‘tanh’ is. It’s a &lt;a href=&quot;https://en.wikipedia.org/wiki/Hyperbolic_functions&quot;&gt;hyperbolic tangent&lt;/a&gt;, which neatly squashes the net and spits out a value between -1 and 1. This is called the ‘activation function’ - there are other options (eg the logistic function) that can be used instead.&lt;/p&gt;

&lt;p&gt;Initial values for the weights and bias are random numbers in the range -1..1. They are tweaked in the learning process, which I’ll explain in an upcoming article. The collection of weights and biases are also called the &lt;em&gt;parameters&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Loss is used to calculate how ‘good’ a neural network is at calculating the desired target. It will be always positive, and the closer to zero the better. In this simple example, it is calculated as the square of the output-vs-target delta:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;loss = (output - target)^2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;obligatory-rust&quot;&gt;Obligatory Rust&lt;/h2&gt;

&lt;p&gt;Because I enjoy fooling around with Rust, here’s a little demo, perhaps this will solidify the concepts from a developer’s perspective:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;struct Neuron {
    input1: f32,
    input2: f32,
    weight1: f32,
    weight2: f32,
    bias: f32,
}

impl Neuron {
    fn net(&amp;amp;self) -&amp;gt; f32 {
        self.input1 * self.weight1 +
        self.input2 * self.weight2 + 
        self.bias
    }
    fn output(&amp;amp;self) -&amp;gt; f32 {
        self.net().tanh()
    }
    fn loss(&amp;amp;self, target: f32) -&amp;gt; f32 {
        let delta = target - self.output();
        delta * delta
    }
}

fn main() {
    let neuron = Neuron {
        input1: 0.1,
        input2: 0.2,
        weight1: 0.3,
        weight2: 0.4,
        bias: 0.5,
    };
    println!(&quot;Net: {:.3}&quot;, neuron.net());
    println!(&quot;Output: {:.3}&quot;, neuron.output());
    println!(&quot;Loss: {:.3}&quot;, neuron.loss(0.5));
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;/2024/03/20/neural-networks-2/index.html&quot;&gt;My next article explains how to calculate the gradients of the inputs and weights, as a prerequisite for adjusting them when learning.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!&lt;/p&gt;

&lt;p&gt;Photo by Josh Riemer on Unsplash&lt;/p&gt;
</content>
 </entry>
 
 
</feed>