<feed xmlns="http://www.w3.org/2005/Atom" xml:base="https://blog.ltgt.net">
<title type="html">Thomas Broyer</title>
<link href="/" />
<link rel="self" href="https://blog.ltgt.net/rss.xml" />
<updated>2026-04-17T05:56:11+0000</updated>
<author>
  <name>Thomas Broyer</name>
  <uri>https://github.com/tbroyer</uri>
</author>
<id>http://blog.ltgt.net/rss.xml</id>
<link rel="license" type="application/rdf+xml"
      href="http://creativecommons.org/licenses/by/3.0/rdf" />
<rights type="html"><![CDATA[
Copyright © Thomas Broyer, blog.ltgt.net
]]></rights>


<entry>
  <title type="html">How do HTML event handlers work?</title>
  <link href="/html-event-handlers/" />
  <published>2024-11-05T00:00:00+0000</published>
  <updated>2024-11-10T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/html-event-handlers/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/how-do-html-event-handlers-work-nok/comments" />
  
  <content type="html" xml:base="/html-event-handlers/"><![CDATA[
    <p><a href="https://html.spec.whatwg.org/multipage/webappapis.html#eventhandler">HTML event handlers</a> are those <code>onxxx</code> attributes and properties many of us are used to, but do you know how they actually work?
If you're writing custom elements and would like them to have such event handlers, what would you have to do? And what would you possibly be unable to implement? What differences would there be from <em>native</em> event handlers?</p>
<p>Before diving in: if you just want something usable, I wrote <a href="https://github.com/tbroyer/webfeet" title="The Webfeet library on GitHub">a library</a> that implements all this (<a href="/web-component-properties/">and more</a>) but first <a href="#recap">jump to the conclusion</a> for the limitations; otherwise, read on.</p>
<h2 id="high-level-overview">High-level overview</h2>
<p>Before all, an event handler is a property on an object whose name starts with <code>on</code> (followed by the event type) and whose value is a JS function (or null).
When that object is an element, the element also has a similarly-named attribute whose value will be parsed as JavaScript, with a variable named <code>event</code> whose value will be the current event being handled, and that can return <code>false</code> to cancel the event (how many times have we seen those infamous <code>oncontextmenu=&quot;return false&quot;</code> to <em>disable right click</em>?)</p>
<p>Setting an event handler is equivalent to adding a listener (removing the previous one if any) for the corresponding event type.</p>
<p>Quite simple, right? but the devil lies in the details!</p>
<p>(fwiw, there are two special kinds of event handlers, <code>onerror</code> and <code>onbeforeunload</code>, that I won't talk about the here.)</p>
<h2 id="in-details">In details</h2>
<p>Let's go through those details the devil hides in (in no particular order).</p>
<h3 id="globality">Globality</h3>
<p>All built-in event handlers on elements are <em>global</em>, and available on every element (actually, every <code>HTMLElement</code>; that excludes SVG and MathML elements).
This include custom elements so you won't need to implement, e.g., an <code>onclick</code> yourself, it's built into every element.
This also implies that as new event handlers are added to HTML in the future, they might conflict with your own event handlers for a <em>custom event</em> (this is also true of properties and methods that could later be added to the <code>Node</code>, <code>Element</code> and <code>HTMLElement</code> interfaces though).</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Custom elements already have all "native" event handlers built-in.</aside>
<p>Conversely, this <em>globality</em> isn't something you'll be able to implement for a <em>custom event</em>: you can create an <code>onfoo</code> event handler on your custom element, but you won't be able to put an <code>onfoo</code> on a <code>&lt;div&gt;</code> element and expect it to do anything useful.
(Technically, you possibly could <em>monkey-patch</em> the <code>HTMLElement.prototype</code> and use a <code>MutationObserver</code> to detect the attribute, but you'll still miss attributes on detached elements and, well, <em>monkey-patching</em>… do I need to say more?)</p>
<p>To avoid forward-incompatibility (be future-proof) you might want to name your event handler with a dash or other non-ASCII character in its attribute name, and maybe an uppercase character in its property name.
When <a href="https://github.com/WICG/webcomponents/issues/1029">custom attributes</a> are a thing, then maybe this will also allow having such an attribute globally available on all elements.
Not sure it's a good idea, if you ask me I think I'd just use a <em>simple</em> name and hope HTML won't add a conflicting one in the future.</p>
<h3 id="return-value">Return value</h3>
<p>We briefly talked about the return value of the event handler function above: if it returns <code>false</code> then the event will be cancelled.</p>
<p>It happens that we're talking about the exact <code>false</code> value here, not just any <em>falsy</em> value.</p>
<p>Fwiw, by <em>cancelled</em> here, we mean just as if the event handler function had called <code>event.preventDefault()</code>.</p>
<h3 id="listener-ordering">Listener ordering</h3>
<p>When you set an event handler, it adds an event listener for the corresponding event, so if you set it in between two <code>element.addEventListener()</code>, it'll be called in between the event listeners.</p>
<p>Now if you set it to another value later on, it won't actually remove the listener for the previous value and add one for the new value; it will actually reuse the existing listener!
This was likely some kind of optimization in browser engines in the past (from the time of Internet Explorer or even Netscape I suppose), but as websites relied on it it's now part of the spec.</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">const</span> events <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
element<span class="token punctuation">.</span><span class="token function">addEventListener</span><span class="token punctuation">(</span><span class="token string">"click"</span><span class="token punctuation">,</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> events<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token string">"click 1"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
element<span class="token punctuation">.</span><span class="token function-variable function">onclick</span> <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token string">"replaced below"</span><span class="token punctuation">;</span> <span class="token comment">// starts listening</span>
element<span class="token punctuation">.</span><span class="token function">addEventListener</span><span class="token punctuation">(</span><span class="token string">"click"</span><span class="token punctuation">,</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> events<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token string">"click 3"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
element<span class="token punctuation">.</span><span class="token function-variable function">onclick</span> <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> events<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token string">"click 2"</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// doesn't reorder the listeners</span>
element<span class="token punctuation">.</span><span class="token function">click</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span>events<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment">// → ["click 1", "click 2", "click 3"]</span></code></pre>
<p>If you remove an event handler (set the property to <code>null</code> –wait, there's more about it, see <a href="#non-function-property-values">below</a>– or remove the attribute), the listener will be removed though.
So if for any reason you want to make sure an event handler is added to the end of the listeners list, then first remove any previous value then set your own.</p>
<h3 id="non-function-property-values">Non-function property values</h3>
<p>We talked about <em>setting an event handler</em> and <em>removing an event handler</em> already, but even there there are small details to account for.</p>
<p>When you set an event handler's property, any object value (which include functions) will <em>set</em> the event handler (and possibly add an event listener).
When an event is dispatched, only function values will have any useful effect, but any object can be used to activate the corresponding event listener (and possibly later be replaced with a function value without reordering the listeners).</p>
<p>Conversely, any non-object, non-function value will be <a href="/web-component-properties/#type-coercion"><em>coerced</em></a> to <code>null</code> and will <em>remove</em> the event handler.</p>
<p>This means that <code>element.onclick = new Number(42)</code> <em>sets</em> the event handler (to some <em>useless</em> value, but still starts listening to the event), and <code>element.onclick = 42</code> <em>removes</em> it (and <code>element.onclick</code> then returns <code>null</code>).</p>
<h3 id="invalid-attribute-values-lazy-evaluation">Invalid attribute values, lazy evaluation</h3>
<p>Attribute values are never <code>null</code>, so they always <em>set</em> an event handler (to <em>remove</em> it, remove the attribute).
They're also evaluated lazily: invalid values (that can't be parsed as JavaScript) will be stored internally until they're needed (either the property is read, or an event is dispatched that should execute the event handler), at which point they'll be tentatively evaluated.</p>
<p>When the value cannot be parsed as JavaScript, an error is reported (to <code>window.onerror</code> among others) and the event handler is replaced with <code>null</code> but <em>won't</em> remove the event handler!
(so yes, you can have an event handler property returning <code>null</code> while having it listen to the event, and not have the listener be reordered when set to another value)</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">const</span> events <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
element<span class="token punctuation">.</span><span class="token function">addEventListener</span><span class="token punctuation">(</span><span class="token string">"click"</span><span class="token punctuation">,</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> events<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token string">"click 1"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
element<span class="token punctuation">.</span><span class="token function">setAttribute</span><span class="token punctuation">(</span><span class="token string">"onclick"</span><span class="token punctuation">,</span> <span class="token string">"}"</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// invalid, but starts listening</span>
console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span>element<span class="token punctuation">.</span>onclick<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// reports an error and logs null, but doesn't stop listening</span>
element<span class="token punctuation">.</span><span class="token function">addEventListener</span><span class="token punctuation">(</span><span class="token string">"click"</span><span class="token punctuation">,</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> events<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token string">"click 3"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
element<span class="token punctuation">.</span><span class="token function-variable function">onclick</span> <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> events<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token string">"click 2"</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// doesn't reorder the listeners</span>
element<span class="token punctuation">.</span><span class="token function">click</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span>events<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment">// → ["click 1", "click 2", "click 3"]</span></code></pre>
<p>The error reports the original location of the value, that is the <code>setAttribute()</code> call in a script, or even the attribute in the HTML, even though the value is actually evaluated much later.
This is something that I don't think could be implemented in userland.</p>
<h3 id="scope">Scope</h3>
<p>We've said above that an <code>event</code> variable is available in the script set as an attribute value, but that's not the only <em>variable</em> in scope:
every property of the current element is directly readable as a variable as well.
Also in scope are properties of the associated <code>form</code> element if the element is <em>form-associated</em>, and properties of the <code>document</code>.</p>
<p>This means that <code>&lt;a onclick=&quot;alert(href)&quot;</code> will show the link's target URL, <code>&lt;button onclick=&quot;alert(action)&quot;&gt;</code> will show the form's target URL (as a side effect, you can also refer to other form elements by name), and <code>&lt;span onclick=&quot;alert(location)&quot;&gt;</code> will show the document's URL.</p>
<p>This is more or less equivalent to evaluating the attribute value inside this:</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">with</span> <span class="token punctuation">(</span>document<span class="token punctuation">)</span> <span class="token punctuation">{</span>
  <span class="token keyword">with</span> <span class="token punctuation">(</span>element<span class="token punctuation">.</span>form<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">with</span> <span class="token punctuation">(</span>element<span class="token punctuation">)</span> <span class="token punctuation">{</span>
      <span class="token comment">// evaluate attribute value here</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>Related to scope too is the script's <em>base URL</em> that would be used when <code>import()</code>ing modules with a relative URL.
Browsers seem to behave differently already on that: Firefox resolves the path relative to the document URL, whereas Chrome and Safari fail to resolve the path to a URL (as if there was no base URL at all).
I don't think anything can be done here in a userland implementation.</p>
<h3 id="function-source-text">Function source text</h3>
<p>When the event handler has been set through an attribute, the function returned by the event handler property has a very specific <em>source text</em> (which is exposed by its <code>.toString()</code>), which is close to, but not exactly the same as what <code>new Function(&quot;event&quot;, attrValue)</code> would do (declaring a function with an <code>event</code> argument and the attribute's value as its body).</p>
<p>You couldn't directly use <code>new Function(&quot;event&quot;, attrValue)</code> anyway due to the <a href="#scope">scope</a> you need to setup, but there's a trick to control the exact source text of a function so this isn't insurmoutable:</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">const</span> handlerName <span class="token operator">=</span> <span class="token string">"onclick"</span>
<span class="token keyword">const</span> attrValue <span class="token operator">=</span> <span class="token string">"return false;"</span>
<span class="token keyword">const</span> fn <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token template-string"><span class="token template-punctuation string">`</span><span class="token string">return function </span><span class="token interpolation"><span class="token interpolation-punctuation punctuation">${</span>handlerName<span class="token interpolation-punctuation punctuation">}</span></span><span class="token string">(event) {\n</span><span class="token interpolation"><span class="token interpolation-punctuation punctuation">${</span>attrValue<span class="token interpolation-punctuation punctuation">}</span></span><span class="token string">\n}</span><span class="token template-punctuation string">`</span></span><span class="token punctuation">)</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span>fn<span class="token punctuation">.</span><span class="token function">toString</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token comment">// → "function onclick(event) {\nreturn false;\n}"</span></code></pre>
<h3 id="content-security-policy">Content Security Policy</h3>
<p>Last, but not least, event handler attribute values are rejected early by a Content Security Policy (CSP): the violation will be reported as soon as the attribute is tentatively set, and this won't have any effect on the state of the event handler (that could have been set through the property).</p>
<p>The CSP directive that controls event handler attributes is <code>script-src-attr</code> (which falls back to <code>script-src</code> if not set, or to <code>default-src</code>).
When implementing an event handler for a <em>custom event</em> in a custom element, the attribute value will have to be evaluated by scripting though (through <code>new Function()</code> most likely) so it will be controlled by <code>script-src</code> that will have to include either an appropriate hash source, or <code>'unsafe-eval'</code> (notice the difference from native event handlers that would use <code>'unsafe-inline'</code>, not <code>'unsafe-eval'</code>).
Hash sources will be a problem though, because you'll have to evaluate not just the attribute's value, but a script that embeds the attribute's value (to set up the <a href="#scope">scope</a> and <a href="#function-source-text">source text</a>).
And you'd have to actually evaluate both to make sure the attribute value doesn't mess with your evaluated script (think SQL injection but on JavaScript syntax).
This would mean that each event handler attribute would have to have two hash sources allowed in the <code>script-src</code> CSP directive, one of them being dependent on the custom element's implementation of the event handler.</p>
<p>An alternative would be to use a native event handler for parsing, but then the function would have that native event handler as its <a href="#function-source-text">function name</a>, and you'd have to make sure to use an element associated with the same form (if not using the custom element directly because e.g. you don't want to trigger mutation observers) to get the appropriate variables <a href="#scope">in scope</a>.</p>
<h2 id="recap">Recap: What does it mean for custom event handlers?</h2>
<p>As seen above, it's not possible to fully implement event handlers for a custom event in a way that would make it indistinguishable from <em>native</em> event handlers:</p>
<ul>
<li>they won't be globally available on every element (except maybe in the future with <em>custom attributes</em>)</li>
<li>a Content Security Policy won't be able to use <code>script-src-attr</code> on those custom event handlers, and if it uses hash sources, chances are that 2 hash sources will be need for each attribute value (one of them being dependent on the custom event handler implementation details)</li>
<li>errors emitted by the scripts used as event handler attribute values won't point to the source of the attribute value</li>
<li>an <code>import()</code> with a relative URL, inside an event handler attribute value, won't behave the same as in a <em>native</em> event handler</li>
</ul>
<p>The first point alone (or the first two) might make one reevaluate the need for adding such event handlers at all.
And if you're thinking about only implementing the property, think about what it brings compared to <em>just</em> having users call <code>addEventListener()</code>.</p>
<p>That being said, <a href="https://github.com/tbroyer/webfeet" title="The Webfeet library on GitHub">I did the work</a> (more as an exercise than anything else), so feel free to go ahead a implement event handlers for your custom elements.</p>

  ]]></content>
</entry>

<entry>
  <title type="html">Making Web Component properties behave closer to the platform</title>
  <link href="/web-component-properties/" />
  <published>2024-01-21T00:00:00+0000</published>
  <updated>2024-02-25T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/web-component-properties/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/making-web-component-properties-behave-closer-to-the-platform-c1n/comments" />
  
  <content type="html" xml:base="/web-component-properties/"><![CDATA[
    <p>Built-in HTML elements' properties all share similar behaviors, that don't come <em>for free</em> when you write your own custom elements. Let's see <a href="#what">what</a> those behaviors are, <a href="#why">why</a> you'd want to implement them in your web components, and <a href="#how">how</a> to do it, including how some web component libraries actually don't allow you to mimic those behaviors.</p>
<h2 id="what">Built-in elements' behaviors</h2>
<p>I said it already: built-in elements' properties all share similar behaviors, but there are actually several different such shared behaviors. First, there are properties (known as <em>IDL attributes</em> in the HTML specification) that <em>reflect</em> attributes (also known as <em>content attributes</em>); then there are other properties that are unrelated to attributes. One thing you won't find in built-in elements are properties whose value will change if an attribute change, but that <em>won't</em> update the attribute value when they are changed themselves (in case you immediately thought of <code>value</code> or <code>checked</code> as counter-examples, the situation is actually a bit more complex: those attributes are reflected by the <code>defaultValue</code> and <code>defaultChecked</code> properties respectively, and the <code>value</code> and <code>checked</code> properties are based on an internal state and behave differently depending on whether the user already interacted with the element or not).</p>
<h3 id="type-coercion">Type coercion</h3>
<p>But I'll start with another aspect that is shared by all of them, whether reflected or not: typing. DOM interfaces are defined using <a href="https://webidl.spec.whatwg.org">WebIDL</a>, that has types and <em>extended annotations</em>, and defines mapping of those to JavaScript. <a href="https://tc39.es/ecma262/#sec-ecmascript-language-types">Types in JavaScript</a> are rather limited: null, undefined, booleans, IEEE-754 floating-point numbers, big integers, strings, symbols, and objects (including errors, functions, promises, arrays, and typed arrays). WebIDL on the other hand defines, among others, 13 different numeric types (9 integer types and 4 floating point ones) that can be further annotated to change their overflowing behavior, and several string types (including enumerations).</p>
<p>The way those types are experienced by developers is that getting the property will always return a value of the defined type (that's easy, the element <em>owns</em> the value), and setting it (if not read-only) will coerce the assigned value to the defined type. So if you want your custom element to <em>feel</em> like a built-in one, you'll have to define a setter to coerce the value to some specific type. The underlying question is what should happen if someone assigns a value of an unexpected type or outside the expected value space?</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Convert and validate the new value in a property custom setter.</aside>
<p>You probably don't want to use the exact WebIDL coercion rules though, but similar, approximated, rules that will behave the same most of the time and only diverge on some edge cases. The reason is that WebIDL is really weird: for instance, by default, numeric values overflow by wrapping around, so assigning 130 to a <code>byte</code> (whose value space ranges from -128 to 127) will coerce it to… -126! (128 wraps to -128, 129 to -127, and 130 to -126; and by the way 256 wraps to 0; for the curious, <code>BigInt.asIntN</code> and <code>BigInt.asUintN</code> will do such wrapping in JS, but you'll have to convert numbers to <code>BigInt</code> and back); non-integer values assigned to integer types are truncated by default, except when the type is annotated with <code>[Clamp]</code>, in which case they're rounded, with half-way values rounded towards even values (something that only happens <em>natively</em> in JS when setting such non-integer values to typed arrays: <code>Math.round(2.5)</code> is 3, but <code>Int8Array.of(2.5)[0]</code> is 2).</p>
<p>Overall, I feel like, as far as primitive/simple types are concerned, boolean, integers, double (not float), string (WebIDL's <code>DOMString</code>), and enumerations are all that's needed; truncating (or rounding, but with JavaScript rules), and clamping or enforcing ranges for integers. In other words, wrapping integers around is just weird, and what matters is coercing to the appropriate type and value space. Regarding enumerations, they're probably best handled by the reflection rules though (see below), and treated only as strings: no single built-in element has a property of a type that's a WebIDL <code>enum</code>.</p>
<h3 id="reflected-properties">Reflected properties</h3>
<p>Now let's get back to reflected properties: most properties of built-in elements <a href="https://html.spec.whatwg.org/multipage/common-dom-interfaces.html#reflecting-content-attributes-in-idl-attributes" title="HTML Living Standard: Reflecting content attributes in IDL attributes">reflect attributes</a> or similarly (but with specific rules) correspond to an attribute and change its value when set; non-reflected properties are those that either expose some internal state (e.g. the current value or validation state of a form field), computed value (from the DOM, such as the <code>selectedIndex</code> of a <code>select</code>, or the <code>cellIndex</code> of a table cell) or direct access to DOM elements (elements of a form, rows of a table, etc.), or that access other reflected properties with a transformed value (such as the <code>valueAsDate</code> and <code>valueAsNumber</code> of <code>input</code>). So if you want your custom element to <em>feel</em> like a built-in one, you'll want to use similar <em>reflection</em> wherever appropriate.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Have your properties reflect attributes by default.</aside>
<p>The way reflection is defined is that the source of truth is the attribute value: getting the property will actually parse the attribute value, and setting the property will <em>stringify</em> the value into the attribute. Note that this means possibly setting the attribute to an <em>invalid</em> value that will be <em>corrected</em> by the getter. An example of this is setting the <code>type</code> property of an <code>input</code> element to an unknown value: it will be reflected in the attribute as-is, but the getter will correct it <code>text</code>. Another example where this is required behavior is with dependent attributes like those of <code>progress</code> or <code>meter</code> elements: without this you'd have to be very careful setting properties in the <em>right order</em> to avoid invalid combinations and having your set value immediately rewritten, but this behavior makes it possible to update properties in any order as the interaction between them are resolved internally and exposed by the getters: you can for example set the <code>value</code> to a value upper than <code>max</code> (on getting, <code>value</code> would be normalized to its default value) and then update the <code>max</code> (on getting, value could now return the value you previously set, because it wasn't actually rewritten on setting). Actually, these are not <em>technically</em> reflected then as they have specific rules, but at least they're consistent with <em>actual</em> reflected properties; for the purpose of this article, I'll consider them as reflected properties though.</p>
<p>This is at least how it <em>theoretically</em> works; in practice, the parsed value can be <em>cached</em> to avoid parsing every time the property is read; but note that there can be several properties reflecting the same attribute (the most known one probably being <code>className</code> and <code>classList</code> both reflecting the <code>class</code> attribute). Reflected properties can also have additional options, depending on their type, that will change the behavior of the getter and setter, not unlike WebIDL extended attributes.</p>
<p>Also note that HTML only defines reflection for a limited set of types (if looking only at primitive/simple types, only non-nullable and nullable strings and enumerations, <code>long</code>, <code>unsigned long</code>, and <code>double</code> are covered, and none of the narrower integer types, big integers, or the <code>unrestricted double</code> that allows <code>NaN</code> and infinity).</p>
<p>You can see how Mozilla tests the compliance of their built-in elements
<a href="https://github.com/mozilla/gecko-dev/blob/master/dom/html/test/reflect.js">in the Gecko repository</a> (the <code>ok</code> and <code>is</code> assertions are defined in their <a href="https://github.com/mozilla/gecko-dev/blob/master/testing/mochitest/tests/SimpleTest/SimpleTest.js"><code>SimpleTest</code></a> testing framework). And here's the Web Platform Tests' <a href="https://github.com/web-platform-tests/wpt/blob/master/html/dom/reflection.js">reflection harness</a>, with data for each built-in element in sibling files, that <a href="https://wpt.fyi/results/html/dom">almost every browser pass</a>.</p>
<h3 id="events">Events</h3>
<p>Most direct changes to properties and attributes don't fire events: user actions or method calls will both update a property <strong>and</strong> fire an event, but changing a property programmatically generally won't fire any event. There are a few exceptions though: the events of type <code>ToggleEvent</code> fired by changes to <a href="https://html.spec.whatwg.org/multipage/popover.html#the-popover-attribute%3Aconcept-element-attributes-change-ext">the <code>popover</code> attribute</a> or <a href="https://html.spec.whatwg.org/multipage/interactive-elements.html#the-details-element%3Aconcept-element-attributes-change-ext">the <code>open</code> attribute of <code>details</code> elements</a>, or the <code>select</code> event when changing the <a href="https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#textFieldSelection%3Adom-textarea%2Finput-selectionstart-2" title="HTML Living Standard: selectionStart attribute's setter"><code>selectionStart</code></a>, <a href="https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#textFieldSelection%3Adom-textarea%2Finput-selectionend-3" title="HTML Living Standard: selectionEnd attribute's setter"><code>selectionEnd</code></a> or <a href="https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#textFieldSelection:dom-textarea/input-selectiondirection-4" title="HTML Living Standard: selectionDirection attribute's setter"><code>selectionDirection</code></a> properties of <code>input</code> and <code>textarea</code> elements (if you know of others, let me know); but notably changing the value of a form element programmatically won't fire a <code>change</code> or <code>input</code> event. So if you want your custom element to <em>feel</em> like a built-in one, don't fire events from your property setters or other attribute changed callbacks, but fire an event <em>when</em> (just after) you programmatically change them.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Don't fire events from your property setters or other attribute changed callbacks.</aside>
<h2 id="why">Why you'd want to implement those</h2>
<p>If you're you (your team, your company) are the only users of the web components (e.g. building an application out of web components, or an <em>internal</em> library of reusable components), then OK, don't use reflection if you don't need it, you'll be the only user anyway so nobody will complain. If you're publicly sharing those components, then my opinion is that, following the principle of least astonishment, you should aim at behaving more like built-in elements, and reflect attributes.</p>
<p>Similarly, for type coercions, if you're the only users of the web components, it's ok to only rely on TypeScript (or Flow or whichever type-checker) to make sure you always pass values of the appropriate type to your properties (and methods), but if you share them publicly then you should in my opinion coerce or validate inputs, in which case you'd want to follow the principe of least astonishment as well, and thus use rules similar to WebIDL and reflection behaviors. This is particularly true for a library that can be used without specific tooling, which is generally the case for custom elements.</p>
<p>For example, all the following design systems can be used without tooling (some of them provide ready-to-use bundles, others can be used through import maps): Google's <a href="https://github.com/material-components/material-web/discussions/5239">Material Web</a>, Microsoft's <a href="https://github.com/microsoft/fluentui/tree/master/packages/web-components">Fluent UI</a>, IBM's <a href="https://carbondesignsystem.com/developing/frameworks/web-components/">Carbon</a>, Adobe's <a href="https://opensource.adobe.com/spectrum-web-components/">Spectrum</a>, Nordhealth's <a href="https://nordhealth.design/web-components/">Nord</a>, <a href="https://shoelace.style/">Shoelace</a>, etc.</p>
<h2 id="how">How to implement them</h2>
<p>Now that we've seen <a href="#what">what</a> we'd want to implement, and <a href="#why">why</a> we'd want to implement it, let's see <em>how</em> to do it. First without, and then <a href="#libraries">with</a> libraries.</p>
<p>I started collecting implementations that <em>strictly</em> follow (as an exercise, not as a goal) the above rules in <a href="https://github.com/tbroyer/custom-elements-reflection-tests">a GitHub repository</a> (strictly because it directly reuses the above-mentioned Gecko and Web Platform Tests harnesses).</p>
<h3 id="vanilla">Vanilla implementation</h3>
<p>In a <em>vanilla</em> custom element, things are rather straightforward:</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">HTMLElement</span> <span class="token punctuation">{</span>
  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> strVal <span class="token operator">=</span> <span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">getAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>strVal<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">setAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">,</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>or with intermediate caching (note that the setter is identical, setting the attribute will trigger the <code>attributeChangedCallack</code> which will close the loop):</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">HTMLElement</span> <span class="token punctuation">{</span>
  #reflected<span class="token punctuation">;</span>

  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">setAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">,</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">static</span> <span class="token keyword">get</span> <span class="token function">observedAttributes</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token punctuation">[</span> <span class="token string">"reflected"</span> <span class="token punctuation">]</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token function">attributeChangedCallback</span><span class="token punctuation">(</span><span class="token parameter">name<span class="token punctuation">,</span> oldValue<span class="token punctuation">,</span> newValue</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token comment">// Note: in this case, we know it can only be the attribute named "reflected"</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected <span class="token operator">=</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>And for a non-reflected property (here, a read-write property representing an internal state):</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">HTMLElement</span> <span class="token punctuation">{</span>
  #nonReflected<span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">nonReflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#nonReflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#nonReflected <span class="token operator">=</span> newValue<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>Because many rules are common to many attributes (the <code>coerceType</code> operation is defined by WebIDL, or using similar rules, and the HTML specification defines a handful of <em>microsyntaxes</em> for the <code>parseValue</code> and <code>stringifyValue</code> operations), those could be packaged up in a helper library. And with decorators <a href="https://github.com/tc39/proposal-decorators">coming to ECMAScript</a> (and already available in TypeScript), those could be greatly simplified:</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">HTMLElement</span> <span class="token punctuation">{</span>
  @reflectInt accessor reflected<span class="token punctuation">;</span>
  @int accessor nonReflected<span class="token punctuation">;</span>
<span class="token punctuation">}</span></code></pre>
<p>I actually built such a library, mostly as an exercise (and I already learned a lot, most of the above details actually). It's currently not published on NPM but you can find it <a href="https://github.com/tbroyer/webfeet" title="The Webfeet library on GitHub">on Github</a></p>
<h3 id="libraries">With a library</h3>
<p>Surprisingly, web component libraries don't really help us here.</p>
<p>First, like many libraries nowadays, most expect people to just pass values of the appropriate types (relying on type checking through TypeScript) and basically leave you handling everything including how to behave in the presence of unexpected values. While it's OK, as we've seen <a href="#why">above</a>, in a range of situations, there are limits to this approach and it's unfortunate that they don't provide tools to make it easier at least coercing types.</p>
<p>Regarding reflected properties, most libraries tend to discourage you from doing it, while (fortunately!) supporting it, if only minimally.</p>
<p>All libraries (that I've looked at) support observed attributes though (changing the attribute value updates the property, but not the other way around), and most default to this behavior.</p>
<p>Now let's dive into the <em>how-to</em> with Lit, <a href="#fast">FAST</a>, and then <a href="#stencil">Stencil</a> (other libraries left as a so-called exercise for the reader).</p>
<h4 id="lit">With Lit</h4>
<p>By default, <a href="https://lit.dev/docs/components/properties/">Lit reactive properties</a> (annotated with <code>@property()</code>) observe the attribute of the same (or configured) name, using a converter to parse the value if needed (by default only handling numbers through a plain JavaScript number coercion, booleans, strings, or possibly objects or arrays through <code>JSON.parse()</code>; but a custom converter can be given). If your property is not associated to any attribute (but needs to be reactive to trigger a render when changed), then you can annotate it with <code>@property({ attribute: false })</code> or <code>@state()</code> (the latter is meant for internal state though, i.e. private properties).</p>
<p>To make a reactive property <a href="https://lit.dev/docs/components/properties/#reflected-attributes">reflect an attribute</a>, you'll add <code>reflect: true</code> to the <code>@property()</code> options, and Lit will use the converter to stringify the value too. This won't be done immediately though, but only as part of Lit's reactive update cycle. This timing is a slight deviation compared to built-in elements that's probably acceptable, but it makes it harder to implement some reflection rules (those that set the attribute to a different value than the one returned by the getter) as the converter will always be called with the property value (returned by the getter, so after normalization). For a component similar to <code>progress</code> or <code>meter</code> with dependent properties, Lit recommends correcting the values in a <code>willUpdate</code> callback (this is where you'd check whether the <code>value</code> is valid with respect to the <code>max</code> for instance, and possibly overwrite its value to bring it in-range); this means that attributes will have the corrected value, and this requires users to update all properties in the same <em>event loop</em> (which will most likely be the case anyway).</p>
<p>It should be noted that, surprisingly, Lit <em>actively</em> discourages reflecting attributes:</p>
<blockquote>
<p>Attributes should generally be considered input to the element from its owner, rather than under control of the element itself, so reflecting properties to attributes should be done sparingly. It's necessary today for cases like styling and accessibility, but this is likely to change as the platform adds features like the <code>:state</code> pseudo selector and the Accessibility Object Model, which fill these gaps.</p>
</blockquote>
<p>No need to say I disagree.</p>
<p>For type coercion and validation, Lit allows you to have <a href="https://lit.dev/docs/components/properties/#accessors-custom" title="Lit documentation: Reactive properties: creating custom property accessors">your own accessors</a> (and version 3 makes it <a href="https://lit.dev/docs/v3/releases/upgrade/#updates-to-lit-decorators" title="Lit 3 upgrade guide: Updates to Lit decorators">even easier</a>), so everything's ok here, particularly for non-reflected properties:</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">LitElement</span> <span class="token punctuation">{</span>
  #nonReflected<span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">nonReflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#nonReflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  @<span class="token function">state</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
  <span class="token keyword">set</span> <span class="token function">nonReflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#nonReflected <span class="token operator">=</span> newValue<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>For those cases where you'd want the attribute to possibly have an <em>invalid</em> value (to be corrected by the property getter), it would mean using a non-reactive property wrapping a private reactive property (this assumes Lit won't flag them as errors in future versions), and parsing the value in its getter:</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">LitElement</span> <span class="token punctuation">{</span>
  @<span class="token function">property</span><span class="token punctuation">(</span><span class="token punctuation">{</span> <span class="token literal-property property">attribute</span><span class="token operator">:</span> <span class="token string">"reflected"</span><span class="token punctuation">,</span> <span class="token literal-property property">reflect</span><span class="token operator">:</span> <span class="token boolean">true</span> <span class="token punctuation">}</span><span class="token punctuation">)</span>
  accessor #reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>

  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token function">parseValue</span><span class="token punctuation">(</span><span class="token keyword">this</span><span class="token punctuation">.</span>#reflected<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected <span class="token operator">=</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>or with intermediate caching (note that the setter is identical):</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">LitElement</span> <span class="token punctuation">{</span>
  @<span class="token function">property</span><span class="token punctuation">(</span><span class="token punctuation">{</span> <span class="token literal-property property">attribute</span><span class="token operator">:</span> <span class="token string">"reflected"</span><span class="token punctuation">,</span> <span class="token literal-property property">reflect</span><span class="token operator">:</span> <span class="token boolean">true</span> <span class="token punctuation">}</span><span class="token punctuation">)</span>
  accessor #reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>

  #parsedReflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#parsedReflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected <span class="token operator">=</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token function">willUpdate</span><span class="token punctuation">(</span><span class="token parameter">changedProperties</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span>changedProperties<span class="token punctuation">.</span><span class="token function">has</span><span class="token punctuation">(</span><span class="token string">"#reflected"</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
      <span class="token keyword">this</span><span class="token punctuation">.</span>#parsedReflected <span class="token operator">=</span> <span class="token function">parseValue</span><span class="token punctuation">(</span><span class="token keyword">this</span><span class="token punctuation">.</span>#reflected<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>It might actually be easier to directly set the attribute from the setter (and as a bonus behaving closer to built-in elements) and only rely on an <em>observed property</em> from Lit's point of view (setting the attribute will trigger <code>attributeChangedCallback</code> and thus Lit's <em>observation</em> code that will use the converter and then set the property):</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">LitElement</span> <span class="token punctuation">{</span>
  @<span class="token function">property</span><span class="token punctuation">(</span><span class="token punctuation">{</span>
    <span class="token literal-property property">attribute</span><span class="token operator">:</span> <span class="token string">"reflected"</span><span class="token punctuation">,</span>
    <span class="token function-variable function">converter</span><span class="token operator">:</span> <span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token function">parseValue</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>
  <span class="token punctuation">}</span><span class="token punctuation">)</span>
  accessor #reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>

  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">setAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">,</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>Note that this is actually very similar to the approach in the vanilla implementation above but using Lit's own lifecycle hooks. It should also be noted that for a <code>USVString</code> that contains a URL (where the attribute value is resolved to a URL relative to the document base URI) the value needs to be processed in the getter (as it depends on an external state –the document base URI– that could change independently from the element).</p>
<details>
<summary>A previous version of this article contained a different implementation that happened to be broken.</summary>
<pre class="language-js"><code class="language-js"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">LitElement</span> <span class="token punctuation">{</span>
  #reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  @<span class="token function">property</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">const</span> stringValue <span class="token operator">=</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// XXX: there might be a more optimized way</span>
    <span class="token comment">// than stringifying and then parsing</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected <span class="token operator">=</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>stringValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// Avoid unnecessarily triggering attributeChangedCallback</span>
    <span class="token comment">// that would reenter that setter.</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">getAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">)</span> <span class="token operator">!==</span> stringValue<span class="token punctuation">)</span> <span class="token punctuation">{</span>
      <span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">setAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">,</span> stringValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>This implementation would for instance have the setter called with <code>null</code> when the attribute is removed, which actually needs to behave differently than user code calling the setter with <code>null</code>: in the former case the property should revert to its default value, in the latter case that <code>null</code> would be coerced to the string <code>&quot;null&quot;</code> or the numeric value <code>0</code> and the attribute would be added back with that value.</p>
</details>
<p>If we're OK only reflecting valid values to attributes, then we can fully use converters but things aren't necessarily simpler (we still need the custom setter for type coercion and validation, and marking the internal property as reactive to avoid triggering the custom setter when the attribute changes; we don't directly deal with the attribute but we now have to <em>normalize</em> the value in the setter in the same way as stringifying it to the attribute and parsing it back, to have the getter return the appropriate value):</p>
<pre class="language-js"><code class="language-js"><span class="token keyword">const</span> customConverter <span class="token operator">=</span> <span class="token punctuation">{</span>
  <span class="token function">fromAttribute</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">,</span>
  <span class="token function">toAttribute</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>

<span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">LitElement</span> <span class="token punctuation">{</span>
  @<span class="token function">property</span><span class="token punctuation">(</span><span class="token punctuation">{</span> <span class="token literal-property property">reflect</span><span class="token operator">:</span> <span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token literal-property property">converter</span><span class="token operator">:</span> customConverter <span class="token punctuation">}</span><span class="token punctuation">)</span>
  accessor #reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token parameter">value</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token comment">// XXX: this should use a more optimized conversion/validation</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>#reflected <span class="token operator">=</span> <span class="token function">parseValue</span><span class="token punctuation">(</span><span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<h4 id="fast">With FAST</h4>
<p>I know <a href="https://www.fast.design/">FAST</a> is not used that much but I wanted to cover it as it seems to be the only library that <a href="https://www.fast.design/docs/fast-element/defining-elements#customizing-attributes" title="FAST Documentation: Building Components: Customizing Attributes">reflects attributes by default</a>. By default it won't do any type coercion unless you use the <code>mode: &quot;boolean&quot;</code>, which works <em>almost</em> like an HTML boolean attribute, except an attribute present but with the value <code>&quot;false&quot;</code> will coerce to a property value of <code>false</code>!</p>
<p>Otherwise, it works more or less like Lit, with one big difference: the converter's <code>fromView</code> is <em>also</em> called when setting the property (this means that <code>fromView</code> receives any <em>external</em> value, not just string values from the attribute). But unfortunately this doesn't really help us as most coercion rules need to throw at one point and we want to do it only in the property setters, never when parsing attribute values; and those rules that don't throw will have possibly different values between the attribute and the property getter (push invalid value to the attribute, sanitize it on the property getter), or just behave differently between the property (e.g. turning a <code>null</code> into <code>0</code> or <code>&quot;null&quot;</code>) and the attribute (where <code>null</code> means the attribute is not set, and the property should then have its default value which could be different from <code>0</code>, and will likely be different from <code>&quot;null&quot;</code>).</p>
<p>This means that in the end the solutions are almost identical to the Lit ones (here using TypeScript's <em>legacy</em> decorators though; and applying the annotation on the <em>private</em> property to avoid triggering the custom setter on attribute change):</p>
<pre class="language-ts"><code class="language-ts"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">FASTElement</span> <span class="token punctuation">{</span>
  <span class="token decorator"><span class="token at operator">@</span><span class="token function">attr</span></span><span class="token punctuation">(</span><span class="token punctuation">{</span> attribute<span class="token operator">:</span> <span class="token string">"reflected"</span> <span class="token punctuation">}</span><span class="token punctuation">)</span>
  <span class="token keyword">private</span> _reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>

  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token function">parseValue</span><span class="token punctuation">(</span><span class="token keyword">this</span><span class="token punctuation">.</span>_reflected<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>_reflected <span class="token operator">=</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>or with intermediate caching (note that the setter is identical):</p>
<pre class="language-ts"><code class="language-ts"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">FASTElement</span> <span class="token punctuation">{</span>
  <span class="token decorator"><span class="token at operator">@</span><span class="token function">attr</span></span><span class="token punctuation">(</span><span class="token punctuation">{</span> attribute<span class="token operator">:</span> <span class="token string">"reflected"</span> <span class="token punctuation">}</span><span class="token punctuation">)</span>
  <span class="token keyword">private</span> _reflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>

  <span class="token keyword">private</span> <span class="token function">_reflectedChanged</span><span class="token punctuation">(</span>oldValue<span class="token punctuation">,</span> newValue<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>_parsedReflected <span class="token operator">=</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">private</span> _parsedReflected<span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>_parsedReflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>__reflected <span class="token operator">=</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>Or if you want immediate reflection to the attribute (the internal property can now be used to store the parsed value):</p>
<pre class="language-ts"><code class="language-ts"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">FASTElement</span> <span class="token punctuation">{</span>
  <span class="token decorator"><span class="token at operator">@</span><span class="token function">attr</span></span><span class="token punctuation">(</span><span class="token punctuation">{</span>
    attribute<span class="token operator">:</span> <span class="token string">"reflected"</span><span class="token punctuation">,</span>
    mode<span class="token operator">:</span> <span class="token string">"fromView"</span><span class="token punctuation">,</span>
    converter<span class="token operator">:</span> <span class="token punctuation">{</span>
      <span class="token function">fromView</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
        <span class="token keyword">return</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span><span class="token punctuation">,</span>
      <span class="token function">toView</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
        <span class="token comment">// mandatory in the converter type</span>
        <span class="token class-name"><span class="token keyword">throw</span></span> <span class="token keyword">new</span> <span class="token class-name">Error</span><span class="token punctuation">(</span><span class="token string">"should never be called"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span><span class="token punctuation">)</span>
  <span class="token keyword">private</span> _reflected<span class="token punctuation">;</span>

  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>_reflected <span class="token operator">??</span> <span class="token string">""</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span><span class="token function">setAttribute</span><span class="token punctuation">(</span><span class="token string">"reflected"</span><span class="token punctuation">,</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>Note that the internal property is not initialized, to avoid calling the converter's <code>fromView</code>, and handled in the getter instead (our <code>fromView</code> expects a string or null coming from the attribute, so we'd have to initialize the property with such a string value which would hurt readability of the code as that could be a value different from the one actually stored in the property and returned by the pblic property getter).</p>
<p>If we're OK only reflecting valid values to attributes, then we can fully use converters but things aren't necessarily simpler (we still need the custom setter for type coercion and validation, and marking the internal property as reactive to avoid triggering the custom setter when the attribute changes; we don't directly deal with the attribute but we still need to call <code>stringifyValue</code> as we know the converter's <code>fromView</code> will receive the new value):</p>
<pre class="language-ts"><code class="language-ts"><span class="token keyword">const</span> customConverter <span class="token operator">=</span> <span class="token punctuation">{</span>
  <span class="token function">fromView</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token function">parseValue</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">,</span>
  <span class="token function">toView</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>

<span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">FASTElement</span> <span class="token punctuation">{</span>
  <span class="token decorator"><span class="token at operator">@</span><span class="token function">attr</span></span><span class="token punctuation">(</span><span class="token punctuation">{</span> attribute<span class="token operator">:</span> <span class="token string">"reflected "</span><span class="token punctuation">,</span> converter<span class="token operator">:</span> customConverter <span class="token punctuation">}</span><span class="token punctuation">)</span>
  <span class="token keyword">private</span> _reflected<span class="token punctuation">;</span>

  <span class="token keyword">get</span> <span class="token function">reflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>_reflected <span class="token operator">??</span> <span class="token string">""</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">reflected</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>_reflected <span class="token operator">=</span> <span class="token function">stringifyValue</span><span class="token punctuation">(</span>newValue<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<p>For non-reflected properties, you'd want to use <code>@observable</code> instead of <code>@attr</code>, except that it doesn't work on custom accessors, so you'd have to <a href="https://www.fast.design/docs/fast-element/observables-and-state#access-tracking" title="FAST Documentation: Building Components: Observables and State: Access Tracking">do it manually</a>:</p>
<pre class="language-ts"><code class="language-ts"><span class="token keyword">class</span> <span class="token class-name">MyElement</span> <span class="token keyword">extends</span> <span class="token class-name">FASTElement</span> <span class="token punctuation">{</span>
  <span class="token keyword">private</span> _nonReflected <span class="token operator">=</span> <span class="token string">""</span><span class="token punctuation">;</span>
  <span class="token keyword">get</span> <span class="token function">nonReflected</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    Observable<span class="token punctuation">.</span><span class="token function">track</span><span class="token punctuation">(</span><span class="token keyword">this</span><span class="token punctuation">,</span> <span class="token string">'nonReflected'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token keyword">this</span><span class="token punctuation">.</span>_nonReflected<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">set</span> <span class="token function">nonReflected</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span> <span class="token punctuation">{</span>
    <span class="token keyword">const</span> newValue <span class="token operator">=</span> <span class="token function">coerceType</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// …there might be additional validations here…</span>
    <span class="token keyword">this</span><span class="token punctuation">.</span>_nonReflected <span class="token operator">=</span> newValue<span class="token punctuation">;</span>
    Observable<span class="token punctuation">.</span><span class="token function">notify</span><span class="token punctuation">(</span><span class="token keyword">this</span><span class="token punctuation">,</span> <span class="token string">'nonReflected'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre>
<h4 id="stencil">With Stencil</h4>
<p>First a disclosure: I never actually used <a href="https://stenciljs.com/">Stencil</a>, only played with it a bit locally in a hello-world project while writing this post.</p>
<p>Stencil is kind of special. It supports observable attributes through the <code>@Prop()</code> decorator, and reflected ones through <code>@Prop({ reflect: true })</code>. It will however reflect default values to attributes when the component initializes, doesn't support custom converters, and like FAST will convert an attribute value of <code>&quot;false&quot;</code> to a boolean <code>false</code>. You also have to add <code>mutable: true</code> to the <code>@Prop()</code> if the component modifies its value (Stencil assumes properties and attributes are inputs to the component, not state of the component).</p>
<p>A <code>@Prop()</code> must be public too, and cannot have custom accessors. You can use a <code>@Watch()</code> method to do some validation, but throwing from there won't prevent the property value from being updated; you can revert the property to the old value from the watch method, but other watch methods for the same property will then be called twice, and not necessarily in the correct order (depending on declaration order).</p>
<p>You cannot expose properties on the element's API if they are not annotated with <code>@Prop()</code>, making them at a minimum observe an attribute.</p>
<p>In other words, a Stencil component <strong>cannot</strong>, by design, <em>feel</em> like a built-in custom element (another thing specific to Stencil: besides <code>@Prop()</code> properties, you can expose methods through <code>@Method</code> but they must be <code>async</code>).</p>

  ]]></content>
</entry>

<entry>
  <title type="html">Improving a web component, one step at a time</title>
  <link href="/web-component-step-by-step-improvement/" />
  <published>2023-12-16T00:00:00+0000</published>
  <updated>2023-12-16T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/web-component-step-by-step-improvement/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/improving-a-web-component-one-step-at-a-time-2673/comments" />
  
  <content type="html" xml:base="/web-component-step-by-step-improvement/"><![CDATA[
    <p>Earlier this month, <a href="https://www.stefanjudis.com/">Stefan Judis</a> published a small <a href="https://www.stefanjudis.com/blog/a-web-component-to-make-your-text-sparkle/" title="A web component to make your text sparkle">web component that makes your text sparkle</a>.</p>
<p>In the spirit of so-called <a href="https://blog.jim-nielsen.com/2023/html-web-components/">HTML web components</a> which apparently often comes with some sort of aversion for the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_shadow_DOM" title="Using shadow DOM">shadow DOM</a>, the element directly manipulates the light DOM. As a developer of web apps with heavy DOM manipulations, and lover of <em>the platform</em>, this feels weird to me as it could possibly break so many things: other code that manipulates the DOM and now sees new elements and could also change them, handling of disconnection and reconnection of the element (as most such elements modify their children in the <code>connectedCallback</code> without checking whether it had already been done), <code>MutationObserver</code>, etc.</p>
<p>The first thing that came to my mind was that shadow DOM, for all its drawbacks and bugs, was the perfect fit for such an element, and I wanted to update Stefan's element to use the shadow DOM instead. Then a couple days ago, <a href="https://www.zachleat.com">Zach Leatherman</a> published <a href="https://www.zachleat.com/web/snow-fall/" title="&lt;snow-fall&gt; Web Component">a similar element</a> that makes it snow on its content, and <a href="https://piaille.fr/@tbroyer/111585454702562025">I was pleased</a> to see he used shadow DOM to encapsulate (hide) the snowflakes. That was the trigger for me to actually take the time to revisit Stefan's <code>&lt;sparkle-text&gt;</code> element, so here's a step by step of various improvements (in my opinion) I made.</p>
<p><em>Disclaimer before I begin: this not in any way a criticism of Stefan's work! On the contrary actually, it wouldn't have been possible without this prior work. I just want to show things that <em>I</em> think could be improved, and this is all very much subjective.</em></p>
<p>I'll link to commits in <a href="https://github.com/tbroyer/sparkly-text">my fork</a> without any (intermediate) demo, as all those changes don't have much impact on the element's behavior, as seen by a reader of the web page (if you're interested in what it changes when looked at through the DevTools, then clone the repository, run <code>npm install</code>, <code>npm run start</code>, then checkout each commit in turn), except in some specific situations. The final state is available <a href="https://tbroyer.github.io/sparkly-text/">here</a> if you want to play with it in your DevTools.</p>
<h2 id="using-shadow-dom">Using shadow DOM</h2>
<p>The <a href="https://github.com/tbroyer/sparkly-text/commit/57ef19f625ce886e876a597e198cd4089152a99d" title="Git commit: Encapsulate sparkles in Shadow DOM">first step</a> was moving the sparkles to shadow DOM, to avoid touching the light DOM. This involves of course attaching shadow DOM, with a <code>&lt;slot&gt;</code> to let the light DOM show, and then changing where the sparkles are added, but also changing how CSS is handled!</p>
<figure>
<figcaption>Abridged diff of the changes (notably excluding CSS)</figcaption>
<pre class="language-diff"><code class="language-diff">@@ -66,16 +62,21 @@ class SparklyText extends HTMLElement {
<span class="token unchanged"><span class="token prefix unchanged"> </span>`;
<span class="token prefix unchanged"> </span>    let sheet = new CSSStyleSheet();
<span class="token prefix unchanged"> </span>    sheet.replaceSync(css);
</span><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>    document.adoptedStyleSheets = [...document.adoptedStyleSheets, sheet];
<span class="token prefix deleted">-</span>    _needsStyles = false;
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    this.shadowRoot.adoptedStyleSheets = [sheet];
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>  }
<span class="token prefix unchanged"> </span>
<span class="token prefix unchanged"> </span>  connectedCallback() {
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    if (this.shadowRoot) {
<span class="token prefix inserted">+</span>      return;
<span class="token prefix inserted">+</span>    }
<span class="token prefix inserted">+</span>
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>    this.#numberOfSparkles = parseInt(
<span class="token prefix unchanged"> </span>      this.getAttribute("number-of-sparkles") || `${this.#numberOfSparkles}`,
<span class="token prefix unchanged"> </span>      10
<span class="token prefix unchanged"> </span>    );
<span class="token prefix unchanged"> </span>
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    this.attachShadow({ mode: "open" });
<span class="token prefix inserted">+</span>    this.shadowRoot.append(document.createElement("slot"));
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>    this.generateCss();
<span class="token prefix unchanged"> </span>    this.addSparkles();
<span class="token prefix unchanged"> </span>  }
</span>@@ -99,7 +100,7 @@ class SparklyText extends HTMLElement {
<span class="token unchanged"><span class="token prefix unchanged"> </span>      Math.random() * 110 - 5
<span class="token prefix unchanged"> </span>    }% - var(--_sparkle-base-size) / 2)`;
<span class="token prefix unchanged"> </span>
</span><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>    this.appendChild(sparkleWrapper);
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    this.shadowRoot.appendChild(sparkleWrapper);
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>    sparkleWrapper.addEventListener("animationend", () => {
<span class="token prefix unchanged"> </span>      sparkleWrapper.remove();
<span class="token prefix unchanged"> </span>    });
</span></code></pre>
</figure>
<p>In Stefan's version, CSS is injected to the document, with a boolean to make sure it's done only once, and styles are <em>scoped</em> to <code>.sparkle-wrapper</code> descendants of the <code>sparkle-text</code> elements. With shadow DOM, we gain style encapsulation, so no need for that scoping, we can directly target <code>.sparkle-wrapper</code> and <code>svg</code> as they're in the shadow DOM, clearly separate from the HTML that had been authored. We need to do it for each element though (we'll improve that later), but we now need to make sure we initialize the shadow DOM only once instead (I'm going step by step, so leaving this in the <code>connectedCallback</code>).</p>
<p>As a side effect, this also fixes some edge-case bug where the CSS would apply styles to any descendant SVG of the element, whether a sparkle or not (this could have been fixed by only targetting SVG inside <code>.sparkle-wrapper</code> actually); and of course with shadow DOM encapsulation, page author styles won't affect the sparkles either.</p>
<h2 id="small-performance-improvements">Small performance improvements</h2>
<p>Those are really small, and probably negligible, but I feel like they're good practice anyway so I didn't even bother measuring actually.</p>
<p>First, as said above, the CSS needs to be somehow <em>injected</em> into each element's shadow DOM, but the constructible stylesheet can actually be shared between all of them. I've thus split construction of the stylesheet with its adoption in the shadow DOM, and made sure construction was only made once. Again, to limit <a href="https://github.com/tbroyer/sparkly-text/commit/783d76d4766b70d7ca2d7767d1950f27f6a20d24" title="Git commit: Only create a single CSSStyleSheet">the changes</a>, everything's still in the same method, just move inside an <code>if</code> (I think I would have personally constructed the stylesheet early, as soon as the script is loaded, rather than waiting for the element to actually be used; it probably doesn't make a huge difference).</p>
<pre class="language-diff"><code class="language-diff"><span class="token unchanged"><span class="token prefix unchanged"> </span>  generateCss() {
</span><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>    const css = `…`;
<span class="token prefix deleted">-</span>    let sheet = new CSSStyleSheet();
<span class="token prefix deleted">-</span>    sheet.replaceSync(css);
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    if (!sheet) {
<span class="token prefix inserted">+</span>      const css = `…`;
<span class="token prefix inserted">+</span>      sheet = new CSSStyleSheet();
<span class="token prefix inserted">+</span>      sheet.replaceSync(css);
<span class="token prefix inserted">+</span>    }
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>    this.shadowRoot.adoptedStyleSheets = [sheet];
<span class="token prefix unchanged"> </span>  }
</span></code></pre>
<p>Similarly, sparkles were created by <code>innerHTML</code> the SVG into each. I <a href="https://github.com/tbroyer/sparkly-text/commit/887cdeafd58807bb7d96104178a806f45c109353" title="Git commit: Create sparkles by cloning a template node">changed that</a> to using <code>cloneNode(true)</code> on an element <em>prepared</em> only once.</p>
<pre class="language-diff"><code class="language-diff"><span class="token unchanged"><span class="token prefix unchanged"> </span>  addSparkle() {
</span><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>    const sparkleWrapper = document.createElement("span");
<span class="token prefix deleted">-</span>    sparkleWrapper.classList.add("sparkle-wrapper");
<span class="token prefix deleted">-</span>    sparkleWrapper.innerHTML = this.#sparkleSvg;
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    if (!sparkleTemplate) {
<span class="token prefix inserted">+</span>      sparkleTemplate = document.createElement("span");
<span class="token prefix inserted">+</span>      sparkleTemplate.classList.add("sparkle-wrapper");
<span class="token prefix inserted">+</span>      sparkleTemplate.innerHTML = this.#sparkleSvg;
<span class="token prefix inserted">+</span>    }
<span class="token prefix inserted">+</span>
<span class="token prefix inserted">+</span>    const sparkleWrapper = sparkleTemplate.cloneNode(true);
</span></code></pre>
<p>We actually don't even need the wrapper element, we could directly use the SVG <a href="https://github.com/tbroyer/sparkly-text/commit/9daa0df820ad113a745d1be7ae01ce9b6cf00711" title="Git commit: Remove sparkle-wrapper element">without wrapper</a>.</p>
<h2 id="handling-disconnection">Handling disconnection</h2>
<p>The element uses chained timers (a <code>setTimeout</code> callback that itself ends up calling <code>setTimeout</code> with the same callback, again and again) to re-add sparkles at random intervals (removing the sparkles is done as soon as the animation ends; and all of this is done only if the user didn't configure their browser to prefer reduced motion).</p>
<p>If the element is removed from the DOM, this unnecessarily continues in the background and could create memory leaks (in addition to just doing unnecessary work). <a href="https://github.com/tbroyer/sparkly-text/commit/dc4b731a3f33e5164b5f4d8cc867d76207069405" title="Git commit: Stop sparkling once disconnected">I started</a> with a very small change: check whether the element is still connected to the DOM before calling adding the sparkle (and calling <code>setTimeout</code> again). It could have been better (for some definition of better) to track the timer IDs so we could call <code>clearTimeout</code> in <code>disconnectedCallback</code>, but I feel like that would be unnecessarily complex.</p>
<pre class="language-diff"><code class="language-diff"><span class="token unchanged"><span class="token prefix unchanged"> </span>      const {matches:motionOK} = window.matchMedia('(prefers-reduced-motion: no-preference)');
</span><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>      if (motionOK) this.addSparkle();
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>      if (motionOK &amp;&amp; this.isConnected) this.addSparkle();
</span></code></pre>
<p>This handles disconnection (as could be done by any <em>destructive</em> change to the DOM, like navigating with <a href="https://turbo.hotwired.dev/">Turbo</a> or <a href="https://htmx.org/">htmx</a>, I'm not even talking about using the element in a JavaScript-heavy web app) but not reconnection though, and we've exited early from the <code>connectedCallback</code> to avoid initializing the element twice, so this change actually broke our component in these situations where it's moved around, or stashed and then reinserted. To fix that, we need to always call <code>addSparkles</code> in <code>connectedCallback</code>, so move all the rest into an <code>if</code>, that's actually as simple as that… except that when the user prefers reduced motion, sparkles are never removed, so they keep piling in each time the element is connected again. One way to handle that, without introducing our housekeeping of individual timers, is to just remove all sparkles on disconnection. Either that or conditionally add them in <code>connectedCallback</code> if either we're initializing the element (including attaching the shadow DOM) or the user doesn't prefer reduced motion. The difference between both approaches is in whether we want the small animation when the sparkles appear (and appearing at new random locations). <a href="https://github.com/tbroyer/sparkly-text/commit/ba8652eb490c41940fd531e2e87c6711cb1cc8d9" title="Git commit: Restart animation on reconnection">I went with the latter</a>.</p>
<p>This still doesn't handle the situation where <code>prefers-reduced-motion</code> changes while the element is displayed though: if it turns to <code>no-preference</code>, then sparkles will start animating (due to CSS) then disappear at the end of their animation (due to JS listening to the <code>animationend</code> event), and no other sparkle will be added (because the <code>setTimeout</code> chain would have been broken earlier). I don't feel like it's worthy enough of a fix for such an element but it's also rather easy to handle so <a href="https://github.com/tbroyer/sparkly-text/commit/e0412236e1e5d8870cee14d044368eed46a060b1" title="Git commit: Handle prefers-reduced-motion changes">let's do it</a>: listen to the media query change and start the timers whenever the user no longer prefers reduced motion.</p>
<pre class="language-diff"><code class="language-diff">@@ -94,6 +94,19 @@ connectedCallback() {
<span class="token unchanged"><span class="token prefix unchanged"> </span>      );
<span class="token prefix unchanged"> </span>      this.addSparkles();
<span class="token prefix unchanged"> </span>    }
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>
<span class="token prefix inserted">+</span>    motionOK.addEventListener("change", this.motionOkChange);
<span class="token prefix inserted">+</span>  }
<span class="token prefix inserted">+</span>
<span class="token prefix inserted">+</span>  disconnectedCallback() {
<span class="token prefix inserted">+</span>    motionOK.removeEventListener("change", this.motionOkChange);
<span class="token prefix inserted">+</span>  }
<span class="token prefix inserted">+</span>
<span class="token prefix inserted">+</span>  // Declare as an arrow function to get the appropriate 'this'
<span class="token prefix inserted">+</span>  motionOkChange = () => {
<span class="token prefix inserted">+</span>    if (motionOK.matches) {
<span class="token prefix inserted">+</span>      this.addSparkles();
<span class="token prefix inserted">+</span>    }
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>  }
</span></code></pre>
<h2 id="browser-compatibility">Browser compatibility</h2>
<p>Constructible stylesheets aren't supported in Safari 16.3 and earlier (and possibly other browsers). To avoid the code failing and strange things (probably, I haven't tested) happening, I started by <a href="https://github.com/tbroyer/sparkly-text/commit/59062d60e228111a9d00e5dd47695d0855cb937f" title="Git commit: Bail out early when constructible stylesheets aren't supported">bailing out early</a> if the browser doesn't support constructible stylesheets (the element would then just do nothing; I could have actually even avoided registering it at all). Fwiw, I borrowed the check from Zach's <code>&lt;snow-fall&gt;</code> which works this way already (thanks Zach). As an aside, it's a bit strange that the code assumed construtible stylesheets were available, but tested for the availability of the custom element registry 🤷</p>
<pre class="language-diff"><code class="language-diff"><span class="token unchanged"><span class="token prefix unchanged"> </span>  connectedCallback() {
</span><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>    if (this.shadowRoot) {
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    // https://caniuse.com/mdn-api_cssstylesheet_replacesync
<span class="token prefix inserted">+</span>    if (this.shadowRoot || !("replaceSync" in CSSStyleSheet.prototype)) {
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>      return;
<span class="token prefix unchanged"> </span>    }
<span class="token prefix unchanged"> </span>
</span></code></pre>
<p>But Safari 16.3 and earlier still represent more than a third of users on macOS, and more than a quarter of users on iOS! (according to <a href="https://caniuse.com/">CanIUse</a>) To widen browser support, I therefore added <a href="https://github.com/tbroyer/sparkly-text/commit/e5785ef938678e55c9b039dad518f69bd40075ea" title="Git commit: Support Safari &lt; 16.4">a workaround</a>, which consists of injecting a <code>&lt;style&gt;</code> element in the shadow DOM. Contrary to the constructible stylesheet, styles cannot be shared by all elements though, as we've seen above, so we only conditionally fallback to that approach, and continue using a constructible stylesheet everywhere it's supported.</p>
<pre class="language-diff"><code class="language-diff"><span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>      sheet = new CSSStyleSheet();
<span class="token prefix deleted">-</span>      sheet.replaceSync(css);
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>      if (supportsConstructibleStylesheets) {
<span class="token prefix inserted">+</span>        sheet = new CSSStyleSheet();
<span class="token prefix inserted">+</span>        sheet.replaceSync(css);
<span class="token prefix inserted">+</span>      } else {
<span class="token prefix inserted">+</span>        sheet = document.createElement("style");
<span class="token prefix inserted">+</span>        sheet.textContent = css;
<span class="token prefix inserted">+</span>      }
</span><span class="token unchanged"><span class="token prefix unchanged"> </span>    }
</span>
<span class="token deleted-sign deleted"><span class="token prefix deleted">-</span>    this.shadowRoot.adoptedStyleSheets = [sheet];```
</span><span class="token inserted-sign inserted"><span class="token prefix inserted">+</span>    if (supportsConstructibleStylesheets) {
<span class="token prefix inserted">+</span>      this.shadowRoot.adoptedStyleSheets = [sheet];
<span class="token prefix inserted">+</span>    } else {
<span class="token prefix inserted">+</span>      this.shadowRoot.append(sheet.cloneNode(true));
<span class="token prefix inserted">+</span>    }
</span></code></pre>
<h2 id="other-possible-improvements">Other possible improvements</h2>
<p>I stopped there but there's still room for improvement.</p>
<p>For instance, the <code>number-of-sparkles</code> attribute is read once when the element is connected, so changing the attribute afterwards won't have any effect (but will have if you disconnect and then reconnect the element). To handle that situation (if only because you don't control the order of initialization when that element is used within a JavaScript-heavy application with frameworks like React, Vue or Angular), one would have to listen to the attribute change and update the number of sparkles dynamically. This could be done either by removing all sparkles and recreating the correct number of them (with <code>addSparkles()</code>), but this would be a bit <em>abrupt</em>, or by reworking entirely how sparkles are managed so they could adapt dynamically (don't recreate a sparkle, let it <em>expire</em>, when changing the number of sparkles down, or create just as many sparkles as necessary when changing it up). I feel like this would bump complexity by an order of magnitude, so it's probably not worth it for such an element.</p>
<p>The number of sparkles could also be controlled by a property <a href="https://html.spec.whatwg.org/multipage/common-dom-interfaces.html#reflecting-content-attributes-in-idl-attributes" title="HTML Living Standard: Reflecting content attributes in IDL attributes">reflecting</a> the attribute; that would make the element more similar to built-in elements. Once the above is in place, this hopefully shouldn't be too hard.</p>
<p>That number of sparkles is expected to be, well, a number, and is currently parsed with <code>parseInt</code>, but the code doesn't handle parsing errors and could set the number of sparkles to <code>NaN</code>. Maybe we'd prefer using the default value in this case, and similarly for a zero or negative value; basically defining the attribute as a <a href="https://html.spec.whatwg.org/multipage/common-dom-interfaces.html#limited-to-only-non-negative-numbers-greater-than-zero-with-fallback" title="HTML Living Standard: Reflecting content attributes in IDL attributes, of type unsigned long limited to only positive numbers with fallback">number limited  to only positive numbers with fallback</a>.</p>
<p>All this added complexity is, to me, what separates so-called <em>HTML web components</em> from others: they're designed to be used from HTML markup and not (or rarely) manipulated afterwards, so shortcuts can be taken to keep them simple.</p>
<p>Still speaking of that number of sparkles, the timers that create new sparkles are entirely disconnected from the animation that also makes them disappear. The animation length is actually configurable through the <code>--sparkly-text-animation-length</code> CSS custom property, but the timers delay is not configurable (a random value between 2 and 3 seconds). This means that if we set the animation length to a higher value than 3 seconds, there will actually be more sparkles than the configured number, as new sparkles will be added before the previous one has disappeared. There are several ways to <em>fix</em> this (<strong>if</strong> we think it's a bug –this is debatable!– and is worth fixing): for instance we could use <a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Animations_API" title="MDN: Web Animations API">the Web Animations API</a> to read the computed timing of the animation and compute the timer's delay based on this value. Or we could let the animation repeat and move the element on <code>animationiteration</code>, rather than remove it and add another, and to add some randomness it could be temporarily paused and then restarted if we wanted (with a timer of some random delay). The code would be much different, but not necessarily more complex.</p>
<script type=module src=https://tbroyer.github.io/sparkly-text/sparkly-text.js></script>
<figure>
  <sparkly-text number-of-sparkles=10 style="--sparkly-text-animation-length: 10s">
    10 sparkles, animation lengthened to 10 seconds
  </sparkly-text>
  <p>There are currently <output></output> sparkles.</p>
  <script>
    const sparklyText = document.currentScript.parentElement.querySelector("sparkly-text");
    const output = document.currentScript.parentElement.querySelector("output");
    customElements.whenDefined("sparkly-text").then(() => {
      new MutationObserver(() => {
          output.value = sparklyText.shadowRoot.querySelectorAll(":host > svg").length;
      }).observe(sparklyText.shadowRoot, { childList: true });
    });
  </script>
</figure>
<p>Regarding the animation events (whether <code>animationend</code> like it is now, or possibly <code>animationiteration</code>), given that they bubble, they could be listened to on a single parent (the element itself –filtering out possible animations on light DOM children– or an intermediate element inserted to contain all sparkles). This could hopefully simplify the code handling each sparkle.</p>
<p>Last, but not least, the <code>addSparkles</code> and <code>addSparkle</code> methods could be made private, as there's no reason to expose them in the element's API.</p>
<h2 id="final-words">Final words</h2>
<p>Had I started from scratch, I probably wouldn't have written the element the same way. I tried to keep the changes small, one step at a time, rather than doing a big refactoring, or starting from scratch and comparing the outcome to the original, as my goal was to specifically show what I think could be improved and how it wouldn't necessarily involve big changes. Going farther, and/or possibly using a helper library (<a href="/web-component-libs-benefits/" title="The benefits of Web Component Libraries">I have written earlier</a> about their added value), is left as an exercise for the reader.</p>

  ]]></content>
</entry>

<entry>
  <title type="html">What are JWT?</title>
  <link href="/jwt/" />
  <published>2023-11-29T00:00:00+0000</published>
  <updated>2023-11-29T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/jwt/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/what-are-jwt-nm0/comments" />
  
  <content type="html" xml:base="/jwt/"><![CDATA[
    <aside>
<p>This article's goal is to present what JWTs are, whenever you face them.
As <a href="#criticism">we'll see</a>, you won't deliberately choose to use JWTs in a project, and more importantly: you won't use JWTs as <em>session tokens</em>.</p>
</aside>
<h2 id="what-is-it">What is it?</h2>
<blockquote>
<p>JSON Web Token (JWT) is a compact, URL-safe means of representing data to be transferred between two parties. The data is encoded as a JSON object that can be signed and/or encrypted.</p>
</blockquote>
<p>This is, paraphrased, the definition from the IETF standard that defines it (<a href="https://datatracker.ietf.org/doc/html/rfc7519">RFC 7519</a>).</p>
<h2 id="whats-the-point-whats-the-use-case">What's the point? What's the use case?</h2>
<p>So the goal is to transfer data, with some guarantees (or none, by the way): authenticity, integrity, even possibly confidentiality (if the message is encrypted). There are therefore many possible uses.</p>
<p>JWT is thus used in OpenID Connect to encode the <a href="https://openid.net/specs/openid-connect-core-1_0.html#IDToken">ID Token</a> that forwards to the application information on the authentication process that took place at the identity server. OpenID Connect also uses JWT to encode <em>aggregated claims</em>: information from other identity servers, for which we'll want to verify the authenticity and integrity.</p>
<p>A JWT <em>might</em> be used to authenticate to a server, such as with the OAuth 2 JWT Bearer (<a href="https://datatracker.ietf.org/doc/html/rfc7523">RFC 7523</a>).</p>
<p>Still in OAuth 2 land, access tokens <em>could</em> themselves be JWTs (<a href="https://datatracker.ietf.org/doc/html/rfc9068">RFC 9068</a>), authorization request parameters <em>could</em> be encoded as a JWT (<a href="https://datatracker.ietf.org/doc/html/rfc9101">RFC 9101</a>), as well as token introspection responses (<a href="https://tools.ietf.org/html/draft-ietf-oauth-jwt-introspection-response">IETF draft: JWT Response for OAuth Token Introspection</a>), and finally dynamic client registration uses a JWT to identify the software of which an instance attempts to register (so-called <em>software statements</em> of <a href="https://datatracker.ietf.org/doc/html/rfc7591">RFC 7591</a>).</p>
<h2 id="how-does-it-work">How does it work?</h2>
<p>A JWT is composed of at least 2 parts, separated with a <code>.</code> (dot), the first one always being the header. Each part is always encoded as <em>base64url</em>, a variant of Base 64 with the <code>+</code> and <code>/</code> characters (that have special meaning in URLs) replaced with <code>-</code> and <code>_</code> respectively, and without the trailing <code>=</code>.</p>
<p>There are two types of JWTs: JSON Web Signature (JWS, defined by <a href="https://datatracker.ietf.org/doc/html/rfc7515">RFC 7515</a>), and JSON Web Encryption (JWE, defined by <a href="https://datatracker.ietf.org/doc/html/rfc7516">RFC 7516</a>). The most common case is the JWS, composed of 2 or 3 parts: the header, the payload, and optionally the signature. JWEs are rarer (and more complex) so I won't talk about them here.</p>
<p>The header, common to both types, describes the type of JWT (JWS or JWE) as well as the different signature, MAC, or encryption algorithms being used (codified by <a href="https://datatracker.ietf.org/doc/html/rfc7518">RFC 7518</a>), along with other useful information, as a JSON object.<br>
In the case of JWS, we'll find the signature or MAC algorithm, possibly a key identifier (whenever multiple keys can be used, e.g. to allow for key rotation), or even a URL pointing to information about the keys (in JWKS format, defined by <a href="https://datatracker.ietf.org/doc/html/rfc7517">RFC 7517</a>), etc.</p>
<p>In the case of JWS, the payload will generally be a JSON object with the transfered data (but technically could be another JWT).</p>
<p>The third part is the signature or MAC. This part is absent if the header says the JWT is unprotected (<code>&quot;alg&quot;:&quot;none&quot;</code>).</p>
<p>For debugging, one can use the <a href="https://jwt.io/#debugger-io">JWT Debugger</a> by Auth0 to decode JWTs <em>(beware not to use it with sensitive data, only on JWTs coming from test servers)</em>.</p>
<p>⚠️ JWT being almost always used in security-related contexts, handle them with care, specifically when it comes to their cryptographical components.</p>
<p>One <strong>MUST</strong> use dedicated libraries to manipulate JWTs, and be careful to use them correctly to avoid introducing vulnerabilities.</p>
<p><a href="https://datatracker.ietf.org/doc/html/rfc8725">RFC 8725</a> has a set of best practices when manipulating and using JWTs.</p>
<h2 id="criticism">Criticism</h2>
<p>Numerous security experts, among them cryptographers, vehemently criticize JWTs and advise against their use.</p>
<p>The main criticism relates to its complexity, even though it could look <em>simple</em> to developers:</p>
<ul>
<li>first, you need to know how to decode UTF-8 and JSON ; that's as many sources of bugs (and potential vulnerabilities).</li>
<li>and of course because it's a generic format capable of signing and/or encrypting, or even not protecting anything at all (<code>&quot;alg&quot;:&quot;none&quot;</code>), with a <a href="https://www.iana.org/assignments/jose/jose.xhtml#web-signature-encryption-algorithms">list of supported algorithms</a> as long as your arm, you have to handle many cases (even if only to reject them).</li>
</ul>
<p>As a result, <a href="https://0xn3va.gitbook.io/cheat-sheets/web-application/json-web-token-vulnerabilities">a number of vulnerabilities</a> have been identified; among them (<a href="https://auth0.com/blog/critical-vulnerabilities-in-json-web-token-libraries/">identified as soon as March 2015</a>):</p>
<ul>
<li>
<p>As the JWT itself declares the algorithm used to sign or encrypt it, software that receives it needs to partly trust it, or correctly check the used algorithm against a list of authorized algorithms. Because of its apparent simplicity, many libraries came out that didn't do those necessary checks and readily accepted unprotected JWTs (<code>&quot;alg&quot;:&quot;none&quot;</code>), allowing an attacker to use any JWT, without authenticity or integrity check. And as incredible as it may seem, <a href="https://www.howmanydayssinceajwtalgnonevuln.com/">we still find</a> vulnerable applications nowadays!</p>
<p>Note: in the same way, the header can directly include the public key to be used to verify the signature. Using it will prove the integrity of the JWT, but not its authenticity as the signature could have been generated by anyone.</p>
</li>
<li>
<p>Another attack involes using the public key intended to verify an asymmetric signature (<code>&quot;alg&quot;:&quot;RS256&quot;</code> or <code>&quot;alg&quot;:&quot;ES256&quot;</code>) as a MAC key (<code>&quot;alg&quot;:&quot;HS256&quot;</code>): the application receiving the JWT could then mistakenly validate the MAC and allow the JWT in. Anybody could then create a JWS that would be accepted by the application, when that one <em>thinks</em> it's verifying an asymmetric signature.</p>
<p>This vulnerability could be due to a misuse of the library used to verify JWTs, but also in some cases directly to its API that cannot tell between a public key and a shared secret (generally for the sake of making it easy to use).</p>
</li>
</ul>
<p>Aside: despite ID Tokens in OpenID Connect being JWTs, you won't actually need to verify their signature as you generally get them through HTTPS, that already guarantees authenticity and integrity (and confidentiality), which saves us from a whole class of vulnerabilities.</p>
<p>Another criticism is due to the misuse of JWT, most often by ignorance or lack of expertise in software security: validity of a JWT is directly verifiable, without the need for a database of valid tokens or a validation service (authenticity and integrity are verifiable, so the validity period contained within in the JWT are <em>reliable</em>), but it makes the JWT <strong>impossible to revoke</strong> (unless you add such a mecanism –possibly based on the <code>jti</code> claim, initially designed to protect against replay attacks– going against the whole reason for which JWT was chosen in the first place). If a JWT is used as a <em>session token</em> for example, it then becomes impossible to sign out or terminate a session. In most use cases (in the specifications), a JWT is validated and used as soon as it's received from the issuer, so revocation is not even an issue. It's when a JWT is stored by the receiver for a later use that the problem arises (such as with a <em>session token</em> or an <em>access token</em>).</p>
<p>Some articles critical of JWT:</p>
<ul>
<li><a href="https://evertpot.com/jwt-is-a-bad-default/">JWT should not be your default for sessions</a> (by Evert Pot, developper)</li>
<li><a href="https://developer.okta.com/blog/2017/08/17/why-jwts-suck-as-session-tokens">Why JWTs Suck as Session Tokens</a> (by Okta, vendor of an identity management platform)</li>
<li><a href="https://fly.io/blog/api-tokens-a-tedious-survey/#jwt">section &quot;JSON Web Tokens&quot; of &quot;API Tokens: A Tedious Survey&quot;</a> (by Thomas H. Ptacek, security researcher)</li>
<li><a href="https://www.scottbrady91.com/jose/alternatives-to-jwts">Alternatives to JWTs</a> (by Scott Brady, engineering manager specializing in identity management systems)</li>
<li><a href="http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/">Stop using JWTs for sessions</a> and <a href="http://cryto.net/~joepie91/blog/2016/06/19/stop-using-jwt-for-sessions-part-2-why-your-solution-doesnt-work/">Stop using JWT for sessions, part 2: Why your solution doesn’t work</a> (on a web site surprisingly without HTTPS)</li>
<li><a href="https://paragonie.com/blog/2017/03/jwt-json-web-tokens-is-bad-standard-that-everyone-should-avoid">No Way, JOSE! Javascript Object Signing and Encryption is a Bad Standard That Everyone Should Avoid</a> (by Scott “CiPHPerCoder” Arciszewski, cryptographer)</li>
<li><a href="https://scottarc.blog/2023/09/06/how-to-write-a-secure-jwt-library-if-you-absolutely-must/">How to Write a Secure JWT Library If You Absolutely Must</a> (by the same author)</li>
</ul>

  ]]></content>
</entry>

<entry>
  <title type="html">Beyond the login page</title>
  <link href="/beyond-the-login-page/" />
  <published>2023-11-29T00:00:00+0000</published>
  <updated>2023-11-29T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/beyond-the-login-page/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/beyond-the-login-page-4hjd/comments" />
  
  <content type="html" xml:base="/beyond-the-login-page/"><![CDATA[
    <p>There are many blog posts floating around about “adding authentication to your application”, be it written in Node.js, ASP.NET, Java with Spring Boot, JS in the browser talking to a JSON-based Web API on the server, etc. Most of them handle the login page and password storage, and sometimes logout and a user registration page. But authentication is actually much more than that!</p>
<p>Don't get me wrong, it's great that we can describe in a single blog post how to do such things, but everyone should be aware that this is actually just the beginning of the journey, and most of the time those blog posts don't have any such warnings.</p>
<p>So here are some things to think about when “adding authentication to your application”:</p>
<ul>
<li>are you sure you store passwords securely? and verify them securely?</li>
<li>is your logout secure? (ideally cannot be abused by tricking you just clicking a link on a mail or random site)</li>
<li>are passwords robust?
<ul>
<li>put a lower bound on password length (NIST <a href="https://pages.nist.gov/800-63-3/sp800-63b.html#5-authenticator-and-verifier-requirements">recommends</a> a minimum of 8 characters); don't set an upper bound, or if you really want to make sure it's high enough (NIST recommends accepting at least 64 characters)</li>
<li>if possible, check passwords (at registration or change) against known compromised passwords (use <a href="https://haveibeenpwned.com/Passwords">Pwned Passwords</a> or similar)</li>
</ul>
</li>
<li>how well do you handle non-ASCII characters? For example, macOS and Windows encode diacritics differently, so make sure that someone who signed up on one device will be able to sign in on another (put differently, use <a href="https://en.wikipedia.org/wiki/Unicode_equivalence">Unicode normalization</a> on inputs; NIST recommends using NFKC or NFKD)</li>
<li>do you have a form to securely change the password? (when already authenticated)</li>
<li>are your forms actually compatible with password managers?</li>
<li>do you protect against brute-force attacks? if you do (e.g. by locking out accounts, or even just throttling), do you somehow protect legitimate users against DDoS?</li>
<li>once authenticated, how do you maintain the authenticated state (<em>sessions</em>; <abbr title="by the way">btw</abbr> don't use <a href="/jwt/">JWTs</a>)? and is this secure? (in other words, do you protect against <a href="https://en.wikipedia.org/wiki/Session_fixation">session fixation</a>? <a href="https://en.wikipedia.org/wiki/Cross-site_request_forgery">cross-site request forgery</a>?)</li>
<li>how long are your <em>sessions</em>? There's a balance between short and long sessions regarding security and convenience, but a choice needs to be made.</li>
<li>do you have a mechanism to ask for re-authentication before sensitive actions?</li>
<li>what do you do if a user forgot their password? Password recovery generally requires an email address, do you have one? how can you make sure that the user didn't mistype it and you will actually be able to use it when they need it? Put differently: you need a secure email verification process before you can have a secure password reset process. Implementing those processes securely go beyond the scope of this post, but let's just say we've just come from one single blog post explaining how to “add authentication to your application” to a <em>series</em> of blog posts.</li>
<li>by the way, now that you store an email address for password reset purpose, how can the user securely update it? and by that I also mean, how do you handle the case where the account got breached and the attacker changes the email address? There's unfortunately no simple answer to that, because there are a handful of cases to handle: the user may have lost access to the previous email, an attacker may have gained access to the previous email, the user may still have access to the previous email but have mistyped the new email, etc.</li>
<li>speaking of changing passwords, do you make it easier for password managers? (spoiler: through a <a href="https://w3c.github.io/webappsec-change-password-url/"><code>/.well-known/change-password</code></a> URL)</li>
<li>do you handle multi-factor authentication? do you plan on handling it in the future? If you use SMS to send one-time codes, can the device <a href="https://web.dev/articles/sms-otp-form?hl=en">autofill the form</a>?</li>
<li>how about <a href="https://passkeys.dev/">passkeys</a>?</li>
</ul>
<p>That being said, I don't think I ever implemented <strong>all</strong> of the above <em>perfectly</em>. There are always tradeoffs. But these are things to think about and make choices, and sometimes deliberate choices to postpone things (or just not implement them, after pondering the risks). Unfortunately, I did however see big mistakes in implementations of the various processes hinted above.</p>
<p>Most of the time nowadays, I prefer offloading this to an identity provider, using <a href="https://openid.net/connect/">OpenID Connect</a> or soon <a href="https://developer.mozilla.org/en-US/docs/Web/API/FedCM_API">Federated Credential Management (FedCM)</a>, even if that means shipping an identity provider as part of the deliverables (I generally go with <a href="https://keycloak.org/">Keycloak</a>, with <a href="https://github.com/adorsys/keycloak-config-cli">keycloak-config-cli</a> to provision its configuration). I'm obviously biased though as I work in IT services, developping software mainly for intranets/extranets, and companies now increasingly have their own identity providers or at a minimum have that in their roadmap. So <abbr title="Your mileage may vary">YMMV</abbr>.</p>
<p>And we've only talked about authentication, not even authorization!</p>
<p>Some resources to go farther:</p>
<ul>
<li>NIST <a href="https://pages.nist.gov/800-63-3/sp800-63b.html">SP 800-63-3</a></li>
<li>OWASP:
<ul>
<li><a href="https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html">Authentication Cheat Sheet</a></li>
<li><a href="https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html">Password Storage Cheat Sheet</a></li>
<li><a href="https://cheatsheetseries.owasp.org/cheatsheets/Multifactor_Authentication_Cheat_Sheet.html">Multifactor Authentication Cheat Sheet</a></li>
<li><a href="https://cheatsheetseries.owasp.org/cheatsheets/Forgot_Password_Cheat_Sheet.html">Forgot Password Cheat Sheet</a></li>
<li><a href="https://cheatsheetseries.owasp.org/cheatsheets/Credential_Stuffing_Prevention_Cheat_Sheet.html">Credential Stuffing Prevention Cheat Sheet</a></li>
<li><a href="https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html">Session Management Cheat Sheet</a></li>
</ul>
</li>
<li>Troy Hunt's <a href="https://www.troyhunt.com/everything-you-ever-wanted-to-know/">Everything you ever wanted to know about building a secure password reset feature</a></li>
<li>Google:
<ul>
<li><a href="https://web.dev/articles/sign-in-form-best-practices?hl=en">Sign-in form best practices</a></li>
<li><a href="https://web.dev/articles/sign-up-form-best-practices?hl=en">Sign-up form best practices</a></li>
<li><a href="https://web.dev/articles/change-password-url?hl=en">Help users change passwords easily by adding a well-known URL for changing passwords</a></li>
<li><a href="https://web.dev/articles/sms-otp-form?hl=en">SMS OTP form best practices</a></li>
<li><a href="https://developers.google.com/identity/passkeys/">Passwordless login with passkeys</a></li>
</ul>
</li>
</ul>

  ]]></content>
</entry>

<entry>
  <title type="html">How I teach Git</title>
  <link href="/teaching-git/" />
  <published>2023-11-26T00:00:00+0000</published>
  <updated>2023-11-26T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/teaching-git/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/how-i-teach-git-3nj3/comments" />
  
  <content type="html" xml:base="/teaching-git/"><![CDATA[
    <p>I've been using Git for a dozen years.
Eight years ago, I had to give a training session on Git (and GitHub) to a partner company about to create an open source project, and I'm going to tell you here about the way I taught it.
Incidentally, we created internal training sessions at work since then that use the same (or similar) approach.
That being said, I didn't invent anything: this is heavily inspired by what others wrote before, including <a href="https://git-scm.com/book/">the <cite>Pro Git</cite> book</a>, though not in the same order, and that <abbr title="in my opinion">IMO</abbr> can make a difference.</p>
<p>The reason I'm writing this post is because over the years, I've kept seeing people actually <em>use</em> Git without really understanding what they're doing; they'd either be locked into a very specific workflow they were told to follow, and unable to adapt to another that, say, an open source project is using (this also applies to open source maintainers not really understanding how external contributors use Git themselves), or they'd be totally lost if anything doesn't behave the way they thought it would, or if they made a mistake invoking Git commands.
I've been inspired to write it down by <a href="https://jvns.ca">Julia Evans</a>' (renewed) interest in Git, as she sometimes ask for comments on social networks.</p>
<p>My goal is not to actually teach you about Git, but more about sharing my approach to teaching Git, for others who will teach to possibly take inspiration.
So if you're learning Git, this post was not written with you in mind (sorry), and as such might not be self-sufficient, but hopefully the links to other learning resources will be enough to fill the blanks are make it a helpful learning resource as well.
If you're a visual learner, those external learning resources are illustrated, or even oriented towards visual learning.</p>
<h2 id="mental-model">Mental model</h2>
<p>Once we're clear why we use a VCS (Version Control System) where we record changes inside <em>commits</em> (or in other words we <em>commit our changes</em> to the history; I'm assuming some familiarity with this terminology), let's look at Git more specifically.</p>
<p>One thing I think is crucial to understand Git, is getting an accurate mental model of the concepts behind it.</p>
<p>First, that's not really important, but Git doesn't actually record <em>changes</em>, but rather <em>snapshots</em> of our files (at least conceptually; it will use <em>packfiles</em> to store things efficiently and will actually store <em>changes</em> –diffs– in some cases), and will generate diffs on-demand.
This sometimes shows in the result of some commands though (like why some commands show one file removed and another added, while other commands show a file being renamed).</p>
<p>Now let's dive into some Git concepts, or how Git implements some common VCS concepts.</p>
<h3 id="commit">Commit</h3>
<p>A Git <em>commit</em> is:</p>
<ul>
<li>one or more parent commit(s), or none for the very first commit (<em>root</em>)</li>
<li>a commit message</li>
<li>an author and an author date (actually a timestamp with timezone offset)</li>
<li>a committer and commit date</li>
<li>and our files: their pathname relative to the repository root, their <em>mode</em> (UNIX file-system permissions), and their content</li>
</ul>
<p>Each commit is given an identifier determined by computing the SHA1 hash of this information: change a comma and you get a different SHA1, a different <em>commit object</em>.
(<abbr title="For what it's worth">Fwiw</abbr>, Git is slowly <a href="https://git-scm.com/docs/hash-function-transition">moving to SHA-256</a> as the hashing function).</p>
<details>
<summary>Aside: how's the SHA1 computed?</summary>
<p>Git's storage is <em>content-adressed</em>, meaning that each <em>object</em> is stored with a name that's directly derived from its content, in the form of its SHA1 hash.</p>
<p>Historically, Git stored everything in files, and we can still reason that way.
A file's content is store as a <em>blob</em>, a directory is stored as <em>tree</em> (a text file that lists files in the directory with their name, mode, and the SHA1 of the <em>blob</em> representing their content, and their subdirectories with their name and the SHA1 their <em>tree</em>)</p>
<p>If you want the details, Julia Evans wrote an amazing (again) <a href="https://jvns.ca/blog/2023/09/14/in-a-git-repository--where-do-your-files-live-/">blog post</a>; or you can read it <a href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects">from the <cite>Pro Git</cite> book</a>.</p>
</details>
<figure>
<img src=https://git-scm.com/book/en/v2/images/commit-and-tree.png width=800 height=443 alt='A graph with 5 boxes organized in 3 columns, each box labelled with a 5-digit SHA1 prefix; the one on the left is sub-labelled "commit" and includes metadata "tree" with the SHA1 of the box in the middle, and "author" and "committer" both with value "Scott", and text "The initial commit of my project"; the box in the middle is sub-labelled "tree" and includes three lines, each labelled "blob", with the SHA1 of the 3 remaining boxes and what looks like file names: "README", "LICENSE" and "test.rb"; the last 3 boxes, aligned vertically on the right are all sub-labelled "blob" and contain what looks like the beginning of a README, LICENSE, and Ruby source file content; there are arrows linking boxes: the commit points to the tree, which points to the blobs.'>
<figcaption>A commit and its tree (source: <a src=https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell><cite>Pro Git</cite></a>)</figcaption>
</figure>
<p>The <em>parent commit(s)</em> in a <em>commit</em> create a <a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph">directed acyclic graph</a> that represents our history:
a <dfn>directed acyclic graph</dfn> is made of nodes (our commits) linked together with directed edges (each commit links to its parent(s) commit(s), there's a direction, hence <em>directed</em>) and cannot have loops/cycles (a commit will never be its own ancestor, none of its ancestor commits will link to it as a parent commit).</p>
<figure>
<img src=https://git-scm.com/book/en/v2/images/commits-and-parents.png width=800 height=265 alt='A graph with 6 boxes arranged in 2 lines and 3 columns; each box on the first line is labelled with a 5-digit SHA1 prefix, sub-labelled "commit" and with metadata "tree" and "parent" both with a 5-digit SHA1 prefix –different each time–, "author" and "committer" both with value "Scott", and some text representing the commit message; the box on the left has no "parent" value, the two other boxes have as "parent" the SHA1 of the box on their left; there&apos;s an arrow between those boxes, pointing to the left representing the "parent"; incidentally, the box on the left has the same SHA1 and same content as the commit box from the above figure; finally, each commit box also points to a box beneath it each labelled "Snapshot A", "Snapshot B", etc. and possibly representing the "tree" object linked from each commit.'>
<figcaption>Commits and their parents (source: <a src=https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell><cite>Pro Git</cite></a>)</figcaption>
</figure>
<h3 id="references-branches-and-tags">References, branches and tags</h3>
<p>Now SHA1 hashes are impractical to work with as humans, and while Git allows us to work with unique SHA1 prefixes instead of the full SHA1 hash, we'd need simpler names to refer to our commits: enter <dfn>references</dfn>.
Those are <em>labels</em> for our commits that <em>we</em> chose (rather than Git).</p>
<p>There are several kinds of <em>references</em>:</p>
<ul>
<li><dfn>branches</dfn> are <em>moving</em> references (note that <code>main</code> or <code>master</code> aren't special in any way, their name is only a convention)</li>
<li><dfn>tags</dfn> are <em>immutable</em> references</li>
<li><dfn><code>HEAD</code></dfn> is a special reference that points to the <em>current commit</em>.
It generally points to a branch rather than directly to a commit (we'll see why later).
When a reference points to another reference, this is called a <a href="/confusing-git-terminology/#reference-symbolic-reference"><dfn>symbolic reference</dfn></a>.</li>
<li>there are other special references (<code>FETCH_HEAD</code>, <code>ORIG_HEAD</code>, etc.) that Git will setup for you during some operations</li>
</ul>
<figure>
<img src=https://git-scm.com/book/en/v2/images/branch-and-history.png width=800 height=430 alt='A graph with 9 boxes; 6 boxes are arranged the same as the above figure, and are labelled the same (three commits and their 3 trees); two boxes above the right-most (latest) commit, with arrows pointing towards it, are labelled "v1.0" and "master" respectively; the last box is above the "master" box, with an arrow pointing towards it, and is labelled "HEAD".'>
<figcaption>A branch and its commit history (source: <a src=https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell><cite>Pro Git</cite></a>)</figcaption>
</figure>
<h3 id="the-three-states">The three states</h3>
<p>When you work in a Git repository, the files that you manipulate and record in the Git history are in your <em>working directory</em>.
To create commits, you'll <em>stage</em> files in the <a href="/confusing-git-terminology/#index-staged-cached"><em>index</em></a> or <em>staging area</em>.
When that's done you attach a commit message and move your <em>staged</em> files to the <em>history</em>.</p>
<p>And to close the loop, the <em>working directory</em> is initialized from a given commit from your <em>history</em>.</p>
<figure>
<img src=https://git-scm.com/book/en/v2/images/areas.png width=800 height=441 alt='A sequence diagram with 3 participants: "Working Directory", "Staging Area", and ".git directpry (Repository)"; there&apos;s a "Checkout the project" message from the ".git directory" to the "Working Directory", then "Stage Fixes" from the "Working Directory" to the "Staging Area", and finally "Commit" from the "Staging Area" to the ".git directory".'>
<figcaption>Working tree, staging area, and Git directory (source: <a href="https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F#_the_three_states"><cite>Pro Git</cite></a>)</figcaption>
</figure>
<h3 id="aside-ignoring-files">Aside: ignoring files</h3>
<p>Not all files need to have their history <em>tracked</em>: those generated by your build system (if any), those specific to your editor, and those specific to your operating system or other work environment.</p>
<p>Git allows defining naming patterns of files or directories to ignore.
This does not actually mean that Git will ignore them and they cannot be <em>tracked</em>, but that if they're not tracked, several Git operations won't show them to you or manipulate them (but you can manually add them to your history, and from then on they'll no longer be <em>ignored</em>).</p>
<p>Ignoring files is done by putting their pathname (possibly using globs) in ignore files:</p>
<ul>
<li><code>.gitignore</code> files anywhere in your repository define ignore patterns for the containing directory;
those ignore files are tracked in history as a mean to share them between developers;
this is where you'll ignore those files generated by your build system
(<code>build/</code> for Gradle projects, <code>_site/</code> for an Eleventy website, etc.)</li>
<li><code>.git/info/excludes</code> is local to the repository on your machine; rarely used but sometimes useful so good to know about</li>
<li>and finally <code>~/.config/git/ignore</code> is global to the machine (for your user); this is where you'll ignore files that are specific to your machine, such as those specific to the editors you use, or those specific to your operating system (e.g. the <code>.DS_Store</code> on macOS, or <code>Thumbs.db</code> on Windows)</li>
</ul>
<h3 id="summing-up">Summing up</h3>
<p>Here's another representation of all those concepts:</p>
<figure>
<img src=https://marklodato.github.io/visual-git-guide/conventions.svg width=907 height=529 alt='A graph with 10 boxes; 5 boxes are arranged as a line in the center, labelled with 5-digit SHA1 prefixes and with arrows between them pointing from right to left; a note describes them as "commit objects, identified by SHA-1 hash", another note describes one of the arrows as "child points to a parent"; a pair of boxes (looking like a single box split horizontally in two boxes) is above the right-most (latest) commit, with an arrow pointing down towards it, the upper box of the pair is labelled "HEAD" and described as "reference to the current branch"; the  lower box is labelled "main" and described as "current branch"; a seventh box is above another commit, with an arrow pointing down towards it; it&apos;s labelled "stable" and described as "another branch"; the last two boxes are under the commit history, one above the other; the bottom-most box is labelled "Working Directory" and described as "files that you &apos;see&apos;", the other box, between it and the commit history, is labelled "Stage (Index)" and described as "files to go in the next commit".'>
<figcaption>Commits, references, and areas (source: <a href=https://marklodato.github.io/visual-git-guide/index-en.html#conventions><cite>A Visual Git Reference</cite></a>, Mark Lodato)</figcaption>
</figure>
<h2 id="basic-operations">Basic operations</h2>
<p>This is where we start talking about Git commands, and how they interact with the graph:</p>
<ul>
<li><code>git init</code> to initialize a new repository</li>
<li><code>git status</code> to get a summary of your files' state</li>
<li><code>git diff</code> to show changes between any two of your working directory, the index, the <code>HEAD</code>, or actually between any commit</li>
<li><code>git log</code> to show and search into your history</li>
<li>creating commits
<ul>
<li><code>git add</code> to add files to the <em>index</em></li>
<li><code>git commit</code> to transform the <em>index</em> into a <em>commit</em> (with an added <em>commit message</em>)</li>
<li><code>git add -p</code> to add files interactively to the <em>index</em>:
pick which changes to add and which ones to leave only in your working directory,
on a file-by-file, part-by-part (called <em>hunk</em>) basis</li>
</ul>
</li>
<li>managing branches
<ul>
<li><code>git branch</code> to show branches, or create a branch</li>
<li><code>git switch</code> (also <code>git checkout</code>) to check out a branch (or any commit, any <em>tree</em>, actually) to your working directory</li>
<li><code>git switch -b</code> (also <code>git checkout -b</code>) as a shortcut for <code>git branch</code> and <code>git switch</code></li>
</ul>
</li>
<li><code>git grep</code> to search into your working directory, index, or any commit;
this is kind of an enhanced <code>grep -R</code> that's aware of Git</li>
<li><code>git blame</code> to know the last commit that changed each line of a given file (so, who to blame for a bug)</li>
<li><code>git stash</code> to put uncommitted changes aside (this includes <em>staged</em> files, as well as <em>tracked</em> files from the working directory), and later <em>unstash</em> them.</li>
</ul>
<h3 id="commit-branch-switching-and-head">Commit, branch switching, and HEAD</h3>
<p>When you create a commit (with <code>git commit</code>), Git not only creates the <em>commit object</em>, it also moves the <code>HEAD</code> to point to it.
If the <code>HEAD</code> actually points to a branch, as is generally the case, Git will move that branch to the new commit (and <code>HEAD</code> will continue to point to the branch).
Whenever the current branch is an ancestor of another branch (the commit pointed by the branch is also part of another branch), committing will move <code>HEAD</code> the same, and branches will <em>diverge</em>.</p>
<p>When you switch to another branch (with <code>git switch</code> or <code>git checkout</code>), <code>HEAD</code> moves to the new current branch, and your working directory and index are setup to ressemble the state of that commit (uncommitted changes are tentatively kept; if Git is unable to do it, it will refuse the switch).</p>
<p>For more details, and visual representations, see the <a href="https://marklodato.github.io/visual-git-guide/index-en.html#commit">commit</a> and <a href="https://marklodato.github.io/visual-git-guide/index-en.html#checkout">checkout</a> sections of Mark Lotato's <cite>A Visual Git Reference</cite> (be aware that this reference was written years ago, when <code>git switch</code> and <code>git restore</code> didn't exist and <code>git checkout</code> was all we had; so the <em>checkout</em> section covers a bit more than <code>git switch</code> as a result).
Of course, the <cite>Pro Git</cite> book is also a good reference with visual representations; <a href="https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell">the <cite>Branches in a Nutshell</cite> subchapter</a> covers a big part of all of the above.</p>
<h3 id="aside-git-is-conservative">Aside: Git is conservative</h3>
<p>As we've seen above, due to its <em>content-addressed storage</em>, any “change” to a commit (with <code>git commit --amend</code> for instance) will actually result in a different commit (different SHA1).
The <em>old commit</em> won't disappear immediately: Git uses <em>garbage collection</em> to eventually delete commits that aren't reachable from any <em>reference</em>.
This means that many mistakes can be recovered if you manage to find the commit SHA1 back (<code>git reflog</code> can help here, or the notation <code>&lt;branch-name&gt;@{&lt;n&gt;}</code>, e.g. <code>main@{1}</code> for the last commit that <code>main</code> pointed to before it changed).</p>
<h3 id="working-with-branches">Working with branches</h3>
<p>We've seen above how branches can diverge.
But diverging calls for eventually <em>merging</em> changes back (with <code>git merge</code>).
Git is very good at that (as we'll see later).</p>
<p>A special case of merging is when the current branch is an ancestor of the branch to merge into.
In this case, Git can do a <a href="/confusing-git-terminology/#can-be-fast-forwarded"><dfn>fast-forward merge</dfn></a>.</p>
<p>Because operations between two branches will likely always target the same pair of branches, Git allows you to setup a branch to <em>track</em> another branch.
That other branch with be called the <em>upstream</em> of the branch that <em>tracks</em> it.
When setup, <code>git status</code> will, for example, tell you how much the two branches have diverged from one another: is the current branch <a href="/confusing-git-terminology/#your-branch-is-up-to-date-with-originmain"><em>up to date</em></a> with its upstream branch, <em>behind it</em> and <a href="/confusing-git-terminology/#can-be-fast-forwarded">can be fast-forwarded</a>, <em>ahead</em> by a number of commits, or have they diverged, each by some number of commits.
Other commands will use that information to provide good default values for parameters so they can be omitted.</p>
<p>To integrate changes from another branch, rather than merging, another option is to <em>cherry-pick</em> (with the same-named command) a single commit, without its history:
Git will compute the changes brought in by that commit and apply the same changes to the current branch, creating a new commit similar to the original one
(if you to know more about how Git actually does it, see Julia Evans' <a href="https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/"><cite>How git cherry-pick and revert use 3-way merge</cite></a>).</p>
<p>Finally, another command in your toolbelt is <code>rebase</code>.
You can see it as a way to do many cherry-picks at once but it's actually much more powerful (as we'll see below).
In its basic use though, it's just that: you give it a range of commits (between any commit as the starting point and an existing branch as the end point, defaulting to the current one) and a target, and it cherry-picks all those commits on top of the target and finally updates the branch used as the end point.
The command here is of the form <code>git rebase --onto=&lt;target&gt; &lt;start&gt; &lt;end&gt;</code>.
As with many Git commands, arguments can be omitted and will have default values and/or specific meanings: thus, <code>git rebase</code> is a shorthand for <code>git rebase --fork-point upstream</code> where <code>upstream</code> is the <a href="/confusing-git-terminology/#untracked-files-remote-tracking-branch-track-remote-branch">upstream</a> of the current branch (I'll ignore <code>--fork-point</code> here, its effect is subtle and not that important in every-day use), which itself is a shorthand for <code>git rebase upstream HEAD</code> (where <code>HEAD</code> must point to a branch), itself a shorthand for <code>git rebase --onto=upstream upstream HEAD</code>, a shorthand for <code>git rebase --onto=upstream $(git merge-base upstream HEAD) HEAD</code>, and will rebase all commits between the last common ancestor of <code>upstream</code> and the current branch on one hand and the current branch (i.e. all commits since they diverged) on the other hand, and will reapply them on top of <code>upstream</code>, then update the current branch to point to the new commits.
Explicit use of <code>--onto</code> (with a value different from the starting point) is rare actually, see <a href="/confusing-git-terminology/#git-rebase---onto">my previous post</a> for one use case.</p>
<p>We cannot present <code>git rebase</code> without its interactive variant <code>git rebase -i</code>:
it starts with exactly the same behavior as the non-interactive variant,
but after computing what needs to be done, it'll allow you to edit it (as a text file in an editor, one action per line).
By default, all selected commits are cherry-picked, but you'll be able to reorder them, to skip some commit(s), or even combine some into a single commit.
You can actually cherry-pick a commit that was not initially selected, and even create merge commits, thus entirely rewriting the whole history!
Finally, you can also stop on a commit to <em>edit</em> it (using <code>git commit --amend</code> then, and/or possibly create new commits before continuing with the rebase), and/or run a given command between two commits.
This last option is so useful (to e.g. validate that you didn't break your project at each point of the history) that you can pass that command in an <code>--exec</code> option and Git will execute it between each rebased commit (this works with non-interactive rebase too; in interactive mode you'll see execution lines inserted between each cherry-pick line when given the ability to edit the rebase scenario).</p>
<p>For more details, and visual representations, see the <a href="https://marklodato.github.io/visual-git-guide/index-en.html#merge">merge</a>, <a href="https://marklodato.github.io/visual-git-guide/index-en.html#cherry-pick">cherry pick</a>, and <a href="https://marklodato.github.io/visual-git-guide/index-en.html#rebase">rebase</a> sections of Mark Lodato's <cite>A Visual Git Reference</cite>, and the <a href="https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging"><cite>Basic Branching and Merging</cite></a>, <a href="https://git-scm.com/book/en/v2/Git-Branching-Rebasing"><cite>Rebasing</cite></a>, and <a href="https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History"><cite>Rewriting History</cite></a> subchapters of the <cite>Pro Git</cite> book.
You can also look at the “branching and merging” diagrams from David Drysdale's <a href="https://lurklurk.org/gitpix/gitpix.html"><cite>Git Visual Reference</cite></a>.</p>
<h2 id="working-with-others">Working with others</h2>
<p>For now, we've only ever worked locally in our repository.
But Git was specifically built to work with others.</p>
<p>Let me introduce <em>remotes</em>.</p>
<h3 id="remotes">Remotes</h3>
<p>When you <em>clone</em> a repository, that repository becomes a <dfn>remote</dfn> of your local repository, named <code>origin</code> (just like with the <code>main</code> branch, this is just the default value and the name in itself has nothing special, besides sometimes being used as the default value when an command argument is omitted).
You'll then start working, creating local commits and branches (therefore <em>forking</em> from the remote), and the remote will probably get some more commits and branches from its author in the mean time.
You'll thus want to synchronize those remote changes into your local repository, and want to quickly know what changes you made locally compared to the remote.
The way Git handles this is by recording the state of the remote it knows about (the branches, mainly) in a special namespace: <code>refs/remote/</code>.
Those are known as <a href="/confusing-git-terminology/#untracked-files-remote-tracking-branch-track-remote-branch"><dfn>remote-tracking branches</dfn></a>.
Fwiw, local branches are stored in the <code>refs/heads/</code> namespace, and tags in <code>refs/tags/</code> (tags from remotes are generally <em>imported</em> right into <code>refs/tags/</code>, so for instance you lose the information of where they came from).
You can have as many remotes as needed, each with a name.
(Note that remotes don't necessarily live on other machines, they can actually be on the same machine, accessed directly from the filesystem, so you can play with remotes without having to setup anything.)</p>
<h3 id="fetching">Fetching</h3>
<p>Whenever you <em>fetch</em> from a remote (using <code>git fetch</code>, <code>git pull</code>, or <code>git remote update</code>), Git will talk to it to download the commits it doesn't yet know about, and will update the <em>remote-tracking branches</em> for the remote.
The exact set of references to be fetched, and where they're fetched, is passed to the <code>git fetch</code> command (as <a href="/confusing-git-terminology/#refspecs">refspecs</a>) and the default value defined in your repository's <code>.git/config</code>, and configured by default by <code>git clone</code> or <code>git remote add</code> to taking all branches (everything in <code>refs/heads/</code> on the remote) and putting them in <code>refs/remote/&lt;remote&gt;</code> (so <code>refs/remote/origin/</code> for the <code>origin</code> remote), with the same name (so <code>refs/heads/main</code> on the remote becomes <code>refs/remote/origin/main</code> locally).</p>
<figure>
<img src=https://git-scm.com/book/en/v2/images/remote-branches-5.png width=800 height=577 alt='A diagram with 3 big boxes, representing machines or repositories, containing smaller boxes and arrows representing commit histories; one box is labelled "git.outcompany.com", sublabelled "origin", and includes commits in a branch named "master"; another box is labelled "git.team1.outcompany.com", sublabelled "teamone", and includes commits in a branch named "master"; the commit SHA1 hashes are the same in "origin" and "teamone" except "origin" has one more commit on its "master" branch, i.e. "teamone" is "behind"; the third box is labelled "My Computer", it includes the same commits as the other two boxes, but this time the branches are named "origin/master" and "teamone/master"; it also includes two more commits in a branch named "master", diverging from an earlier point of the remote branches.'>
<figcaption>Remotes and remote-tracking branches (source: <a href=https://git-scm.com/book/en/v2/Git-Branching-Remote-Branches><cite>Pro Git</cite></a>)</figcaption>
</figure>
<p>You'll then use branch-related commands to get changes from a <em>remote-tracking branch</em> to your local branch (<code>git merge</code> or <code>git rebase</code>), or <code>git pull</code> which is hardly more than a shorthand for <code>git fetch</code> followed by a <code>git merge</code> or <code>git rebase</code>.
<abbr title="By the way">BTW</abbr>, in a number of situations, Git will automatically setup a <em>remote-tracking branch</em> to be the <em>upstream</em> of a local branch when you create it (it will tell you about it when that happens).</p>
<h3 id="pushing">Pushing</h3>
<p>To share your changes with others, they can either add your repository as a remote and <em>pull</em> from it (implying accessing your machine across the network), or you can <em>push</em> to a remote.
(If you ask someone to pull changes from your remote, this is called a… <em>pull request</em>, a term you'll have probably heard of from GitHub or similar services.)</p>
<p>Pushing is similar to fetching, in reverse: you'll send your commits to the remote and update its branch to point to the new commits.
As a safety measure, Git only allows remote branches to be <em>fast-forwarded</em>;
if you want to push changes that would update the remote branch in a non-fast-forward way, you'll have to <em>force</em> it, using <code>git push --force-with-lease</code> (or <code>git push --force</code>, but be careful: <code>--force-with-lease</code> will first ensure your <em>remote-tracking branch</em> is up-to-date with the remote's branch, to make sure nobody pushed changes to the branch since the last time you <em>fetched</em>; <code>--force</code> won't do that check, doing what you're telling it to do, at your own risks).</p>
<p>As with <code>git fetch</code>, you pass the branches to update to the <code>git push</code> command, but Git provides a good default behavior if you don't.
If you don't specify anything, Git will infer the remote from the <em>upstream</em> of the current branch, so most of the time <code>git push</code> is equivalent to <code>git push origin</code>.
This actually is a shorthand to <code>git push origin main</code> (assuming the current branch is <code>main</code>), itself a shorthand for <code>git push origin main:main</code>, shorthand for <code>git push origin refs/heads/main:refs/heads/main</code>, meaning to push the local <code>refs/heads/main</code> to the <code>origin</code> remote's <code>refs/heads/main</code>.
See <a href="/confusing-git-terminology/#refspecs">my previous post</a> for some use cases of specifying <em>refspecs</em> with differing source and destination.</p>
<figure>
<img src=https://lurklurk.org/gitpix/push2.svg width=1052 height=744 alt='A diagram representing a "git push" command, with four git graph diagrams (dots, some labelled, connected by lines) arranged in two lines and two columns; an arrow in between the columns implies that the left column is a "before" state and the right column an "after" state; graphs on the above line are inside a cloud, representing a remote repository, and have two branches, "master" and "other", that diverged from a common ancestor; the bottom left diagram has the same shape as the one above it except the labels are changed to "origin/master" and "origin/other" and each branch has more commits: the "master" branch has two additional commits compared to "origin/master", and "other" has one more commit thatn "origin/other"; the top right diagram has two more commits in its "master" branch compared to the top left diagram; the bottom right diagram is identical to the bottom left one except "origin/master" now points to the same commit as "master"; in other words, in the "before" state, the remote lacked three commits, and after the "git push" the two commits from the local "master" branch were copied to the remote while "other" was left untouched.'>
<figcaption><code>git push</code> (source: <a href=https://lurklurk.org/gitpix/gitpix.html><cite>Git Visual Reference</cite></a>, David Drysdale)</figcaption>
</figure>
<p>For more details, and visual representations, see the <a href="https://git-scm.com/book/en/v2/Git-Branching-Remote-Branches"><cite>Remote Branches</cite></a>, <a href="https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes"><cite>Working with Remotes</cite></a>, and <a href="https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project"><cite>Contributing to a Project</cite></a> subchapters of the <cite>Pro Git</cite> book, and the “dealing with remote repositories” diagrams from David Drysdale's <a href="https://lurklurk.org/gitpix/gitpix.html"><cite>Git Visual Reference</cite></a>.
The <cite>Contributing to a Project</cite> chapter of <cite>Pro Git</cite> also touches about contributing to open source projects on platforms like GitHub, where you have to first <em>fork</em> the repository, and contribute through <em>pull requests</em> (or <em>merge requests</em>).</p>
<h2 id="best-practices">Best practices</h2>
<p>Those are directed towards beginners, and hopefully not too controversial.</p>
<p>Try to keep a <em>clean</em> history:</p>
<ul>
<li>use merge commits wisely</li>
<li>clear and high-quality commit messages (see the <a href="https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project#_commit_guidelines"><cite>commit guidelines</cite></a> in <cite>Pro Git</cite>)</li>
<li>make <em>atomic</em> commits: each commit should be compile and run independently of the commits following it in the history</li>
</ul>
<p>This only applies to the history you share with others.
Locally, do however you want.
For beginners, I'd give the following advices though:</p>
<ul>
<li>don't work directly on <code>main</code> (or <code>master</code>, or any branch that you don't specifically <em>own</em> on the remote as well), create local branches instead;
it helps decoupling work on different tasks: about to start working on another bug or feature while waiting for additional details on instructions on the current one? switch to another branch, you'll get back to that later by switching back;
it also makes it easier to update from the remote as you're sure you won't have conflicts if your local branches are simply copies of the remote ones of the same name, without any local change (except when you want to push those changes to that branch)</li>
<li>don't hesitate to rewrite your commit history (<code>git commit --amend</code> and/or <code>git rebase -i</code>), but don't do it too early; its more than OK to stack many small commits while working, and only rewrite/cleanup the history before you share it</li>
<li>similarly, don't hesitate to rebase your local branches to integrate upstream changes (until you shared that branch, at which point you'll follow the project's how branching workflow)</li>
</ul>
<p>In case of any problem and you're lost, my advice is to use <code>gitk</code> or <code>gitk HEAD @{1}</code>, also possibly <code>gitk --all</code> (I'm using <code>gitk</code> here but use whichever tool you prefer), to visualize your Git history and try to understand what happened.
From this, you can rollback to the previous state (<code>git reset @{1}</code>) or try to fix things (cherry-picking a commit, etc.)
And if you're in the middle of a rebase, or possibly a failed merge, you can abort and rollback to the previous state with commands like <code>git rebase --abort</code> or <code>git merge --abort</code>.</p>
<p>To make things even easier, don't hesitate, before any possibly destructive command (<code>git rebase</code>), to create a branch or a tag as a &quot;bookmark&quot; you can easily reset to if things don't go as expected.
And of course, inspect the history and files after such a command to make sure the outcome is the one you expected.</p>
<h2 id="advanced-concepts">Advanced concepts</h2>
<p>Only a few of them, there are many more to explore!</p>
<ul>
<li>Detached <code>HEAD</code>: the <a href="https://git-scm.com/docs/git-checkout#_detached_head"><code>git checkout</code> manpage</a> has a good section on the topic, also see <a href="/confusing-git-terminology/#detached-head-state">my previous post</a>, and for a good visual representation, see the <a href="https://marklodato.github.io/visual-git-guide/index-en.html#detached"><cite>Committing with a Detached HEAD</cite></a> section of Mark Lodato's <cite>A Visual Git Reference</cite>.</li>
<li>Hooks: those are executables (shell scripts most of the time) that Git will run in reaction to operations on a repository; people use them to lint the code before each commit (aborting the commit if that fails), generate or post-process commit messages, or trigger actions on the server after someone pushes to the repository (trigger builds and/or deployments).</li>
<li>A couple rarely needed commands that can save you hours when you actually need them:
<ul>
<li><code>git bisect</code>: an advanced command to help you pinpoint which commit introduced a bug, by testing several commits (manually or through scripting); with a linear history, this is using bisection and could be done manually, but as soon as you have many merge commits this becomes much more complex and it's good to have <code>git bisect</code> do the heavy lifting.</li>
<li><code>git filter-repo</code>: a <a href="https://github.com/newren/git-filter-repo">third-party command</a> actually, as a replacement to Git's own <code>filter-branch</code>, that allows rewriting the whole history of a repository to remove a mistakenly added file, or help extract part of the repository to another.</li>
</ul>
</li>
</ul>
<p>We're done.</p>
<p>With this knowledge, one should be able to map any Git command to how it will modify the <em>directed acyclic graph</em> of commits, and understand how to fix mistakes (ran a merge on the wrong branch? rebased on the wrong branch?)
I'm not saying understanding such things will be <em>easy</em>, but should at least be possible.</p>

  ]]></content>
</entry>

<entry>
  <title type="html">Confusing git terminology</title>
  <link href="/confusing-git-terminology/" />
  <published>2023-11-12T00:00:00+0000</published>
  <updated>2023-11-12T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/confusing-git-terminology/</id>
  
  
  <content type="html" xml:base="/confusing-git-terminology/"><![CDATA[
    <p>Next week, <a href="https://jvns.ca">Julia Evans</a> published on her blog about <a href="https://jvns.ca/blog/2023/11/01/confusing-git-terminology/">confusing git terminology</a>.
This is an awesome post but not all explanations resonated with me so I thought I'd write my own version (or rather, add my own notes) in case others felt the same
(Julia, feel free to cherry pick from here to your blog 😉).
I'll also reorder them to make it easier to cross-reference without you having to jump around.</p>
<h2 id="my-mental-representation-of-git">My mental representation of git</h2>
<p>First, let me quickly describe how I represent a git repository in my head.</p>
<p>A git repository is a set of <a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph">directed acyclic graphs</a> of commits.
In many cases a repository has only one such graph, but there can actually be multiple (early users of GitHub Pages know about the <code>gh-pages</code> branch, in most case it's an entirely separate branch, a separate graph not connected in any wayto the other branches).</p>
<p>Then to easily reference some of those commits, we put <em>labels</em> on them: those are our branches and tags (among other things).</p>
<p>Each git repository on a machine contains such a set of directed acyclic graphs of commits,
and each time you <code>git clone</code>, <code>git fetch</code> and <code>git push</code> you copy parts of these graphs between repositories.</p>
<p>You can use <code>gitk --all</code> or <code>git log --all --oneline --graph</code> to visualize the graphs known on your matchine.</p>
<h2 id="head-and-heads">HEAD and “heads”</h2>
<p><a href="https://jvns.ca/blog/2023/11/01/confusing-git-terminology/#head-and-heads">As Julia says</a>, “heads” are “branches” (contrary to tags that are immutable, those “heads” move along the graph).</p>
<p>The way I see <code>HEAD</code> though is more like “what's been checked out in the working directory”.
It will thus indeed be “the current branch” most of the time, but not always (we'll come to those cases below).</p>
<p>One interesting thing: a remote repository also has a <code>HEAD</code>, it then represents the “default branch” that will be checked out when you clone the repository (unless you tell git to checkout a specific branch).
Actually, git makes no distinction between a repository on a server that everyone will clone from (e.g. on GitHub), and any of these clones: git is decentralized before all.
You can even clone from a repository you already have on your machine, and observe that the branch that will be checked out by default will be that source repository's <code>HEAD</code>.
When you change the “default branch” of your repository on GitHub, what you're actually doing is updating its <code>HEAD</code>.</p>
<h2 id="reference-symbolic-reference">“reference”, “symbolic reference”</h2>
<p>A reference is any <em>label</em> on a commit in the directed acyclic graph of commits.
It allows you to <em>reference</em> (sic!) a commit by a (somewhat) simple name (much simpler than the commit ID at least).
Those are branches (local and remote), tags, as well as <code>HEAD</code>, <code>FETCH_HEAD</code>, <code>ORIG_HEAD</code>, <code>MERGE_HEAD</code>, etc.</p>
<p>A symbolic reference is a reference that points to another reference, rather than directly to a commit.
This is the case of <code>HEAD</code> when you checkout a branch: it points to the branch so that git knows to move that branch forward when you make a new commit.</p>
<p>Note that <a href="https://jvns.ca/blog/2023/11/01/confusing-git-terminology/#reference-symbolic-reference">as Julia notes</a>,
<code>HEAD^^^</code> is not technically a reference, it's one of <a href="https://git-scm.com/docs/revisions">many different ways</a> of specifying revisions (another name for a commit).</p>
<h2 id="index-staged-cached">“index”, “staged”, “cached”</h2>
<p>I have nothing to add to <a href="https://jvns.ca/blog/2023/11/01/confusing-git-terminology/#index-staged-cached">what Julia wrote</a>.
tl;dr: they're all the same thing, but <code>--cached</code> (or <code>--staged</code> which is a synonym) and <code>--index</code> mean slightly different things.</p>
<h2 id="untracked-files-remote-tracking-branch-track-remote-branch">“untracked files”, “remote-tracking branch”, “track remote branch”</h2>
<p>The word “track” here has three different meanings:</p>
<ul>
<li>
<p>an “untracked file” is a file that's not included in <code>HEAD</code> or the index (technically it could exist in another commit, but when only looking at <code>HEAD</code> and comparing it to the state of your working directory, it only exists in your working directory and not in <code>HEAD</code> or in the index)</p>
</li>
<li>
<p>a “remote-tracking branch” is a reference that corresponds to a branch in a remote repository that you fetched.
Whenever you <code>git fetch</code> (or <code>git clone</code>) from a remote repository, the branches in that remote repository (in <code>refs/heads/</code> there) are copied/updated to your repository under new names, in <code>refs/remote/&lt;remotename&gt;/</code> rather than <code>ref/heads/</code> (<code>refs/heads/</code> being reserved for <em>local</em> branches).
Those <code>refs/remote/&lt;remotename&gt;/</code> branches are thus <em>tracking</em> the corresponding <code>refs/heads/</code> from the remote repository.</p>
</li>
<li>
<p>in git, a branch can be configured to “track” another (e.g. using <code>git branch --track</code> when creating a branch, <code>git branch --set-upstream-to=</code> to change a branch); that other branch is then said to be the “upstream” of the former.
Git will use that information in <code>git status</code> to tell you by how many commits the two branches diverge, and in <code>git pull</code> and <code>git push</code> to <em>synchronize</em> the two branches.
The “upstream” branch can be a “remote-tracking branch” or a local branch.
When you <code>git switch</code> (or <code>git checkout</code>) to a local branch that actually doesn't exist but has a name match in a single remote, git will automatically create it from the matching “remote-tracking branch”, and set it up to “track” it
(by extension, the repository you cloned/forked from, and whose branches you'll track, can also be called the “upstream repository”).</p>
</li>
</ul>
<h2 id="detached-head-state">“detached HEAD state”</h2>
<p>When the <code>HEAD</code> points to a (local) branch, each new commit will move the branch <em>label</em> to the new commit.</p>
<p>When the <code>HEAD</code> points to anything else than a (local) branch, git won't be able to move the reference to a new commit: you're in a “detached HEAD state”, if you make a new commit, only <code>HEAD</code> will reference it and nothing else, so if you switch to a branch you'll no longer have any reference (<em>label</em>) to that commit.
In other words, you're in a “detached HEAD state” when <code>HEAD</code> is <strong>not</strong> a “symbolic reference” but directly references a commit.</p>
<p>Note that when you checkout anything that's not a local branch (in <code>refs/heads/</code>), whether it's a tag or a “remote tracking branch”, git will resolve it to the commit ID and setup <code>HEAD</code> to point to that ID, so you'll be in a “detached HEAD state”.</p>
<h2 id="ours-and-theirs-while-merging-or-rebasing">“ours” and “theirs” while merging or rebasing</h2>
<p>“Ours” and “theirs”, or “local” and “remote”, are indeed confusing.</p>
<p>When merging, you merge another branch into the current branch: the current branch is “ours” and the other one is thus “theirs”.</p>
<p>But when rebasing the current branch on top of another branch, you're repeatedly cherry-picking the commits from the current branch on top of the other branch, so the other branch is “ours” or “local”, and the commits from the current branch are “theirs”.
To make things a bit clearer, I like to think of how rebase work (conceptually at least): after determining the list of commits that defers between the branches and need to be rebased, first checkout the other (target) branch, then for each commit in the list cherry-pick it, and finally update the branch to point to the last rebased commit. Because you start by moving to the branch on top of which you want to rebase, it becomes the “ours” or “local”, and the branch you started from becomes the “theirs” or “remote”.</p>
<h2 id="your-branch-is-up-to-date-with-originmain">“Your branch is up to date with ‘origin/main’”</h2>
<p>This is directly derived from the “tracking” of your branch, as seen above:
if your current branch “tracks” <code>refs/remote/origin/main</code>, then <code>git status</code> will display by how much commit the two branches diverge.
When they don't diverge (i.e. both references point to the exact same commit), then the branch is said to be “up to date” with its “upstream”.</p>
<p>Remember though, <a href="https://jvns.ca/blog/2023/11/01/confusing-git-terminology/#your-branch-is-up-to-date-with-origin-main">as Julia points out</a>,
that <code>refs/remote/origin/main</code> is only updated when you explicitly fetch from the remote repository (with <code>git fetch</code>, <code>git pull</code>, or <code>git remote update</code>).</p>
<h2 id="can-be-fast-forwarded">“can be fast-forwarded”</h2>
<p>This is another message you can see in the output of <code>git status</code> related to the state of this branch relative to its “upstream” branch.
We've seen that when they both point to the same commit you'll get an “is up-to-date” message; this one is another situation when the branches have not diverged, but they're not identical either.
This happens when the current branch is “behind” its “upstream”: it points to a commit that's part of the “upstream”, but “upstream” actually has more commits.</p>
<pre class="language-text"><code class="language-text">A - B (main)
     \
      C - D (origin/main)</code></pre>
<p>or if you prefer</p>
<pre class="language-text"><code class="language-text">A - B (main) - C - D (origin/main)</code></pre>
<p>This will typically be the case when you did <code>git pull</code> a few days ago to bring your <code>main</code> “up-to-date” with <code>origin/main</code> (at that time, both <code>main</code> and <code>origin/main</code> pointed to commit B) and didn't touch it since then, and things continued moving in the <code>origin</code> remote repository (commits C and D were added).
When you <code>git fetch origin main</code>, you retrieve commits C and D locally into <code>origin/main</code>; now <code>main</code> can be “fast-forwarded” to commit D by just moving the <code>main</code> <em>label</em> along the graph towards <code>origin/main</code>.</p>
<p>In other words, there's no need to create a merge commit when running <code>git merge</code> (or <code>git pull</code>), and there's no risk of merge conflict.
There's hardly any situation safer than a “fast-forward merge”.</p>
<p>Note that such a “fast-forward merge” can actually bring in merge commits (here, <code>main</code> can be fast-forwarded to <code>origin/main</code>, and bring in commits C, D, E, F, and G):</p>
<pre class="language-text"><code class="language-text">A - B (main) - C - D (origin/main)
 \            /
  E -- F --- G (origin/newfeature)</code></pre>
<p>As for the name, I like to imagine those commits as a timeline, or a tape in a tape cassette or VHS.
You were following changes but ⏸️ <em>paused</em> a few days ago at your last <code>git pull</code>.
Git knows that there's <code>origin/main</code> ahead in a “straight line” so you can just press the “⏩ fast forward” button to safely reach that new state.</p>
<p>The other situations you can experience that are neither an “is up to date with” or “can be forwarded” are:</p>
<ul>
<li>when your branch has more commits than its “upstream”: git will show “Your branch is ahead of 'origin/main' by N commits”<pre><code>A - B (origin/main)
     \
      C - D (main)
</code></pre>
</li>
<li>when they have diverged: “Your branch and 'origin/main' have diverged, and have M and N different commits each, respectively”<pre><code>A - B (main)
 \
  C - D (origin/main)
</code></pre>
</li>
</ul>
<h2 id="head-head-head-head-head2-head2">HEAD^, HEAD~, HEAD^^, HEAD~~, HEAD^2, HEAD~2</h2>
<p>When you need to specify commits as parameters to git commands, one way is to use the commit ID, or a reference (branch, tag) name.
But git makes it easier for those commits that are not directly pointed by a reference: if you know how to find that commit then no need to use <code>git log</code> to go search the commit ID yourself, you can tell git how to get to it from another commit.</p>
<p>That's what the <code>^</code> and <code>~</code> suffixes do (there are <a href="https://git-scm.com/docs/revisions">other notations</a> as well).</p>
<p>So <code>^</code> is actually a shorthand for <code>^1</code> which takes the “first parent” of the commit you apply it to.
Most commits have only a single parent, but merge commits will have at least 2 (yes, at least, you can actually have merge commits with more than 2 parents),
so <code>^</code> or <code>^1</code> will take the first, and <code>^2</code> the second (and <code>^3</code> the third, you got it).</p>
<p><code>HEAD^^</code> actually just applies the <code>^</code> operator to <code>HEAD^</code>, which itself had applied it to <code>HEAD</code>, therefore taking “two commits ago”.</p>
<p>To make it easier to follow the “first parents”, the <code>~</code> operator can be used.
Similarly, <code>~</code> is actually a shorthand for <code>~1</code>.
Directly taken for <a href="https://git-scm.com/docs/revisions">the docs</a>, <code>~3</code> is equivalent to <code>^^^</code> and directly expressed “three commits before” (or “three commits ago” when applied to <code>HEAD</code>).
So “ten commits ago” can be written either <code>HEAD^^^^^^^^^^</code> or <code>HEAD~10</code>, one is easier to read than the other 😉</p>
<h2 id="and-">.. and ...</h2>
<p>Those are generally used with <code>git log</code> and <code>git diff</code>.</p>
<p>The notation <code>r1..r2</code> selects all commits reachable from <code>r2</code> that are not reachable from <code>r1</code> (note that <code>r1</code> and <code>r2</code> can be any form of revision: a reference or a commit ID),
whereas <code>r1...r2</code> selects all commits reachable from either <code>r1</code> or <code>r2</code> but not both.</p>
<p>In a typical tree with two diverging branches like this:</p>
<pre><code>A - B (main)
  \ 
    C - D (test)
</code></pre>
<p>the notation <code>main..test</code> will select all of B, C and D (but not A), whereas <code>main...test</code> will select commits C and D only.</p>
<p>Note that the behavior is different with <code>git diff</code>, as <code>git diff</code> is about comparing two points in the graph, not a range of commits!
<code>git diff</code> thus has its own definition for <code>..</code> and <code>...</code>: whereas <code>git diff r1..r2</code> is equivalent to <code>git diff r1 r2</code>, showing the difference between those 2 commits,
<code>git diff r1...r2</code> will however find the last common ancestor of <code>r1</code> and <code>r2</code> (same as <code>git merge-base r1 r2</code>), and diff between that common ancestor and <code>r2</code>.
In other words, <code>git diff main...test</code> will show the changes in <code>test</code> since the point it diverged from <code>main</code> (what changes did I add to my branch, ignoring commits added to the “upstream” since then? or what changes exist in my “upstream” branch since I branched out, ignoring changes in my branch?)</p>
<p>While this might seem the reverse of <code>git log</code> (commit B is <em>taken into account</em> by <code>git log main...test</code> but not <code>git log main...test</code>, and by <code>git diff main...test</code> but not <code>git diff main..test</code>), this is actually rather consistent with <code>git log</code>, at least for <code>...</code>: <code>git log main...test</code> and <code>git diff main...test</code> will both only tell you about commits C and D (notice that this is what GitHub is using when clicking on those <em>compare</em> links).</p>
<p>TL;DR: forget about the <code>..</code> notation, it's almost never what you want for <code>git log</code>,
use either <code>...</code> or the space-separated form of <code>git diff</code>.</p>
<h2 id="refspecs">refspecs</h2>
<p>Refspecs are used by <code>git fetch</code> and <code>git push</code> to determine what to fetch or push, respectively, and the mapping between local references and remote ones (though most of the time one uses those commands without an explicit refspec).
A default refspec can also be configured for a <em>remote</em> (remote repository) for each action (fetch or push); one will generally be configured for fetching.</p>
<p>When you clone a repository, git sets up a remote named <code>origin</code> and configures its default refspec, generally with <code>+refs/heads/*:refs/remotes/origin/*</code> but this can differ depending on the options passed to <code>git clone</code>.</p>
<p>This refspec tells git that when fetching from the remote repository,
all the references inside <code>refs/heads/</code> (due to the <code>*</code> wildcard) will be fetched and stored locally into <code>refs/remote/origin/</code> (using the same name suffix).
The <code>+</code> is equivalent to passing <code>--force</code> to the commands and will update the destination reference even if the new value is not “fast-forwarded” from the current value.
When fetching, this means that if someone force-pushed a branch, git will update the corresponding <code>refs/remote/</code> on your side to make it match the remote reference; without the <code>+</code>, your “remote-tracking branch” would instead stay desynchronized.</p>
<p>The <code>--tags</code> flag is actually a shorthand to adding the <code>refs/tags/*:refs/tags/*</code> refspec: tags are synchronized (either fetched or pushed, depending on the command) between repositories (without overwriting existing tags at the destination).</p>
<p>As I said above, you can actually use those refspecs for pushing too.</p>
<p>For example, with <code>git push origin HEAD:test</code> you will update (or possibly create) a <code>test</code> branch on the remote repository (git will expand <code>test</code> to <code>refs/heads/test</code>) to point to the commit that's locally your <code>HEAD</code> (this will send the appropriate commits to the remote to make it possible).
I use this from time to time on side-projects where I'm the only maintainer to test local commit on a scratch branch, to trigger my GitHub Actions; if the build pass, then only will I push to <code>main</code>; all without having to create that <code>test</code> branch locally.</p>
<p>I sometimes also use the form <code>git push origin main^:main</code> to push my <code>main</code> branch, except for its last commit, that I will keep local as it's likely a work in progress.</p>
<p>People working with <a href="https://gerritcodereview.com/">Gerrit</a> will be familiar with <code>git push origin HEAD:refs/for/main</code> to push commits for review (<code>refs/for</code> is a <em>magic</em> namespace in Gerrit to push for review for a target branch), and now you know what it means 😉.</p>
<p>You might sometimes also see things like <code>git push origin :test</code>, this will delete the remote <code>test</code> branch, and is equivalent to <code>git push --delete test</code> (and it was the only way to delete a remote branch or tag before the <code>--delete</code> flag was added).</p>
<h2 id="reset-revert-restore">“reset”, “revert”, “restore”</h2>
<p>Those three terms are all meant to somehow <em>destroy</em> something, but in different ways. Eck there's even <a href="https://git-scm.com/docs/git#_reset_restore_and_revert">a section of the docs</a> dedicated to disambiguating them!</p>
<ul>
<li>“reset” is meant to <em>move</em> the current branch to another commit (a “fast-forward merge” is actually equivalent to a “reset”), though it can also be used to manipulate the “index” (opposite of <code>git add</code> and equivalent to <code>git restore --staged</code>).
You can tell <code>git reset</code> what to do of your index and working tree with flags such as <code>--hard</code>.</li>
<li>“revert” will create new commits that will undo the effects of previous commits</li>
<li>“restore” is all about files in your working directory or index, to undo changes made to them and restore them to a specific version recorded in some commit or the index.</li>
</ul>
<h2 id="checkout">checkout</h2>
<p>The <code>git checkout</code> command can do two seamingly unrelated things:</p>
<ul>
<li>“switching” to another branch, and</li>
<li>“restoring” files from a given commit</li>
</ul>
<p>Technically, those are actually quite similar as they're about changing files in your working directory, and in the case of “switching” also changing what <code>HEAD</code> points to.</p>
<p>Nowadays, you should rather use the <code>git switch</code> and <code>git restore</code> commands to the same effects.</p>
<h2 id="tree-ish">“tree-ish”</h2>
<p>In git, each commit is a <em>snapshot</em> of the state of the repository, along with some metadata (among them the commit message, committer, and author).
That <em>snapshot</em> is stored as a <em>tree object</em>.
A “tree-ish” is anything that resolves to a <em>tree object</em>: either the tree ID itself, or a <em>commit-ish</em> (a commit ID, a reference name, possibly using the <code>^</code> or <code>~</code> operators as seen above).</p>
<p>Technically you can also refer to a subtree (directory) of a given tree-ish by suffixing it with <code>:</code> followed by the path of the directory.
While I sometimes use this notation with <code>git show</code> to refer to files (show me the content of the given file inside that commit), I've never ever used it for a subtree (this can apparently be used with <code>git restore --source=</code>, <code>git checkout</code>, and <code>git reset</code>; looks like a very advanced feature to me).</p>
<h2 id="reflog">reflog</h2>
<p>The reflog, or reference log, is kind of an <em>audit log</em> of any change ever done to references in your local repository.</p>
<p>You'll almost never use it but it can save yourself in some gnarly situations, to recover things you accidentally deleted.</p>
<h2 id="merge-vs-rebase-vs-cherry-pick">merge vs rebase vs cherry-pick</h2>
<p>I have to say I don't quite understand how those terms are confusing 🤷</p>
<p>I suppose this is due to <em>superficial</em> knowledge of git; knowing mostly git commands and not really having a mental representation of the concepts at hand.
Git core concepts aren't that hard to comprehend, but if nobody explains them to you and you only learned to use git by memorizing a few commands, you can quickly get lost, particularly when told to change your workflow
(fwiw, this is I think the main reason we created internal training sessions at work, starting from those concepts towards the commands that manipulate them, dispensed to all new hires).</p>
<p>The commands can sometimes be confusing to use though:</p>
<ul>
<li>
<p><code>git merge</code> will create a new commit joining two lines of commit history (two branches)</p>
</li>
<li>
<p><code>git rebase</code> will <em>replay</em> your commits on top of another commit (selecting the commits since the last common ancestor). In more advanced use cases, you can also specify exactly which set of commits to rebase, and onto which commit to rebase them (see below).</p>
<p>Because git stores <em>snapshots</em>, and not diffs, it will compute the diff of each commit (similar to <code>git diff</code>) and apply it on the new base.
Julia <a href="https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/">has a wonderful post</a> explaining how this all works in details.</p>
<p><code>git rebase</code> also has some super powers in the form of its interactive mode, where you can tell it to reorder the commits, skip some, squash others into a single commit, etc.
You generally use this form to <em>replay</em> your history without changing your “base”.</p>
</li>
<li>
<p><code>git cherry-pick</code> will also <em>replay</em> a commit, but works kinda the reverse of <code>git rebase</code>: you tell it which commit (from another branch) to <em>replay</em> on top of your current branch; the commits from your current branch don't change, you're creating a new commit that does the same as another commit from another branch.</p>
</li>
</ul>
<p>The thing to remember: <code>git rebase</code> can be <em>destructive</em>, so use with care and don't hesitate to create a branch as <em>bookmark</em> before you rebase, and/or abort your rebase if you feel like you lose control of it.
That being said, my personal workflow involves rebasing a lot</p>
<h2 id="git-rebase---onto">git rebase --onto</h2>
<p>When you use <code>git rebase main</code> to rebase your current branch on top of main (e.g. just before merging it, as a “fast-forward merge”, because you like your history to be linear; or just to avoid all those merge commits whenever you want to sync your feature branch with new changes from <code>main</code>), git will first find the last common ancestor between your current branch and <code>main</code>, and get the list of commits in your branch since that point (this is the exact equivalent to <code>git log main...</code> or <code>git log main...HEAD</code> if you remember). It will then <em>replay</em> them on top of <code>main</code>.</p>
<p>So <code>main</code> is used twice here: to find which commit to rebase, and “onto” which base.</p>
<p>Imagine you started working on a new feature, so you branched from <code>main</code> at some point.
Then management decides that the feature becomes a priority and should be released early, without other features that already landed on <code>main</code>.
So a new branch (let's call it <code>release-X</code>) is created from an earlier point of <code>main</code> than you branched from, then possibly a few bugfixes are cherry-picked too.
You would then want to take all the commits from your branch and move them as if you branched from that new branch (or any earlier point from <code>main</code> than you initially branched from): <code>git rebase --onto release-X main</code>.</p>
<h2 id="commit-more-confusing-terms-and-all-the-rest">commit, more confusing terms, and all the rest…</h2>
<p>I'll stop there I have nothing to add to <a href="https://jvns.ca/blog/2023/11/01/confusing-git-terminology/#commit">what Julia says on “commit”</a>.</p>
<p>I might actually do a followup post with some of the things she left out.
I'd personally add fork vs. clone too.</p>

  ]]></content>
</entry>

<entry>
  <title type="html">Climate-friendly software: don&#39;t fight the wrong battle</title>
  <link href="/climate-friendly-software/" />
  <published>2023-04-30T00:00:00+0000</published>
  <updated>2023-04-30T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/climate-friendly-software/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/climate-friendly-software-dont-fight-the-wrong-battle-19h2/comments" />
  
  <content type="html" xml:base="/climate-friendly-software/"><![CDATA[
    <p>When talking about software ecodesign, green IT, climate-friendly software, the carbon footprint of software, or however you name it,
most of the time people focus on energy efficiency and server-side code,
sometimes going to great length measuring and monitoring it.
But what if all this was misguided?</p>
<p>Ok, this is a bit of a bold statement, but don't get me wrong:
I'm not saying you shouldn't care about this.
Let's look at one of the most recent examples I've seen: GitHub's <a href="https://gh.io/AAjpnus">ReadME Project Q&amp;A: Slash your code's carbon footprint</a> newsletter issue.
It's good and I agree with many things in there (go read it if you haven't already),
but it talks almost exclusively about energy efficiency and server-side code,
or in other words it limits actions to the scope 2 of the <a href="https://ghgprotocol.org/">GHG Protocol</a>.</p>
<p>So let's first understand which impacts we're talking about
before I give you my opinion on the low-hanging fruits.</p>
<p><em>Disclaimer: people regarded as experts in green IT trusted me enough to have me contribute to <a href="https://ecoconceptionweb.com/" title="(French) Eco-conception web : les 115 bonnes pratiques">a book on the subject</a> but I'm not myself an expert in the field.</em></p>
<p>Note: this post is written for developers and software architects; there are other actions to lower the climate impact of the digital world that won't be covered here.</p>
<h2 id="stepping-back">Stepping back</h2>
<p>Most software nowadays is client-server: whether web-based or mobile, more and more end-user software talk to servers.
This means there's a huge asymmetry in usage: even for small-scale professional software the end users generally vastly outnumber the servers.
And this implies the impacts of the individual clients need to be much lower than those of the servers.</p>
<figure>
<img src=/image/2023/04/ghg-balance.png width=606 height=237 alt="Data table showing greenhouse gas emissions share broken down by tier and lifecycle stage; all values in user equipment line are red, other values in use phase column are orange; in the total column, user equipment is red, networks orange, and data centers green" aria-describedby=ghg-balance>
<details>
<summary>Data table</summary>
<table id=ghg-balance>
<tr><td></td><th scope=col>Manufacturing</th><th scope=col>Use</th><th scope=col>Total</th></tr>
<tr><th scope=row>User equipment</th><td>40%</td><td>26%</td><td>66%</td></tr>
<tr><th scope=row>Networks</th><td>3%</td><td>17%</td><td>19%</td></tr>
<tr><th scope=row>Data centers</th><td>1%</td><td>14%</td><td>15%</td></tr>
<tr><th scopr=row>Total</th><td>44%</td><td>56%</td><td></td></tr>
</table>
</details>
<figcaption>Greenhouse gas emissions balance (<a href="https://www.greenit.fr/wp-content/uploads/2019/11/GREENIT_EENM_etude_EN_accessible.pdf" title="The environmental footprint of the digital world">source</a>, PDF, 533 KB)</figcaption>
</figure>
<p>What <a href="https://en.wikipedia.org/wiki/Life-cycle_assessment" title="Wikipedia: Life-cycle assessment">life-cycle assessments (LCA)</a> for end-users' devices tell us is that manufacturing, transport and disposal summed up immensely outweighs use, ranging from 65% up to nearly 98% of the global warming potential (GWP).
Of course, this depends where the device was manufactured and where it's being used, with the use location's biggest impact being related to the carbon footprint of the electric system, as the use phase is all about charging or powering our smartphones, laptops and desktops.</p>
<figure>
<img src=/image/2023/04/pixel-7.png width=713 height=327 alt="Bar chart of the estimated greenhouse gas (GHG) emissions for the Google Pixel 7; production is 7 times bigger than customer use, itself much bigger than transportation or recycling" aria-describedby=pixel7>
<details>
<summary>Data table</summary>
<table id=pixel7>
<caption>Estimated GHG emissions for Pixel 7 assuming three years of use: 70 kg CO₂e</caption>
<tr><th scope=col>Lifecycle phase</th><th scope=col>Emissions share</th></tr>
<tr><th scope=row>Production</th><td>84%</td></tr>
<tr><th scope=row>Transportation</th><td>3%</td></tr>
<tr><th scope=row>Customer Use</th><td>12%</td></tr>
<tr><th scope=row>Recycling</th><td>1%</td></tr>
</table>
</details>
<figcaption>Estimated Greenhouse Gas (GHG) emissions for a Google Pixel 7 (<a href="https://www.gstatic.com/gumdrop/sustainability/pixel-7-product-environmental-report.pdf" title="Pixel 7 Product Environmental Report">source</a>, PDF, 224 KB)</figcaption>
</figure>
<figure>
<img src=/image/2023/04/precision-3520.png width=572 height=297 alt="Piechart of the estimated carbon footprint for a Dell Precision 3520 broken down by lifecycle phase, with a secondary piechart breaking down the footprint of the manufacturing phase by component; manufacturing is more than 4.5 times bigger than use, itself much bigger than transportation or end of life; components with the biggest impacts are the display, twice as big as the solid state drive, followed by the power supply and mainboard" aria-describedby=precision-3520>
<details>
<summary>Data table</summary>
<table id=precision-3520>
<caption>Carbon footprint for the Dell Precision 3520, assuming four years of use</caption>
<thead>
<tr><th scope=col>Lifecycle phase</th><th scope=col>Component</th><th scope=col>Carbon footprint's share</th></tr>
</thead>
<tbody>
<tr><th scope=rowgroup rowspan=8>Manufacturing</th><th scope=row>Chassis & assembly</th><td>3.3%</td></tr>
<tr><th scope=row>Solid state drive</th><td>17.5%</td></tr>
<tr><th scope=row>Power supply</th><td>11.1%</td></tr>
<tr><th scope=row>Battery</th><td>2.3%</td></tr>
<tr><th scope=row>Mainboard and other boards</th><td>11.9%</td></tr>
<tr><th scope=row>Display</th><td>32.9%</td></tr>
<tr><th scope=row>Packaging</th><td>0.3%</td></tr>
<tr><th scope=row>Total</th><td>79.2%</td></tr>
</tbody>
<tbody>
<tr><th scope=row colspan=2>Transportation</th><td>4.4%</td></tr>
<tr><th scope=row colspan=2>Use</th><td>16.2%</td></tr>
<tr><th scope=row colspan=2>End of life</th><td>0.3%</td></tr>
</tbody>
</table>
</details>
<figcaption>Estimated carbon footprint allocation for my Dell Precision 3520, assuming 4 years of use (I've had mine for more than 5.5 years already): 304 kg CO₂e ± 68 kg CO₂e (<a href="https://i.dell.com/sites/csdocuments/CorpComm_Docs/en/carbon-footprint-precision-3520.pdf" title="Dell Precision 3520 Product Carbon Footprint">source</a>, PDF, 557 KB)</figcaption>
</figure>
<p>I am French, working mainly for French companies with most of their users in France, so I'm ready to admit I'm biased towards a very low use phase weight compared to other regions: go explore data for your users on <a href="https://app.electricitymaps.com/map">Electricity Map</a> and <a href="https://ourworldindata.org/grapher/carbon-intensity-electricity" title="Our World in Data: Carbon intensity of electricity, 2022">Our World in Data</a>.
And yet, that doesn't change the fact that the use phase has a much lower carbon footprint than all three of manufacturing, transport, and disposal as a whole.</p>
<figure>
<a href="https://ourworldindata.org/grapher/carbon-intensity-electricity" title="Our World in Data: Carbon intensity of electricity, 2022; interactive visualization"><img src=https://assets.ourworldindata.org/grapher/exports/carbon-intensity-electricity.svg width=850 height=600 alt="Map of carbon intensity of electricity in 2022, per country; countries whose electricity consumption emits less than 100 g CO₂e per kWh are mainly in Central, Eastern, and Southern Africa, and in Europe"></a>
</figure>
<p>What we can infer from this, is that keeping our devices longer will  increase the share of use in the whole life-cycle impacts.
Fairphone measured that extending the lifespan of their phones from 3 to 5 years <q>helps reduce the yearly emissions on global warming by 31%, while a further extension to 7 years of use helps reduce the yearly impact by 44%.</q></p>
<figure>
<img src=/image/2023/04/fairphone-4.png width=485 height=293 alt="Barchart of yearly emissions for the Fairphone 4, per baseline scenario" aria-describedby=fairphone-4>
<details>
<summary>Data table</summary>
<table id=fairphone-4>
<caption>Yearly emissions per baseline scenario, in kg CO₂e (numbers are approximations read from the barchart)</caption>
<tr><td></td><th scope=col>3 years</th><th scope=col>5 years</th><th scope=col>7 years</th></tr>
<tr><th scope=row>Production</th><td>11.7</td><td>7.1</td><td>5.5</td></tr>
<tr><th scope=row>Transport</th><td>0.5</td><td>0.3</td><td>0.2</td></tr>
<tr><th scope=row>Use</th><td>2.3</td><td>2.3</td><td>2.3</td></tr>
<tr><th scope=row>End of life</th><td>0.6</td><td>0.3</td><td>0.2</td></tr>
</table>
</details>
<figcaption>Fairphone 4: comparative of yearly emissions per baseline scenario (<a href="https://www.fairphone.com/wp-content/uploads/2022/07/Fairphone-4-Life-Cycle-Assessment-22.pdf" title="Life Cycle Assessment of the Fairphone 4">source</a>, PDF, 1.1 MB)</figcaption>
</figure>
<aside role="doc-pullquote presentation" aria-hidden=true>Extending the lifespan of a smartphone from 3 to 5 years can reduce its yearly global warming impacts by almost a third.</aside>
<p>Things are different for servers though, where the use phase's share varies much more depending on use location: from 4% up to 85%!
As noted in the ReadME Project Q&amp;A linked above, big companies' datacenters are for the most part net-neutral in carbon emissions, so not only the geographic regions of your servers matter, but also the actual datacenters in those regions.
This implies that whatever you do on the server side, its impact will likely be limited (remember what I was saying in the introduction?)
Of course there are exceptions, and there will always be, so please look at this through the prism of your own workloads.</p>
<figure>
<img src=/image/2023/04/dell-r640.png width=486 height=359 alt="Piechart of estimated carbon footprint allocation for a Dell PowerEdge R640, assuming 4 years of use: use is more than 4.5 times bigger than manufacturing, itself an order of magnitude bigger than transportation or end of life." aria-describedby=dell-r640>
<details>
<summary>Data table</summary>
<table id=dell-r640>
<tr><th scope=col>Lifecycle phase</th><th scope=col>Emissions share</th></tr>
<tr><th scope=row>Manifacturing</th><td>16.6%</td></tr>
<tr><th scope=row>Transportation</th><td>0.3%</td></tr>
<tr><th scope=row>Use</th><td>83%</td></tr>
<tr><th scope=row>End of life</th><td>0.1%</td></tr>
</table>
</details>
<figcaption>Estimated carbon footprint for a Dell PowerEdge R640 server, assuming 4 years of use: 7730 kg CO₂e (<a href="https://i.dell.com/sites/csdocuments/CorpComm_Docs/en/carbon-footprint-poweredge-r640.pdf" title="Dell PowerEdge R640 Product Carbon Footprint">source</a>, PDF, 514 KB)</figcaption>
</figure>
<p>Keep in mind the orders of magnitude though: 70 kg CO₂e for a single Pixel 7 (on 3 years) vs. 7730 kg CO₂e for a Dell PowerEdge R640 server (on 4 years), that's 110 smartphones for a server (or a 83:1 ratio when considering yearly emissions): chances are that you'll have much more users than that.
The ratio for laptops (304 kg CO₂e on 4 years for a Dell Precision 3520) would be 25 laptops for a server.
But as seen previously the actual carbon footprint will vary a lot depending on the location; you can explore some data in the <a href="https://dataviz.boavizta.org/manufacturerdata" title="Datavizta: Manufacturer data repository">Boavizta data visualization tool</a> that compiles dozens of LCAs of various manufacturers.
The Dell PowerEdge R640 in France would actually emit 1701 kg CO₂e rather than 7730 kg CO₂e: that's a 4.5:1 ratio!
Comparatively, my Dell Precision 3520 would fall from 304 kg CO₂e to 261 kg CO₂e, only a 1.16:1 ratio.
The laptop to server ratio would thus fall from 25 down to 7.9:1, which makes the laptops' impacts comparatively much bigger than the server compared to other regions.</p>
<p>Note that there are three tiers: end-users, datacenters, and networks.
Network energy consumption however <a href="https://doi.org/10.1016/j.joule.2021.05.007" title="Does not compute: Avoiding pitfalls assessing the Internet's energy and carbon impacts">doesn't vary proportionally to the amount of data transferred</a>, which means we as users of those networks don't have much levers on their footprint.
That being said, data transmission is <a href="https://developer.android.com/training/connectivity/minimize-effect-regular-updates#:~:text=Requests%20that%20your%20app%20makes%20to%20the%20network%20are%20a%20major%20cause%20of%20battery%20drain%20because%20they%20turn%20on%20power%2Dconsuming%20cellular%20or%20Wi%2DFi%20radios." title="Android Developers: Requests that your app makes to the network are a major cause of battery drain because they turn on power-consuming cellular or Wi-Fi radios.">among the things that will drain the batteries of mobile devices</a>, so reducing the amount of data you exchange on the network could have a more direct impact on the battery life of end-users' smartphones (even though what will drain the battery the most will more likely be the screen).</p>
<h2 id="taking-action">Taking action</h2>
<p>So, what have we learned so far?</p>
<ul>
<li>It's important that end users keep their devices longer,</li>
<li>we can't do much about networks,</li>
<li>the location (geographic region and datacenter) of servers matter a lot, more so than how and how much we use them.</li>
</ul>
<p>Now, what can we do about it?</p>
<p>For servers, it's relatively simple:
if you can, rent servers in energy efficient datacenters, and/or countries with low-carbon electricity;
in addition, or otherwise, then of course optimize your server-side architecture and code.
If you manage your own servers, avoid buying machines to let them sit idle: maximize their utilization.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Pick servers in carbon-neutral or low-carbon datacenters first, then optimize your architecture and code.</aside>
<p>For the networks, our actions are probably limited to reducing data usage, <q>not because it reduces immediate emissions (it doesn't), but to avoid the need for rapid expansion of the network infrastructure</q> (I'm quoting <a href="https://limited.systems/">Wim Vanderbauwhede</a> here, from a private conversation).</p>
<p>For the end-users' devices, it's more complicated, but not out of reach:
we want users to keep their devices as long as possible so, put differently, we must not be responsible for them to change their devices.
There will always be people changing devices &quot;for the hype&quot; or on some scheduled basis (or just because the vendor stopped pushing security updates, <a href="https://en.wikipedia.org/wiki/Planned_obsolescence#Software_degradation_and_lock-out" title="Wikipedia: Planned Obsolescence: Software degradation and lock-out">some form of planned obsolescence</a>, or can't be repaired; two things laws could alleviate), but there are also many people who keep them as long as possible (because they're eco-conscious or can't afford purchasing a new device, or simply because they don't feel the need for changing something that's still fully functioning.)
For those people, don't be the one to make them change their mind and cross the line.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Don't be the one that will make your users change their device.</aside>
<p>This is something we won't ever be able to measure, as it depends on how people perceive the overall experience on their device, but it boils down to perceived performance.
So by all means, optimize your mobile apps and web frontends, test on old devices and slow networks (even if only emulated), and monitor their real-user performance (e.g. through <a href="https://web.dev/vitals/">Web Vitals</a>).
As part of performance testing, have a look at electricity use, as it will both be directly associated with emissions to produce that electricity, and be perceptible by the user (battery drain).
And don't forget to account for the app downloads as part of the overall perceived performance: light mobile apps that don't need to be updated every other day, frontend JS and CSS that can be cached and won't update several times a day either (defeating the cache).</p>
<aside role="doc-pullquote">Optimize for the perceived performance and battery life.</aside>
<p>Don't forget about the space taken by your app on the user's device too: users shouldn't have to make a choice between apps due to <em>no space left on device</em>, so when possible prefer a website or progressive web app (PWA) to a native application (you can still publish them to application stores if required, through tiny wrapper native apps).</p>
<aside role="doc-pullquote">When possible, prefer a website or PWA to a native application.</aside>
<h2 id="a-note-to-product-managers">A note to product managers</h2>
<p>The above advices were mostly technical, answering the question <q>What can I do as an architect or developer?</q>
but product managers have their share, and they're actually the ones in power here:
they can choose which features to build or not build, they can shape the features, they can reduce software complexity by limiting the number of features and of levers and knobs.
This will undoubtedly avoid bloat and help you make things leaner and faster.</p>
<p>Avoid <a href="https://en.wikipedia.org/wiki/Feature_creep" title="Wikipedia: Feature creep">feature creep</a> and beware of <a href="https://en.wikipedia.org/wiki/Wirth%27s_law" title="Wikipedia: Wirth's law">Wirth's law</a>.</p>
<aside role="doc-pullquote">Refrain from adding features, reduce software complexity.</aside>
<p>Last, but not least, make sure you really need software!
Sometimes you should embrace <a href="https://en.wikipedia.org/wiki/Low_technology" title="Wikipedia: Low technology">low-tech</a>.
For example, instead of developing a mobile app with accounts to identify the user so you can notify them, then maybe you could simply use SMS (assuming you have some out-of-band means of knowing their phone number, and the latency of distribution is acceptable).
And sometimes what you're trying to address with software just isn't worth it, particularly if it involves IoT (remember that we should strive for fewer devices that we keep longer, not more).</p>
<aside role="doc-pullquote">Sometimes, ideas aren't even worth their impacts.</aside>
<p>Conversely, as we'll need to electrify parts of our economy to reduce their carbon footprint, <q>software is one of the few sectors to start with a head-starts: we get greener at the same rate as the grid without other work needed</q> (I'm quoting <a href="https://infrequently.org/">Alex Russell</a> here, from a private conversation), so please do use software to digitalize and replace more carbon-intensive activities.</p>
<h2 id="other-pitfalls">Other pitfalls</h2>
<p>Besides only evaluating electricity consumption on your servers, another pitfall is trying to attribute emissions to each user or request:
when you have dozens, hundreds or even thousands of concurrent requests, how do you distribute electricity consumption among them?
There's an <a href="https://datatracker.ietf.org/doc/draft-martin-http-carbon-emissions-scope-2/" title="IETF: HTTP Response Header Field: Carbon-Emissions-Scope-2">IETF proposal for a HTTP response header</a> exposing such information, and while it's a commendable idea I doubt it's realistic.
My personal belief is that display of such information is often a sign of <a href="https://en.wikipedia.org/wiki/Greenwashing" title="Wikipedia: Greenwashing">greenwashing</a>.
To my knowledge, data can only be accurate in aggregates.</p>
<p>If you really do want to show how <em>green</em> you are, conduct a life-cycle assessment (LCA): take all three scopes into account, all three tiers, evaluating impacts over more criterias than the global warming potential (GWP) alone.</p>
<p>Here are a couple resources if you want to go farther:</p>
<ul>
<li><a href="https://www.greenit.fr/2023/04/18/quels-pieges-a-eviter-pour-evaluer-lempreinte-environnementale-du-numerique/">Pitfalls to avoid when assessing the environmental footprint of digital technology</a> (in French)</li>
<li><a href="https://learn.greensoftware.foundation/">Learn Green Software</a></li>
</ul>
<p><em>Thanks to <a href="https://infrequently.org/">Alex Russell</a> and <a href="https://limited.systems/">Wim Vanderbauwhede</a> for their feedback.</em></p>

  ]]></content>
</entry>

<entry>
  <title type="html">Naming things is hard, SPA edition</title>
  <link href="/naming-things-is-hard-spa-edition/" />
  <published>2023-03-28T00:00:00+0000</published>
  <updated>2023-03-28T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/naming-things-is-hard-spa-edition/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/naming-things-is-hard-spa-edition-3g41/comments" />
  
  <content type="html" xml:base="/naming-things-is-hard-spa-edition/"><![CDATA[
    <p>During the past few months, social networks have been shaken by a <em>single-page</em> vs <em>multi-page applications</em> (SPA vs MPA) battle, more specifically related to Next.js and React, following, among other things, <a href="https://mobile.twitter.com/rauchg/status/1619492334961569792">a tweet by Guillermo Rauch</a> and <a href="https://github.com/reactjs/reactjs.org/pull/5487#issuecomment-1409720741">a GitHub comment by Dan Abramov</a>.</p>
<p>I've read a few articles and been involved in a few discussions about those and it appeared that we apparently don't all have the same definitions, so I'll give mine here and hope people rally behind them.</p>
<h2 id="navigation">SPA vs MPA: it's about navigation</h2>
<p>It's not that hard: a <em>single-page</em> application means that you load a page (HTML) once, and then do everything in there by manipulating its DOM and browser history, fetching data as needed.
This is the exact same thing as <em>client-side navigation</em> and requires some form of <em>client-side routing</em> to handle navigation (particularly from history, i.e. using the back and forward browser buttons).</p>
<p>Conversely, a <em>multi-page</em> application means that each navigation involves loading a new page.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>SPA means you load a page once then navigate by manipulating the DOM and history. MPA means that each navigation involves loading a new page.</aside>
<p>This by itself is a controversial topic: despite SPAs having lots of problems (user experience –aborting navigation, focus management, timing of when to update the URL bar–, <a href="https://nolanlawson.com/2019/11/05/what-ive-learned-about-accessibility-in-spas/" title="Nolan Lawson: What I’ve learned about accessibility in SPAs">accessibility</a>, performance even by not being able to leverage streaming) due to taking responsibility and having to reimplement <a href="https://dev.to/tigt/routing-im-not-smart-enough-for-a-spa-5hki" title="Taylor Hunt: Routing: I’m not smart enough for a SPA">many things</a> from the browser (loading feedback, error handling, focus management, scrolling), some people strongly believe this is <a href="https://twitter.com/dan_abramov/status/1621949445540659201">“one of the first interesting optimizations”</a> and they <a href="https://twitter.com/dan_abramov/status/1617963492908335104">“can’t really seriously consider websites that reload page on every click good UX”</a>
(I've only quoted Dan Abramov from the React team here, but I don't want to single him out: he's far from being alone with this view; others are <a href="https://andy-bell.co.uk/the-extremely-loud-minority/" title="Andy Bell: The (extremely) loud minority">in denial</a> thinking that <a href="https://www.epicweb.dev/the-webs-next-transition#:~:text=This%20is%20the%20strategy%20used%20by%20most%20of%20the%20industry%20today." title="Kent C. Dodds: The Web’s Next Transition; this quote in the section about SPAs">“this is the strategy used by most of the industry today”</a>).
Some of those issues are supposedly (and hopefully) fixed by the new <a href="https://developer.mozilla.org/en-US/docs/Web/API/Navigation_API" title="MDN: Navigation API">navigation API</a> that's currently only implemented in Chromium browsers.
And despite <a href="https://www.zachleat.com/web/single-page-applications/" title="Zach Leatherman: Defaulting on Single Page Applications (SPA)">their many advantages</a>, MPAs aren't free from limitations too, otherwise we probably wouldn't have had SPAs to being with.</p>
<p>My opinion? There's no one-size-fits-all: most sites and apps could (<a href="https://www.thoughtworks.com/radar/techniques/spa-by-default" title="Thoughtworks Technology Radar: SPA by default">and probably should</a>) be MPAs, and an SPA is a good (and better) fit for others.
It's also OK to use both MPA and SPA in a single application depending on the needs.
Jason Miller published <a href="https://jasonformat.com/application-holotypes/" title="Jason Miller: Application Holotypes: A Guide to Architecture Decisions">a rather good article</a> 4 years ago (I don't agree with everything in there though).
Nolan Lawson also has written <a href="https://nolanlawson.com/2022/06/27/spas-theory-versus-practice/" title="Nolan Lawson: SPAs: theory versus practice">a good and balanced series</a> on MPAs vs SPAs.</p>
<p>And we haven't even talked about where the <em>rendering</em> is done yet!</p>
<h2 id="rendering">Rendering: SSR, ESR, SWSR, and CSR</h2>
<p>Before diving into <em>where</em> it's done, we first need to define <em>what</em> rendering is.</p>
<p>My definition of <em>rendering</em> is applying some form of <em>templating</em> to some <em>data</em>.
This means that getting some HTML fragment from the network and putting it into the page with some form of <code>innerHTML</code> is <strong>not</strong> rendering.
Conversely, getting some <em>virtual DOM</em> as JSON for example and reconstructing the equivalent DOM from it <strong>would</strong> qualify as rendering.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>Rendering is applying some form of templating to some data.</aside>
<p>Now that we've defined <em>what</em> rendering is, let's see <em>where</em> it can be done: basically at each and any stage of delivery: the origin server (SSR), edge (ESR), service-worker (SWSR), or client (CSR).</p>
<p>There's also a whole bunch of <em>prerendering</em> techniques: static site generation (SSG), on-demand generation, distributed persistent rendering (DPR), etc.</p>
<p>All these rendering stages, except client-side rendering (CSR), generate HTML to be delivered to the browser engine.
CSR will however directly manipulate the DOM most of the time, but sometimes will also generate HTML to be used with some form of <code>innerHTML</code>; the details here don't really matter.</p>
<p>Rendering at the origin server or at the edge (Cloudflare Workers, Netlify Functions, etc.) can be encompassed under the name server-side rendering (SSR), but depending on the context SSR can refer to the origin server only.
Similarly, rendering in a service worker could be included in client-side rendering (CSR), but most of the time CSR is only about rendering in a browsing context.
I suppose we could use <em>browser-side rendering</em> (BSR) to encompass CSR and SWSR.</p>
<p><img src="/image/2023/03/ssr-csr.png" alt="Schema of SSR, ESR, SWSR and CSR, with grouping representing SSR-in-the-broader-sense (SSR and ESR) vs. BSR (SWSR and CSR), and which generate HTML (SSR, ESR and SWSR) or manipulate the DOM (CSR)"></p>
<p>As noted by Jason Miller and Addy Osmani in their <a href="https://web.dev/rendering-on-the-web/" title="web.dev: Rendering on the Web">Rendering on the Web</a> blog post, applications can leverage several stages of rendering (SSR used in the broader sense here), but like many they conflate SPA and CSR.
Eleventy (and possibly others) also allows <a href="https://www.11ty.dev/docs/plugins/edge/" title="Eleventy Edge: A plugin to run Eleventy in an Edge Function to add dynamic content to your Eleventy sites.">rendering a given page at different stages</a>, with parts of the page prerendered at build-time or rendered on the origin server, while other parts will be rendered at the edge.</p>
<h2 id="implications">What does that imply?</h2>
<p>My main point is that rendering is almost orthogonal to single-page vs multi-page: an SPA doesn't imply CSR.</p>
<aside role="doc-pullquote presentation" aria-hidden=true>SPA doesn't necessarily imply CSR.</aside>
<ul>
<li><a href="https://chromestatus.com/metrics/feature/timeline/popularity/2617" title="Chrome Platform Status: usage metrics of the history.pushState API">Most web sites are MPAs</a> with SSR, sometimes ESR.</li>
<li>Most React/Vue/Angular applications are SPAs with CSR: the HTML page is mostly empty, generally the same for every URL, and the page loads data on <em>boot</em> and renders it (at the time of writing, the <a href="https://angular.io">Angular website</a> is such an SPA+CSR).</li>
<li>Next.js/Gatsy/Remix/Nuxt/Angular Universal/Svelte Kit/Solid Start/îles applications are SPAs with SSR and CSR: data is present as HTML in the  page, but navigations then use CSR staying on the same page (and actually, despite the content being present in the HTML page, those frameworks will discard and re-render it client-side on <em>boot</em>).</li>
<li>Qwik City/Astro/Deno Fresh/Enhance/Marko Run applications are MPAs with SSR (and CSR as needed through <a href="https://jasonformat.com/islands-architecture/" title="Jason Miller: Islands Architecture"><em>islands of interactivity</em></a>); Qwik City provides <a href="https://qwik.builder.io/docs/faq/#can-qwik-do-spa" title="Qwik FAQ: Can Qwik do SPA?">an easy way</a> to switch to an SPA with SSR and CSR (though contrary to the above-mentioned frameworks, Qwik City won't re-render on page load).</li>
<li><a href="https://turbo.hotwired.dev/handbook/drive">Hotwire Turbo Drive</a> (literally <em>HTML over the wire</em>; formerly Turbolinks) and <a href="https://htmx.org">htmx</a> applications are SPAs with SSR.</li>
<li>GitHub is known for its use of Turbolinks and is actually both MPA and SPA, depending on pages and sometimes navigation (going from a user profile to a repository loads a new page, but the reverse is a client-side navigation).</li>
</ul>
<p>Some combinations aren't really useful: an MPA with CSR (and without SSR) would mean loading an almost empty HTML page at each navigation to then fetch data (or possibly getting it right from HTML page) and do the rendering. Imagine the Angular website (which already makes a dubious choice of not including the content in the HTML page, for a documentation site) but where all navigations would load a new (almost empty) page.</p>
<p>Similarly, if you're doing a SPA, there's no real point in doing rendering in a service worker as it could just as well be done in the browsing context; unless maybe you're doing SPA navigation only on some pages/situations (video playing?) and want to leverage SWSR for all pages including MPAs?</p>
<h2 id="other-considerations">Other considerations</h2>
<p>In an application architecture, navigation and rendering locality aren't the only considerations.</p>
<h3 id="inline-updates">Inline updates</h3>
<p>Not every interaction has to be a navigation:
there are many cases where a form submission would <em>return</em> to the same page (reacting to an article on <a href="https://dev.to">Dev.to</a>, posting a comment, updating your shopping cart), in which case progressive enhancement could be used to do an inline update without a full page refresh.</p>
<p>Those are independent from SPAs: you can very well have an MPA and use such inline updates.
Believe it or not, this is exactly what <a href="https://dev.to">Dev.to</a> does for their comment form (most other features like following the author, reacting to the post or a comment, or replying to a comment however won't work at all if JavaScript is somehow broken).</p>
<h3 id="includes">Concatenation and Includes</h3>
<p>Long before we had capable enough JavaScript in the browser to build full-blown applications (in the old times of DHTML, before AJAX), there already were optimization techniques on the servers to help build an HTML page from different pieces, some of which could have been <em>prerendered</em> and/or cached.
Those were <a href="https://en.wikipedia.org/wiki/Server_Side_Includes" title="Wikipedia: Server Side Includes"><em>server-side includes</em></a> and <a href="https://www.w3.org/TR/esi-lang/" title="W3C: ESI Language Specification 1.0"><em>edge-side includes</em></a>.</p>
<p>While they are associated with specific syntaxes, the concepts can be used today <a href="https://blog.cloudflare.com/edge-side-includes-with-cloudflare-workers/" title="Edge-Side-Includes with Cloudflare Workers">in edge functions</a> or <a href="https://philipwalton.com/articles/smaller-html-payloads-with-service-workers/" title="Philip Walton: Smaller HTML Payloads with Service Workers">even in service workers</a>.</p>
<p>The different parts being concatenated/included this way can be themselves static or prerendered, or rendered on-demand.
Actually the above-mentioned feature of Eleventy where parts of a page are server-rendered or prerendered and other parts are rendered at the edge is very similar to those <em>includes</em> as well.</p>

  ]]></content>
</entry>

<entry>
  <title type="html">Migrating from Jekyll to Eleventy</title>
  <link href="/from-jekyll-to-eleventy/" />
  <published>2023-03-12T00:00:00+0000</published>
  <updated>2023-03-12T00:00:00+0000</updated>
  <id>http://blog.ltgt.net/from-jekyll-to-eleventy/</id>
  
  
    <link rel="replies" href="https://dev.to/tbroyer/migrating-from-jekyll-to-eleventy-1g50/comments" />
  
  <content type="html" xml:base="/from-jekyll-to-eleventy/"><![CDATA[
    <p>Yes, this is going to be yet another one of those articles explaining how I migrated this blog from <a href="https://jekyllrb.com/">Jekyll</a> to <a href="https://11ty.dev/">Eleventy</a>. You've been warned.</p>
<h2 id="why">Why?</h2>
<p>I don't really have issues with Jekyll and I've been using it for 10 years now here, but I haven't really <em>chosen</em> Jekyll: it's been more-or-less imposed on me by GitHub Pages.
But GitHub now has added the possibility to <a href="https://docs.github.com/en/pages/getting-started-with-github-pages/configuring-a-publishing-source-for-your-github-pages-site#publishing-with-a-custom-github-actions-workflow">deploy using a custom GitHub Actions workflow</a>, and this is game-changer!</p>
<p>I could have kept using Jekyll with unlocked possibilities, but I'm not a Rubyist, that's just not a language I'm comfortable with, and I know almost nothing about Gems, so definitely not something I'd be comfortable maintaining going forward.</p>
<p>I also could have just kept using the built-in Jekyll Pages integration, and this is what I would have done if I hadn't found any satisfying alternative. I'm not forced to change, so at least I have a fallback in the form of the <em>status quo</em>.</p>
<p>So what would replace it? Let's evaluate my requirements.</p>
<h3 id="the-requirements">The Requirements</h3>
<ul>
<li>I have articles written in HTML (exports from Posterous) and Markdown, using a bit of Liquid to link to other articles (with the <code>post_url</code> Jekyll tag). The Markdown articles use <a href="https://github.github.com/gfm/">GitHub Flavored Markdown</a>, including syntax-highlighted fenced code blocks, with embedded HTML. Ideally I shouldn't have to update the articles at all.</li>
<li>I only have 4 templates only (<code>index.html</code>, <code>rss.xml</code>, and <code>default</code> and <code>post</code> layouts) so migrating to another templating engine wouldn't really be a problem. The <code>index.html</code> uses pagination (even though I still only have a single page). The <code>default</code> layout builds a <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP">Content Security Policy</a> using flags from the articles' front matter.</li>
<li>I also have a few static files: CSS, JS, and images (and a file to <a href="https://support.google.com/webmasters/answer/9008080?hl=en#html_verification">verify ownership</a> for the Google Search Console).</li>
<li>Of course, because <a href="https://www.w3.org/Provider/Style/URI">cool URIs don't change</a>, the permalinks have to be ported to the new solution.</li>
<li>I hadn't identified it at first, but I actually have an old article that's not published, through Jekyll's <code>published: false</code> in the front matter. In the worst case, I'd just delete it (it'd still be there in the Git history).</li>
<li>Nice to have: I kinda like Jekyll's <code>_drafts</code> folder using the file's last modified date, and <code>_posts</code> folder with the publication date as part of the file name. (I don't commit my drafts, and yes that means I don't have backups; I don't have many drafts, and I'll probably never finish and publish them so 🤷)</li>
<li>Of course I want something I'm comfortable using for the next 10 years, in terms of technology and ecosystem. This means essentially that I'd like a Node-based solution.</li>
<li>Last, but not least, I want the output to be (almost) identical (for now at least) to the Jekyll site: must be static HTML, with <code>&lt;script&gt;</code>s added by the layouts and possibly right from the articles, no <em>client-side hydration</em> and upgrading to a Single Page Application.</li>
</ul>
<h3 id="the-choice">The choice</h3>
<p>The <em>HTML-first</em> approach rules out (<em>a priori</em>, correct me if I'm wrong) every React or Vue based approach, or similar.</p>
<p>I've quickly evaluated a couple alternatives, namely <a href="https://astro.build/">Astro</a> and Eleventy.</p>
<p>Astro is fun, but I must say it doesn't really look <em>content oriented</em>, relegating the content into its <code>src/pages</code>, or worse, a subfolder inside <code>src/content/</code>.
I really like the typesafe nature of content collections, but moving everything down to <code>src/content/blog</code> really <em>hides</em> the content away IMO.
Extracting the publication date from the file name <a href="https://github.com/humanwhocodes/astro-jekyll">is possible</a>, but it looks more and more like a <em>development</em> project rather than a <em>content</em> project.
It's great, but not what I'm looking for here.</p>
<p>I then looked at Eleventy. I have to admit my first contacts with the Eleventy documentation months ago left me with a bitter taste as I couldn't really figure out how collections worked and how you were supposed (or not) to organize your files. Looking at <a href="https://github.com/tweetback/tweetback">tweetback</a> more recently didn't really help: absolutely everything is JS, loading content from a SQLite database.</p>
<p>I decided to give it a chance: maybe I misunderstood the documentation the last time(s) I read it.
And indeed it was the case: moving from Jekyll to Eleventy probably couldn't be easier.</p>
<h2 id="how">How?</h2>
<p>I felt my way a bit, so I'll summarize here <a href="https://github.com/tbroyer/blog.ltgt.net/commit/1baabc320ebefbbbaae2e37c6beeceed2c2167cf">what I ended up doing</a>, also describing some things I tried along the way.</p>
<h3 id="getting-started">Getting Started</h3>
<p>Removing Jekyll consists in deleting the <code>_config.yml</code> and possibly <code>Gemfile</code> (I didn't have one).
Adding Eleventy means initializing a new NPM packaging and adding the <code>@11ty/eleventy</code> dependency (and of course adding <code>node_modules</code> to the <code>.gitignore</code>), and creating a <a href="https://www.11ty.dev/docs/config/#default-filenames">configuration file</a> (I chose <code>eleventy.config.cjs</code> rather than the <code>.eleventy.js</code> hidden file).</p>
<p>Because the deployment workflow is different, the <code>CNAME</code> file becomes useless and can be deleted.
A new GitHub Actions workflow also has to be created, using the <code>actions/configure-pages</code>, <code>actions/upload-pages-artifact</code>, and <code>actions/deploy-pages</code> actions. I took inspiration from <a href="https://github.com/actions/starter-workflows/blob/main/pages/astro.yml">the Astro starter workflow</a> and updated it for Eleventy.</p>
<h3 id="markdown">Markdown</h3>
<p>Eleventy supports Markdown out of the box, with all the options I needed, except syntax highlighting and heading anchors for deep linking.
It also automatically <a href="https://www.11ty.dev/docs/dates/">extracts the date from the file name</a>.</p>
<p>Syntax highlighting is as easy as using <a href="https://www.11ty.dev/docs/plugins/syntaxhighlight/">the official plugin</a>, but then the generated HTML markup is different than with the Rouge highlighter in Jekyll, so I had to change the CSS accordingly.
I ended up importing an existing theme: display would be slightly different than before, but actually probably better looking.</p>
<p>Deep linking requires using <a href="https://github.com/valeriangalliat/markdown-it-anchor">the <code>markdown-it-anchor</code> plugin</a>, and to make sure existing deep links wouldn't break I provided my own <code>slugify</code> function mimicking the way CommonMarkGhPages computes the slug from the heading text (I happen to have a few headings with <code>&lt;code&gt;</code> in them, and CommonMarkGhPages would compute the slug from the rendered HTML leading to things like <code>codejavaccode</code>; I chose to break those few links in favor of better-looking anchor slugs).
I also disabled <code>tabIndex</code> to keep the same rendering as previously (I'll read more on the accessibility implications and possibly revert that choice later.)</p>
<p>I reimplemented the <code>post_url</code> first as a <a href="https://www.11ty.dev/docs/shortcodes/">custom short code</a> but that meant updating all articles to quote the argument (due to how Eleventy wires things up), so I ended up using a <a href="https://www.11ty.dev/docs/custom-tags/">custom tag</a>; that's specific to the Liquid template engine (in case I would want to change later on) but at least I don't have to update the articles.</p>
<p>In terms of rendering, besides syntax highlighting, the only difference is the <code>&lt;br&gt;</code> which are now rendered that way rather than <code>&lt;br /&gt;</code> (there's an option in <code>markdown-it</code> but I'll keep the less XHTML-y, more HTML-y syntax).</p>
<p>The <code>rss.xml</code> file wouldn't be treated as a template by default, so I <a href="https://www.11ty.dev/docs/languages/custom/#aliasing-an-existing-template-language">aliased the <code>xml</code> extension</a> to the Liquid engine, and added an explicit <code>permalink:</code> to avoid Eleventy creating an <code>rss.xml/index.html</code> file.
I did the same with the <code>css</code> extension so I could <a href="https://www.11ty.dev/docs/languages/liquid/#supported-features">use an <code>include</code></a> to bring in the syntax-highlighting theme in my <code>style.css</code>.</p>
<h3 id="liquid-templating">Liquid Templating</h3>
<p>I had to rename my layout files to use a <code>.liquid</code> extension rather than <code>.html</code>.
I didn't want to move them though, so I <a href="https://www.11ty.dev/docs/config/#directory-for-layouts-(optional)">configured a layouts directory</a> instead.</p>
<p>I also had to handle all the Jekyll-specific things I was using: <code>xml_escape</code>, <code>date_to_xmlschema</code>, <code>date_to_string</code>, and <code>date_to_long_string</code> filters, and the <code>site.time</code> and <code>site.github.url</code> variables (we already handled the <code>post_url</code> tag above).</p>
<p>At first, I tried to recreate them in Eleventy (which is easy with <a href="https://www.11ty.dev/docs/shortcodes/">custom shortcodes</a> and <a href="https://www.11ty.dev/docs/data-global/">global data files</a>), but finally decided that I could replace most with more standard Liquid that would be compatible right-away with LiquidJS: <code>xml_escape</code> becomes <code>escape</code>, <code>date_*</code> become <code>date:</code> with the appropriate format (this made it possible to fix my <code>&lt;time&gt;</code> elements erroneously including the time), and <code>site.time</code> becomes <code>&quot;now&quot;</code> or <code>&quot;today&quot;</code> with the <code>date</code> filter.
I put that in a separate commit as that's compatible with Jekyll Liquid as well.
And all that's left is therefore <code>site.github.url</code> that can be put in a global data file (a JS file getting the value out of an environment variable, fed by the <code>actions/configure-pages</code> output in the GitHub Actions workflow).</p>
<p>Finally, I actually had to update all templates to use Eleventy's way of <a href="https://www.11ty.dev/docs/pagination/">handling pagination</a>, and looping over collections.</p>
<p>Speaking of collections, I initially used <a href="https://www.11ty.dev/docs/data-template-dir/">directory data files</a> to assign a <code>post</code> tag to all posts in <code>_posts</code> and <code>_drafts</code>.
This didn't handle the <code>published: false</code>, so I used a <a href="https://www.11ty.dev/docs/collections/#advanced-custom-filtering-and-sorting">custom collection</a> in the configuration file instead.
I probably could have also used a <a href="https://www.11ty.dev/docs/data-computed/">computed</a> <a href="https://www.11ty.dev/docs/collections/#how-to-exclude-content-from-collections"><code>eleventyExcludeFromCollections</code></a> to exclude it, but this also helped fix an issue with the sort order and apparently a bug in LiquidJS's <code>for</code> loop with both <code>reversed</code> and <code>limit:</code> where it would limit before reversing whichever way I wrote things, contrary to <a href="https://liquidjs.com/tags/for.html#reversed">what the doc says</a>.</p>
<p>One last change I made: update the Content Security Policy to account for the Eleventy dev mode autoreload; I used <code>eleventy.env.runMode != &quot;build&quot;</code> to <a href="https://www.11ty.dev/docs/data-eleventy-supplied/#eleventy-variable">detect when run with autoreload</a>.</p>
<h3 id="static-files">Static Files</h3>
<p>Contrary to Jekyll where any file without front matter is simply copied, static files have to be <a href="https://www.11ty.dev/docs/copy/">explicitly declared</a> with Eleventy.
I also had to <a href="https://www.11ty.dev/docs/ignores/">ignore</a> those HTML files I needed to just copy without processing.</p>
<h3 id="permalinks">Permalinks</h3>
<p>Permalinks for the <code>rss.xml</code> and <code>style.css</code> are defined right in those files' front matter.
The <code>index.html</code> uses pagination so I <a href="https://www.11ty.dev/docs/pagination/#remapping-with-permalinks">declared a mapping</a> there as well.</p>
<p>Finally I decided to compute the permalink for posts right in the front matter of the <code>post</code> layout, using the <code>page.fileSlug</code> gives me exactly what I want (the date part has already been removed by Eleventy).
Using a JS front matter allowed me to filter out the <code>published: false</code> article so it <a href="https://www.11ty.dev/docs/permalinks/#skip-writing-to-the-file-system">wouldn't ever be rendered to disk</a> (I already excluded it from the <code>posts</code> collection, but Eleventy would still process and render it).</p>
<h3 id="drafts">Drafts</h3>
<p>To handle drafts, I'm using <a href="https://www.11ty.dev/docs/collections/#getfilteredbyglob(-glob-)">the <code>getFilteredByGlob</code> function</a> when declaring the <code>posts</code> collection, so I can decide whether to include the <code>_drafts</code> folder depending on an environment variable.
This would include  the drafts in the <code>posts</code> collection so they would appear in the <code>index.html</code> and <code>rss.xml</code>.</p>
<p>More importantly though, when not including drafts, I have to ignore the <code>_drafts</code> folder, otherwise the drafts are still processed and generated (despite not being linked to as they don't appear in the <code>posts</code> collection).
This is actually not really a problem given that I don't commit drafts to my Git repository, so I would observe this behavior only locally.</p>
<h3 id="comparing-the-results">Comparing the results</h3>
<p>To make sure the output was identical to the Jekyll-based version, I built the site once with Jekyll before any modification and backed up the <code>_site</code> folder; then <a href="https://meldmerge.org/">compared it</a> with the output of Eleventy to make sure everything was OK.</p>
<h2 id="conclusion">Conclusion</h2>
<p>As I felt my way and learned about Eleventy, this took me nearly two weekends to complete (not full time, don't worry!)
What took me the most time actually was probably finding (and deciding on) the new syntax-highlighting theme!
Otherwise, things went really smoothly.</p>
<p>I'm very happy with the outcome, so I switched over.
And now that I control the build workflow, I know I could setup an asset pipeline, minify the generated HTML, bring in <a href="https://github.com/11ty/eleventy-plugin-bundle">more Eleventy plugins</a> to split the syntax-highlighting theme out and <a href="https://piaille.fr/@tbroyer/109988938746853236">only send it when there's a code block</a> on the page, etc.</p>
<p>A big <strong>would recommend!</strong></p>

  ]]></content>
</entry>

</feed>
