<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://perfectionkills.com/feed.xml" rel="self" type="application/atom+xml" /><link href="http://perfectionkills.com/" rel="alternate" type="text/html" /><updated>2025-10-09T04:52:31+00:00</updated><id>http://perfectionkills.com/feed.xml</id><title type="html">Perfection Kills</title><subtitle>Javascript rants and findings, by kangax</subtitle><entry><title type="html">My Fitness: from spreadsheet to an app</title><link href="http://perfectionkills.com/my-fitness-from-spreadsheet-to-an-app/" rel="alternate" type="text/html" title="My Fitness: from spreadsheet to an app" /><published>2025-10-04T00:00:00+00:00</published><updated>2025-10-04T00:00:00+00:00</updated><id>http://perfectionkills.com/my-fitness-from-spreadsheet-to-an-app</id><content type="html" xml:base="http://perfectionkills.com/my-fitness-from-spreadsheet-to-an-app/"><![CDATA[<h2>My Fitness: from spreadsheet to an app</h2>
<p>My fitness journey—as is the case with many teenagers—began with bodybuilding. I wanted to look good. Soon after, I found <a href="https://stronglifts.com/stronglifts-5x5/">StrongLifts 5x5</a> and got into powerlifting. It became all about numbers: getting bigger bench, bigger squat, bigger press. Yet, I’ve always been drawn to the notion of <a href="https://archive.t-nation.com/training/total-athleticism-the-workout">Total Athleticism</a>, as coined by Max Shank in one of the articles on <a href="https://t-nation.com/">T-Nation</a> that I used to read religiously around 2015<sup id="footnote-anchor-1"><a href="#footnote-1">1</a></sup>. Having a 300 bench was cool but I didn’t want to be one of those powerlifters who had massive numbers yet couldn’t run up the stairs. I wanted to also be good at running, calisthenics, kettlebells. Eventually this brought me to “functional fitness” and, of course, CrossFit, which popularized it circa 2000.</p>
<h3>CrossFit standards</h3>
<p>In my fitness circles, CrossFit was still criticized for its reckless high-skill olympic movements performed at high intensity. Blame the epic fail videos of someone doing something stupid and the ignorance around the methodology. While I was on the offense about doing actual CrossFit, I loved the “variable movements” concept. I found a couple of “crossfit athlete standards” posters online and made this spreadsheet to track my progress across multiple domains. It immediately exposed all my gaps: I could squat 2x bodyweight but my snatch was at a measly 100lb and all the speed and work capacity tests were barely at level 2:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-1.png"/></figure>
<p>If I wanted to be an all-around developed athlete, these were the things I had to work on. The standards also served as an “objective” benchmark. To consider yourself “advanced” here’s how many pull-ups you had to be able to do; and this is how fast your 1 mile run would have to be. It gave me a concrete goal to work towards. These spreadsheets became my north star for the following few years. For a challenge junky like me, they were a perfect long-term obsession.</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-2.png"/></figure>
<h3>Strength and Skill</h3>
<p>The spreadsheet overload was real. This wasn’t the first one I used. As far back as 2011, I found <a href="https://exrx.net/Testing/WeightLifting/StrengthStandards">exrx.net Strength Standards</a> and created this view to understand where I stand strength-wise and what to work on:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-3.png"/></figure>
<p>In the last couple years I started tracking my frequency and total-lifetime-session-count of certain movements I wanted to be better at — <a href="https://kangax.substack.com/i/159510772/practice-makes-perfect">a concept I wrote about before</a>:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-4.png"/></figure>
<p>Finally, I tracked my proficiency levels on various CrossFit -specific movements as a way to advance my skill and become fluent in them during WOD’s.</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-5.png"/></figure>
<h3>One app to rule them all</h3>
<p>The year was 2025. <br/>I was a software engineer.<br/>And I would still manually update a spreadsheet with the number of times I’ve performed a certain movement that I deemed as “needed practice“.</p>
<p>This was embarrassing.</p>
<p>When I embarked on <a href="https://kangax.substack.com/p/przilla-crossfit-ai-companion">building PRzilla</a>, I realized that perhaps this was the time to ditch manual spreadsheet tracking. I could now build an app that would have <em>all</em> of this backed in:</p>
<ul>
<li>
<p>show your <strong>fitness level</strong> across multiple movement patterns/domains (strength, endurance, gymnastics, work capacity, etc.)</p>
</li>
<li>
<p>show your <strong>raw strength</strong> benchmarks (squat, deadlift, snatch, push press, etc.)</p>
</li>
<li>
<p>show your <strong>skill proficiency</strong> as a “lifetime sessions performed”</p>
<ul>
<li>
<p>if you’ve done ring muscle-ups <em>only</em> 20 times in your life, you’re unlikely to be better at them vs. someone who has done them 120 times</p>
</li>
</ul>
</li>
<li>
<p>show your <strong>skill ownership</strong> as a “max consecutive reps able to do“</p>
<ul>
<li>
<p>being able to do 50 consecutive kipping pull-ups means you own them; this movement is unlikely to be your limiting factor in any WOD that has them</p>
</li>
</ul>
</li>
</ul>
<h3>Whoop, Apple Fitness, and the rise of quantified fitness</h3>
<p>I use <a href="https://whoop.com/">Whoop</a> and I absolutely love how it’s able to distill complex/boring HRV/RHR metrics into simple, quantified scores like recovery and strain. Whoop and Apple Fitness—that’s just as big on quantifiable fitness—were a big motivation for this app.</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-6.jpeg"/></figure>
<p>On the other hand there are apps like <a href="https://beyondthewhiteboard.com/">BTWB</a> which is one of the most extensive Crossfit-style workout tracking tools, but I found its UI unintuitive and UX clunky:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-7.webp"/></figure>
<h3>A snapshot of your fitness</h3>
<p>And so I turned all my spreadsheets into a simple snapshot consisting of 5 views: your level, balance, strength, ownership, and practice. These could be easily extended in the future with any other <strong>modules</strong>: time domain distribution, specific goals tracking like work capacity or endurance. Or even sport-specific ones like Hyrox.</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-8.png"/></figure>
<h3>SugarWOD parser</h3>
<p>In order to turn all my workout data into beautiful charts, I needed to… have that data in the first place. One issue was that it was split between: </p>
<ul>
<li>
<p>SugarWOD — scores of WODs prescribed by my box that I did</p>
</li>
<li>
<p>wodwell.com — common WODs that I did on my own</p>
</li>
<li>
<p>Strong app — traditional strength training workouts that are not WODs (aka sets and reps)</p>
</li>
</ul>
<p>Importing wodwell scores was easy since it was just a map of common wod (fran, murph, etc.) to a score. Strong app export would be a lot more involved since I would have to implement sets and reps tracking (as well as workout sessions, potential rest values, and so on). So I decided to focus on SugarWOD import. And this is where the fun began.<br/><br/><strong>Good news</strong>: SugarWOD allows easy export of all of you workout history.<br/><strong>Bad news</strong>: SugarWOD data is very… unstructured.</p>
<div><div><div>
<p>Thanks for reading Juriy’s Substack! Subscribe for free to receive new posts and support my work.</p>
</div>
</div>
</div>
<p>Here’s an example of CSV:</p>
<pre><code>09/14/2022,WOD,24 Minute AMRAP:Row 240m12 Lateral Burpees over Back of Rower48 Double Unders24 Alternating Front Foot Elevated Reverse Lunges (53/35)*,3.073,3+73,Rounds + Reps,,"[{""rnds"":3,""reps"":73}]",,SCALED,</code></pre>
<p>As you can see, we have an arbitrary, potentially non-descriptive wod title like “<em>WOD</em>” plus gobbled up, plain-text description like “<em>24 Minute AMRAP:Row 240m12 Lateral Burpees over Back of Rower48 Double Unders24 Alternating Front Foot Elevated Reverse Lunges (53/35)*</em>” that’s missing basic formatting / newlines.</p>
<p>As humans, we’re able to quickly parse this into:</p>
<pre><code>24 Minute AMRAP:
Row 240m
12 Lateral Burpees over Back of Rower
48 Double Unders
24 Alternating Front Foot Elevated Reverse Lunges (53/35)*</code></pre>
<p>Thank god we live in the age of LLM’s which are capable of reasoning through a messy jammed up text like this just like we—humans—do. </p>
<p>Another peculiarity was <strong>load-based entries</strong>. In SugarWOD you can program them in a workout and specify sets and reps, e.g. 5 sets of 3 snatches. Users can then log a value for each of the 5 sets. In order to present your performance on the leaderboard, SugarWOD allows coaches to specify <strong>how to score those sets</strong> — max value (who got the highest weight)? lowest value (who got the fastest row time)? sum of all values (who did the most work overall)? and so on. The export doesn’t expose this scoring criteria so parser needs to infer it based on the sets data. In the example below, we can see that the scoring was using SUM of 12 sets and so 695 is not the weight user did as a 1RM squat snatch; the real squat snatch values are in the sets field:</p>
<pre><code>05/31/2024,WOD,12 ROUNDS:30 Second CAP:3 Toes to Bar2 Lateral Barbell Burpees1 Squat Snatch*REST 1 Minute.*Increase weight as able.,695,695,Load,,"[{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":65}]",,RX,</code></pre>
<p>In this case it’s “obvious” that 695 wasn’t a 1RM snatch (current world record is 496lbs) but some cases are much less obvious so the parser needs to be very careful there.</p>
<h3>LLM-powered pipeline</h3>
<p>And so after many weeks of experimenting, refining, rewriting, and adjusting based on real data (thanks to amazing volunteers in my gym), I now have a pretty smart and capable pipeline that turns unstructured SugarWOD data into a structured PRzilla data:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-9.png"/></figure>
<p>One of the biggest findings—and things that slowed me down—was realizing that a giant, monolithic LLM prompt we used to generate giant JSON with a dozen of different fields (gpp, modality, difficulty, benchmarks, classification, etc.) was taking way too much time, was way too expensive, and <strong>often produced errors</strong> as it tried to do too many things at once. </p>
<h3>Parallelization for the win</h3>
<p>I then switched to a series of small, targeted LLM parsers/prompts—as seen on the diagram above—for each of the metrics and ran them in parallel. The results were astonishing: faster and cheaper execution and <em>much</em> more accurate results. This also gave me flexibility to run only specific parsers in specific cases; e.g. when parsing your historic data we want to extract movements (to feed it into our proficiency metrics), performance levels (to understand how your fitness level progressed), time domain (to see a time domain breakdown), and so on. We don’t care about coaching/scaling/stimulus module since the workouts are in the past and users don’t need to know that! However, those modules are important for <em>new </em>workouts, when using <a href="https://przilla.app/wod/analyze">analyze</a> or <a href="https://przilla.app/wod/generate">generate</a>. </p>
<p>Finally, it allowed me to run these parsers in parallel which meant that a WOD analysis was now taking <code>time_of_slowest_module</code> (usually ~12-15sec) rather than sequential <code>SUM(module1, module2, …)</code> that would usually take up to 40 sec!</p>
<h3>Unstructured to structured</h3>
<p>The end result is incredible. We’re able to turn a textual mess like this, into an <strong>actual workout</strong> and your<strong> actual performance. </strong>Here we see that 14min AMRAP was properly parsed into movements like “Wall Walk” and “Front Squat”; that it’s an endurance and stamina -heavy workout (yep!), that it’s classified as “Very Hard” and its modality are equally “Gymnastics” and “Weighlifting”. Moreover, AI determined that user’s score of 80 falls right around L3 (which we would likely adjust to L3.5 or L4 due to workout’s difficulty):</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-10.png"/></figure>
<p>Parsing “Wall Walk” as a movement is what allows us to count that towards your practice score! Notice that we now know that you’ve done “Wall Walk” 32 times in your life with the recent one being 6 months ago.</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-11.png"/></figure>
<p>Now that we have this structured data, the possibilities—all of a sudden—are endless. We can easily, and more importantly, <strong>automatically</strong> show your strength levels: powerlifting, weightlifting, crossfit total:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-12.png"/></figure>
<p>We can derive how good you are at Endurance, Stamina, Power and other GPP components based on your scores on WODs that are high in those:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-13.png"/></figure>
<p>Because we’ve determined time domain of all the wods you’ve ever done, we can show if you’re leaning towards shorter or longer ones. Yes, parsing 1300 entries is expensive but at least we can marvel at the end result 😅. Here is coach Mike’s real data dating back to 2018. You can see that early years prioritize short WODs (&lt;12min) whereas last couple years the focus has shifted towards longer, HYROX-style ones:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-14.png"/></figure>
<p>And, of course, we have ability to see all the WODs for any given movement (why it’s so important to parse those for all the custom WODs and create proper associations). Here you can see that Mike has done over 232 lifetime front squat sessions over 8 years, 157 as dedicated lifts and 75 as part of WODs:</p>
<figure><img src="/images/my-fitness-from-spreadsheet-to-an-app/my-fitness-from-spreadsheet-to-an-app-15.png"/></figure>
<p>In the interest of brevity, I’ll stop right here. There are other things powering this pipeline which I’m still refining and perhaps can talk about later: male vs. female benchmarks, age-based adjustment of strength and fitness metrics, smart movement aggregation for practice skill screen, logging import errors like movements that don’t match in our DB, or a smart system of retrying LLM when parsers fail.</p>
<h3>End goal</h3>
<p>Now that I’ve gotten here, I can’t help but wonder: <strong>what’s next</strong>? and <strong>what’s the end goal</strong>? I can now replace spreadsheets with this app but it doesn’t solve all of my use cases. The dream would be to have an app that <strong>can track all of my workouts</strong>. This means:</p>
<ul>
<li>
<p>It needs to be a native (mobile) app</p>
<ul>
<li>
<p>Web apps are great but when in the gym and on the go—let’s be honest—we all prefer native apps.</p>
</li>
</ul>
</li>
</ul>
<ul>
<li>
<p>Replace SugarWOD completely?</p>
<ul>
<li>
<p>PRzilla is able to do this by parsing previous data but what about future one?</p>
</li>
<li>
<p>I would need to either:</p>
<ul>
<li>
<p>Implement SugarWOD API integration that’s tied to a box directly.</p>
</li>
<li>
<p>Implement some sort of image recognition of a WOD (snap a TV in your box) that can then be logged directly into our system.</p>
</li>
</ul>
</li>
</ul>
</li>
<li>
<p>Allow custom sets and reps logging</p>
<ul>
<li>
<p>This is a big one… and it would allow me to switch completely away from Strong app.</p>
</li>
<li>
<p>But first… I’ll need to port Strong app data into our system (perhaps more on that in later posts!)</p>
</li>
</ul>
</li>
</ul>
<div class="footnote" id="footnote-1"><sup>1</sup>
<p>Alex Viada came out with <a href="https://www.goodreads.com/author/list/13851793.Alex_Viada">Hybrid Athlete</a> around the same time.</p>
</div>]]></content><author><name></name></author><category term="other" /><summary type="html"><![CDATA[My Fitness: from spreadsheet to an app My fitness journey—as is the case with many teenagers—began with bodybuilding. I wanted to look good. Soon after, I found StrongLifts 5x5 and got into powerlifting. It became all about numbers: getting bigger bench, bigger squat, bigger press. Yet, I’ve always been drawn to the notion of Total Athleticism, as coined by Max Shank in one of the articles on T-Nation that I used to read religiously around 20151. Having a 300 bench was cool but I didn’t want to be one of those powerlifters who had massive numbers yet couldn’t run up the stairs. I wanted to also be good at running, calisthenics, kettlebells. Eventually this brought me to “functional fitness” and, of course, CrossFit, which popularized it circa 2000. CrossFit standards In my fitness circles, CrossFit was still criticized for its reckless high-skill olympic movements performed at high intensity. Blame the epic fail videos of someone doing something stupid and the ignorance around the methodology. While I was on the offense about doing actual CrossFit, I loved the “variable movements” concept. I found a couple of “crossfit athlete standards” posters online and made this spreadsheet to track my progress across multiple domains. It immediately exposed all my gaps: I could squat 2x bodyweight but my snatch was at a measly 100lb and all the speed and work capacity tests were barely at level 2: If I wanted to be an all-around developed athlete, these were the things I had to work on. The standards also served as an “objective” benchmark. To consider yourself “advanced” here’s how many pull-ups you had to be able to do; and this is how fast your 1 mile run would have to be. It gave me a concrete goal to work towards. These spreadsheets became my north star for the following few years. For a challenge junky like me, they were a perfect long-term obsession. Strength and Skill The spreadsheet overload was real. This wasn’t the first one I used. As far back as 2011, I found exrx.net Strength Standards and created this view to understand where I stand strength-wise and what to work on: In the last couple years I started tracking my frequency and total-lifetime-session-count of certain movements I wanted to be better at — a concept I wrote about before: Finally, I tracked my proficiency levels on various CrossFit -specific movements as a way to advance my skill and become fluent in them during WOD’s. One app to rule them all The year was 2025. I was a software engineer.And I would still manually update a spreadsheet with the number of times I’ve performed a certain movement that I deemed as “needed practice“. This was embarrassing. When I embarked on building PRzilla, I realized that perhaps this was the time to ditch manual spreadsheet tracking. I could now build an app that would have all of this backed in: show your fitness level across multiple movement patterns/domains (strength, endurance, gymnastics, work capacity, etc.) show your raw strength benchmarks (squat, deadlift, snatch, push press, etc.) show your skill proficiency as a “lifetime sessions performed” if you’ve done ring muscle-ups only 20 times in your life, you’re unlikely to be better at them vs. someone who has done them 120 times show your skill ownership as a “max consecutive reps able to do“ being able to do 50 consecutive kipping pull-ups means you own them; this movement is unlikely to be your limiting factor in any WOD that has them Whoop, Apple Fitness, and the rise of quantified fitness I use Whoop and I absolutely love how it’s able to distill complex/boring HRV/RHR metrics into simple, quantified scores like recovery and strain. Whoop and Apple Fitness—that’s just as big on quantifiable fitness—were a big motivation for this app. On the other hand there are apps like BTWB which is one of the most extensive Crossfit-style workout tracking tools, but I found its UI unintuitive and UX clunky: A snapshot of your fitness And so I turned all my spreadsheets into a simple snapshot consisting of 5 views: your level, balance, strength, ownership, and practice. These could be easily extended in the future with any other modules: time domain distribution, specific goals tracking like work capacity or endurance. Or even sport-specific ones like Hyrox. SugarWOD parser In order to turn all my workout data into beautiful charts, I needed to… have that data in the first place. One issue was that it was split between: SugarWOD — scores of WODs prescribed by my box that I did wodwell.com — common WODs that I did on my own Strong app — traditional strength training workouts that are not WODs (aka sets and reps) Importing wodwell scores was easy since it was just a map of common wod (fran, murph, etc.) to a score. Strong app export would be a lot more involved since I would have to implement sets and reps tracking (as well as workout sessions, potential rest values, and so on). So I decided to focus on SugarWOD import. And this is where the fun began.Good news: SugarWOD allows easy export of all of you workout history.Bad news: SugarWOD data is very… unstructured. Thanks for reading Juriy’s Substack! Subscribe for free to receive new posts and support my work. Here’s an example of CSV: 09/14/2022,WOD,24 Minute AMRAP:Row 240m12 Lateral Burpees over Back of Rower48 Double Unders24 Alternating Front Foot Elevated Reverse Lunges (53/35)*,3.073,3+73,Rounds + Reps,,"[{""rnds"":3,""reps"":73}]",,SCALED, As you can see, we have an arbitrary, potentially non-descriptive wod title like “WOD” plus gobbled up, plain-text description like “24 Minute AMRAP:Row 240m12 Lateral Burpees over Back of Rower48 Double Unders24 Alternating Front Foot Elevated Reverse Lunges (53/35)*” that’s missing basic formatting / newlines. As humans, we’re able to quickly parse this into: 24 Minute AMRAP: Row 240m 12 Lateral Burpees over Back of Rower 48 Double Unders 24 Alternating Front Foot Elevated Reverse Lunges (53/35)* Thank god we live in the age of LLM’s which are capable of reasoning through a messy jammed up text like this just like we—humans—do. Another peculiarity was load-based entries. In SugarWOD you can program them in a workout and specify sets and reps, e.g. 5 sets of 3 snatches. Users can then log a value for each of the 5 sets. In order to present your performance on the leaderboard, SugarWOD allows coaches to specify how to score those sets — max value (who got the highest weight)? lowest value (who got the fastest row time)? sum of all values (who did the most work overall)? and so on. The export doesn’t expose this scoring criteria so parser needs to infer it based on the sets data. In the example below, we can see that the scoring was using SUM of 12 sets and so 695 is not the weight user did as a 1RM squat snatch; the real squat snatch values are in the sets field: 05/31/2024,WOD,12 ROUNDS:30 Second CAP:3 Toes to Bar2 Lateral Barbell Burpees1 Squat Snatch*REST 1 Minute.*Increase weight as able.,695,695,Load,,"[{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":55},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":60},{""success"":true,""load"":65}]",,RX, In this case it’s “obvious” that 695 wasn’t a 1RM snatch (current world record is 496lbs) but some cases are much less obvious so the parser needs to be very careful there. LLM-powered pipeline And so after many weeks of experimenting, refining, rewriting, and adjusting based on real data (thanks to amazing volunteers in my gym), I now have a pretty smart and capable pipeline that turns unstructured SugarWOD data into a structured PRzilla data: One of the biggest findings—and things that slowed me down—was realizing that a giant, monolithic LLM prompt we used to generate giant JSON with a dozen of different fields (gpp, modality, difficulty, benchmarks, classification, etc.) was taking way too much time, was way too expensive, and often produced errors as it tried to do too many things at once. Parallelization for the win I then switched to a series of small, targeted LLM parsers/prompts—as seen on the diagram above—for each of the metrics and ran them in parallel. The results were astonishing: faster and cheaper execution and much more accurate results. This also gave me flexibility to run only specific parsers in specific cases; e.g. when parsing your historic data we want to extract movements (to feed it into our proficiency metrics), performance levels (to understand how your fitness level progressed), time domain (to see a time domain breakdown), and so on. We don’t care about coaching/scaling/stimulus module since the workouts are in the past and users don’t need to know that! However, those modules are important for new workouts, when using analyze or generate. Finally, it allowed me to run these parsers in parallel which meant that a WOD analysis was now taking time_of_slowest_module (usually ~12-15sec) rather than sequential SUM(module1, module2, …) that would usually take up to 40 sec! Unstructured to structured The end result is incredible. We’re able to turn a textual mess like this, into an actual workout and your actual performance. Here we see that 14min AMRAP was properly parsed into movements like “Wall Walk” and “Front Squat”; that it’s an endurance and stamina -heavy workout (yep!), that it’s classified as “Very Hard” and its modality are equally “Gymnastics” and “Weighlifting”. Moreover, AI determined that user’s score of 80 falls right around L3 (which we would likely adjust to L3.5 or L4 due to workout’s difficulty): Parsing “Wall Walk” as a movement is what allows us to count that towards your practice score! Notice that we now know that you’ve done “Wall Walk” 32 times in your life with the recent one being 6 months ago. Now that we have this structured data, the possibilities—all of a sudden—are endless. We can easily, and more importantly, automatically show your strength levels: powerlifting, weightlifting, crossfit total: We can derive how good you are at Endurance, Stamina, Power and other GPP components based on your scores on WODs that are high in those: Because we’ve determined time domain of all the wods you’ve ever done, we can show if you’re leaning towards shorter or longer ones. Yes, parsing 1300 entries is expensive but at least we can marvel at the end result 😅. Here is coach Mike’s real data dating back to 2018. You can see that early years prioritize short WODs (&lt;12min) whereas last couple years the focus has shifted towards longer, HYROX-style ones: And, of course, we have ability to see all the WODs for any given movement (why it’s so important to parse those for all the custom WODs and create proper associations). Here you can see that Mike has done over 232 lifetime front squat sessions over 8 years, 157 as dedicated lifts and 75 as part of WODs: In the interest of brevity, I’ll stop right here. There are other things powering this pipeline which I’m still refining and perhaps can talk about later: male vs. female benchmarks, age-based adjustment of strength and fitness metrics, smart movement aggregation for practice skill screen, logging import errors like movements that don’t match in our DB, or a smart system of retrying LLM when parsers fail. End goal Now that I’ve gotten here, I can’t help but wonder: what’s next? and what’s the end goal? I can now replace spreadsheets with this app but it doesn’t solve all of my use cases. The dream would be to have an app that can track all of my workouts. This means: It needs to be a native (mobile) app Web apps are great but when in the gym and on the go—let’s be honest—we all prefer native apps. Replace SugarWOD completely? PRzilla is able to do this by parsing previous data but what about future one? I would need to either: Implement SugarWOD API integration that’s tied to a box directly. Implement some sort of image recognition of a WOD (snap a TV in your box) that can then be logged directly into our system. Allow custom sets and reps logging This is a big one… and it would allow me to switch completely away from Strong app. But first… I’ll need to port Strong app data into our system (perhaps more on that in later posts!) 1 Alex Viada came out with Hybrid Athlete around the same time.]]></summary></entry><entry><title type="html">PRzilla: CrossFit AI companion</title><link href="http://perfectionkills.com/przilla-crossfit-ai-companion/" rel="alternate" type="text/html" title="PRzilla: CrossFit AI companion" /><published>2025-09-10T00:00:00+00:00</published><updated>2025-09-10T00:00:00+00:00</updated><id>http://perfectionkills.com/przilla-crossfit-ai-companion</id><content type="html" xml:base="http://perfectionkills.com/przilla-crossfit-ai-companion/"><![CDATA[<h2>PRzilla: CrossFit AI companion</h2>

<h2>Why</h2>

<p>When I left LinkedIn, I itched to build something in the space dear to my heart — fitness and CrossFit specifically. I also wanted a challenge of building a full-stack app, something I’ve never done before. The app would solve my pain points but I wanted to release it out there for anyone to use. This meant database, auth, users, and production-level user experience. It would be the biggest project I’ve ever done. With the rise of AI-assisted coding, it was a perfect time.</p>

<h2>Problem</h2>

<p>I’ve been using <a href="https://www.sugarwod.com/">SugarWOD</a> to track scores for CrossFit workouts (<a href="https://www.crossfit.com/essentials/what-is-a-crossfit-workout">WODs</a>) prescribed by my gym. But SugarWOD was never designed to be a standalone tracker: it’s missing many WODs, those that are there don’t have any info, and it doesn’t have a way to discover new ones. So I supplemented it with <a href="https://wodwell.com">wodwell.com</a> to find more workouts and track their scores. </p>

<p>Wodwell has its own issues: full of ads, a clunky UI, and it's slow. More importantly, I wanted to be in control of my data and wodwell has no export. It also wasn’t great that my workout history and performance data was split between two platforms.</p>

<h2>What</h2>

<p>All of this sounded like a perfect opportunity to build just that: a full-stack app that has an incredibly easy and fast search through a 1000 most popular WODs. It would allow to log scores for any of them, to track your progression over time, and to favorite WODs for later. As I started coding with AI, I quickly realized I could go <em>even further</em>: we could get insights into WODs via AI analysis (time domain, difficulty, L1-10 benchmarks, etc.)</p>

<div class="video-wrapper">
  <iframe width="560" height="315" src="https://www.youtube.com/embed/FZZZuicIssY" title="PRzilla walkthrough" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>

<h2>How</h2>

<p>Before I set out on this journey I wanted to define few foundational tenets that were non-negotiable.</p>

<h4>AI-driven</h4>

<p>“Vibe coding” exploded as I was starting this project. My LinkedIn feed was full with “this is incredible” and “this will never work” posts. I came across <a href="https://addyo.substack.com/p/why-i-use-cline-for-ai-engineering">Addy’s article on Cline</a> and decided to build this app entirely with AI as a matter of principle. <strong>No manual coding</strong>. It would be a perfect experiment since an app was not just a trivial one-pager vibe-coded in a day.</p>

<h4>Mobile-ready</h4>

<p>Always a fun UI challenge and is certainly a must these days unless you provide a native app. In context of CrossFit, you often need to look things up or log your scores while in the gym. Every page needed to be responsive and every UI concept needed to be adopted to small and large screens.</p>

<h4>Dark mode</h4>

<p>Not a terribly complicated constrain and is largely solved by using the right foundational abstractions but it does add cognitive complexity, especially if you’re working with AI, as you need to ensure it complies and uses the right tokens.</p>

<h4>Stateful</h4>

<p>Often overlooked aspect but it’s what separates a polished, predictable app from a clunky frustrating experience. URL’s are the source of truth. Important UI state change needs to be reflected in them. Now you have the power to reload it, bookmark it, share it, go back, and so on.</p>

<h4>Fast</h4>

<p>Next.js is known for SSR support out of the box; this means fast server-driven apps. This was a great opportunity for me to learn and experiment with these concepts.</p>

<blockquote>

<p>Big lesson I took from introducing these in the beginning: each constrain is a liability, <strong>another dimension</strong> to your product surface. Be careful with creating too many from the start. Think iPhone and its lack of copy-paste first few years.<br/><br/>A feature alone is a single point or a line (1D).</p>

<ul>

<li>Add a "mobile-ready" constraint, and that line now exists on a 2D plane (feature x device). You have to test both states.</li>

<li>Add "dark mode," and the plane becomes a 3D cube (feature x device x theme).</li>

<li>Add "SSR-ready," and you're now in a 4D space.</li>

</ul>

</blockquote>

<h2>From Zero to SaaS in 150 Days</h2>

<p>I’ve now spent about 5 months working on this daily-ish. I <a href="https://kangax.substack.com/p/building-an-app-with-cline-claude">learned a ton</a> <a href="https://kangax.substack.com/p/gemini-25-pro-and-the-meta-engineering">about AI assisted coding</a> <a href="https://kangax.substack.com/p/memory-bank-and-the-many-failures">and wrote about</a> <a href="https://kangax.substack.com/p/fast-cheap-or-smart-pick-three-the">most of it</a>. The lessons never stop and I post them <a href="https://www.linkedin.com/in/juriyzaytsev/recent-activity/all/">weekly on LinkedIn</a>. </p>

<p>What started as a simple way to see most common WODs, quickly turned into a powerful UI that allows to find just the right workout. With the power of AI, <a href="https://kangax.substack.com/p/using-ai-to-accurately-predict-crossfit">I’ve gone deep on classifying workouts</a> to create helpful data that doesn’t exist anywhere else out there — difficulty, modality, training stimulus, time domain, and workout characteristics via tags.</p>

<p>When I ask AI to summarize the complexity <sup id="footnote-anchor-1"><a href="#footnote-1">1</a></sup> of the app now:</p>

<blockquote>
<p>PRZilla is a large-scale production web application with 123,000 lines of TypeScript code across 818 files, featuring 19 database tables, ~109 React components in 304 TSX files, and 67 tRPC API procedures. The codebase includes 1,532 test cases with 256 E2E tests across 40 test files ensuring critical user journeys, 58 service modules handling complex business logic including 6 AI-powered features, and manages 922 predefined workouts with sophisticated scoring algorithms. This represents approximately 2-3 years of full-time development effort , comparable in complexity to a mid-sized SaaS product.</p>
</blockquote>

<p>It’s incredible to see the kind of power you wield with AI. The breadth and depth of functionality certainly feels like it would have taken me 2-3 years. I haven’t written any of this code and honestly can’t imagine having to ever write code manually again. </p>

<p>Cutting wood by hand is slow. Using an electric saw freehand is fast, but it’s how you get a crooked cut. The real leverage comes when you bolt the saw in place at a precise angle, set the exact speed, and let it execute a perfect cut in a minute.</p>

<p>That is exactly how I build software now. I don’t write code manually. And I don't just hand a task to an AI. Instead, I architect the system, protect it with guardrails so it stays the course, and give it specific instructions so it knows exactly the path to follow.</p>

<p>My role has changed: <strong>I am the architect and the guardrail engineer.</strong></p>

<h2>The Hard Part is Still the Hard Part</h2>

<p>Having spent a good amount of time not only developing new features but also refactoring, redesigning UI, and fixing bugs, I can tell with good confidence: <strong>your app will not fall apart</strong>. AI is capable of 95%. The remaining 5% are complex cases that usually reside at the edges of larger system integrations OR are just complex in nature. Those would be also complex for human, likely even more so.</p>

<p>For example, I’ve struggled to implement a well-working lazy loading of WOD cards on the main page because there was already a complex state management of various filters that had to all work in unison and support SSR; introducing lazy loading created X^Y^Z level of state management complexity and AI struggled to keep everything together without small bugs popping up here and there. </p>

<p>These are the fundamentally hard issues inherent to engineering. AI offers no magic wand for challenges like:</p>

<ul>

<li>

<p>The "dependency hell" of npm packages.</p>

</li>

<li>

<p>The chaos of flaky end-to-end tests.</p>

</li>

<li>

<p>Navigating features with no documentation.</p>

</li>

</ul>

<p>AI also can’t make your app stable if the underlying structure is rotten: fragmented state, logic duplication, complex branches with subtle bugs. But it’s surprisingly good at finding those and fixing them in a heartbeat.</p>

<h2>Code != Product</h2>

<p>When I look at the app right now I feel like it would have taken me <em>much less time</em> to build the “final” version. Yet, the reality is that development works like this in non-trivial apps:</p>

<figure><img src="/images/przilla-crossfit-ai-companion/przilla-crossfit-ai-companion-1.png"/></figure>

<div>

<p>AI allows you to travel that curvy path much faster. Although you have to be careful because without proper guardrails you can start swinging too far left and right: you created too much code, too many experiments, pushed things to prod too fast, all leading to too much liability.</p>

</div>

<h2>Production-ready</h2>

<p>You develop a feature, you have 1 problem.<br/>You decide to release it into production, now you have 10 problems.</p>

<p>Besides the app looking “good” and working “smooth”, the most important production-level aspect is making sure you don’t break things. In the last 10 years I’ve  worked at big companies where, despite often being oncall, you always have dedicated SRE help. You also have a well-oiled infra machine to detect errors in prod and notify you.</p>

<p>Thankfully, for small full-stack apps like mine, platforms like <a href="https://posthog.com/">PostHog</a> &amp; <a href="https://sentry.io/">Sentry</a> are incredible and provide all-in-one solutions for error monitoring (and more) with generous basic tiers.</p>

<h4>No broken windows</h4>

<p>I followed a pretty standard, tiered approach to release things safely:</p>

<ul>

<li>

<p>TypeScript must pass</p>

</li>

<li>

<p>Linter must pass</p>

</li>

<li>

<p>Unit tests must pass</p>

</li>

<li>

<p>E2E tests must pass</p>

</li>

<li>

<p>Test locally to ensure things work</p>

</li>

<li>

<p>Always push to a branch in production (<a href="https://vercel.com/">Vercel</a> makes it easy). This is basically your staging environment since it’s hitting production DB. </p>

</li>

<li>

<p>Manually test feature in prod branch, merge into main if it works well. An even safer option would be to introduce feature flags with gradual rollouts but I didn’t want that complexity <em>just yet</em>. </p>

</li>

<li>

<p>Finally, watch out for spikes in errors following the rollout of a commit.</p>

</li>

</ul>

<blockquote>

<p>Big takeaway here was not to trust AI with E2E tests. I didn’t pay too much attention to <em>all</em> of the assertions at first, then quickly discovered that bugs weren’t being caught. Turns out quality of E2E assertions was subpar: tests relied only on visibility checks, many used vague assertions or hard‑coded values, and almost none validated data against the database. Tests were slow and flaky due to waitForTimeout calls and text-based or CSS-class selectors. I ended up adding lint rules (via <a href="https://www.npmjs.com/package/eslint-plugin-playwright">eslint-plugin-playwright</a>) to ensure AI doesn’t break this in the future.</p>

</blockquote>

<h4>Good design</h4>

<p>I struggled with design at first but later found that it’s often a matter of the right prompt. For example, this was prompted to look like Apple Fitness / Whoop in 2025 with Sonnet 4 (which translates to clean, modern and minimal UI with oversized elements):</p>

<figure><img src="/images/przilla-crossfit-ai-companion/przilla-crossfit-ai-companion-2.png"/></figure>

<p>Compare to the old one:</p>

<figure><img src="/images/przilla-crossfit-ai-companion/przilla-crossfit-ai-companion-3.png"/></figure>

<p>To summarize, I think at least 80% of my time was spent on making things polished: figuring out UI/UX, refining UI/UX… endlessly, testing various permutations of an app, thinking through edge cases, ensuring it’s tested well, ensuring it’s feature-complete yet not over-engineered, documenting it well, deploying it correctly, and so on.</p>

<p>In the next post, I’ll dive deeper into some of the fitness-heavy concepts I’ve implemented in the app. We’ll talk more about that colorful “My Fitness” page and the complex LLM-powered pipeline that powers it!</p>

<div class="footnote" id="footnote-1"><sup>1</sup>

<p>Code metrics don't tell the whole story, but they do provide a rough idea of this app's scale. Recognizing that AI can introduce bloat, I carefully reviewed and streamlined all committed code. I estimate the result is a lean codebase with no more than 15-20% potential cruft.</p>

</div>]]></content><author><name></name></author><category term="other" /><summary type="html"><![CDATA[PRzilla: CrossFit AI companion]]></summary></entry><entry><title type="html">Using AI to accurately predict CrossFit workout difficulty and performance</title><link href="http://perfectionkills.com/using-ai-to-accurately-predict-crossfit/" rel="alternate" type="text/html" title="Using AI to accurately predict CrossFit workout difficulty and performance" /><published>2025-05-31T00:00:00+00:00</published><updated>2025-05-31T00:00:00+00:00</updated><id>http://perfectionkills.com/using-ai-to-accurately-predict-crossfit</id><content type="html" xml:base="http://perfectionkills.com/using-ai-to-accurately-predict-crossfit/"><![CDATA[<h2>Using AI to accurately predict CrossFit workout difficulty and performance</h2>

<figure><img src="/images/using-ai-to-accurately-predict-crossfit/using-ai-to-accurately-predict-crossfit-1.png"/></figure>

<p>One of the things I’ve geeked out on recently was using AI to assign difficulty and performance bands to a <a href="https://www.crossfit.com/essentials/what-is-a-crossfit-workout">WOD</a>. Not just one but 907 of them (and counting). </p>

<p>I’m building <a href="https://przilla.app">PRzilla.app</a> which allows you to log scores for various WOD’s, track your performance, and see all kinds of cool charts about it: how you’re progressing, what biases and weaknesses there are, movement prioritization.</p>

<h3>Performance levels</h3>

<p>In order to measure athlete performance, we need to compare it against a set of “objective” levels. For example, it is commonly accepted that <a href="https://www.crossfit.com/231103">Fran</a> can be done within 3 minutes if you’re an elite CrossFit athlete, <a href="https://wodwell.com/wod/fran/">with the rest of the bands looking like this</a>:</p>

<blockquote>

<p>What is a good score for the “Fran” workout?</p>

<ul>

<li>Beginner: 7-9 minutes</li>

<li>Intermediate: 6-7 minutes</li>

<li>Advanced: 4-6 minutes</li>

<li>Elite: &lt;3 minutes</li>

</ul>

</blockquote>

<p><a href="https://wodwell.com/wods/?sort=popular&amp;ref=headernav">Wodwell</a> is likely the biggest database of WOD’s and it shows bands for some of the most common ones but <strong>not all of them</strong>. I decided to add these bands to PRzilla and add them to <em>all workouts</em>. </p>

<p>But how do we measure all of them?</p>

<p>Ideally, we’d have a dedicated panel of experts going over thousands of WOD’s to figure all of this out. Thankfully, current top-tier AI models are trained on sufficient volume of CrossFit data and have strong-enough reasoning capabilities to do this in much shorter time.</p>

<h3>Subjectivity</h3>

<p>Here’s the thing: <strong>absolute scores are bound to be subjective</strong> and context-dependent!</p>

<p>Even though Fran times are “commonly accepted“ as &lt;3, 4-6, 6-7, 7-9 there can also be a decent variation among them when adjusted for male vs. female, year/decade measured (CF performance is <a href="https://games.crossfit.com/article/crossfit-open-workout-252-analysis">usually trending upwards</a>), in general population vs. experienced CrossFitters, in CrossFitters vs. specialized athletes (runners/weightlifters/calisthenic warriors), based on country/area or a specific gym (Mayhem vs. your typical box), and many more.</p>

<p>This makes calculations tricky but I think it’s still possible to create a range that resembles an averaged-out, close-enough representation. Our “5k run” results are likely more lax than the ones actual runners would use. But for most workouts, it’s possible to use reasoning to tell that 6 minute Fran is roughly a (top of) intermediate or a (bottom of) advanced.</p>

<p>More importantly, as long as our scoring system is consistent across the board, it’s great for measuring <strong>relative performance</strong>: either against yourself over time, or against others.</p>

<h3>CrossFit specific AI analysis</h3>

<p>Top tier AI models are already trained on a large enough CrossFit data to be able to determine most of these bands. But we need few extra layers of careful orchestration to create a solid system at large:</p>

<ol>

<li>

<p><strong>SOTA thinking model(s)</strong><br/>I used mostly Gemini 2.5 Pro (sometimes Sonnet 3.7) since those were the best reasoning models at the moment.</p>

</li>

<li>

<p><strong>Prompt and context</strong><br/>Make it act like a CrossFit coach / exercise scientist. “Use your extensive CrossFit knowledge”. Give it extra data to consult and base off of to narrow the scope and context, e.g. Community Cup tiers and measurements.</p>

</li>

<li>

<p><strong>Memory bank and examples</strong><br/>In practice, models are <a href="https://kangax.substack.com/i/161351606/the-real-context-window">limited by 100-200K context window</a> so we can’t send our entire JSON consisting of millions of tokens. For relative stability across batches, we need to constantly orient our model to return relatively similar calculations. I used a combination of <a href="https://kangax.substack.com/p/memory-bank-and-the-many-failures">memory bank</a> + specific detailed documentation for this plan/feature + few examples of already existing calculations and reasoning across varied workouts. Reasoning was performed in batches to prevent hallucinations and tackle cost (this was expensive as is).</p>

</li>

<li>

<p><strong>Internal error correction</strong><br/>At the end of calculation, model needs to double-check its own analysis of specific WOD for correctness.</p>

</li>

<li>

<p><strong>External error correction</strong><br/>At the end of each batch of calculations, model takes few random <em>existing</em> scores and compares them to a current calculation; this ensures relative stability across many batches.</p>

</li>

</ol>

<h3>Percentiles</h3>

<figure><img src="/images/using-ai-to-accurately-predict-crossfit/using-ai-to-accurately-predict-crossfit-2.jpeg"/></figure>

<p>CrossFit <a href="https://games.crossfit.com/article/crossfit-open-workout-253-analysis">popularized percentile-based scores</a> during the Open, and — as part of <a href="https://games.crossfit.com/article/guide-2025-community-cup">Community Cup</a> — they recently rolled out a scoring system that consists of 5 levels that map directly to percentiles — Rookie (&lt;21%), Novice (22-43%), Intermediate (44-65%), Advanced (66-87%), and Pro (&gt;88%).</p>

<p>I initially went with wodwell-inspired 4 tiers of Beginner, Intermediate, Advanced, Elite, then realized that there’s not enough granularity. When working on <a href="https://library.crossfit.com/premium/pdf/CFJ_Williams_WeakBias.pdf">GPP</a> charts, the scale was 1-10 and plotting “Advanced” on it was washing out the result too much. 60% and 80% could both be considered Advanced but to go from one to another might take you few years! Similarly with <a href="https://www.przilla.app/charts">GPP wheel chart</a>: if your stamina is at 60% and strength is at 80%, you would want to see that reflected on a chart as unevenness.</p>

<h3>Example: Frelen</h3>

<p>Here is a <a href="https://gist.github.com/kangax/032f6335b11cce9324e73568363bc784">raw example</a> of one of the calculations and model’s reasoning. I already had a 4 tier system/data that was generated using similar heuristic; each WOD had a <em>difficulty</em>, <em>difficultyExplanation</em> (the one model derived from its reasoning before), <em>type</em>, and <em>levels</em>.</p>

<p>I then used a model to derive 10 levels by giving it existing framework of how we derived those 4 levels + new system of 10 levels + examples.</p>

<div>

<p>Notice how it analyses WOD step by step; it understands that it’s similar to “Helen” and “Eva” since both follow a similar triplet pattern of run, x, pull-ups with this one being closer to “Eva” in terms of volume; it calculates rough times for run and thrusters while accounting for fatigue and number of rounds; adjusts edges of 10-tier to be more than 4-tier one and even considers that because it’s a “hard” WOD, beginner level is to be extended by 7min. </p>

</div>

<p>This now allows us to see where our scores stand for <em>any</em> WOD, such as this L7 that falls within 4:00-5:00 for <a href="https://www.przilla.app/?search=Diane">Diane</a>.</p>

<figure><img src="/images/using-ai-to-accurately-predict-crossfit/using-ai-to-accurately-predict-crossfit-3.png"/></figure>

<p>Before doing a workout, you can take a look at the performance guide and have a better idea which time to shoot for to get into a certain percentile.</p>

<figure><img src="/images/using-ai-to-accurately-predict-crossfit/using-ai-to-accurately-predict-crossfit-4.png"/></figure>

<h3>AI-derived difficulty</h3>

<p>Using similar training and reasoning, I was also able to create difficulty levels for all WOD’s. You’ve already seen “Frelen” categorized as “Hard“ earlier. </p>

<p>Here’s the <a href="https://gist.github.com/kangax/e84b6d5f5eae1d1a8105fe87b97992d1">actual documentation used</a> when orienting AI to work with this data. AI uses this as a framework to understand general structure of a workout, and then adjusts difficulty based on modifiers like volume, skill, and load.</p>

<h3>Difficulty examples</h3>

<p>This produces strikingly accurate results. Note AI’s explanation for <em>why</em> difficulty is set certain way:</p>

<ul>

<li>

<p><a href="http://przilla.app/?search=1k row">1k row</a> — <strong>Easy</strong>, <em>“A standard benchmark test of 1000 meter rowing speed.“</em></p>

</li>

<li>

<p><a href="http://przilla.app/?search=baseline">Baseline</a><sup id="footnote-anchor-1"><a href="#footnote-1">1</a></sup> — <strong>Easy</strong>, <em>“A classic CrossFit introductory benchmark testing basic rowing and bodyweight movement capacity.”</em></p>

</li>

<li>

<p><a href="http://przilla.app/?search=annie">Annie</a> <sup id="footnote-anchor-2"><a href="#footnote-2">2</a></sup> — <strong>Medium</strong>, <em>“Girl WOD (Ladder Couplet). Tests double-under skill proficiency and core endurance in a fast-paced descending rep scheme (50-40-30-20-10).”</em></p>

</li>

<li>

<p><a href="http://przilla.app/?search=wittman">Wittman</a> <sup id="footnote-anchor-3"><a href="#footnote-3">3</a></sup> — <strong>Medium</strong>, <em>“Hero WOD (Triplet). 7 rounds combining moderate KB swings, light power cleans, and box jumps. Tests moderate power endurance/conditioning.”</em></p>

</li>

<li>

<p><a href="https://www.przilla.app/?search=Dork">Dork</a> <sup id="footnote-anchor-4"><a href="#footnote-4">4</a></sup> — <strong>Hard</strong>, <em>“Hero WOD (Triplet). 6 rounds combining DUs, heavy KB swings (70lb), and burpees. Tests conditioning, skill, and endurance over significant volume.”</em></p>

</li>

<li>

<p><a href="https://www.przilla.app/?search=Kelly">Kelly</a> <sup id="footnote-anchor-5"><a href="#footnote-5">5</a></sup> — <strong>Hard</strong>, <em>“Girl WOD (Triplet). 5 rounds: run, high-vol box jumps, high-vol wall balls. Tests high-volume conditioning/endurance.”</em></p>

</li>

<li>

<p><a href="https://www.przilla.app/?search=Maggie">Maggie</a> <sup id="footnote-anchor-6"><a href="#footnote-6">6</a></sup> — <strong>Very Hard</strong>, <em>“Five rounds of high-volume, high-skill gymnastics movements (HSPU, Pull-ups, Pistols). Tests advanced gymnastics capacity and endurance.”</em></p>

</li>

<li>

<p><a href="https://www.przilla.app/?search=The Seven">The Seven</a> <sup id="footnote-anchor-7"><a href="#footnote-7">7</a></sup> — <strong>Very Hard</strong>, <em>“Hero WOD. 7 rounds of 7 reps: HSPU, heavy thrusters (135lb), KTE, heavy DL (245lb), burpees, heavy KB swings (70lb), pull-ups. Extremely demanding strength/skill/volume across 7 movements.“</em></p>

</li>

<li>

<p><a href="https://www.przilla.app/?search=Atalanta">Atalanta</a> <sup id="footnote-anchor-8"><a href="#footnote-8">8</a></sup> — <strong>Extremely Hard</strong>, <em>“Long Murph-style chipper with vest, high volume gymnastics.“</em></p>

</li>

<li>

<p><a href="https://www.przilla.app/?search=2007 Reload">2007 Reload</a> <sup id="footnote-anchor-9"><a href="#footnote-9">9</a></sup> — <strong>Extremely Hard</strong>, <em>“Long row followed by high-skill gymnastics and heavy shoulder-to-overheads demand elite capacity and strength.“</em></p>

</li>

</ul>

<p>Fun fact: “Extremely Hard“ category did not exist until I introduced Crossfit Games workouts at which point AI proactively came up with it and it made sense as the relative difficulty was objectively increased in those! Only 15 out of 907 are currently categorized as such.</p>

<h3>Effort vs Complexity</h3>

<p>Some of you will certainly scoff at a “1k row” categorized as easy. A simple movement like that can <em>absolutely</em> be made into a grueling test of strength, grit, endurance and stamina. The difficulty in PRzilla is not about how hard something <em>can</em> be made but how demanding it is on skill/strength/endurance. 1k row is easy in a sense that it can be performed by almost any person and can be completed with little effort <strong>as prescribed</strong>. You can’t say the same about <a href="https://www.przilla.app/?category=Girl&amp;search=Amanda">Amanda</a> that will have you do 21 ring muscle-ups together with 21 squat snatches at 135lb — feats that can take you years to master individually, not to mention being able to superset them.</p>

<h3>Extreme skills</h3>

<p>Speaking of extremely hard tests, it was interesting to see how AI estimates something like “<a href="https://www.przilla.app/?category=Girl&amp;search=Triple+unders%3A+max+reps">Triple unders: max reps</a>” or “<a href="https://www.przilla.app/?category=Girl&amp;search=Free+standing+handstand+push-ups">Free standing handstand push-ups: max reps</a>”:</p>

<blockquote>

<p>This is a "Very Hard" test of max unbroken triple-unders. This is an extremely high-skill movement. Even a single rep is a significant achievement for many.</p>

<p>High-skill jump rope variation requiring exceptional timing, coordination, and wrist speed.</p>

<ul>

<li>L1: Cannot complete 0 reps (effectively)</li>

<li>L2: 0 reps</li>

<li>L3: 0 reps</li>

<li>L4: 0 reps</li>

<li>L5: 1-4 reps</li>

<li>L6: 5-10 reps</li>

<li>L7: 11-15 reps</li>

<li>L8: 16-21 reps</li>

<li>L9: 22-36 reps</li>

<li>L10: &gt;=36 reps</li>

</ul>

</blockquote>

<p>Because we maintain relative difficulty, even an intermediate score on such tests are a great achievement. And the model understands that beginners (up until level 5) are unlikely to complete even 1.</p>

<h3>Timeline and adjusted performance</h3>

<p>Once we know your performance levels on all the WOD’s, it’s easy to plot them over time for a chart like this that <a href="https://www.przilla.app/charts">shows “fitness level” progression and trend</a>. And here’s something even more fun — because we have WOD’s difficulty, we can adjust your score to be more representative of real life performance (meaning that getting “Intermediate” in a “Very Hard” WOD is closer to getting “Advanced“ in “Hard” one):</p>

<pre><code>adjustedLevel = cap(scoreLevel + difficultyBonus, 0, 10)</code></pre>

<p>…where difficultyBonus is something simple like:</p>

<p>Easy: -0.5, Medium: +0.0, Hard: +0.5, Very Hard: +1.0, Extremely Hard: +1.5</p>

<figure><img src="/images/using-ai-to-accurately-predict-crossfit/using-ai-to-accurately-predict-crossfit-5.png"/></figure>

<h3>Work in progress</h3>

<p>Give these estimates a try — do they feel right? Could anything be improved? I’m planning to refine these in PRzilla for an even deeper understanding of workout stimulus; similar to community cup, we could be better at gender and age group adjustments. There are also gaps right now with certain WODs that have a timecap and so are a hybrid of time (if completed within timecap) and reps/load (if completed at timecap).</p>

<p>In the future, I’m planning to add an option to input any custom WOD and get an estimate of its difficulty and performance levels.</p>

<p></p>

<div class="footnote" id="footnote-1"><sup>1</sup>

<p>For Time: 500 meter Row, 40 Air Squats, 30 Sit-Ups, 20 Push-Ups, 10 Pull-Ups</p>

</div>

<div class="footnote" id="footnote-2"><sup>2</sup>

<p>50-40-30-20-10 Reps For Time: Double-Unders, Sit-Ups</p>

</div>

<div class="footnote" id="footnote-3"><sup>3</sup>

<p>7 Rounds For Time: 15 Kettlebell Swings (1.5/1 pood) , 15 Power Cleans (95/65 lb), 15 Box Jumps (24/20 in))</p>

</div>

<div class="footnote" id="footnote-4"><sup>4</sup>

<p>6 Rounds For Time: 60 Double-Unders, 30 Kettlebell Swings (1.5/1 pood), 15 Burpees</p>

</div>

<div class="footnote" id="footnote-5"><sup>5</sup>

<p>5 Rounds For Time: 400 meter Run, 30 Box Jumps (24/20 in), 30 Wall Ball Shots (20/14 lb)</p>

</div>

<div class="footnote" id="footnote-6"><sup>6</sup>

<p>5 Rounds for Time: 20 Handstand Push-Ups, 40 Pull-Ups, 60 Pistols (Alternating Legs)</p>

</div>

<div class="footnote" id="footnote-7"><sup>7</sup>

<p>7 Rounds for Time: 7 Handstand Push-Ups, 7 Thrusters (135/95 lb), 7 Knees-to-Elbows, 7 Deadlifts (245/165 lb), 7 Burpees, 7 Kettlebell Swings (2/1.5 pood), 7 Pull-Ups</p>

</div>

<div class="footnote" id="footnote-8"><sup>8</sup>

<p>For Time: 1 mile Run, 100 Handstand Push-Ups, 200 Alternating Pistols, 300 Pull-Ups, 1 mile Run. Wear a Weight Vest (20/14 lb)</p>

</div>

<div class="footnote" id="footnote-9"><sup>9</sup>

<p>For Time: 1,500 meter Row Then, 5 Rounds of: 10 Bar Muscle-Ups, 7 Shoulder-to-Overheads (235/145 lb)</p>

</div>]]></content><author><name></name></author><category term="other" /><summary type="html"><![CDATA[Using AI to accurately predict CrossFit workout difficulty and performance]]></summary></entry><entry><title type="html">Javascript quiz. ES6 edition.</title><link href="http://perfectionkills.com/javascript-quiz-es6/" rel="alternate" type="text/html" title="Javascript quiz. ES6 edition." /><published>2015-11-04T00:00:00+00:00</published><updated>2015-11-04T00:00:00+00:00</updated><id>http://perfectionkills.com/javascript-quiz-es6</id><content type="html" xml:base="http://perfectionkills.com/javascript-quiz-es6/"><![CDATA[<h2>Javascript quiz. ES6 edition.</h2>

<div id="javascript-quiz-es6">

  <p>
    Remember that <a href="http://perfectionkills.com/javascript-quiz/">crazy Javascript quiz</a> from 6 years ago? Craving to solve another set of mind-bending snippets no sensible developer would ever use in their code? Looking for a new installment of the most ridiculous Javascript interview questions?
  </p>

  <p>Look no further! The "ECMAScript Two Thousand Fifteen" installment of good old Javascript Quiz is finally here.</p>

  <p>The rules are as usual:</p>

  <ul>
    <li>Assuming <a href="http://www.ecma-international.org/ecma-262/6.0">ECMA-262 6th Edition</a></li>
    <li>Implementation quirks do not count (assuming standard behavior only)</li>
    <li>Every snippet is run as a global code (not in eval, function, or module contexts)</li>
    <li>There are no other variables declared (and host environment is not extended with anything beyond what's defined in a spec)</li>
    <li>Answer should correspond to exact return value of entire expression/statement (or last line)</li>
    <li>"Error" in answer indicates that overall snippet results in a compile or runtime error</li>
    <li>Cheating with <a href="http://babeljs.io/repl/">Babel</a> doesn't count (and there could even be bugs!)</li>
  </ul>

  <p>
    The quiz goes over such ES6 topics as: <em>classes, computed properties, spread operator, generators, template strings, and shorthand properties</em>. It's relatively easy, but still tricky. It tries to cover various ES6 features — a little bit of this, a little bit of that — but it's certainly still only a tiny subset.
  </p>
  <p>
    If you can think of other silly riddle ideas to break one's head against, please post them in the comments. For a slightly harder version, feel free to explore <a href="https://github.com/kangax/compat-table/blob/gh-pages/data-es6.js#L8157-L8161">some</a> of the <a href="https://github.com/kangax/compat-table/blob/gh-pages/data-es6.js#L1149-L1160">tests</a> in <a href="kangax.github.io/compat-table/es6/">our compat table</a> or perhaps something from TC39 <a href="https://github.com/tc39/test262/tree/master/test/language">official test suite</a>.
  </p>

  <p>Ready? Here we go.</p>

  <ol class="quiz" style="margin-top: 40px">
    <li>

      <p style="margin-bottom: -20px">&nbsp;</p>

      <div>
        <script src="https://gist.github.com/kangax/775877f5e8833629d2dd.js"> </script>
      </div>

      <ul class="answers">
        <li>
          <input type="radio" name="question-1" id="answer-1-1">
          <label for="answer-1-1">[2, 1, 1]</label>
        </li>
        <li>
          <input type="radio" name="question-1" id="answer-1-2">
          <label for="answer-1-2">[2, undefined, 1]</label>
        </li>
        <li>
          <input type="radio" name="question-1" id="answer-1-3">
          <label for="answer-1-3">[2, 1, 2]</label>
        </li>
        <li>
          <input type="radio" name="question-1" id="answer-1-4">
          <label for="answer-1-4">[2, undefined, 2]</label>
        </li>
      </ul>
    </li>
    <li>

      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/b4c7ec851cdeaa1d89cd.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-2" id="answer-2-1">
          <label for="answer-2-1">['inner', 'outer']</label>
        </li>
        <li>
          <input type="radio" name="question-2" id="answer-2-2">
          <label for="answer-2-2">['outer', 'outer']</label>
        </li>
        <li>
          <input type="radio" name="question-2" id="answer-2-3">
          <label for="answer-2-3">[undefined, undefined]</label>
        </li>
        <li>
          <input type="radio" name="question-2" id="answer-2-4">
          <label for="answer-2-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/934a12f3e5f5d1f6a41b.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-3" id="answer-3-1">
          <label for="answer-3-1">undefined</label>
        </li>
        <li>
          <input type="radio" name="question-3" id="answer-3-2">
          <label for="answer-3-2">1</label>
        </li>
        <li>
          <input type="radio" name="question-3" id="answer-3-3">
          <label for="answer-3-3">{ x: 1 }</label>
        </li>
        <li>
          <input type="radio" name="question-3" id="answer-3-4">
          <label for="answer-3-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/8aba5d9488594e942e87.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-4" id="answer-4-1">
          <label for="answer-4-1">["function", "undefined"]</label>
        </li>
        <li>
          <input type="radio" name="question-4" id="answer-4-2">
          <label for="answer-4-2">["function", "function"]</label>
        </li>
        <li>
          <input type="radio" name="question-4" id="answer-4-3">
          <label for="answer-4-3">["undefined", "undefined"]</label>
        </li>
        <li>
          <input type="radio" name="question-4" id="answer-4-4">
          <label for="answer-4-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/8b37c698093d9ac196df.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-5" id="answer-5-1">
          <label for="answer-5-1">"function"</label>
        </li>
        <li>
          <input type="radio" name="question-5" id="answer-5-2">
          <label for="answer-5-2">"object"</label>
        </li>
        <li>
          <input type="radio" name="question-5" id="answer-5-3">
          <label for="answer-5-3">"undefined"</label>
        </li>
        <li>
          <input type="radio" name="question-5" id="answer-5-4">
          <label for="answer-5-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/4d0deaad37316eccf6b3.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-6" id="answer-6-1">
          <label for="answer-6-1">"function"</label>
        </li>
        <li>
          <input type="radio" name="question-6" id="answer-6-2">
          <label for="answer-6-2">"object"</label>
        </li>
        <li>
          <input type="radio" name="question-6" id="answer-6-3">
          <label for="answer-6-3">"undefined"</label>
        </li>
        <li>
          <input type="radio" name="question-6" id="answer-6-4">
          <label for="answer-6-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/bb2dc76e683f5b239d78.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-7" id="answer-7-1">
          <label for="answer-7-1">1</label>
        </li>
        <li>
          <input type="radio" name="question-7" id="answer-7-2">
          <label for="answer-7-2">3</label>
        </li>
        <li>
          <input type="radio" name="question-7" id="answer-7-3">
          <label for="answer-7-3">6</label>
        </li>
        <li>
          <input type="radio" name="question-7" id="answer-7-4">
          <label for="answer-7-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/d86c9641a9e9e6d70025.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-8" id="answer-8-1">
          <label for="answer-8-1">"function"</label>
        </li>
        <li>
          <input type="radio" name="question-8" id="answer-8-2">
          <label for="answer-8-2">"generator"</label>
        </li>
        <li>
          <input type="radio" name="question-8" id="answer-8-3">
          <label for="answer-8-3">"object"</label>
        </li>
        <li>
          <input type="radio" name="question-8" id="answer-8-4">
          <label for="answer-8-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/a72fad89b34f2918724e.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-9" id="answer-9-1">
          <label for="answer-9-1">"function"</label>
        </li>
        <li>
          <input type="radio" name="question-9" id="answer-9-2">
          <label for="answer-9-2">"undefined"</label>
        </li>
        <li>
          <input type="radio" name="question-9" id="answer-9-3">
          <label for="answer-9-3">"object"</label>
        </li>
        <li>
          <input type="radio" name="question-9" id="answer-9-4">
          <label for="answer-9-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/483d85070c5deacfda80.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-10" id="answer-10-1">
          <label for="answer-10-1">"function"</label>
        </li>
        <li>
          <input type="radio" name="question-10" id="answer-10-2">
          <label for="answer-10-2">"undefined"</label>
        </li>
        <li>
          <input type="radio" name="question-10" id="answer-10-3">
          <label for="answer-10-3">"object"</label>
        </li>
        <li>
          <input type="radio" name="question-10" id="answer-10-4">
          <label for="answer-10-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/327b3da882d027893e88.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-11" id="answer-11-1">
          <label for="answer-11-1">1</label>
        </li>
        <li>
          <input type="radio" name="question-11" id="answer-11-2">
          <label for="answer-11-2">3</label>
        </li>
        <li>
          <input type="radio" name="question-11" id="answer-11-3">
          <label for="answer-11-3">[1,2,3]</label>
        </li>
        <li>
          <input type="radio" name="question-11" id="answer-11-4">
          <label for="answer-11-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/f0b8ae23e9cc858a0bde.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-12" id="answer-12-1">
          <label for="answer-12-1">[2, { x: 1 }, 2, 2, 2, { y }]</label>
        </li>
        <li>
          <input type="radio" name="question-12" id="answer-12-2">
          <label for="answer-12-2">[{ x: 1 }, 2, { y }]</label>
        </li>
        <li>
          <input type="radio" name="question-12" id="answer-12-3">
          <label for="answer-12-3">[1, undefined, 2, undefined, 2, undefined]</label>
        </li>
        <li>
          <input type="radio" name="question-12" id="answer-12-4">
          <label for="answer-12-4">Error</label>
        </li>
      </ul>
    </li>
    <li>
      <p style="margin-bottom: -20px">&nbsp;</p>

      <script src="https://gist.github.com/kangax/4fa8a01567cebc7dbd7b.js"> </script>

      <ul class="answers">
        <li>
          <input type="radio" name="question-13" id="answer-13-1">
          <label for="answer-13-1">"function"</label>
        </li>
        <li>
          <input type="radio" name="question-13" id="answer-13-2">
          <label for="answer-13-2">"undefined"</label>
        </li>
        <li>
          <input type="radio" name="question-13" id="answer-13-3">
          <label for="answer-13-3">"object"</label>
        </li>
        <li>
          <input type="radio" name="question-13" id="answer-13-4">
          <label for="answer-13-4">Error</label>
        </li>
      </ul>
    </li>
  </ol>

  <button type="button" id="submitter">Let's see the score!</button>

  <p id="quiz-result">
    Here be quiz result
  </p>

  <p>I hope you enjoyed it. I'll try to write up an explanation for these in the near future.</p>

  <a href="https://news.ycombinator.com/submit" class="hn-button" data-title="J" data-url="http://perfectionkills.com/javascript-quiz-es6/" data-count="horizontal">Vote on Hacker News</a>
  <script>var HN=[];HN.factory=function(e){return function(){HN.push([e].concat(Array.prototype.slice.call(arguments,0)))};},HN.on=HN.factory("on"),HN.once=HN.factory("once"),HN.off=HN.factory("off"),HN.emit=HN.factory("emit"),HN.load=function(){var e="hn-button.js";if(document.getElementById(e))return;var t=document.createElement("script");t.id=e,t.src="//hn-button.herokuapp.com/hn-button.js";var n=document.getElementsByTagName("script")[0];n.parentNode.insertBefore(t,n)},HN.load();</script>

  <a href="https://twitter.com/share" class="twitter-share-button" data-url="http://perfectionkills.com/javascript-quiz-es6/" data-via="kangax">Tweet</a>
  <script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>

  <script src="//www.redditstatic.com/button/button1.js"></script>

  <script src="/js/quiz-es6.js"></script>

</div>]]></content><author><name></name></author><category term="js" /><summary type="html"><![CDATA[Javascript quiz. ES6 edition.]]></summary></entry><entry><title type="html">The poor, misunderstood innerText</title><link href="http://perfectionkills.com/the-poor-misunderstood-innerText/" rel="alternate" type="text/html" title="The poor, misunderstood innerText" /><published>2015-04-01T00:00:00+00:00</published><updated>2015-04-01T00:00:00+00:00</updated><id>http://perfectionkills.com/the-poor-misunderstood-innerText</id><content type="html" xml:base="http://perfectionkills.com/the-poor-misunderstood-innerText/"><![CDATA[<h1 id="the-poor-misunderstood-innertext">The poor, misunderstood innerText</h1>

<div class="innertext">

Few things are as misunderstood and misused on the web as <code>innerText</code> property.

That quirky, non-standard way of element's <em>text retrieval</em>, [introduced by Internet Explorer](https://msdn.microsoft.com/en-us/library/ie/ms533899%28v=vs.85%29.aspx) and later "copied" by both WebKit/Blink and Opera for web-compatibility reasons. It's usually seen in combination with <code>textContent</code> — as a cross-browser way of using standard property followed by a proprietary one:

<script src="https://gist.github.com/kangax/21b031672fcce0810e6f.js"> </script>

Or as the main webcompat offender in [numerous Mozilla tickets](https://bugzilla.mozilla.org/show_bug.cgi?id=264412#c24) — since Mozilla is one of the only major browsers refusing to add this non-standard property — when someone doesn't know what they're doing, skipping <code>textContent</code> "fallback" altogether:

<script src="https://gist.github.com/kangax/84462c2c36f7db8ad8a3.js"> </script>

<code>innerText</code> is pretty much always frown upon. After all, why would you want to use a non-standard property that does the "same" thing as a standard one? Very few people venture to actually check the differences, and on the surface it certainly appears as there is none. Those curious enough to investigate further usually <em>do</em> find them, but only slight ones, and only <b>when retrieving text, not setting it</b>.

Back in 2009, I did just that. And I even wrote [this StackOverflow answer](http://stackoverflow.com/a/1359822/130652) on the exact differences — slight whitespace deviations, things like inclusion of &lt;script&gt; contents by <code>textContent</code> (but not <code>innerText</code>), differences in interface (<code>Node</code> vs. <code>HTMLElement</code>), and so on.

All this time I was strongly convinced that there isn't much else to know about <code>textContent</code> vs. <code>innerText</code>. Just steer away from <code>innerText</code>, use this "combo" for cross-browser code, keep in mind slight differences, and you're golden.

Little did I know that I was merely looking at the tip of the iceberg and that my perception of <code>innerText</code> will change drastically. What you're about to hear is the story of Internet Explorer getting something right, the real differences between these properties, and how we probably want to standardize this red-headed stepchild.

<h3 id="the-real-difference">The real difference</h3>

A little while ago, I was helping someone with the implementation of text editor in a browser. This is when I realized just how ridiculously important these seemingly insignificant whitespace deviations between <code>textContent</code> and <code>innerText</code> are.

Here's a simple example:

<p data-height="268" data-theme-id="0" data-slug-hash="gbEWvR" data-default-tab="result" data-user="kangax" class="codepen">See the Pen <a href="http://codepen.io/kangax/pen/gbEWvR/">gbEWvR</a> by Juriy Zaytsev (<a href="http://codepen.io/kangax">@kangax</a>) on <a href="http://codepen.io">CodePen</a>.</p>

Notice how <code>innerText</code> almost precisely represents <b>exactly how text appears on the page</b>. <code>textContent</code>, on the other hand, does something strange — it ignores newlines created by &lt;br&gt; and around styled-as-block elements (&lt;span&gt; in this case). But it preserves spaces as they are defined in the markup. What does it actually do?

Looking at the [spec](http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Node3-textContent), we get this:

<blockquote>
  This attribute returns the text content of this node and its descendants. [...]
  <br /><br />
  On getting, no serialization is performed, the returned string does not contain any markup. No <b>whitespace normalization is performed</b> and the returned string does not contain the white spaces in element content (see the attribute Text.isElementContentWhitespace). [...]
  <br /><br />
  The string returned is made of the text content of this node depending on its type, as defined below:
  <br /><br />
  For <b>ELEMENT_NODE</b>, ATTRIBUTE_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, DOCUMENT_FRAGMENT_NODE:
  <br /><br />
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<b>concatenation of the textContent attribute value of every child node</b>, excluding COMMENT_NODE and PROCESSING_INSTRUCTION_NODE nodes. This is the empty string if the node has no children.
  <br /><br />
  For <b>TEXT_NODE</b>, CDATA_SECTION_NODE, COMMENT_NODE, PROCESSING_INSTRUCTION_NODE
  <br /><br />
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<b>nodeValue</b>
</blockquote>

In other words, <code>textContent</code> returns concatenated text of all text nodes. Which is almost like taking markup (i.e. <code>innerHTML</code>) and stripping it off of the tags. Notice that no whitespace normalization is performed, the text and whitespace are essentially spit out the <b>same way they're defined in the markup</b>. If you have a giant chunk of newlines in HTML source, you'll have them as part of <code>textContent</code> as well.

While investigating these issues, I came across a [fantastic blog post by Mike Wilcox](http://clubajax.org/plain-text-vs-innertext-vs-textcontent/) from 2010, and pretty much the only place where someone tries to bring attention to this issue. In it, Mike takes a stab at the same things I'm describing here, saying these true-to-the-bone words:

<blockquote>
  Internet Explorer implemented innerText in version 4.0, and it’s a useful, if misunderstood feature. [...]
  <br /><br />
  The most common usage for these properties is while working on a rich text editor, when you need to “get the plain text” or for other functional reasons. [...]
  <br /><br />
  Because “no whitespace normalization is performed”, what textContent is essentially doing is acting like a PRE element. The markup is stripped, but otherwise what we get is exactly what was in the HTML document — including tabs, spaces, lack of spaces, and line breaks. It’s getting the source code from the HTML! What good this is, I really don’t know.
</blockquote>

Knowing these differences, we can see just how potentially misleading (and dangerous) a typical <code>textContent || innerText</code> retrieval is. It's pretty much like saying:

<script src="https://gist.github.com/kangax/1afbc0d166ac220e2cac.js"> </script>

<h3 id="case-for-innerText">The case for innerText</h3>

Coming back to a text editor...

Let's say we have a [contenteditable](http://html5demos.com/contenteditable) area in which a user is writing something. And we'd like to have our own spelling correction of a text in that area. In order to do that, we really want to analyze text <b>the way it appears in the browser</b>, not in the markup. We'd like to know if there are newlines or spaces typed by a user, and not those that are in the markup, so that we can correct text accordingly.

This is just one use-case of plain text retrieval. Perhaps you might want to <b>convert written text to another format</b> (PDF, SVG, image via canvas, etc.) in which case it has to look exactly as it was typed. Or maybe you need to <b>know the cursor position in a text</b> (or its entire length), so you need to operate on a text the way it's presented.

I'm sure there's more scenarios.

A good way to think about <code>innerText</code> is as if the text was selected and copied off the page. In fact, this is exactly what WebKit/Blink does — it [uses the same code](http://lists.w3.org/Archives/Public/public-html/2011Jul/0133.html) for <code>Selection#toString</code> serialization and <code>innerText</code>!

Speaking of that — if <code>innerText</code> is essentially the same thing as stringified selection, shouldn't it be possible to emulate it via <code>Selection#toString</code>?

It sure is, but as you can imagine, the performance of such thing [leaves more to be desired](http://jsperf.com/innertext-vs-selection-tostring/4) — we need to save current selection, then change selection to contain entire element contents, get string representation, then restore original selection:

<script src="https://gist.github.com/kangax/05c89595c0d02b3d49bf.js"> </script>

The problems with this frankenstein of a workaround are performance, complexity, and clarity. It shouldn't be so hard to get "plain text" representation of an element. Especially when there's an already "implemented" property that does just that.

<img src="/images/innerText_emulation.png" />

Internet Explorer got this right — <code>textContent</code> and <code>Selection#toString</code> are poor contenders in cases like this; <code>innerText</code> is exactly what we need. Except that it's non-standard, and unsupported by one major browser. Thankfully, at least Chrome (Blink) and Safari (WebKit) were considerate enough to immitate it. One would hope there's no deviations among their implementations. Or is there?

<h3 id="diff-with-textContent">Differences with textContent</h3>

Once I realized the significance of <code>innerText</code>, I wanted to see the differences among 2 engines. Since there was nothing like this out there, I set on a path to explore it. In true ["cross-browser maddness" traditions](http://unixpapa.com/js/key.html), what I've found was not for the faint of heart.

<img src="/images/innerText_tests.png" />

I started with (now extinct) [test suite by Aryeh Gregor](https://web.archive.org/web/20110205234444/http://aryeh.name/spec/innertext/test/innerText.html) and [added few more things](http://kangax.github.io/jstests/innerText/) to it. I also searched WebKit/Blink bug trackers and included [whatever](https://code.google.com/p/chromium/issues/detail?id=96839) [relevant](https://bugs.webkit.org/show_bug.cgi?id=14805) [things](https://bugs.webkit.org/show_bug.cgi?id=17830) I found there.

The table above (and in the test suite) shows all the gory details, but few things worth mentioning. First, good news — Internet Explorer &lt;=9 are identical in their behavior :) Now bad — everything else diverges. Even IE changes with each new version — 9, 10, 11, and Tech Preview (the unreleased version of IE that's currently in the making) are all different. What's also interesting is how WebKit copied some of the old-IE traits — such as not including contents of &lt;script&gt; and &lt;style&gt; elements — and then when IE changed, they naturally drifted apart. Currently, some of the WebKit/Blink behavior is like old-IE and some isn't. But even comparing to original versions, WebKit did a poor job copying this feature, or rather, it seems like they've tried to make it <em>better</em>!

Unlike IE, WebKit/Blink insert tabs between table cells — that kind of makes sense! They also preserve upper/lower-cased text, which is arguably better. They don't include hidden elements ("display:none", "visibility:hidden"), which makes sense too. And they don't include contents of &lt;select&gt; elements and &lt;canvas&gt;/&lt;video&gt; fallback — perhaps a questionable aspect — but reasonable as well.

Ok, there's more good news.

Notice that IE Tech Preview (Spartan) is now much closer to WebKit/Blink. There's only 9 aspects they differ in (comparing to 10-11 in earlier versions). That's still a lot but there's at least <em>some</em> hope for convergence. Most notably, IE <em>again</em> stopped including &lt;script&gt; and &lt;style&gt; contents, and — for the first time ever — stopped including "display:none" elements (but not "visibility:hidden" ones — more on that later).

<h3 id="opera-mess">Opera mess</h3>

You might have caught the lack of Opera in a table. It's not just because Opera is now using Blink engine (essentially having WebKit behavior). It's also due to the fact that when it wasn't on Blink, it's been <b>reaaaally naughty</b> when it comes to <code>innerText</code>. To sustain web compatibility, Opera simply went ahead and "aliased" <code>innerText</code> to <code>textContent</code>. That's right, in Opera, <code>innerText</code> would return nothing close to what we see in IE or WebKit. There's simply no point including in a table; it would diverge in every single aspect, and we can just consider it as never implemented.

<h3 id="note-on-perf">Note on performance</h3>

Another difference lurks behind <code>textContent</code> and <code>innerText</code> — performance.

You can find dozens of [tests on jsperf.com comparing innerText and textContent](http://jsperf.com/search?q=innerText) — <code>innerText</code> is often dozens time slower.

<a href="http://jsperf.com/innertext-vs-textcontent-and-various-markup">
  <img src="/images/innerText_vs_textContent.png" />
</a>

In [this blog post](http://www.kellegous.com/j/2013/02/27/innertext-vs-textcontent/), Kelly Norton is talking about <code>innerText</code> being up to 300x slower (although that seems like a particularly rare case) and advises against using it entirely.

Knowing the underlying concepts of both properties, this shouldn't come as a surprise. After all, <code>innerText</code> requires knowledge of layout and [anything that touches layout is expensive](http://gent.ilcore.com/2011/03/how-not-to-trigger-layout-in-webkit.html).

So for all intents and purposes, <code>innerText</code> is significantly slower than <code>textContent</code>. And if all you need is to retrieve a text of an element without any kind of style awareness, you should — by all means — use <code>textContent</code> instead. However, this style awareness of <code>innerText</code> is <em>exactly</em> what we need when retrieving text "as presented"; and that comes with a price.

<h3 id="what-about-jquery">What about jQuery?</h3>

You're probably familiar with jQuery's <code>text()</code> method. But how exactly does it work and what does it use — <code>textContent || innerText</code> combo or something else? Turns out, jQuery [takes a safe route](https://github.com/jquery/jquery/blob/7602dc708dc6d9d0ae9982aadb9fa4615a9c49fa/external/sizzle/dist/sizzle.js#L942-L971) — it either returns <code>textContent</code> (if available), or manually does what <code>textContent</code> is supposed to do — iterates over all children and concatenates their <code>nodeValue</code>'s. Apparently, at one point jQuery **did** use <code>innerText</code>, but then [ran into good old whitespace differences](http://bugs.jquery.com/ticket/11153) and decided to ditch it altogether.

So if we wanted to use jQuery to get real text representation (à la <code>innerText</code>), we can't use jQuery's <code>text()</code> since it's basically a cross-browser <code>textContent</code>. We would need to roll our own solution.

<h3 id="standardization-attempts">Standardization attempts</h3>

Hopefully by now I've convinced you that <code>innerText</code> is pretty damn useful; we went over the underlying concept, browser differences, performance implications and saw how even an all-mighty jQuery is of no help.

You would think that by now this property is standardized or at least making its way into the standard.

Well, not so fast.

Back in 2010, Adam Barth (of Google), [proposes to spec innerText](http://lists.w3.org/Archives/Public/public-whatwg-archive/2010Aug/0455.html) in a WHATWG mailing list. Funny enough, all Adam wants is to set <em>pure text</em> (not markup!) of an element in a secure way. He also doesn't know about <code>textContent</code>, which would certainly be a preferred (standard) way of doing that. Fortunately, Mike Wilcox, whose blog post I mentioned earlier, chimes in with:

<blockquote>
In addition to Adam's comments, there is no standard, stable way of *getting* the text from a series of nodes. textContent returns everything, including tabs, white space, and even script content. [...] innerText is one of those things IE got right, just like innerHTML. Let's please consider making that a standard instead of removing it.
</blockquote>

In the same thread, Robert O'Callahan (of Mozilla) [doubts usefulness of innerText](http://lists.w3.org/Archives/Public/public-whatwg-archive/2010Aug/0477.html) but also adds:

<blockquote>
But if Mike Wilcox or others want to make the case that innerText is actually a useful and needed feature, we should hear it. Or if someone from Webkit or Opera wants to explain why they added it, that would be useful too.
</blockquote>

Ian Hixie is open to adding it to a spec if it's needed for web compatibility. While Rob O'Callahan considers this a redundant feature, Maciej Stachowiak (of WebKit/Apple) hits the nail on the head with [this fantastic reply](http://lists.w3.org/Archives/Public/public-whatwg-archive/2010Aug/0480.html):

<blockquote>
Is it a genuinely useful feature? Yes, the ability to get plaintext content as rendered is a useful feature and annoying to implement from scratch. To give one very marginal data point, it's used by our regression text framework to output the plaintext version of a page, for tests where layout is irrelevant. A more hypothetical use would be a rich text editor that has a "convert to plaintext" feature. textContent is not as useful for these use cases, since it doesn't handle line breaks and unrendered whitespace properly.
<br />[...]<br />
These factors would tend to weigh against removing it.
</blockquote>

To which Rob gives a reasonable reply:

<blockquote>
There are lots of ways people might want to do that. For example, "convert to plaintext" features often introduce characters for list bullets (e.g. '*') and item numbers. (E.g., Mac TextEdit does.) Safari 5 doesn't do
either. [...] Satisfying more than a small number of potential users with a single
attribute may be difficult.
</blockquote>

And the conversation dies out.

<h3 id="is-innerText-useful">Is innerText really useful?</h3>

As Rob points out, "convert to plaintext" could certainly be an ambiguous task. In fact, we can easily create a test markup that looks nothing like its "plain text" version:

<p data-height="268" data-theme-id="0" data-slug-hash="emXMKZ" data-default-tab="result" data-user="kangax" class="codepen">See the Pen <a href="http://codepen.io/kangax/pen/emXMKZ/">emXMKZ</a> by Juriy Zaytsev (<a href="http://codepen.io/kangax">@kangax</a>) on <a href="http://codepen.io">CodePen</a>.</p>

Notice that "opacity: 0" elements are not displayed, yet they are part of <code>innerText</code>. Ditto with infamous "text-indent: -999px" hiding technique. The bullets from the list are not accounted for and neither is dynamically generated content (via ::after pseudo selector). Paragraphs only create 1 newline, even though in reality they could have gigantic margins.

But I think that's OK.

If you think of <code>innerText</code> as text copied from the page, most of these "artifacts" make perfect sense. Just because a chunk of text is given "opacity: 0" doesn't mean that it shouldn't be part of output. It's a purely presentational concern, just like bullets, space between paragraphs or indented text. What matters is **structural preservation** — block-styled elements should create newlines, inline ones should be inline.

One iffy aspect is probably "text-transform". Should capitalized or uppercased text be preserved? WebKit/Blink think it should; Internet Explorer doesn't. Is it part of a text itself or merely styling?

Another one is "visibility: hidden". Similar to "opacity: 0" (and unlike "display: none"), a text is still part of the flow, it just can't be seen. Common sense would suggest that it <b>should still be part of the output</b>. And while Internet Explorer does just that, WebKit/Blink disagrees (also being curiously inconsistent with their "opacity: 0" behavior).

Elements that aren't known to a browser pose an additional problem. For example, WebKit/Blink recently started supporting &lt;template&gt; element. That element is not displayed, and so it is not part of <code>innerText</code>. To Internet Explorer, however, it's nothing but an unknown inline element, and of course it outputs its contents.

<h3 id="standardization-2">Standardization, take 2</h3>

In 2011, another <code>innerText</code> proposal [is posted to WHATWG mailing list](http://lists.w3.org/Archives/Public/public-html/2011Jul/0133.html), this time by Aryeh Gregor. Aryeh proposes to either:

<ol>
  <li>Drop <code>innerText</code> entirely</li>
  <li>Spec <code>innerText</code> to be like <code>textContent</code></li>
  <li>Actually spec <code>innerText</code> according to whatever IE/WebKit are doing</li>
</ol>

Similar to previous discussions, Mozilla opposes 3rd option (standardizing it), whereas Microsoft and Opera oppose 1st one (dropping it).

In the same thread, Aryeh expresses his concerns about standardizing <code>innerText</code>:

<blockquote>
The problem with (3) is that it would be very hard to spec; it would be even harder to spec in a way that all browsers can implement; and any spec would probably have to be quite incompatible anyway with the existing implementations that follow the general approach. [...]
</blockquote>

Indeed, as we've seen from the tests, compatibility poses to be a serious issue. If we were to standardize <code>innerText</code>, which of the 2 behaviors should we put in a spec?

Another problem is reliance on <code>Selection.toString()</code> (as expressed by Boris Zbarsky):

<blockquote>
It's not clear whether the latter is in fact an option; that depends on  how Selection.toString gets specified and whether UAs are willing to do the same for innerText as they do for Selection.toString....
<br /><br />
So far the only proposal I've seen for Selection.toString is "do what the copy operation does", which is neither well-defined nor acceptable for innerText.  In my opinion.
</blockquote>

In the end, we're left with [this WHATWG ticket by Aryeh](https://www.w3.org/Bugs/Public/show_bug.cgi?id=13145) on specifying <code>innerText</code>. Things look rather grim, as evidenced in one of the comments:

<blockquote>
I've been told in no uncertain terms that it's <b>not practical for non-Gecko browsers to remove</b>. Depending on the rendering tree to the extent WebKit does, on the other hand, is insanely complicated to spec in terms of standard stuff like DOM and CSS. Also, it potentially breaks for detached nodes (WebKit behaves totally differently in that case). [...] But <b>Gecko people seemed pretty unhappy about this kind of complexity and rendering dependence in a DOM property</b>.  And on the other hand, I got the impression <b>WebKit is reluctant to rewrite their innerText implementation</b> at all.  So I'm figuring that the spec that will be implemented by the most browsers possible is one that's as simple as possible, basically just a compat shim.  If multiple implementers actually want to implement something like the innerText spec I started writing, I'd be happy to resume work on it, but that wasn't my impression.
</blockquote>

We can't remove it, can't change it, can't spec it to depend on rendering, and spec'ing it would be quite difficult :)

<h3 id="tunnel">Light at the end of a tunnel?</h3>

Could there still be some hope for <code>innerText</code> or will it forever stay an unspecified legacy with 2 different implementations?

My hope is that the test suite and compatibility table are the first step in making things better. We need to know exactly how engines differ, and we need to have a good understanding of what to include in a spec. I'm sure this doesn't cover all cases, but it's a start (other aspects worth exploring: shadow DOM, detached nodes).

I think this test suite should be enough to write 90%-complete spec of <code>innerText</code>. The biggest issue is <b>deciding which behavior to choose</b> among IE and WebKit/Blink.

The plan could be:

1. Write a spec
2. Try to converge IE and WebKit/Blink behavior
3. Implement spec'd behavior in Firefox

Seeing [how amazing Microsoft has been](https://status.modern.ie/) recently, I really hope we can make this happen.

<h3 id="naive-spec">The naive spec</h3>

I took a stab at a relatively simple version of <code>innerText</code>:

<script src="https://gist.github.com/kangax/94ea9cade0cebfb16c02.js"> </script>

Couple important tasks here:

1. Checking if a text node is within "formatted" context (i.e. a child of "white-space: pre-*" node), in which case its contents should be concatenated as is; otherwise collapse all whitespaces to 1.

2. Checking if a node is block-styled ("block", "list-item", "table", etc.), in which case it has to be surrounded by newlines; otherwise, it's inline and its contents are output as is.

Then there's things like ignoring &lt;script&gt;, &lt;style&gt;, etc. nodes and inserting tab ("\t") between &lt;td&gt; elements (to follow WebKit/Blink lead).

This is still a <b>very minimal and naive implementation</b>. For one, it doesn't collapse newlines between block elements — a quite important aspect. In order to do that, we need to <b>keep track of more state</b> — to know information about previous node's style. It also doesn't normalize whitespace in "true" manner — a text node with leading and trailing spaces, for example, should have those spaces stripped if it is (the only node?) in a block element.

This needs more work, but it's a decent start.

It would be also a good idea to write <code>innerText</code> implementation in Javascript, with unit tests for each of the "feature" in a compat table. Perhaps even supporting 2 modes — IE and WebKit/Blink. An implementation like this could then be simply integrated into non-supporting engines (or used as a proper polyfill).

I'd love to hear your thoughts, ideas, experiences, criticism. I hope (with all of your help) we can make some improvement in this direction. And even if nothing changes, at least some light was shed on this very misunderstood ancient feature.

<h3 id="update">Update: half a year later</h3>

It's been half a year since I wrote this post and few things changed for the better!

First of all, [Robert O'Callahan](http://robert.ocallahan.org/) of Mozilla made some awesome effort — he decided to [spec out the innerText](https://github.com/rocallahan/innerText-spec) and then implemented it in Firefox. The idea was to create something simple but sensible. The proposed spec — only after about 11 years — is now [implemented in Firefox 45](https://bugzilla.mozilla.org/show_bug.cgi?id=264412) :)

I've added FF45 results to [a compat table](http://kangax.github.io/jstests/innerText/) and aside from couple differences, FF is pretty close to Chrome's implementation. I'm also planning to add more tests to find any other differences among Chrome, FF, and Edge.

<img src="/images/innerText_updated.png" />

The spec already revealed few bugs in Chrome, which I'm hoping to file tickets for and see resolved. If we can then also get Edge to converge, we'll be very close to having all 3 biggest browsers behave similarly, making `innerText` viable feature in a near future.

<script async="" src="//assets.codepen.io/assets/embed/ei.js"></script>
</div>]]></content><author><name></name></author><category term="js" /><summary type="html"><![CDATA[The poor, misunderstood innerText]]></summary></entry><entry><title type="html">Know thy reference</title><link href="http://perfectionkills.com/know-thy-reference/" rel="alternate" type="text/html" title="Know thy reference" /><published>2014-12-11T00:00:00+00:00</published><updated>2014-12-11T00:00:00+00:00</updated><id>http://perfectionkills.com/know-thy-reference</id><content type="html" xml:base="http://perfectionkills.com/know-thy-reference/"><![CDATA[<h1 id="know-thy-reference">Know thy reference</h1>
<h3 id="abusing-leaky-abstractions-for-a-better-understanding-of-this">Abusing leaky abstractions for a better understanding of “this”</h3>

<p>It was a sunny Monday morning that I woke up to <a href="https://news.ycombinator.com/item?id=8713270">an article on HackerNews</a>, simply named <a href="http://bjorn.tipling.com/all-this">“This in Javascript”</a>. Curious to see what all the attention is about, I started skimming through. As expected, there were mentions of <code class="language-plaintext highlighter-rouge">this</code> in global scope, <code class="language-plaintext highlighter-rouge">this</code> in function calls, <code class="language-plaintext highlighter-rouge">this</code> in constructor instantiation, and so on. It was a long article. And the more I looked through, the more I realized just how <strong>overwhelming</strong> this topic might seem to folks unfamiliar with intricacies of <code class="language-plaintext highlighter-rouge">this</code>, especially when thrown into a myriad of various examples with seemingly random behavior.</p>

<p>It made me remember a moment from few years ago when I first read <a href="http://www.amazon.com/JavaScript-Good-Parts-Douglas-Crockford/dp/0596517742">Crockford’s Good Parts</a>. In it, Douglas succinctly laid out a piece of information that immediately made everything much clearer in my head:</p>

<blockquote>
  The `this` parameter is very important in object oriented programming, and its value is <b>determined by the invocation pattern</b>. There are <b>four patterns of invocation</b> in JavaScript: the <b>method invocation</b> pattern, the <b>function invocation</b> pattern, the <b>constructor invocation</b> pattern, and the <b>apply invocation</b> pattern. The patterns differ in how the bonus parameter this is initialized.
</blockquote>

<p>Determined by invocation and only 4 cases? Well, that’s certainly pretty simple.</p>

<p>With this thought in mind, I went back to HackerNews, wondering if anyone else thought the subject was presented as something way too complicated. I wasn’t the only one. Lots of folks chimed in with the explanation similar to that from Good Parts, like <a href="https://news.ycombinator.com/item?id=8715373">this one</a>:</p>

<blockquote>
  Even more simply, I'd just say:<br />
  1) The keyword "this" refers to whatever is left of the dot at call-time.<br />
  2) If there's nothing to the left of the dot, then "this" is the root scope (e.g. Window).<br />
  3) A few functions change the behavior of "this"—bind, call and apply<br />
  4) The keyword "new" binds this to the object just created
</blockquote>

<p>Great and simple breakdown. But one point caught my attention — <i>“whatever is left of the dot at call-time”</i>. Seems pretty self-explanatory. For <code class="language-plaintext highlighter-rouge">foo.bar()</code>, <code class="language-plaintext highlighter-rouge">this</code> would refer to <code class="language-plaintext highlighter-rouge">foo</code>; for <code class="language-plaintext highlighter-rouge">foo.bar.baz()</code>, <code class="language-plaintext highlighter-rouge">this</code> would refer to <code class="language-plaintext highlighter-rouge">foo.bar</code>, and so on. But what about something like <code class="language-plaintext highlighter-rouge">(f = foo.bar)()</code>? After all, it <i>seems</i> that “whatever is left of the dot at call time” is <code class="language-plaintext highlighter-rouge">foo.bar</code>. Would that make <code class="language-plaintext highlighter-rouge">this</code> refer to <code class="language-plaintext highlighter-rouge">foo</code>?</p>

<p>Eager to save the world from unusual results in obscure cases, I rushed to leave a prompt comment on how the concept of “left of the dot” could be hairy. That for best results, one should understand concept of references, and their base values.</p>

<p>It is then that I shockingly realized that this concept of references actually hasn’t been covered all that much! In fact, searching for “javascript reference” yielded anything from cheatsheets to “pass-by-reference vs. pass-by-value” discussions, and not at all what I wanted. It had to be fixed.</p>

<p>And so this brings me here.</p>

<p>I’ll try to explain what these mysterious References are in Javascript (by which, of course, I mean ECMAScript) and how fun it is to learn <code class="language-plaintext highlighter-rouge">this</code> behavior through them. Once you understand References, you’ll also notice that reading ECMAScript spec is much easier.</p>

<p>But before we continue, quick disclaimer on the excerpt from Good Parts.</p>

<h3 id="good-parts-20">Good Parts 2.0</h3>

<p>The book was written in the times when <a href="http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdf">ES3</a> roamed the prairies, and now we’re in a full state of <a href="https://es5.github.io">ES5</a>.</p>

<p>What changed? Not much.</p>

<p>There’s 2 additions, or rather sub-points to the list of 4:</p>

<ol>
  <li>method invocation</li>
  <li>function invocation
    <ul>
      <li><span style="color: green">“use strict” mode (<i>new in ES5</i>)</span></li>
    </ul>
  </li>
  <li>constructor invocation</li>
  <li>apply invocation
    <ul>
      <li><span style="color: green">Function.prototype.bind (<i>new in ES5</i>)</span></li>
    </ul>
  </li>
</ol>

<p>Function invocation that happens in strict mode now has its <code class="language-plaintext highlighter-rouge">this</code> values set to <code class="language-plaintext highlighter-rouge">undefined</code>. Actually, it would be more correct to say that it does NOT have its <code class="language-plaintext highlighter-rouge">this</code> “coerced” to global object. That’s what was happening in ES3 and what happens in ES5-non-strict. Strict mode simply <a href="https://es5.github.io/#x10.4.3">avoids that extra step</a>, letting <code class="language-plaintext highlighter-rouge">undefined</code> propagate through.</p>

<p>And then there’s good old <code class="language-plaintext highlighter-rouge">Function.prototype.bind</code> which is hard to even call an addition. It’s essentially call/apply wrapped in a function, permanently binding <code class="language-plaintext highlighter-rouge">this</code> value to whatever was passed to <code class="language-plaintext highlighter-rouge">bind()</code>. It’s in the same bracket as <code class="language-plaintext highlighter-rouge">call</code> and <code class="language-plaintext highlighter-rouge">apply</code>, except for its “static” nature.</p>

<p>Alright, on to the References.</p>

<h3 id="reference-specification-type">Reference Specification Type</h3>

<p>To be honest, I wasn’t <i>that</i> surprised to find very little information on References in Javascript. After all, it’s not part of the language per se. References are only a <b>mechanism</b>, <a href="https://es5.github.io/#x8.7">used to describe certain behaviors in ECMAScript</a>. They’re not really “visible” to the outside world. They are vital for engine implementors, and users of the language don’t need to know about them.</p>

<p>Except when understanding them brings a whole new level of clarity.</p>

<p>Coming back to my original “obscure” example:</p>

<script src="https://gist.github.com/kangax/9a19b45da97a522701ab.js"> </script>

<p>How do we know that 1st one’s <code class="language-plaintext highlighter-rouge">this</code> references <code class="language-plaintext highlighter-rouge">foo</code>, but 2nd one — global object (or <code class="language-plaintext highlighter-rouge">undefined</code>)?</p>

<p>Astute readers will rightfully notice — <i>“well, the expression to the left of <code class="language-plaintext highlighter-rouge">()</code> evaluates to <code class="language-plaintext highlighter-rouge">f</code>, right after assignment; and so it’s the same as calling <code class="language-plaintext highlighter-rouge">f()</code>, making this function invocation rather than method invocation.”</i></p>

<p>Alright, and what about this:</p>

<script src="https://gist.github.com/kangax/3667b73fce9a793b7ec5.js"> </script>

<p><i>“Oh, that’s just grouping operator! It evaluates from left to right so it must be the same as foo.bar(), making <code class="language-plaintext highlighter-rouge">this</code> reference <code class="language-plaintext highlighter-rouge">foo</code>”</i></p>

<script src="https://gist.github.com/kangax/1499fcaa72dcc8f18c09.js"> </script>

<p><i>“Strange”</i></p>

<p>And how about this:</p>

<script src="https://gist.github.com/kangax/89efa9d5b02215b24f8a.js"> </script>

<p><i>“Well… considering last example, it must be <code class="language-plaintext highlighter-rouge">undefined</code> as well then? There must be something about those parenthesis”</i></p>

<script src="https://gist.github.com/kangax/300d61151c5e94230834.js"> </script>

<p><i>“Ok, I’m confused”</i></p>

<h3 id="theory">Theory</h3>

<p>ECMAScript defines Reference as a “resolved name binding”. It’s an abstract entity that consists of three components — base, name, and strict flag. The first 2 are what’s important for us at the moment.</p>

<p>There are 2 cases when Reference is created: in the process of <b>Identifier resolution</b> and during <b>property access</b>. In other words, <code class="language-plaintext highlighter-rouge">foo</code> creates a Reference and <code class="language-plaintext highlighter-rouge">foo.bar</code> (or <code class="language-plaintext highlighter-rouge">foo['bar']</code>) creates a Reference. Neither literals — <code class="language-plaintext highlighter-rouge">1</code>, <code class="language-plaintext highlighter-rouge">"foo"</code>, <code class="language-plaintext highlighter-rouge">/x/</code>, <code class="language-plaintext highlighter-rouge">{ }</code>, <code class="language-plaintext highlighter-rouge">[ 1,2,3 ]</code>, etc., nor function expressions — <code class="language-plaintext highlighter-rouge">(function(){})</code> — create references.</p>

<p>Here’s a simple cheat sheet:</p>

<h4 id="cheat-sheet">Cheat sheet</h4>

<table style="font-family: Courier New, Courier, monospace; text-align: left; border-spacing: 0; border-collapse: collapse">
  <thead>
    <tr>
      <th style="width: 250px; font-weight: normal; background: #888; color: #fff; padding: 5px">
        Example
      </th>
      <th style="width: 200px; font-weight: normal; background: #888; color: #fff; padding: 5px">
        Reference?
      </th>
      <th style="width: 400px; font-weight: normal; background: #888; color: #fff; padding: 5px">
        Notes
      </th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">"foo"</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px"></td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">123</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px"></td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">/x/</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px"></td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">({})</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px"></td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(function(){})</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px"></td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">foo</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #cfc">Yes</td>
      <td style="border: 1px solid #ccc; padding: 5px">Could be unresolved reference if `foo` is not defined</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">foo.bar</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #cfc">Yes</td>
      <td style="border: 1px solid #ccc; padding: 5px">Property reference</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(123).toString</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #cfc">Yes</td>
      <td style="border: 1px solid #ccc; padding: 5px">Property reference</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(function(){}).toString</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #cfc">Yes</td>
      <td style="border: 1px solid #ccc; padding: 5px">Property reference</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(1,foo.bar)</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px">Already evaluated, BUT see grouping operator exception</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(f = foo.bar)</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #fcc">No</td>
      <td style="border: 1px solid #ccc; padding: 5px">Already evaluated, BUT see grouping operator exception</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(foo)</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #cfc">Yes</td>
      <td style="border: 1px solid #ccc; padding: 5px">Grouping operator does not evaluate reference</td>
    </tr>
    <tr>
      <td style="border: 1px solid #ccc; padding: 5px">(foo.bar)</td>
      <td style="border: 1px solid #ccc; padding: 5px; background: #cfc">Yes</td>
      <td style="border: 1px solid #ccc; padding: 5px">Ditto with property reference</td>
    </tr>

  </tbody>
</table>

<p>Don’t worry about last 4 for now; we’ll take a look at those shortly.</p>

<p>Every time a Reference is created, its components — “base”, “name”, “strict” — are set to some values. The strict flag is easy — it’s there to denote if code is in strict mode or not. The “name” component is set to identifier or property name that’s being resolved, and the base is set to either property object or environment record.</p>

<p>It might help to think of References as <b>plain JS objects with a null [[Prototype]]</b> (i.e. with no “prototype chain”), containing only “base”, “name”, and “strict” properties; this is how we can illustrate them below:</p>

<p>When Identifier <code class="language-plaintext highlighter-rouge">foo</code> is resolved, a Reference is created like so:</p>

<script src="https://gist.github.com/kangax/f910ea9f7c0fc83ff1ec.js"> </script>

<p>and this is what’s created for property accessor <code class="language-plaintext highlighter-rouge">foo.bar</code>:</p>

<script src="https://gist.github.com/kangax/21accb720a8786346382.js"> </script>

<p>This is a so-called “Property Reference”.</p>

<p>There’s also a 3rd scenario — Unresolvable Reference. When an Identifier can’t be found anywhere in the scope chain, a Reference is returned with base value set to <code class="language-plaintext highlighter-rouge">undefined</code>:</p>

<script src="https://gist.github.com/kangax/4500d751162c23f6b682.js"> </script>

<p>As you probably know, Unresolvable References could blow up if not “properly used”, resulting in an infamous ReferenceError (“foo is not defined”).</p>

<p>Essentially, References are a simple mechanism of representing name bindings; it’s a way to abstract both object-property resolution and variable resolution into a unified data structure — base + name — whether that base is a regular JS object (as in property access) or an Environment Record (a link in a “scope chain”, as in identifier resolution).</p>

<p>So what’s the use of all this? Now that we know what ECMAScript does under the hood, how does this apply to <code class="language-plaintext highlighter-rouge">this</code> behavior, <code class="language-plaintext highlighter-rouge">foo()</code> vs. <code class="language-plaintext highlighter-rouge">foo.bar()</code> vs. <code class="language-plaintext highlighter-rouge">(f = foo.bar)()</code> and all that?</p>

<h3 id="function-call">Function call</h3>

<p>What do <code class="language-plaintext highlighter-rouge">foo()</code>, <code class="language-plaintext highlighter-rouge">foo.bar()</code>, and <code class="language-plaintext highlighter-rouge">(f = foo.bar)()</code> all have in common? They’re function calls.</p>

<p>If we take a look at <a href="https://es5.github.io/#x11.2.3">what happens when Function Call takes place</a>, we’ll see something very interesting:</p>

<script src="https://gist.github.com/kangax/07c8ed3d6309dba23648.js"> </script>

<p>Notice highlighted step 6, which basically explains both #1 and #2 from Crockford’s list of 4.</p>

<p>We take expression before <code class="language-plaintext highlighter-rouge">()</code>. Is it a property reference? (<code class="language-plaintext highlighter-rouge">foo.bar()</code>) Then use its base value as <code class="language-plaintext highlighter-rouge">this</code>. And what’s a base value of <code class="language-plaintext highlighter-rouge">foo.bar</code>? We already know that it’s <code class="language-plaintext highlighter-rouge">foo</code>. Hence <code class="language-plaintext highlighter-rouge">foo.bar()</code> is called with <code class="language-plaintext highlighter-rouge">this=foo</code>.</p>

<p>Is it NOT a property reference? Ok, then it must be a regular reference with Environment Record as its base — <code class="language-plaintext highlighter-rouge">foo()</code>. In that case, use ImplicitThisValue as <code class="language-plaintext highlighter-rouge">this</code> (and ImplicitThisValue of Environment Record is <a href="https://es5.github.io/#x10.2.1.1.6">always set to <code class="language-plaintext highlighter-rouge">undefined</code></a>). Hence <code class="language-plaintext highlighter-rouge">foo()</code> is called with <code class="language-plaintext highlighter-rouge">this=undefined</code>.</p>

<p>Finally, if it’s NOT a reference at all — <code class="language-plaintext highlighter-rouge">(function(){})()</code> — use <code class="language-plaintext highlighter-rouge">undefined</code> as <code class="language-plaintext highlighter-rouge">this</code> value again.</p>

<p>Are you feeling like this right now?</p>

<p><img src="/images/matrix.jpg" /></p>

<h3 id="assignment-comma-and-grouping-operators">Assignment, comma, and grouping operators</h3>

<p>Armed with this knowledge, let’s see if if we can explain <code class="language-plaintext highlighter-rouge">this</code> behavior of <code class="language-plaintext highlighter-rouge">(f = foo.bar)()</code>, <code class="language-plaintext highlighter-rouge">(1,foo.bar)()</code>, and <code class="language-plaintext highlighter-rouge">(foo.bar)()</code> in terms more robust than “whatever is left of the dot”.</p>

<p>Let’s start with the first one. The expression in question is known as Simple Assignment (=). <code class="language-plaintext highlighter-rouge">foo = 1</code>, <code class="language-plaintext highlighter-rouge">g = function(){}</code>, and so on. If we look at the steps taken to evaluate <a href="https://es5.github.io/#x11.13.1">Simple Assignment</a>, we’ll see one important detail:</p>

<script src="https://gist.github.com/kangax/c59daa515bee3c9e98f0.js"> </script>

<p>Notice that the expression on the right is passed through internal <code class="language-plaintext highlighter-rouge">GetValue()</code> before assignment. <code class="language-plaintext highlighter-rouge">GetValue()</code> in its turn, <b>transforms <code class="language-plaintext highlighter-rouge">foo.bar</code> Reference into an actual function object</b>. And of course then we proceed to the usual Function Call with NOT a reference, which results in <code class="language-plaintext highlighter-rouge">this=undefined</code>. As you can see, <code class="language-plaintext highlighter-rouge">(f = foo.bar)()</code> only looks similar to <code class="language-plaintext highlighter-rouge">foo.bar()</code> but is actually “closer” to <code class="language-plaintext highlighter-rouge">(function(){})()</code> in a sense that it’s an (evaluated) expression rather than an (untouched) Reference.</p>

<p>The same story happens with comma operator:</p>

<script src="https://gist.github.com/kangax/19508cf03171137a6234.js"> </script>

<p><code class="language-plaintext highlighter-rouge">(1,foo.bar)()</code> is evaluated as a function object and Function Call with NOT a reference results in <code class="language-plaintext highlighter-rouge">this=undefined</code>.</p>

<p>Finally, what about grouping operator? Does it also evaluate its expression?</p>

<script src="https://gist.github.com/kangax/f4e49e362db9e2a7f7eb.js"> </script>

<p>And here we’re in for surprise!</p>

<p>Even though it’s so similar to <code class="language-plaintext highlighter-rouge">(1,foo.bar)()</code> and <code class="language-plaintext highlighter-rouge">(f = foo.bar)()</code>, grouping operator does NOT evaluate its expression. It even says so plain and simple — it may return a reference; no evaluation happens. This is why <code class="language-plaintext highlighter-rouge">foo.bar()</code> and <code class="language-plaintext highlighter-rouge">(foo.bar)()</code> are absolutely identical, having <code class="language-plaintext highlighter-rouge">this</code> set to <code class="language-plaintext highlighter-rouge">foo</code> since a Reference is created and passed to a Function call.</p>

<h3 id="returning-references">Returning References</h3>

<p>It’s worth mentioning that ES5 spec technically <a href="https://es5.github.io/#x8.7">allows function calls to return a reference</a>. However, this is only reserved for host objects, and none of the built-in (or user-defined) functions do that.</p>

<p>An example of this (non-existent, but permitted) behavior is something like this:</p>

<script src="https://gist.github.com/kangax/36c3ae7b3e446c394197.js"> </script>

<p>Of course, the current behavior is that non-Reference is passed to a Function call, resulting in this=undefined/global object (unless <code class="language-plaintext highlighter-rouge">bar</code> was already bound to <code class="language-plaintext highlighter-rouge">foo</code> earlier).</p>

<h3 id="typeof-operator">typeof operator</h3>

<p>Now that we understand References, we can take a look in few other places for a better understanding. Take, for example, <a href="https://es5.github.io/#x11.4.3">typeof operator</a>:</p>

<script src="https://gist.github.com/kangax/1b0c46898331cf92b540.js"> </script>

<p>Here is that “secret” for why we can pass unresolvable reference to <code class="language-plaintext highlighter-rouge">typeof</code> and not have it blow up.</p>

<p>On the other hand, if we were to use unresolvable reference without <code class="language-plaintext highlighter-rouge">typeof</code>, as a <a href="https://es5.github.io/#x12.4">plain statement</a> somewhere in code:</p>

<script src="https://gist.github.com/kangax/b2c453a43b4687225ed8.js"> </script>

<p>Notice how Reference is passed to GetValue() which is then responsible for stopping execution if Reference is an unresolvable one. It all starts to make sense.</p>

<h3 id="delete-operator">delete operator</h3>

<p>Finally, what about good old <a href="https://es5.github.io/#x11.4.1">delete operator</a>?</p>

<script src="https://gist.github.com/kangax/3ca8c2a1e141ee4f3c3d.js"> </script>

<p>What might have looked like mambo-jumbo is now pretty nice and clear:</p>

<ul>
  <li>If it’s not a reference, return true (<code class="language-plaintext highlighter-rouge">delete 1</code>, <code class="language-plaintext highlighter-rouge">delete /x/</code>)</li>
  <li>If it’s unresolvable reference (<code class="language-plaintext highlighter-rouge">delete iDontExist</code>)
    <ul>
      <li>if in strict mode, throw SyntaxError</li>
      <li>if not in strict mode, return true</li>
    </ul>
  </li>
  <li>If it’s a property reference, actually try to delete a property (<code class="language-plaintext highlighter-rouge">delete foo.bar</code>)</li>
  <li>If it’s a reference with Environment Record as base (<code class="language-plaintext highlighter-rouge">delete foo</code>)
    <ul>
      <li>if in strict mode, throw SyntaxError</li>
      <li>if not in strict mode, attempt to delete it (further algorithm follows)</li>
    </ul>
  </li>
</ul>

<h3 id="summary">Summary</h3>

<p>And that’s a wrap!</p>

<p>Hopefully you now understand the underlying mechanism of References in Javascript; how they’re used in various places and how we can “utilize” them to explain <code class="language-plaintext highlighter-rouge">this</code> behavior even in non-trivial constructs.</p>

<p>Note that everything I mentioned in this post was <b>based on ES5</b>, being current standard and the most implemented one at the moment. <a href="people.mozilla.org/~jorendorff/es6-draft.html">ES6</a> might have some changes, but that’s a story for another day.</p>

<p>If you’re curious to know more — check out <a href="https://es5.github.io/#x8.7">section 8.7 of ES5 spec</a>, including internal methods <code class="language-plaintext highlighter-rouge">GetValue()</code>, <code class="language-plaintext highlighter-rouge">PutValue()</code>, and more.</p>

<p>P.S. Big thanks to <a href="https://twitter.com/rwaldron">Rick Waldron</a> for review and suggestions!</p>]]></content><author><name></name></author><category term="js" /><summary type="html"><![CDATA[Know thy reference Abusing leaky abstractions for a better understanding of “this”]]></summary></entry><entry><title type="html">Refactoring single page app</title><link href="http://perfectionkills.com/refactoring-single-page-app/" rel="alternate" type="text/html" title="Refactoring single page app" /><published>2014-09-01T00:00:00+00:00</published><updated>2014-09-01T00:00:00+00:00</updated><id>http://perfectionkills.com/refactoring-single-page-app</id><content type="html" xml:base="http://perfectionkills.com/refactoring-single-page-app/"><![CDATA[<h1 id="refactoring-single-page-app">Refactoring single page app</h1>
<h3 id="a-tale-of-reducing-complexity-and-exploring-client-side-mvc"><em>A tale of reducing complexity and exploring client-side MVC</em></h3>

<p>Skip straight to <a href="#tldr">TL;DR</a>.</p>

<p><a href="http://fabricjs.com/kitchensink">Kitchensink</a> is your usual behemoth app.</p>

<p>I created it couple years ago to showcase <em>everything</em> that <a href="http://fabricjs.com">Fabric.js</a> — a full-blown &lt;canvas&gt; library — is capable of. We’ve already had some <a href="http://fabricjs.com/demos">demos</a>, illustrating this and that functionality, but kitchensink was meant to be kind of a general sandbox.</p>

<p>You could quickly try things out — add simple shapes or images or SVG’s or text; move them around, scale, rotate, delete, group, change colors, opacity; experiment with locking or z-index properties; serialize canvas into image or JSON or SVG; and so on.</p>

<p><img src="/images/kitchensink.png" width="400px" /></p>

<p>And so there was a good old, <em>single</em> <strong>kitchensink.js</strong> file (accompanied by kitchensink.html and kitchensink.css) — just a bunch of procedural commands and conditions, really. Pressed that button? Add a rectangle to the canvas. Pressed another one? Load an image. Was object selected on canvas? Enable that button and update its text. You get the idea.</p>

<p>But things change, and over time the app grew and grew until the once-simple kitchensink.js became too big for its own good. I was starting to notice more and more repetition, problems with navigating and maintaining code. Those small weird glitches that live in the apps <strong>without authoritative data source</strong>; they came as well.</p>

<p>I was looking at a 1000+ LOC JS file, realizing it’s time to refactor.</p>

<p>But there was a bit of a pickle. You see, kitchensink is all about managing <strong>&lt;canvas&gt;</strong>, through Fabric, and frankly I had no idea how to best approach an app like this. If this was your usual “User” or “Collection” or “TodoItem” data coming from a server or elsewhere, I’d quickly throw together few <code class="language-plaintext highlighter-rouge">Backbone.Model</code>’s and call it a day. But with Fabric, we have an object model on top of &lt;canvas&gt;, so there’s just a collection of abstract visual objects and a way to operate on those objects.</p>

<p>Is it possible to shoehorn MVC onto all this? What exactly would become a model or the views? Is it even a good idea?</p>

<p>The following is my step-by-step refactoring path, including close look at some MVC-ish solutions. You can use it to get ideas on revamping your own spaghetti app, and/or to see how to approach design of &lt;canvas&gt;-based app, specifically. Each step is made as a separate commit in <a href="https://github.com/kangax/fabricjs.com/commits/gh-pages?page=3">fabricjs.com repo on github</a>.</p>

<h2 id="complexity_vs_maintainability">Complexity vs. Maintainability</h2>

<p><img src="/images/plato.png" /></p>

<p>Before changing anything, I decided to do a little experiment and <strong>statically analyze complexity</strong> of an app. Not to tell me that it was in shitty state; that I already knew. I wanted to see how it <strong>changes</strong> based on different solutions.</p>

<p>There are few ways to analyze JS code at the moment. There’s <a href="https://github.com/philbooth/complexity-report">complexity-report</a> npm package, as well as <a href="http://jscomplexity.org">jscomplexity.org</a> (both rely on <a href="https://github.com/philbooth/escomplex">escomplex</a>). There’s <a href="https://github.com/es-analysis/plato">Plato</a> that provides visual tracking of complexty (based on complexity-report). And there’s good old <a href="http://www.jshint.com/">JSHint</a>; it has its own cyclomatic complexity calculation.</p>

<p>I used <code class="language-plaintext highlighter-rouge">complexity-report</code> because it has more granular analysis and has this useful metric — “Maintainability”. What exactly is it and why should we care about it?</p>

<p>Here’s a simple example:</p>

<script src="https://gist.github.com/kangax/b6c6b8a2e97e2889e2e4.js"> </script>

<p>This chunk of code has cyclomatic complexity (CC) of 1. It’s just a single function call. No conditional operators, no loops. Yet, it’s pretty scary (<a href="https://github.com/kangax/fabric.js/blob/master/src/shapes/itext.class.js#L765-L769">actual code from Fabric.js, btw</a>; shame on me).</p>

<p>Now look at this code:</p>

<script src="https://gist.github.com/kangax/e574391fbb51d881ddfa.js"> </script>

<p>It also has cyclomatic complexity of 1. But it’s clearly <em>significantly</em> easier to understand and maintain.</p>

<p>Maintainability, on the other hand, is reported as <strong>151</strong> for the first one and <strong>159</strong> for the second (the higher the better; 171 being the highest). It’s still not a big difference but it’s definitely more representative of overall state of the code, unlike cyclomatic complexity.</p>

<p><code class="language-plaintext highlighter-rouge">complexity-report</code> <a href="http://jscomplexity.org/complexity">defines maintainability</a> as a function of not just cyclomatic complexity but also <strong>lines of code</strong> and <strong>overall volume</strong> of operators &amp; operands (effort):</p>

<script src="https://gist.github.com/kangax/22a35e5327f0e9ea413b.js"> </script>

<p>Suffice it to say, it gives more accurate picture of code simplicity and maintainability.</p>

<h2 id="vanilla-js">Vanilla JS</h2>

<p>It all started with this one <a href="https://github.com/kangax/fabricjs.com/blob/a28e2c79218f75d27b3c14a9ade38daebd0075d6/js/kitchensink.js">big, linear, 1057 LOC JS file</a>. Kitchensink never had any complex DOM/AJAX interactions or animations, so I never even used jQuery in there. Just a plain vanilla JS.</p>

<script src="https://gist.github.com/kangax/d7d45920fcc6fe4c799b.js"> </script>

<h2 id="introducing-jquery">Introducing jQuery</h2>

<p>I started by <a href="https://github.com/kangax/fabricjs.com/blob/9da7aded2bf1c6e066483a9ef344ad5b088b29b8/js/kitchensink.js">porting</a> all existing DOM interactions to jQuery. I wasn’t expecting any great improvements in code; jQuery can’t help with architectural changes. But it did remove some repetitive code - class handling, element removal, event handling, etc.</p>

<p>It would have also provided <strong>good foundation</strong> for further improvements, in case I decided to go with Backbone or any other higher-level tools.</p>

<script src="https://gist.github.com/kangax/a78627df78f63a3378f1.js"> </script>

<p>Notice how it shaved off ~50 lines of code and even improved complexity from 132 to <strong>116</strong> (mainly removing some DOM handling conditions: think <code class="language-plaintext highlighter-rouge">toggleClass</code>, etc.).</p>

<p><img src="/images/refactoring/jquery.png" /></p>

<h2 id="backbone-angular-ember">Backbone? Angular? Ember?</h2>

<p>With easy stuff out of the way, I tried to figure out what to do next. I’ve used Backbone in the past, and I’ve been meaning to try out Angular and/or Ember — 2 of the most popular higher-level solutions. This would be a perfect way to learn them.</p>

<p>Still unsure of how to proceed, I decided to do something <strong>very simple</strong>. Instead of figuring out which library is the be-all-end-all, I went on to fix the most obvious issue — tight coupling of view logic and all the other (let’s call it “business”) logic.</p>

<h2 id="separating-presentation-and-business-logic">Separating presentation and “business” logic</h2>

<p>I broke kitchensink.js into 3 files: <em>model.js</em>, <em>view.js</em>, and <em>utils.js</em>.</p>

<p>Utils were just some language-level methods used by the app (like <code class="language-plaintext highlighter-rouge">getRandomColor</code> or <code class="language-plaintext highlighter-rouge">supportsColorpicker</code>). View was to contain <strong>purely UI code</strong>, and it would reach out to model.js for any <strong>state or actions on that state</strong>. I called it model.js but really it was a combination of model and <strong>controller</strong>. The bottom line was that it had nothing to do with the presentation logic.</p>

<p>So this kind of mess (in previous code):</p>

<script src="https://gist.github.com/kangax/687f74aba5871a39c0e2.js"> </script>

<p>was now separated into view concern:</p>

<script src="https://gist.github.com/kangax/05f1414f216819bb6f7a.js"> </script>

<p>and model/controller concern:</p>

<script src="https://gist.github.com/kangax/2c8793ac550ccad93125.js"> </script>

<p>Separating presentation logic from everything else had a <strong>dramatic effect</strong> on the state of the app.</p>

<p>Yes, there was an inevitable increase in lines of code (714 -&gt; 829) but both complexity and maintainability skyrocketed. Overall CC went from 116 to <strong>98</strong>, but more importantly, it was significantly less per-file. The biggest chunk was now in the model (cc=70) and the view became thin and easy to follow (cc=26).</p>

<p>Maintainability rose from 104 to ~<strong>125</strong>.</p>

<script src="https://gist.github.com/kangax/564310b2b1b601e43ae7.js"> </script>

<p><img src="/images/refactoring/ui_business.png" /></p>

<h2 id="introducing-convention">Introducing convention</h2>

<p>Looking at the code revealed few more possible optimizations. One of them was to use <strong>convention</strong> when enabling/disabling buttons representing canvas object actions. Instead of keeping references to them in the code, then disabling/enabling them through those references, I gave them all specific class name (“btn-object-action”) right in the markup, then toggled their state with the help of jQuery.</p>

<p>The changes weren’t very impressive, but complexity of the view went down from 26 to <strong>21</strong>, SLOC went from 829 to <strong>805</strong>. Not bad.</p>

<p><img src="/images/refactoring/convention.png" /></p>

<h2 id="backbone">Backbone</h2>

<p>At this point, the app complexity was concentrated in model/controller file. There wasn’t much I could do about it since it was all pure “business” logic: creating objects, manipulating objects, keeping their state, etc.</p>

<p>However, there was still some room for improvement in the view corner.</p>

<p>I decided to start with <strong>Backbone</strong>. I only needed a fraction of its capabilities, but Backbone is relatively “lean” and provides a nice, declarative abstraction of certain common view operations, such as event handling. Changing plain kitchensink view object to <code class="language-plaintext highlighter-rouge">Backbone.View</code> would allow to take advantage of that.</p>

<p>Instead of assigning event handlers manually, there was now this:</p>

<script src="https://gist.github.com/kangax/dfe30290375f72da9122.js"> </script>

<h3 id="first-application-of-backbone-esque-mvc">First application of (Backbone-esque) MVC</h3>

<p>At the same time, model-controller was now implemented as <code class="language-plaintext highlighter-rouge">Backbone.Model</code> and was letting views know when to update themselves. This was an <strong>important change</strong> towards a different architecture. View was now <strong>observing model-controller</strong> for changes, and re-rendering itself accordingly. And model-controller fired change event whenever something would change on canvas itself.</p>

<p>In model-controller:</p>

<script src="https://gist.github.com/kangax/b9b0a3a76bd9dc68c25f.js"> </script>

<p>Remember I mentioned abstract canvas state and interactions?</p>

<p>Notice the bridge between canvas and model/controller object: “object:selected”, “object:added”, “selection:cleared” <strong>canvas/Fabric events were all forwarded</strong> as controller’s “change” one.</p>

<p>In view:</p>

<script src="https://gist.github.com/kangax/0967c79584ce29277d5b.js"> </script>

<p>As an example, now when user selected an object canvas, model-controller would trigger change event and view would re-render itself. Then during render, view would ask model-controller — <em>is there active object?</em> — and depending on an answer, render corresponding buttons in either enabled or disabled state, with one text or the other.</p>

<p>This felt like a good improvement in the right direction, architecture-wise.</p>

<p>Views became more declarative and easier to follow. SLOC went down a bit (787 -&gt; <strong>772</strong>), and view complexity was now even less (from 21 to <strong>16</strong>). Unfortunately, maintainability of model went slightly down.</p>

<script src="https://gist.github.com/kangax/d15e75e3ee13a2bcee4f.js"> </script>

<p><img src="/images/refactoring/backbone.png" /></p>

<h2 id="backboneunclassified">Backbone.unclassified</h2>

<p>Backbone made views more declarative, but there was still some <strong>repetition</strong> I wasn’t happy with:</p>

<script src="https://gist.github.com/kangax/9e06a3b1d4ef4c87e1b7.js"> </script>

<p>Notice how “#lock-horizontally” selector is repeated twice. This is bad both for maintainance (my main concern) and performance. In the past, I’ve used a tiny <a href="https://github.com/willurd/backbone.unclassified.js">backbone.unclassified</a> extension to alleviate this problem, and so I went with it again:</p>

<script src="https://gist.github.com/kangax/26d146d1ed398e20234f.js"> </script>

<p>Notice how we create an “<strong>identifier</strong>” for an element in <code class="language-plaintext highlighter-rouge">ui</code> “map”, and then use that identifier in <em>both</em> events “map” and in the rendering code.</p>

<p>This made views even more declarative, albeit at the expense of slightly more cruft overall. Complexity and maintainability stayed more or less the same.</p>

<h2 id="breaking-up-the-view">Breaking up the view</h2>

<p>The <a href="https://github.com/kangax/fabricjs.com/blob/571811739790490133da0ec1ec0803dd4bfb1f0e/js/kitchensink_view.js">KitchensinkView</a> was already clean and beautiful. Half of it was simple declarative one-liners (clicked this button? call that model method) and the rest was pretty simple and linear rendering logic.</p>

<p>But there was something else.</p>

<p>Entire view logic/rendering of an app was stuffed in <strong>one file</strong>. The declarative “events” hash, for example, was spanning ~200 lines and was becoming daunting to look through. More importantly, this one file included multiple concerns: object controls section, section for adding objects, global canvas controls section, text handling section, and so on. Yes, these are all view concerns but they’re also logically distinct view concerns.</p>

<p>What to do? Break it into multiple views!</p>

<script src="https://gist.github.com/kangax/6ab7548b3be00317817a.js"> </script>

<p>The code size obviously increased once again, but look what happened with views maintainability. It went from 132 to <strong>145</strong>! A significant and <strong>expected</strong> improvement.</p>

<p>Of course I didn’t need complexity report to tell me that things got better. I was now looking at 5 beautiful concise view files, each with its own rendering logic and behavior. As a nice side effect, some of the views (e.g. <code class="language-plaintext highlighter-rouge">AddCommandsView</code>) <a href="https://github.com/kangax/fabricjs.com/blob/089cd6a93d05f4bf4c9b09a5c235f4010e08c545/js/kitchensink/add_commands_view.js">became <strong>entirely declarative</strong></a>.</p>

<p><img src="/images/refactoring/multiple_views.png" /></p>

<h2 id="2-way-binding">2-way binding</h2>

<p>At this point, I was fully satisfied with the way things turned out.</p>

<p>Backbone (with unclassified extension) and multiple views made for a pretty clean app. Backbone felt almost perfect here as there was none of the more complicated logic of nested views/collections, animations/transition, routing, etc. I knew that adding new functionality or changing existing one would be straightforward; multiple views meant easy scaling and easy addition of new ones.</p>

<p>What could possible be better…</p>

<p>Determined to continue further and see where it takes me, I took another look at the views:</p>

<script src="https://gist.github.com/kangax/d201c9f647be7c97666c.js"> </script>

<p>This is <code class="language-plaintext highlighter-rouge">ObjectControlsView</code> and I’m only showing 2 functionalities here: lock toggling button and opacity slider. Notice how both of their behavior <strong>have something in common</strong>. There’s event (“click” or “change”) that maps to a model action, and then there’s rendering logic — updating button text or updating slider value.</p>

<p>Don’t you find the cruft inside <code class="language-plaintext highlighter-rouge">render</code> just a bit too repetitive and unnecessary? Wouldn’t it be great if we could just update “opacity” or toggle lock value on a model, not caring about rendering of corresponding control? So that opacity slider automatically knew to update itself, once opacity on a model changed. Ditto for toggling button.</p>

<p>Did someone say… <strong>data binding</strong>?</p>

<p>Of course! I just had to see what introducing <a href="http://en.wikipedia.org/wiki/UI_data_binding">data-binding</a> would do to an app. Unfortunately, Backbone doesn’t have it built-in, unlike other MV* solutions — Knockout, Angular, Ember, etc.</p>

<p>I wanted to stick to Backbone for now, instead of trying something completely different, which meant using an addon of some sort.</p>

<h3 id="backbonestickit">backbone.stickit</h3>

<p>I tried <a href="https://github.com/NYTimes/backbone.stickit">backbone.stickit</a> first, but <strong>couldn’t get it to work at all</strong> with kitchensink’s model/controller methods.</p>

<p>You see, binding view to a regular Backbone model is easy with “stickit”. Just define a hash with selector ↔ attribute mapping:</p>

<script src="https://gist.github.com/kangax/d084b10ecff4784e343a.js"> </script>

<p>Unfortunately, our model is &lt;canvas&gt;-based and all the state needs to be <strong>set &amp; retrieved via a proxy</strong>. This means using methods, not properties.</p>

<p>We can’t just map opacity slider to “opacity” attribute on a model. We need to map it to <code class="language-plaintext highlighter-rouge">canvas.getActiveObject().opacity</code> (possibly checking that <code class="language-plaintext highlighter-rouge">getActiveObject()</code> returns object in the first place) via custom getters/setters.</p>

<h3 id="epoxy">Epoxy</h3>

<p>Next there was <a href="http://epoxyjs.org/">Epoxy.js</a>, which defines bindings like so:</p>

<script src="https://gist.github.com/kangax/33f9ec0406142d509f07.js"> </script>

<p>Again, easy with plain attributes. Not so much with methods. I tried to implement it via computed properties but without much success.</p>

<h3 id="rivetsjs">Rivets.js</h3>

<p>Next there was <a href="rivetsjs.com">Rivets.js</a> and as I was expecting another painful “adaptation”, it surprisingly <strong>just worked</strong> outside of the box!</p>

<p>Rivets turned out to be pretty low-level, but also very flexible. Docs quickly revealed how to use methods instead of properties. The binding could be initialized like so:</p>

<script src="https://gist.github.com/kangax/32881609c10e50bf6c41.js"> </script>

<p>And the markup would then be parsed for any “rv-…” attributes (prefix could be changed). For example:</p>

<script src="https://gist.github.com/kangax/f72eafcca9cf5203532e.js"> </script>

<p>The great thing was that I could just write <code class="language-plaintext highlighter-rouge">app.getBgColor</code> and it would call that method on <code class="language-plaintext highlighter-rouge">kitchensink</code> since that’s what was passed to <code class="language-plaintext highlighter-rouge">rivets.bind()</code> as an <code class="language-plaintext highlighter-rouge">app</code>. No limitations of only working with <code class="language-plaintext highlighter-rouge">Backbone.Model</code> attributes. While this worked for one-way binding, with 2-way binding (where view also needs to update the model), I would need to write custom adapter…</p>

<p>It sounded daunting but turned out rather straighforward:</p>

<script src="https://gist.github.com/kangax/651adca01fdb169b63f6.js"> </script>

<p>Now, I could add this in markup (notice the use of special <code class="language-plaintext highlighter-rouge">^</code> separator, instead of default <code class="language-plaintext highlighter-rouge">.</code>):</p>

<script src="https://gist.github.com/kangax/68ebe292ce6839b9b32c.js"> </script>

<p>..and it would use a nice convention of calling <code class="language-plaintext highlighter-rouge">getCanvasBgColor</code> as a getter and <code class="language-plaintext highlighter-rouge">setCanvasBgColor</code> as a setter, when changing the colorpicker value.</p>

<p>There was no longer a need for manual (even if declarative) event listeners:</p>

<script src="https://gist.github.com/kangax/a9a99541625ae9ef8c89.js"> </script>

<h4 id="downsides-of-markup-based-bindings">Downsides of markup-based bindings</h4>

<p>I didn’t exactly like this whole setup.</p>

<p>I’d prefer to have bindings right in the code, to have a “birds-view” understanding of which elements map to which behavior. It would also be easier and more understandable to map <strong>multiple elements</strong>. If I wanted a set of buttons to toggle their enabled/disabled state according to certain state of canvas — and I did want that — I couldn’t just do something like:</p>

<script src="https://gist.github.com/kangax/dda2af750317cf0747f2.js"> </script>

<p>I had to write custom binder instead, and that’s certainly more obscure and harder to understand. Speaking of custom binders…</p>

<p>Rivets makes it easy to create them. Binders are those “rv-…” directives we saw earlier. There’s few built-in ones — “rv-value”, “rv-checked”, “rv-on-click” — and it’s easy to define your own.</p>

<p>In order to toggle buttons state, I wrote this simple 1-way binder:</p>

<script src="https://gist.github.com/kangax/8edbebc51b1b6d80c778.js"> </script>

<p>It was now possible to use “rv-enable” on a parent element to enable or disable descendant buttons:</p>

<script src="https://gist.github.com/kangax/3d6ac26b18a09ba0fda0.js"> </script>

<p>But imagine reading unknown markup like this, trying to understand which directive controls what, and <strong>how far it spans</strong>…</p>

<p>Another binder I added was “rv-val”, as an alternative to “rv-value” (with the exception of observing “keyup” rather than “change” event on an element):</p>

<script src="https://gist.github.com/kangax/6271c8531d80edfcdd39.js"> </script>

<p>You can see that adding binders is simple, they’re easy to read, and you can even reuse existing behavior (<code class="language-plaintext highlighter-rouge">rivets.binders.value.routine</code> in this case).</p>

<p>Finally, there’s a convenient support for formatting, which was <strong>just perfect</strong> for changing toggleable text on some elements:</p>

<script src="https://gist.github.com/kangax/e49d830bf6c05aa3ec70.js"> </script>

<p>Notice how “rv-text” contents include <code class="language-plaintext highlighter-rouge">| toggle smth smth</code>. This is a custom formatter, defined like this:</p>

<script src="https://gist.github.com/kangax/3f929be7f26519783625.js"> </script>

<p>The button text was now determined according to <code class="language-plaintext highlighter-rouge">app^horizontalLock</code> (which desugars to <code class="language-plaintext highlighter-rouge">app.getHorizontalLock()</code>) and when passed to <code class="language-plaintext highlighter-rouge">toggle</code> formatter, would come out either as one or the other. Unfortunately, formatter falls a bit short; it seems that its values can’t be strings, which makes things much less convenient.</p>

<p>Unlike with behavior, keeping alternative UI text directly in HTML felt perfect. Text stays where text should be — in markup; it makes localization easier; it’s easy to follow.</p>

<p>On the other hand, I didn’t like keeping model/controller actions right in the markup:</p>

<script src="https://gist.github.com/kangax/b846c51b329a4b454dd9.js"> </script>

<p>It’s especially bad when some of the view behavior is somewhere in a JS-based view/controller, and some — in the markup. YMMV.</p>

<p>So what happened to the code?</p>

<p>After moving app logic from JS views to HTML (via Rivets’ “rv-“ attributes), all that was left from the views were these 3 lines:</p>

<script src="https://gist.github.com/kangax/972c3157c1f4de279ed2.js"> </script>

<p>Amazing, right? Or not so much?</p>

<p>Yes, we practically eliminated JS-based view, moving logic/behavior to markup and/or model-controller. But let’s look at the stats:</p>

<script src="https://gist.github.com/kangax/f454a187cba9b817be45.js"> </script>

<p>There was now additional (32 SLOC) <a href="https://github.com/kangax/fabricjs.com/blob/bd8db011a2a46ee5eed3b42429f2d6677921ab96/js/kitchensink/data_binding_adapter.js"><em>data_binding_adapter.js</em></a> file which included all the customizations and additions for Rivets.js. Still, there was a dramatic reduction of SLOC (830 -&gt; <strong>715</strong>); expected, since a lot of logic was moved to the markup. View’s maintainability was still ~<strong>145</strong> but model-controller surprisingly went from 116 to <strong>125</strong>! Even though more code moved to model-controller, that code was now <em>simpler</em> — usually a pair of getter/setter’s for particular state.</p>

<p>So how does this compare to the very first step — a monolythic spaghetti code?</p>

<script src="https://gist.github.com/kangax/88ed82322a07e08ff6b6.js"> </script>

<p>Improvement across the board. And what about HTML, where so much logic was moved to?</p>

<script src="https://gist.github.com/kangax/04057452dd3d33ae76e1.js"> </script>

<p><img src="/images/refactoring/rivets.png" /></p>

<p>Ok, 100 lines longer, and only 3KB heavier. Doesn’t seem too bad.</p>

<p>But was this really an improvement? All the HTML declarations and all the abstraction felt like 1 step forward, 2 steps back. It seemed harder to understand and likely harder to maintain. While complexity tool showed improvement, it was only improvement on JS side, and of course it couldn’t give <strong>holistic analysis</strong>.</p>

<p>I wanted to take a step back and try something else.</p>

<h2 id="breaking-up-controller">Breaking up controller</h2>

<p>Aside from markup contamination, the problem was <strong>model-controller becoming too fat</strong>; that one file that was still sitting at 70 complexity.</p>

<p>What if I could keep <em>Rivets.js</em> for now, but break model-controller into multiple controllers, each for distinct behavior. And a very thin model would serve as a proxy between &lt;canvas&gt; and controller actions. After some experimentation and pondering on a best way to organize something like that, I ended up with this:</p>

<p>The model was now &lt;canvas&gt; itself! There were no JS-based views, and all the logic was in <a href="https://github.com/kangax/fabricjs.com/tree/62ae965b2e2e9e646c1e15e7292a78e9552ed932/js/kitchensink">5 distinct controllers</a>. But how was this possible? Shouldn’t canvas actions go through some kind of proxy to normalize all the <code class="language-plaintext highlighter-rouge">canvas.getActiveObject().{get|set}Something()</code> voodoo? Yes, it was still needed, but all the proxying was now happening <strong>in controller itself</strong>.</p>

<p>I created <code class="language-plaintext highlighter-rouge">CanvasController</code>, inheriting from <code class="language-plaintext highlighter-rouge">Backbone.Model</code> (to have event managing), and gave it very minimal generic behavior (<code class="language-plaintext highlighter-rouge">getStyle</code>, <code class="language-plaintext highlighter-rouge">setStyle</code>, <code class="language-plaintext highlighter-rouge">triggerChange</code>). Those methods are what <strong>served as proxy</strong> between canvas and controllers. Controllers implemented specific getters/setters <strong>via those methods</strong> (inherited from a parent <code class="language-plaintext highlighter-rouge">CanvasController</code> “class”).</p>

<p>How did this all look complexity-wise?</p>

<script src="https://gist.github.com/kangax/ab8d47418b69b672fcaf.js"> </script>

<p>SLOC stayed the same but what happened to complexity? Not only it went down to total of <strong>68</strong>, the max complexity per file was now only <strong>18</strong>! There was no longer a big business logic file cc=70, but small controller files with cc&lt;=20. Definitely an improvement.</p>

<p>Unfortunately, maintainability went slightly down (to <strong>128</strong>), likely due to all the additional cruft.</p>

<p><img src="/images/refactoring/multiple_controllers.png" /></p>

<p>Even though this was the <strong>best case complexity-wise</strong>, I still wasn’t <em>too happy</em> with this solution. There were still bindings in HTML and canvas controllers felt a bit too overly abstracted (i.e. it would take some time to understand how app works, how to change or extend it).</p>

<h2 id="angular">Angular</h2>

<p>Muliple controllers reminded me of what I’ve seen in Angular tutorials. It seemed natural to try and see how Angular compares to the last (Backbone + Rivets) solution, since it looked so similar.</p>

<p>Angular learning curve is definitely steeper. It took me ~2 days to understand and get comfortable with Rivets data-binding. It took ~2 weeks to understand and get comfortable with Angular data-binding (watches, digest cycle, directives, etc.).</p>

<p>Overall, implementing kitchensink via Angular felt <em>very</em> similar to Backbone + Rivets combo. But, as with everything, there were pros and cons.</p>

<h3 id="the-good">The Good</h3>

<p>In Angular, there’s no need to <code class="language-plaintext highlighter-rouge">Function#bind</code> methods to a model (when calling them from within attribute values). For example, <code class="language-plaintext highlighter-rouge">rv-on-click="app.foo"</code> calls <code class="language-plaintext highlighter-rouge">app.foo()</code> in context of element, whereas Angular’s <code class="language-plaintext highlighter-rouge">ng-click="foo()"</code> calls foo <strong>in context of $scope</strong>. This proves to be more convenient.</p>

<p>Using the same example of <code class="language-plaintext highlighter-rouge">rv-on-click="app.foo"</code> vs. <code class="language-plaintext highlighter-rouge">ng-click="foo()"</code>, braces after name make it more <strong>clear that it’s a function call</strong>.</p>

<p>Function calls are also <strong>more concise</strong>. For example, <code class="language-plaintext highlighter-rouge">rv-show="app.getSelected"</code> vs. <code class="language-plaintext highlighter-rouge">ng-show="getSelected()"</code>. There’s no need to specify <code class="language-plaintext highlighter-rouge">app</code> since <code class="language-plaintext highlighter-rouge">getSelected</code> is looked up automatically on <code class="language-plaintext highlighter-rouge">$scope</code>.</p>

<p>Mostly syntactic preference, but
<code class="language-plaintext highlighter-rouge">&lt;button&gt;{{ ... }}&lt;/button&gt;</code> (in Angular) is easier to read/understand than <code class="language-plaintext highlighter-rouge">&lt;button rv-text&gt;&lt;/button&gt;</code>.</p>

<h3 id="the-not-so-good">The not so Good</h3>

<p>The biggest drawback was getting started and <strong>understanding how to plug</strong> kithchensink’s <em>unique</em> “model” into Angular. I was also unlucky to have run into an issue with {{ … }} conflicting with <a href="http://jekyllrb.com/">Jekyll’s</a> {{ … }}. Took quite some time to figure out why in the world Angular was not “initializing”…</p>

<p>It’s a bit annoying that Angular’s methods start with <code class="language-plaintext highlighter-rouge">$</code> and “interfere” with a common convention of referencing jQuery objects via <code class="language-plaintext highlighter-rouge">$xxx</code>. Just a minor <strong>additional cognitive burden</strong> if you’re used to that notation.</p>

<p>There were some minor things like Angular’s <code class="language-plaintext highlighter-rouge">$element.find()</code> limiting lookup by tagName <em>even when jQuery was available</em>. Weird.</p>

<p>Most importantly, <strong>custom 2-way binding was non-trivial</strong>, unlike with Rivets documentation which made it very clear. With Angular, it’s pretty much impossible to use custom accessors in attribute values. We can’t do that elegant Rivets trick of <code class="language-plaintext highlighter-rouge">app^selected</code> desugaring to <code class="language-plaintext highlighter-rouge">app.getSelected()</code> and <code class="language-plaintext highlighter-rouge">app.setSelected()</code>. Of course Angular’s directives kind of solve this, but it’s not the same.</p>

<p>Why? Because in Rivets, you can plug this custom adapter <em>anywhere</em>, including Rivet’s “native” binders!</p>

<p>Take this radio group, use built-in <code class="language-plaintext highlighter-rouge">rv-checked</code> attribute, and it just works:</p>

<script src="https://gist.github.com/kangax/a488c81f2765d55ab571.js"> </script>

<p>This can not be done in Angular, and so we need to implement our own “radio group” handling via directive. Directives are somewhat similar to Rivets’ ones, although of course much more powerful.</p>

<p>To implement accessors, I created <code class="language-plaintext highlighter-rouge">bindValueTo</code> directive, to be used like this:</p>

<script src="https://gist.github.com/kangax/765fee873786ef61f315.js"> </script>

<p>Now, slider would call <code class="language-plaintext highlighter-rouge">getFontSize()</code> to retrive the value, and <code class="language-plaintext highlighter-rouge">setFontSize(value)</code> to set it. Once I understood directives, it was fairly straightforward:</p>

<script src="https://gist.github.com/kangax/7db74120526a4404ee71.js"> </script>

<p>Notice the additional <code class="language-plaintext highlighter-rouge">$element[0].type === 'radio'</code> branch for that radio group case I mentioned earlier.</p>

<h3 id="clarity-vs-abstraction">Clarity vs. Abstraction</h3>

<script src="https://gist.github.com/kangax/f320246a4cdbee07ac0c.js"> </script>

<p>When it comes to Angular, I feel it’s important to strike a balance between <strong>abstraction &amp; clarity</strong>. Take this toggle button, for example. A common chunk of functionality in kitchensink, used a dozen times.</p>

<script src="https://gist.github.com/kangax/e529b36d0da088e997d3.js"> </script>

<p>Now, this is a fairly <strong>understandable</strong> piece of markup — putting the issue of mixed content/behavior aside — accessor methods toggling the state, element class and text updating accordingly. Yet, this is a common functionality. So to avoid repetition, it could be abstracted away into its own directive.</p>

<p>Imagine it being written like this:</p>

<script src="https://gist.github.com/kangax/f0a0d2eaefeae758de26.js"> </script>

<p>Certainly cleaner and easier to read, but is it easier to understand? As with any abstraction, there’s now an extra level underneath, so it’s not immediately clear what’s going on.</p>

<p>So how did porting to Angular affect complexity/maintainability scores?</p>

<script src="https://gist.github.com/kangax/f7f02f276aabc82998dc.js"> </script>

<p>Comparing to previous Backbone/Rivets combo, SLOC went from 715 to <strong>660</strong>. Complexity — from 68 to <strong>65</strong>, and maintainability — from 128 to <strong>126</strong>. Interesting.</p>

<p>The reduction in SLOC was expected, knowing Angular’s nature of controller “entrees” right in markup. Complexity and maintainability, on the other hand, practically stayed the same.</p>

<p><img src="/images/refactoring/angular.png" /></p>

<h2 id="html-size">HTML size</h2>

<p>If you’re wondering how this refactoring affected size of the main HTML file, the picture is very simple and straightforward.</p>

<p><img src="/images/refactoring/html_size.png" /></p>

<p>As expected, it’s been continuously growing little by little, with the spike from markup-based solutions like Rivets and Angular. Curiously, while Angular resulted in higher SLOC, it was actually less KB comparing to Rivets.</p>

<h2 id="to-summarize">To summarize</h2>

<p>Unfortunately, other MV* libraries (Ember, Knockout, etc.) didn’t make it into my exploration. I was constrained on time, and I’ve already came to a much more maintainable solution. I do hope to try something else in the near future. It’ll be interesting to see how yet another concept ties into the app. Stay tuned for part 2.</p>

<p>My final conclusion was that Backbone+Rivets and Angular provided relatively similar benefits, with almost exact complexity/maintainability scores, and only different distribution of logic (attributes in markup vs. methods in JS “controller”). The pros/cons I mentioned earlier are what constituted the main difference.</p>

<h2 id="tldr">TLDR and Takeaway points</h2>

<ul>
  <li>
    <p>Path of exploration: <strong>Vanilla JS</strong> (initial mess) -&gt; <strong>jQuery</strong> (cleaner) -&gt; <strong>UI/business logic separation</strong> (much cleaner) -&gt; <strong>Backbone</strong> (slightly better) -&gt; <strong>Backbone.unclassified</strong> (slightly better) -&gt; <strong>Backbone &amp; multiple views</strong> (significantly better) -&gt; <strong>Rivets</strong> (better or worse?) -&gt; <strong>Multiple controllers</strong> (possibly better) -&gt; <strong>Angular.js</strong> (better or same?)</p>
  </li>
  <li>
    <p>MVC framework is <strong>not always necessary</strong> when refactoring or creating small/medium-sized client-side app. <strong>Separating presentation logic from “business” logic</strong> is often enough to produce clean and maintainable architecture.</p>
  </li>
  <li>
    <p><em>Backbone</em> is great but almost always comes out a bit <strong>too low-level</strong>. <em>Backbone.unclassified</em> is a great addition to <strong>remove some repetition</strong> in the views.</p>
  </li>
  <li>
    <p><em>Rivets.js</em> is a nice <strong>library-agnostic data-binding tool</strong>, that could be used on top of <em>Backbone</em> to remove lots of repetitive view logic.</p>
  </li>
  <li>
    <p>Complexity tools like <code class="language-plaintext highlighter-rouge">complexity-report</code> or <code class="language-plaintext highlighter-rouge">JSHint</code> can <strong>aid with refactoring</strong> but shouldn’t be followed blindly. <strong>Use common sense</strong> and time-tested principles (SRP, DRY, separate presentation logic) when refactoring/designing an app.</p>
  </li>
  <li>
    <p>Don’t forget to <strong>look at a big picture</strong>. When the size of JS goes down, what happens to the markup? It could be that you’re just shifting things around without any significant improvements.</p>
  </li>
</ul>]]></content><author><name></name></author><category term="js" /><summary type="html"><![CDATA[Refactoring single page app A tale of reducing complexity and exploring client-side MVC]]></summary></entry><entry><title type="html">HTML minifier revisited</title><link href="http://perfectionkills.com/html-minifier-revisited/" rel="alternate" type="text/html" title="HTML minifier revisited" /><published>2014-07-28T00:00:00+00:00</published><updated>2014-07-28T00:00:00+00:00</updated><id>http://perfectionkills.com/html-minifier-revisited</id><content type="html" xml:base="http://perfectionkills.com/html-minifier-revisited/"><![CDATA[<h1 id="html-minifier-revisted">HTML minifier revisted</h1>

<p>4 years ago I <a href="http://perfectionkills.com/experimenting-with-html-minifier/">wrote about</a> and released <a href="http://kangax.github.io/html-minifier/">HTMLMinifier</a>. Back then, there were <a href="http://perfectionkills.com/optimizing-html/#tools">almost no tools</a> for proper HTML minification; unless you considered things like <a href="http://www.alentum.com/ahc/">“Absolute HTML Compressor”</a> for Windows 95/98/XP/2000 or <a href="https://code.google.com/p/htmlcompressor/">Java-based HTMLCompressor</a>.</p>

<p>I haven’t been working on it all that much, but occasionally would add a feature, fix a bug, add some tests, refactor, or pull someone’s generous contribution.</p>

<p>Fast forward to these days, and HTMLMinifier is no longer a simple experimental piece of Javascript. With over <a href="http://kangax.github.io/html-minifier/tests/">400 tests</a>, running on Node.js and <a href="https://www.npmjs.org/package/html-minifier">packaged on NPM</a> (with 120+ dependents), having CLI, <a href="https://github.com/gruntjs/grunt-contrib-htmlmin">grunt</a>/<a href="https://github.com/jonschlinkert/gulp-htmlmin">gulp</a> modules, benchmarking suite, and a number of improvements over the years, it became a rather viable tool for someone looking to squeeze the most out of front-end performance.</p>

<p>Seeing how minifier gained quite few new additions over the years, I thought I’d give a quick rundown of what changed and what it’s now capable of.</p>

<h3 id="better-html5-conformance">Better HTML5 conformance</h3>

<p>We still rely on <a href="http://ejohn.org/blog/pure-javascript-html-parser/">John Resig’s HTML parser</a> but it is now heavily tweaked to conform to HTML5 and to provide more flexible parsing.</p>

<p>A common problem was inability to “properly” recognize block elements within inline ones.</p>

<!-- <pre><code>
&lt;a href="...">
  &lt;div>test&lt;/div>
&lt;/a>
</code></pre> -->

<script src="https://gist.github.com/kangax/fade61c2025870588f63.js"> </script>

<p>This was not allowed in HTML4 but is now OK in HTML5.</p>

<p>Another issue was with custom elements (e.g. <code class="language-plaintext highlighter-rouge">&lt;my-component&gt;test&lt;/my-component&gt;</code>). While, technically, not part of HTML5, browsers do tolerate such cases and so does minifier.</p>

<h3 id="keeping-closing-slash-and-case-sensitivity-xhtml-svg-etc">Keeping closing slash and case sensitivity (XHTML, SVG, etc.)</h3>

<p>Two other commonly requested features were <strong>keeping end tag closing slash</strong> and <strong>case-sensitivity</strong>. Both of these are useful when minifying SVG (or XHTML) documents. Having HTML4 parser at heart, and considering that in 99% of the cases <a href="http://www.webdevout.net/articles/beware-of-xhtml">trailing slashes serve no purpose</a>, minifier would always drop them from the output. It still does, but you can now turn this behavior off.</p>

<p>Ditto for case-sensitivity — there’s an option for those looking to have finer control.</p>

<h3 id="ignoring-custom-comments-and-">Ignoring custom comments and &lt;!–!</h3>

<p>With the rise of client-side MVC frameworks, HTML comments became more than just comments. In <a href="http://knockoutmvc.com/">Knockout</a>, for example, there’s a thing called <em>containerless control flow syntax</em>, where you can have something like this:</p>

<!-- <pre><code>
&lt;!-- ko if: someExpression -->
<!--  &lt;li>Only present when someExpression is true&lt;/li> -->
<!-- &lt;!-- /ko -->
<!-- </code></pre> -->

<script src="https://gist.github.com/kangax/6bc8ce4d686994d6a198.js"> </script>

<p>It’s useful to be able to ignore such comments, while removing “regular” ones, so minifier now allows for exactly that:</p>

<!-- <pre><code>
minify(input, {
  removeComments: true,
  // ignore knockout comments
  ignoreCustomComments: [
    /^\s+ko/,
    /\/ko\s+$/
  ]
};
</code></pre> -->

<script src="https://gist.github.com/kangax/e5b518e97374741f659f.js"> </script>

<p>Relatedly, we’ve also added support for generic ignored comments — those starting with <code class="language-plaintext highlighter-rouge">&lt;!--!</code>. You might recognize this pattern from de-facto standard among Javascript libraries — comments starting with <code class="language-plaintext highlighter-rouge">/*!</code> are ignored by minifiers and are often used for licenses.</p>

<p>If you’d like to ignore an <em>entire chunk</em> of markup from minification, you can now simply wrap it with <code class="language-plaintext highlighter-rouge">&lt;!-- htmlmin:ignore --&gt;</code> and it’ll stay untouched.</p>

<p>Finally, we now ignore anything surrounded by <code class="language-plaintext highlighter-rouge">&lt;%...%&gt;</code> and <code class="language-plaintext highlighter-rouge">&lt;?...?&gt;</code> which is often useful when working with server-side templates, etc.</p>

<h3 id="custom-attributes">Custom attributes</h3>

<p>Another <del>bastardization</del> twist on your regular HTML we can see in client-side MVC frameworks is non-standard attribute names, values and everything in between.</p>

<p>Example of Handlebars’ dynamic attributes:</p>

<!-- 
<pre><code>
&lt;button {{#if disabled}}disabled{{/if}}>...&lt;/button>
</code></pre>
 -->

<script src="https://gist.github.com/kangax/6d1646fb8c9ab694c9fa.js"> </script>

<p>Most of the HTML4/5 parsers will fail here, choking on <code class="language-plaintext highlighter-rouge">{</code> in <code class="language-plaintext highlighter-rouge">{{#if</code> as an invalid attribute name character.</p>

<p>We worked around this by adding support for <code class="language-plaintext highlighter-rouge">customAttrSurround</code> option, in which you can specify an array of regexes to match anything surrounding attributes:</p>

<!-- <pre><code>
minify(input, {
  customAttrSurround: [
    [ /\{\{#if\s+\w+\}\}/, /\{\{\/if\}\}/ ],
    [ /\{\{#unless\s+\w+\}\}/, /\{\{\/unless\}\}/ ]
  ]
});
</code></pre> -->

<script src="https://gist.github.com/kangax/fb3bf464cc41b92b8715.js"> </script>

<p>But wait, there’s more! Attribute names are not the only offenders.</p>

<p>Here’s <a href="https://github.com/Polymer/polymer/issues/650">an example</a> from <a href="http://www.polymer-project.org/">Polymer</a>; notice <code class="language-plaintext highlighter-rouge">?=</code> as an attribute assignment characters:</p>

<!-- 
<pre><code>
&lt;div flex?="{{mode !== 'cover'}}">&lt;/div>
</code></pre>
 -->

<script src="https://gist.github.com/kangax/d28913334b66d38381cf.js"> </script>

<p>Only few days ago we’ve added support for <code class="language-plaintext highlighter-rouge">customAttrAssign</code> option, similar to <code class="language-plaintext highlighter-rouge">customAttrSurround</code> (thanks <a href="https://github.com/duncanbeevers">Duncan Beevers</a>!), which you can call like so:</p>

<!-- <pre><code>
minify(input, {
  customAttrAssign: [/\?=/]
});
</code></pre> -->

<script src="https://gist.github.com/kangax/6346757252e78ea7642a.js"> </script>

<h3 id="scripts-as-templates">Scripts as templates</h3>

<p>Continuing on the topic of MVC frameworks, we’ve also added support for an often-used pattern of <a href="http://ejohn.org/blog/javascript-micro-templating/">scripts-as-templates</a>:</p>

<p>AngularJS:</p>

<!-- <pre><code>
&lt;script type="text/ng-template" id="tempTest">
  &lt;div>
    &lt;span>Properly Inserted&lt;/span>
  &lt;/div>
&lt;/script>
</code></pre> -->

<script src="https://gist.github.com/kangax/32aff374cfb8fee4b13a.js"> </script>

<p>Ember.js</p>

<!-- <pre><code>
&lt;script type="text/x-handlebars">
  Hello, &lt;strong> &lt;/strong>!
&lt;/script>
</code></pre> -->

<script src="https://gist.github.com/kangax/ea63a1f5688a155dc834.js"> </script>

<p>There’s no reason not to minify contents of such scripts, and you can now do this via <code class="language-plaintext highlighter-rouge">processScripts</code> directive:</p>

<!-- <pre><code>
minify(input, {
  collapseWhitespace: true,
  removeComments: true,
  processScripts: [ 'text/ng-template' ]
});
</code></pre> -->

<script src="https://gist.github.com/kangax/646e9ccc81608827ca5d.js"> </script>

<h3 id="jscss-minification">JS/CSS minification</h3>

<p>Now, what about “regular” scripts?</p>

<p>We decided to go a step further, providing a way to minify contents of &lt;script&gt; elements and event handler attributes (“onclick”, “onload”, etc.). This is being delegated to an excellent <a href="https://github.com/mishoo/UglifyJS2">UglifyJS2</a>.</p>

<p>CSS isn’t left behind either; we can now pass contents of style elements and style attributes through <a href="https://github.com/GoalSmashers/clean-css">clean-css</a>, which happens to be the <a href="http://goalsmashers.github.io/css-minification-benchmark/">best CSS compressor at the moment</a>.</p>

<p>Both of these features are optional.</p>

<h3 id="conservative-whitespace-collapse">Conservative whitespace collapse</h3>

<p>If you’d like to play it safe and make minifier always leave at least 1 whitespace where it would otherwise completely remove it, there’s now an option for that — <code class="language-plaintext highlighter-rouge">conservativeCollapse</code>.</p>

<p>This could come in useful if your page layout/rendering depends on whitespace, such as in this example:</p>

<!-- <pre><code>
&lt;style>
  div { display: inline-block }
&lt;/style>

&lt;div>test&lt;/div> &lt;input type="checkbox">
</code></pre> -->

<script src="https://gist.github.com/kangax/24e094c8d1ae47b9ab57.js"> </script>

<p>Minifier doesn’t know that input-preceding element is rendered as <strong>inline-block</strong>; it doesn’t know that <strong>whitespace around it is significant</strong>. Removing whitespace would render checkbox too close (squeeshed) to a “label”.</p>

<p><img src="../images/inline_block.png" style="box-shadow: 1px 1px 1px rgba(0,0,0,0.5)" /></p>

<p>This is when “conservativeCollapse” (and that extra space) comes in useful.</p>

<h3 id="max-line-length">Max line length</h3>

<p>Another recently-introduced customization is maximum line length. An interesting use case is that <a href="https://github.com/kangax/html-minifier/issues/203">some email servers automatically add a new line after 1000 characters</a>, which breaks (minified) HTML. You can now specify line length to add newlines at valid breakpoints.</p>

<h3 id="benchmarks">Benchmarks</h3>

<p>We also have a benchmark suite now that goes over a number of “source” files (front pages of popular websites), minifies them, then reports size comparison and time spent on minification.</p>

<p><img src="../images/minifier_benchmarks.png" /></p>

<p>How does HTMLMinifier compare <sup><a href="#benchmarks">[1]</a></sup> to the other solutions out there (<a href="http://www.willpeavy.com/minifier/">Will Peavy’s online minifier</a> and a Java-based <a href="http://htmlcompressor.com">HTMLCompressor</a>)?</p>

<table style="border-spacing: 10px; border-collapse: separate;">
<thead><tr>
<th>Site</th>
<th align="center">Original size <em>(KB)</em>
</th>
<th align="right">HTMLMinifier <em>(KB)</em>
</th>
<th align="right">Will Peavy <em>(KB)</em>
</th>
<th align="right">htmlcompressor.com <em>(KB)</em>
</th>
</tr></thead>
<tbody>
<tr>
<td><a href="https://github.com/kangax/html-minifier">HTMLMinifier page</a></td>
<td align="center">48.8</td>
<td align="right"><b>37.3</b></td>
<td align="right">43.3</td>
<td align="right">41.9</td>
</tr>
<tr>
<td><a href="http://kangax.github.io/es5-compat-table/es6/">ES6 table</a></td>
<td align="center">117.9</td>
<td align="right"><b>79.9</b></td>
<td align="right">92</td>
<td align="right">91.9</td>
</tr>
<tr>
<td><a href="http://msn.com">MSN</a></td>
<td align="center">156.6</td>
<td align="right"><b>133</b></td>
<td align="right">145</td>
<td align="right">138.3</td>
</tr>
<tr>
<td><a href="http://stackoverflow.com">Stackoverflow</a></td>
<td align="center">200.4</td>
<td align="right"><b>159.5</b></td>
<td align="right">168.3</td>
<td align="right">163.3</td>
</tr>
<tr>
<td><a href="http://amazon.com">Amazon</a></td>
<td align="center">245.9</td>
<td align="right"><b>206.3</b></td>
<td align="right">225</td>
<td align="right">218.5</td>
</tr>
<tr>
<td><a href="http://en.wikipedia.org/wiki/President_of_the_United_States">Wikipedia</a></td>
<td align="center">401.4</td>
<td align="right"><b>380.6</b></td>
<td align="right">396.3</td>
<td align="right">n/a</td>
</tr>
<tr>
<td><a href="http://eloquentjavascript.net/print.html">Eloquent Javascript</a></td>
<td align="center">869.5</td>
<td align="right"><b>830</b></td>
<td align="right">872</td>
<td align="right">n/a</td>
</tr>
</tbody>
</table>

<p>Not too bad!</p>

<p>Notice remarkable savings (~40KB) on large static files such as a one-page <a href="http://eloquentjavascript.net/print.html">Eloquent Javascript</a>.</p>

<h3 id="future-plans">Future plans</h3>

<p>Minifier has come a long way, but there’s always room for improvement.</p>

<p>There’s few more bugs to squeesh and few features to add. I also believe there’s more optimizations we could perform to get the best savings — whether it’s reordering attributes to aid gzip compression or more aggressive content removal (spaces, attributes, values, etc.).</p>

<p>One concern I have is how long it takes to minify large (500KB+) files. While it’s unlikely that someone uses minifier in real-time (rather, as a one time compilation step) it’s still unacceptable for minification to take more than 1-2 minutes. This is something we could try fixing in the future.</p>

<p>We can also monitor performance stats — both size (as well as gzipped?) and time taken — on each commit, to get a good picture of whether things change for the better or worse.</p>

<p>As always, I welcome you to try minifier in your projects, report any bugs/suggestions, and help with whatever you can. Huge thanks goes to <a href="https://github.com/kangax/html-minifier/graphs/contributors">all the contributors</a> without whom we wouldn’t have come this far!</p>

<p><sup id="benchmarks">[1] Benchmarks performed on OS X 10.9.4 (2.3GHz Core i7).</sup></p>]]></content><author><name></name></author><category term="html" /><summary type="html"><![CDATA[HTML minifier revisted]]></summary></entry><entry><title type="html">JSCritic</title><link href="http://perfectionkills.com/jscritic/" rel="alternate" type="text/html" title="JSCritic" /><published>2014-03-27T00:00:00+00:00</published><updated>2014-03-27T00:00:00+00:00</updated><id>http://perfectionkills.com/jscritic</id><content type="html" xml:base="http://perfectionkills.com/jscritic/"><![CDATA[<h1 id="jscritic">JSCritic</h1>

<p>Choosing a good piece of Javascript is hard.</p>

<p>Every time I come across a newly-released, shiny plugin or library, I wonder what’s going on underneath. Yes, it looks pretty and convenient but what does underlying code look like? Does it browser sniff or extend the DOM? Does it pollute global scope? What about compatibility with older browsers; could it be that it utilizes, say, ES5 getters/setters making it unusable in IE&lt;9?</p>

<p>I always wished there was a way to <b>quickly check</b> how well a certain script behaves. Not like we did <a href="https://groups.google.com/forum/?hl=en#!msg/comp.lang.javascript/PZDouKgwFGI/XKd8LYURyzcJ">back in the days</a>.</p>

<p>The best thing for a code quality test like this is undoubtedly through JSHint <sup><a href="#jshint">[1]</a></sup>. It can answer most of those questions and many more. Unfortunately, “many more” part is a bit of a problem. Plugging a script code into <a href="http://jshint.com">jshint.com</a> usually yields tons of issues, not just with browser compatibility or global variables but also code style. These checks are a must for your own scripts, but for 3rd party code, I don’t really care about missing semicolons (despite my love of them), whether constructors begin with uppercase, or if assignments happen in conditional statements. I only wish to know how well a script behaves <b>on the outside</b>. Now, a sloppy code style can certain be an indication of a bad quality of script overall. But more often than not it’s a preference not a problem.</p>

<p>Few days ago, I decided to hack something together; something simple, that would allow me to quickly plug the script and see a big picture.</p>

<p>So I made <a href="http://jscritic.com">JSCritic</a>.</p>

<p>Plug in script code and it answers some of the more pressing questions.</p>

<p><a href="http://jscritic.com">
  <img src="/images/jscritic.png" style="width: 850px" />
</a></p>

<p>I tried using <a href="http://esprima.org">Esprima</a> at first, but quickly realized that most of the checks I care about are already in JSHint. So why not piggy back on that? <a href="http://github.com/kangax/jscritic">JSCritic</a> turned out to be a simple wrapper on top of it. I originally wrote it in Node, to be able to pass it filename and quickly see the results, then ported it to run in a browser.</p>

<!-- <pre lang="shell"><code>
> node jscritic.js fabric.js

- Does it browser sniff?              Nope

- Does it extend native objects?      Yep (String)

- Does it use `document.write`?       Nope

- Does it use eval?                   Yep
    eval("var callback =" + js);

- Does it use ES6 features?           Nope

- Does it use Mozilla-only features?  Nope

- Does it have IE incompatibilities?  Yep (Extra comma, get/set are ES5 features)

- How many global variables?          9 (line, column, GSS, GSS_CONFIG, selector, type, callback, c, ShadowDOMPolyfill)

- How many unused variables?          47 (require, exports, escape, idPrefix, offset, flatten, _varsCache, statements, result, id, _id1, _id2, s, connector, module, props, vflFooter, col, heights, k, h, io, c, setVariable, coeff, expr, medium, strong, required, match, e, vars, tracker, exp, op, w, root, e2, e1, namesssss, self, _this, bridgessssss, names, trackersss, ifffff, node)

Total size:                           872.99KB
Minified size:                        250.75KB
</code></pre> -->

<script src="https://gist.github.com/kangax/26e20fb726cbcbd27087.js"> </script>

<p>You can still run it in both.</p>

<p>Another thing I wanted to see is <b>minified script size</b>. Some plugins have minified versions, some don’t, some use better minifiers, some worse. I decided to minify content through <a href="https://github.com/mishoo/UglifyJS2">UglifyJS</a> — a de facto standard of minification at the moment — to get an objective overview of code size. Unfortunately, browser version of UglifyJS seems to be choking more often than Node one, so it might be safer to use the latter.</p>

<p>I have to say that JSCritic is more of a prototype at the moment. Static analysis has its limitations, as well as JSHint. I haven’t had much time to polish it, but hoping to improve in the near future or with the help of ever-awesome contributors. One thing to emphasize is that for best results you should <b>use non-minified source code</b> (you’ll see exactly why below)!</p>

<p>If you want to know more about tests, implementation details, and drawbacks, read on. Otherwise, hope you find it as useful as I do.</p>

<h3 id="globals">Global variables</h3>

<p>Let’s first take a look at global variables detection. Unfortunately, it seems to be very simplistic in JSHint, failing to catch cases other than plain variable/function declarations in top level code.</p>

<!-- <pre lang="javascript"><code>
var foo = 1;

function bar() {
  function baz () { }
  qux = 123;
}
</code></pre> -->
<script src="https://gist.github.com/kangax/b7ee9179f1648a2282a2.js"> </script>

<p>Here it catches <code>foo</code>, <code>bar</code>, and <code>qux</code> as expected, but fails with all of these:</p>

<!-- <pre lang="javascript"><code>
(function(){ window.foo = 1; })();
(function(){ this.foo = 1; })();
(function(){ self.foo = 1; })();
(function(){ var global = this; global.foo = 1; })();
(function(){ var global = this; global.foo = 1; }).call(this);
</code></pre> -->

<script src="https://gist.github.com/kangax/18e89bb4cba0b86e6409.js"> </script>

<p>Granted, detecting globals via static analysis is hard. A more robust solution would be to actually execute code and compare global object “signature”, just like I did in <a href="http://perfectionkills.com/detecting-global-variable-leaks/">detect-global bookmarklet</a> back in 2009 (based on a script by Remy Sharp). Unfortunately, executing script is also not always easy, and global properties could be exported from various places (e.g. methods that need to be called explicitly); we have no idea which places those are.</p>

<p>Still, JSHint catches a good number of globals and accidental leaks like these:</p>

<!-- <pre lang="javascript"><code>
var foo = 1;
    bar = 2;
</code></pre> -->

<script src="https://gist.github.com/kangax/bbb236b2fe1bc427d75b.js"> </script>

<p>It gives a decent overview, but you should still look through variables carefully as some of them might be false positives. I’m hoping this will be made more robust in the future JSHint versions (or we could try using hybrid detection method — both via static analysis and through global object signature).</p>

<h3 id="natives">Extended natives</h3>

<p>Detecting native object extensions has few limitations as well. While it catches both Array and String in example like this:</p>

<!-- <pre lang="javascript"><code>
(function(){
  Array.prototype.foo = function(){ };
  String.prototype.bar = 123;
})();
</code></pre> -->

<script src="https://gist.github.com/kangax/d272abd2b954d29e844b.js"> </script>

<p>..it fails with all of these:</p>

<!-- <pre lang="javascript"><code>
(function(s) {

  Object.myKeys = function(){ };

  var proto = String.prototype;
  proto.bar = 123;

  Array['prototype'].foo = 'xyz';

  s.prototype.blah = 'blah';

})(String);
</code></pre> -->

<script src="https://gist.github.com/kangax/caa7e3cce423f958b1a2.js"> </script>

<p>As you can see, it’s also simplistic and could have false negatives. There’s an <a href="https://github.com/jshint/jshint/issues/1316">open JSHint issue</a> for this.</p>

<h3 id="eval">eval &amp; document.write</h3>

<p>Just like with other checks, there are false positives and false negatives. Here’s some of them, just to give an idea of what to expect:</p>

<!-- <pre lang="javascript"><code>
/* false negative

    Issues: https://github.com/jshint/jshint/issues/738
            https://github.com/jshint/jshint/issues/1204

*/
schemaEvaluator.eval(experimentId, schema);
</code></pre> -->

<script src="https://gist.github.com/kangax/6a60374ada10d4b95b0d.js"> </script>

<p>and with <code>document.write</code>:</p>

<!-- <pre lang="javascript"><code>
(function(d) {

  // catches
  document.write(1);

  // doesn't catch
  d.write(1);
  document['write'](1);

})(document);
</code></pre> -->

<script src="https://gist.github.com/kangax/f7be5ebdbad1e1b1c409.js"> </script>

<h3 id="compatibility">Browser compatibility</h3>

<p>I included 3 checks for browser/engine compatibility — Mozilla-only extensions (let expressions, <a href="/a-closer-look-at-expression-closures">expression closures</a>, multiple catch blocks, etc.), things IE chokes on (e.g. extra comma), and ES6 additions (array comprehensions, generators, imports, etc.). All of these things could affect cross-browser support.</p>

<h3 id="sniffing">Browser sniffing</h3>

<p>To detect browser sniffing, we first check statically for occurance of <code class="language-plaintext highlighter-rouge">navigator</code> implied global (via JSHint), then check source for occurance of <code class="language-plaintext highlighter-rouge">navigator.userAgent</code>. This covers a lot of cases, but obviously won’t catch any obscurities, so be careful. To make things easier, a chunk of code surrounding <code class="language-plaintext highlighter-rouge">navigator.userAgent</code> is pasted for expection purposes. You can quickly check what it’s there for (is it for non-critical enhancement purposes or could it cause subtle bugs and/or full breakage?)</p>

<h3 id="unused">Unused variables</h3>

<p>Finally, I included unused variables check from JSHint. While not exactly an indication of external script behavior, seeing lots of those could be an indication of sloppy (and potentially buggy) code. I put it all the way at the end, as this is the least important check.</p>

<p>So there it is. The set of rules can definitely be made larger (does it use ES5 features? does it use browser-sniffing-like inference? does it extend the DOM?) and more accurate in the future. For now you can use JSCritic as a quick first look under the hood.</p>

<p class="footnote" id="jshint">
  <sup>[1]</sup> and perhaps ESLint, but I haven't had a chance to look into it.
</p>]]></content><author><name></name></author><category term="js" /><summary type="html"><![CDATA[JSCritic]]></summary></entry><entry><title type="html">State of function decompilation in Javascript</title><link href="http://perfectionkills.com/state-of-function-decompilation-in-javascript/" rel="alternate" type="text/html" title="State of function decompilation in Javascript" /><published>2014-01-14T00:00:00+00:00</published><updated>2014-01-14T00:00:00+00:00</updated><id>http://perfectionkills.com/state-of-function-decompilation-in-javascript</id><content type="html" xml:base="http://perfectionkills.com/state-of-function-decompilation-in-javascript/"><![CDATA[<h1 id="state-of-function-decompilation-in-javascript">State of function decompilation in Javascript</h1>

<p><img src="/images/decompilation2.png" style="box-shadow:rgba(0,0,0,0.5) 1px 1px 1px" /></p>

<p>It’s always fun to see something described as “magic” in Javascript world.</p>

<p>One such example I came across recently was AngularJS <a href="http://en.wikipedia.org/wiki/Dependency_injection">dependency injection</a> mechanism. I’ve never been familiar with the concept, but seeing it in practice looked clever and convenient. Not very magical though.</p>

<p>What is it about? In short: defining required “modules” via function parameters. Like so:</p>

<!-- ```
angular.module('App', [ ])
  .controller('Ctrl', function($scope, $timeout, $http) {
    ...
  });
``` -->

<script src="https://gist.github.com/kangax/e627faf9aeeb05a93498.js"> </script>

<p>Notice the <code class="language-plaintext highlighter-rouge">$scope</code>, <code class="language-plaintext highlighter-rouge">$timeout</code>, <code class="language-plaintext highlighter-rouge">$http</code> identifiers.</p>

<p>Aha. So instead of passing them as strings or vars or whatever, they’re defined as <em>part of the source</em>. And of course to “read” the source there could only be one thing involved…</p>

<p><a href="https://github.com/angular/angular.js/blob/dde1b2949727c297e214c99960141bfad438d7a4/src/auto/injector.js#L63-L96">Function decompilation</a>.</p>

<p>The kind that we used in Prototype.js <a href="http://prototypejs.org/2007/08/15/prototype-1-6-0-release-candidate/">to implement $super</a> back in 2007? Yep, that one. Later making its way to Resig’s <a href="http://ejohn.org/blog/simple-javascript-inheritance/">simple inheritance</a> (used in a <em>safe</em> fashion) and other places.</p>

<p>Seeing a modern framework like Angular use function decompilation got me surprised. Even though it wasn’t something Angular <a href="http://docs.angularjs.org/guide/di#dependency-injection_dependency-annotation_-annotation">relied</a> <a href="http://docs.angularjs.org/guide/di#dependency-injection_dependency-annotation_inline-annotation">on</a> <em>exclusively</em>, this black magic has been somewhat frowned upon for years. I <a href="/those-tricky-functions/">wrote about some of the problems</a> associated with it back in 2009.</p>

<p>Something so <strong>inherently non-standard</strong> and so <strong>varying among implementations</strong> could only be compared to user agent sniffing.</p>

<p>But is it, really? Could it be that things are not nearly as bad <em>these days</em>? I last investigated this 4 years ago — a significant chunk of time. Could it be that implementations came to some kind of unification, when it comes to function string representation? Am I completely outdated?</p>

<p>Curious, I decided to take a look at the current state of affairs. Could function decompilation be relied on right now? What exactly could we rely on?</p>

<p>But first..</p>

<h2 id="theory">Theory</h2>

<p>To put simply, function decompilation is the process of accessing function code as a string (then parsing its body or extracting arguments or whatever).</p>

<p>In Javascript, this is done via <code class="language-plaintext highlighter-rouge">toString()</code> of function objects, so <code class="language-plaintext highlighter-rouge">fn.toString()</code> or <code class="language-plaintext highlighter-rouge">String(fn)</code> or <code class="language-plaintext highlighter-rouge">fn + ''</code> or anything else that delegates to <code class="language-plaintext highlighter-rouge">Function.prototype.toString</code>.</p>

<p>The reason this is deemed unreliable in Javascript is due to its <strong>non-standard nature</strong>. A <a href="http://es5.github.io/#x15.3.4.2">famous quote from ES5 spec</a> states:</p>

<blockquote>
  <p>15.3.4.2 <strong>Function.prototype.toString( )</strong></p>

  <p>An implementation-dependent representation of the function is returned. This representation has the syntax of a FunctionDeclaration. Note in particular that the use and placement of white space, line terminators, and semicolons within the representation String is implementation-dependent.</p>
</blockquote>

<p>Of course when something is <strong>implementation-dependant</strong>, it’s bound to deviate in all kinds of ways imaginable.</p>

<h2 id="practice">Practice</h2>

<p>..and it does. You would think that a function like this:</p>

<!-- ```
function foo(x, y) {
  return x + y;
}
``` -->

<script src="https://gist.github.com/kangax/e0642a97f08df1b380c1.js"> </script>

<p>.. would be serialized to a string like this:</p>

<!-- ```
"function foo(x, y) {\n  return x + y;\n }"
``` -->

<script src="https://gist.github.com/kangax/f4d65ddfd4789982d45b.js"> </script>

<p>And it almost does. Except when some engines omit newlines. And others omit comments. And others omit “dead code”. And others include comments around (!) function. And others hide source completely…</p>

<p>Back in the days, things were <em>really</em> bad. Safari &lt;=2.x, for example, didn’t even conform to valid Function Declaration syntax. It would go wild with things like “<strong>(Internal Function)</strong>” or “<strong>[function]</strong>” or drop identifiers from <a href="http://kangax.github.io/nfe/">NFE’s</a>, just because.</p>

<p>Back in the days, some of the <em>mobile</em> browsers (Blackberry, Opera Turbo) hid the code completely (replacing it with polite “** /* source code not available */ **” comment instead or <a href="https://prototype.lighthouseapp.com/projects/8886/tickets/537-ajax-functionality-on-opera-mobile">similar</a>), supposedly to “save” on memory. A fair optimization.</p>

<h2 id="modern-days">Modern days</h2>

<p>But what about today? Surely, things must have gotten better. There’s a convergence of engines, domination of relatively sane WebKit, lots of standardization, and tremendous increase in engines performance.</p>

<p>And indeed, things are looking good. But it’s not nearly all nice and peachy yet, and there’s more “fun” on the horizon.</p>

<p><a href="http://kangax.github.io/jstests/function-decompilation/">
  <img src="/images/decompilation.png" style="width: 100%" />
</a></p>

<p>I made <a href="http://kangax.github.io/jstests/function-decompilation/">a simple test page</a>, checking various cases of functions and their string representations. Then tested it on desktop browsers, including pretty “old” ones (IE6+, FF3+, Safari4+, Opera 9.6+, Chrome), as well as <a href="http://www.browserstack.com/screenshots/bfc89b1d22472a5a2c25626c9c99ade9084b235b">slew of mobiles</a> and looked at common patterns.</p>

<h3 id="decompilation-purpose">Decompilation purpose</h3>

<p>It’s important to understand <strong>different purposes</strong> of function decompilation in Javascript.</p>

<p>Serializing native, built-in functions is different from serializing user-defined ones. In case of Angular, for example, we’re talking about <strong>user-defined function</strong>, so we don’t have to concern ourselves with the way native functions are serialized. Moreover, if we’re talking about <strong>retrieving arguments only</strong>, there’s definitely less deviations to deal with; unlike if we wanted to “parse” the source code.</p>

<p>Some things are more reliable; others — less so.</p>

<h3 id="user-defined-functions">User-defined functions</h3>

<p>When it comes to user-defined functions, things are pretty uniform.</p>

<p>Aside from oddball and dying environments like IE&lt;9 — which sometimes includes comments (and even parens) around functions in their string representation — or Konqueror, that omits function body brackets from <code class="language-plaintext highlighter-rouge">new Function</code> -generated functions.</p>

<p>Most of the deviations are in <strong>whitespace</strong> (and newlines). Some browsers (e.g. Firefox &lt;17) strip all comments from source code, and remove “dead”, unreachable code.</p>

<p>But don’t get too excited as we’ll talk about what future holds in just a bit…</p>

<h3 id="function-constructor">Function constructor</h3>

<p>Things are also a bit hectic in <strong>generated functions</strong> (using <code class="language-plaintext highlighter-rouge">new Function(...)</code>) but not much. While most of the engines create function with “anonymous” identifier, the spacing and newlines are inconsistent. Chrome also inserts extra comment after parameters list (extra comment never hurts, right?).</p>

<!-- ```
new Function('x, y', 'return x + y')
``` -->

<script src="https://gist.github.com/kangax/5350c77fef6044ade46e.js"> </script>

<p>becomes:</p>

<!-- ```
function anonymous(x, y
/**/) {
return x + y
}
``` -->

<script src="https://gist.github.com/kangax/1ded5df3de37726acaff.js"> </script>

<h3 id="bound-functions">Bound functions</h3>

<p>Every single supporting engine that I’ve tested represents bound (via <code class="language-plaintext highlighter-rouge">Function.prototype.bind</code>) functions the same way as native functions. Yes, that means bound functions <strong>“lose” their source</strong> from string representation.</p>

<!-- ```
function () { [native code] }
``` -->

<script src="https://gist.github.com/kangax/e4cdbd5ef242d6e9c62c.js"> </script>

<p>Arguably this is a reasonable thing to do; although a bit of a “wat?” when you first see it — why not use <em>“[bound code]”</em> instead?</p>

<p>Curiously, some engines (e.g. latest WebKit) <em>preserve function’s original identifier</em> and some don’t.</p>

<h3 id="non-standard">Non-standard</h3>

<p>What about non-standard extensions? Like <a href="http://perfectionkills.com/a-closer-look-at-expression-closures/">Mozilla’s expression closures</a>.</p>

<!-- ```
var expressionClosure = function(x, y) x + y
``` -->

<script src="https://gist.github.com/kangax/cba6aa25148dfa356aa0.js"> </script>

<p>Yep, those are still represented as they’re written; without function body brackets (technically, a violation of Function Declaration syntax, which <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Function/toString">MDN page on Function.prototype.toString</a> doesn’t even mention; something to fix!).</p>

<h3 id="es6-additions">ES6 additions</h3>

<p>I was almost done writing a test case, when a sudden thought crossed my mind. Hold on a second… What about <a href="http://kangax.github.io/es5-compat-table/es6/">EcmaScript 6</a>?!</p>

<p>All those new additions to the language; new syntax that changes the way functions look — classes, generators, rest params, default params, arrow functions. Won’t they affect function representation as well?</p>

<p>Quick test shed the light — they do. Of course. Firefox 24+, leading ES6 brigade, reveals string representation of these new constructs:</p>

<!-- ```
// Arrow functions
var fn = () => 5; // "() => 5"

// Rest params
function fn(...args) { } // "function fn(...args) { }"

// Default params
function fn(foo=1) { } // "function fn(foo=1) { }"

// Generators
(function *(){ yield 1 }); // "function *() { yield 1 }"
``` -->

<script src="https://gist.github.com/kangax/708a49da3f898f91da1e.js"> </script>

<p>Examining ES6 spec <a href="https://people.mozilla.org/~jorendorff/es6-draft.html#sec-function.prototype.tostring">confirms this further</a>:</p>

<blockquote>
  <p>An implementation-dependent String source code representation of the this object is returned. This representation has the syntax of a FunctionDeclaration, FunctionExpression, GeneratorDeclaration, GeneratorExpession, ClassDeclaration, ClassExpression, ArrowFunction, MethodDefinition, or GeneratorMethod depending upon the actual characteristics of the object. In particular that the use and placement of white space, line terminators, and semicolons within the representation String is implementation-dependent.</p>
</blockquote>

<blockquote>
  <p>If the object was defined using ECMAScript code and the returned string representation is in the form of a FunctionDeclaration FunctionExpression, GeneratorDeclaration, GeneratorExpession, ClassDeclaration, ClassExpression, or ArrowFunction then the representation must be such that if the string is evaluated, using eval in a lexical context that is equivalent to the lexical context used to create the original object, it will result in a new functionally equivalent object. The returned source code must not mention freely any variables that were not mentioned freely by the original function’s source code, even if these “extra” names were originally in scope. If the source code string does meet these criteria then it must be a string for which eval will throw a SyntaxError exception.</p>
</blockquote>

<p>Notice how ES6 still leaves function representation <strong>implementation-dependent</strong> although clarifying that it no longer conforms to <em>just</em> FunctionDeclaration syntax. Also notice an interesting additional requirement — “returned source code must not mention freely any variables that were not mentioned freely by the original function’s source code” (bonus points if you understood this in less than 7 tries).</p>

<p>I’m unclear on how this will affect future engines and their representation. But one thing is certain. With the rise of ES6, function representation is no longer just an optional identifier followed by parameters and function body. There’s a <strong>whole lot of new stuff</strong> coming.</p>

<p>Regexes will, once again, have to be updated to account for all the changes (did I say it’s similar to UA sniffing? hint, hint).</p>

<h3 id="minifiers--preprocessors">Minifiers &amp; Preprocessors</h3>

<p>I should also mention couple of old chestnuts that never quite sit well with function decompilation — minifiers and <a href="https://github.com/jashkenas/coffee-script/wiki/List-of-languages-that-compile-to-JS">preprocessors</a>.</p>

<p>Minifiers like UglifyJS, and preprocessors/compilers like <a href="https://code.google.com/p/google-caja/">Caja</a> tend to tweak the hell out of source code and rename parameters. This is why Angular’s dependency injection <a href="http://docs.angularjs.org/tutorial/step_05#controller_a-note-on-minification">doesn’t work with minifiers</a> unless <a href="https://github.com/btford/ngmin">alternative methods</a> are used.</p>

<p>Perhaps not a big deal, but still a relevant issue and definitely something to keep in mind.</p>

<h2 id="tldr--conclusions">TL;DR &amp; Conclusions</h2>

<p>To sum things up: it appears that function decompilation is becoming safer but — depending on your parsing needs — it might still be unwise to <em>rely exclusively</em> on.</p>

<p>Thinking to use it in your app/library?</p>

<p>Remember that:</p>

<ul>
  <li>It’s still <strong>not standard</strong></li>
  <li><strong>User-defined functions</strong> are generally looking sane</li>
  <li>There are <strong>oddball engines</strong> (especially when it comes to <a href="http://kangax.github.io/jstests/function-decompilation/">source code placement, whitespaces, comments, dead code</a>)</li>
  <li>There might be <strong>future oddball engines</strong> (particularly mobile or <em>unusual</em> devices with conservative memory/power consumption)</li>
  <li><strong>Bound functions</strong> don’t show their original source (but do preserve identifier… <em>sometimes</em>)</li>
  <li>You could run into <strong>non-standard extensions</strong> (like Mozilla’s expression closures)</li>
  <li><del>Winter</del> <strong>ES6 is coming</strong>, and functions can now look <em>very</em> different than they used to</li>
  <li><strong>Minifiers/preprocessors</strong> are not your friend</li>
</ul>

<p>P.S. Functions with overwritten <code class="language-plaintext highlighter-rouge">toString</code> methods and/or <code class="language-plaintext highlighter-rouge">Proxy.createFunction</code> are a different kind of beast; we can consider those a special case that would require a special consideration.</p>

<p>Special thanks to <a href="http://webreflection.blogspot.com/">Andrea Giammarchi</a> for providing some of the mobile tests (not available on BrowserStack).</p>]]></content><author><name></name></author><category term="js" /><summary type="html"><![CDATA[State of function decompilation in Javascript]]></summary></entry></feed>