<?xml version='1.0' encoding='UTF-8'?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:blogger="http://schemas.google.com/blogger/2008" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" version="2.0"><channel><atom:id>tag:blogger.com,1999:blog-1401515658801150010</atom:id><lastBuildDate>Sat, 11 Apr 2026 22:20:46 +0000</lastBuildDate><category>python</category><category>big data</category><category>concurrency</category><category>functional programming</category><category>java</category><category>scala</category><category>machine learning</category><category>coursera</category><category>javascript</category><category>parallelism</category><category>algorithms</category><category>devops</category><category>nodejs</category><category>books</category><category>cloud</category><category>linear algebra</category><category>non-blocking I/O</category><category>nosql</category><category>optimization</category><category>rust</category><category>MVC</category><category>bindings</category><category>blockchain</category><category>clojure</category><category>graph theory</category><category>nlp</category><category>simulation</category><category>statistics</category><category>time series</category><category>APIs</category><category>C#</category><category>JEE 7</category><category>feature engineering</category><category>go</category><category>math</category><category>mongodb</category><category>probability</category><category>quantum</category><category>rdms</category><category>security</category><category>testing</category><category>visualization</category><category>web socket</category><title>programming opiethehokie</title><description></description><link>http://www.programmingopiethehokie.com/</link><managingEditor>noreply@blogger.com (opiethehokie)</managingEditor><generator>Blogger</generator><openSearch:totalResults>53</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-6526108765702851622</guid><pubDate>Tue, 01 Jul 2025 00:05:00 +0000</pubDate><atom:updated>2025-07-04T11:34:01.874-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">blockchain</category><category domain="http://www.blogger.com/atom/ns#">machine learning</category><category domain="http://www.blogger.com/atom/ns#">rust</category><title>Zero-Knowledge Proofs: Verifying Computation and Preserving Privacy</title><description>&lt;h3 style=&quot;text-align: left;&quot;&gt;Expanding Verifiability Beyond Merkle Trees in the Age of AI&lt;/h3&gt;&lt;p&gt;From my previous post, you&#39;re already familiar with &lt;a href=&quot;https://www.programmingopiethehokie.com/2025/02/merkle-trees.html&quot; target=&quot;_blank&quot;&gt;Merkle Trees&lt;/a&gt; as a powerful data structure for efficient and secure validation of contents. You know they achieve this by hashing data and then hashing those hashes up a tree, allowing for Merkle proofs that can verify data inclusion or consistency without revealing the entire dataset. This capability is becoming even more crucial as AI-generated content makes verifiable content more imperative.&amp;nbsp;&lt;/p&gt;&lt;p&gt;But what if you need to prove something more complex than just data inclusion? What if you need to prove a computation was performed correctly, or that you know a secret, without revealing that secret? This is where Zero-Knowledge Proofs (ZKPs) come into play, offering new dimensions of verifiability and privacy.&lt;/p&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;What are Zero-Knowledge Proofs?&lt;/h3&gt;&lt;p&gt;A Zero-Knowledge Proof is a cryptographic protocol where a prover can convince a verifier that a statement is true, without revealing any information beyond the truth of the statement itself. Think of it like proving you&#39;re over 18 without showing your ID or revealing your name and address. ZKPs bring two main &quot;primitives&quot; or building blocks:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Computational Integrity (Succinctness)&lt;/b&gt;: They allow you to create proofs of computations that are significantly easier and faster to verify than to perform the original computation. This means the proof itself remains small, regardless of how complex the computation being proven is. Just as a Merkle proof is small compared to the original data, a ZKP is small compared to the computation it verifies.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Zero-Knowledge (Privacy)&lt;/b&gt;: They provide the option to hide parts of the computation (like sensitive inputs or even parts of the model) while still proving its correctness.&lt;/li&gt;&lt;/ul&gt;&lt;p style=&quot;text-align: left;&quot;&gt;While generating ZK proofs can be very computationally intensive, advancements in cryptography, hardware, and distributed systems are making them feasible for increasingly complex computations. This expansion of capabilities opens up a vast &quot;design space for new applications&quot;.&lt;/p&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;Programming ZKPs: A Shift in Mindset&lt;/h3&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Unlike traditional programming, which focuses on &lt;i&gt;how&lt;/i&gt; to compute, programming ZKPs (often called circuits) focuses on defining a set of constraints. These constraints are mathematical rules that the computation must satisfy. For example, you might constrain that two secret numbers multiplied together equal a public number, without ever revealing the secret numbers.&lt;/p&gt;&lt;div&gt;The typical workflow for building a ZKP involves:&lt;/div&gt;&lt;div&gt;&lt;ol style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Writing the circuit&lt;/b&gt;: Defining the constraints of your computation.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Building the circuit&lt;/b&gt;: Compiling it into a binary form and &lt;a href=&quot;https://www.programmingopiethehokie.com/2020/12/rust-to-webassembly-and-fibonacci.html&quot; target=&quot;_blank&quot;&gt;WebAssembly&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Trusted Setup&lt;/b&gt;: A crucial pre-processing step that generates a proving key (for the prover) and a verification key (for the verifier).&amp;nbsp;&lt;/li&gt;&lt;li&gt;&lt;b&gt;Generating the proof&lt;/b&gt;: Using your private inputs (the &quot;witness&quot;), the compiled circuit, and the proving key.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Verifying the proof&lt;/b&gt;: Using the verification key, the public output, and the generated proof.&lt;/li&gt;&lt;/ol&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Concepts like hash functions are fundamental in ZKPs, just as they are in Merkle Trees. However, ZKPs often use &quot;ZK-Friendly hash functions&quot; like &lt;a href=&quot;https://www.poseidon-hash.info/&quot; target=&quot;_blank&quot;&gt;Poseidon&lt;/a&gt;, which are optimized for use within ZKP circuits, offering significant performance gains compared to traditional hashes like SHA-256 due to their arithmetic-based implementation.&amp;nbsp;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Commitments, a cryptographic primitive allowing you to &quot;commit&quot; to a secret value without revealing it, are also crucial, often built using these hash functions. These are key building blocks for applications like digital signatures and more advanced concepts like group signatures, where you can prove you are part of a group without revealing your specific identity.&lt;/p&gt;&lt;span style=&quot;font-size: 18.72px; font-weight: 700;&quot;&gt;Programming ZKPs: An Example&lt;/span&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Lets walk through a basic example of proving that we know two numbers whose product is 36 without revealing what those numbers are.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Write the circuit using &lt;a href=&quot;https://github.com/iden3/circom&quot; target=&quot;_blank&quot;&gt;Circom&lt;/a&gt;.&amp;nbsp;&lt;i&gt;c &amp;lt;== a * b;&lt;/i&gt; is the constraint, two numbers multiplied together equals a third number.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/c3d1615a505608ad96f67d8cb9b1bd09.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Circom compiles it into a Wasm file which we&#39;ll use to generate a witness for specifying our private inputs when creating the proof and a&amp;nbsp;Rank 1 Constraint System binary file mathematically defining our single constraint.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Then we perform a trusted setup. The generated Common Reference String (CRS) consists of a proving key and a verification key. These keys can then be used every time we want to generate and verify proofs, respectively. They can be shared publicly and I provide mine here as part of the example:&lt;/p&gt;&lt;div style=&quot;text-align: left;&quot;&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;a href=&quot;https://drive.google.com/file/d/1XiS8L-JatnaFO_9xyWMhstl0IWe9nLsi/view?usp=sharing&quot;&gt;example1_verification_key.json&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://drive.google.com/file/d/1-VTmq8FmxsNcoc7Mx_Z2odk4nlG8hQKk/view?usp=sharing&quot;&gt;example1_0001.zkey&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Finally, we generate the proof using&amp;nbsp;&lt;a href=&quot;https://github.com/iden3/snarkjs&quot; target=&quot;_blank&quot;&gt;snarkjs&lt;/a&gt;&amp;nbsp;with the Wasm file, proving key and private input that might be something like &lt;i&gt;9&lt;/i&gt; and &lt;i&gt;4&lt;/i&gt;.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/6d165364984f0239661571b783a72cb9.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;We get proof and public output JSON files.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/0295a999fcfd89bc6cb73adcee4f684d.js&quot;&gt;&lt;/script&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/803d15c5b99afda755c8b1b16e1dea61.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;We&#39;ve proven that&amp;nbsp;we know two secret values, a and b, whose product is 36. You can verify the proof (assuming you trust my verification key) with snarkjs.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;span style=&quot;color: #666666;&quot;&gt;$ snarkjs groth16 verify example1_verification_key.json public.json proof.json&lt;br /&gt;[INFO]&amp;nbsp; snarkJS: OK!&lt;/span&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;If you change public.json to contain a different number the proof will no longer be valid. I&#39;ve no longer proved I know the factors of this new number.&lt;/p&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;ZKPs and Blockchains, and Machine Learning (ZKML)&lt;/h3&gt;&lt;p style=&quot;text-align: left;&quot;&gt;The convergence of ZKPs, blockchains (Web3), and machine learning is a rapidly advancing area with significant potential.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;Blockchain use cases include:&lt;/div&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Scaling Blockchains&lt;/b&gt;: Public blockchains have limited computational power. ZKPs enable computations to be executed off-chain, with only a small ZK proof verified on-chain. This scales blockchains without sacrificing decentralization or security. Examples include ZK rollups like &lt;a href=&quot;https://docs.polygon.technology/zkEVM/&quot; target=&quot;_blank&quot;&gt;Polygon zkEVM&lt;/a&gt;&amp;nbsp;and &lt;a href=&quot;https://www.zksync.io/&quot; target=&quot;_blank&quot;&gt;zkSync&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Privacy-Preserving Applications&lt;/b&gt;: The zero-knowledge property is ideal for creating applications that protect users&#39; privacy and personal data when making cryptographic attestations. &lt;a href=&quot;https://aztec.network/&quot; target=&quot;_blank&quot;&gt;Aztec Network&lt;/a&gt;, for instance, uses a ZK rollup for Ethereum where users&#39; balances and transactions are completely hidden.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Identity Primitives and Data Provenance&lt;/b&gt;: Projects like &lt;a href=&quot;https://world.org/world-id&quot; target=&quot;_blank&quot;&gt;WorldID&lt;/a&gt; use ZKPs for privacy-preserving proof-of-personhood protocols, allowing a person to prove they are a unique human without revealing their identity.&lt;/li&gt;&lt;/ul&gt;&lt;p style=&quot;text-align: left;&quot;&gt;ZKML is about applying ZK proofs to machine learning models, specifically focusing on the inference step.&amp;nbsp;The core motivations for ZKML include:&lt;/p&gt;&lt;div&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;b&gt;Verifying AI-Generated Content&lt;/b&gt;: With AI content becoming indistinguishable from human-created content, ZKPs can help determine that a particular piece of content was produced by applying a specific model to a given input.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Privacy-Preserving Inference&lt;/b&gt;: ZKPs allow you to apply an ML model to sensitive data, where a user can get the result of the model&#39;s inference without revealing their input to any third party.&lt;/li&gt;&lt;/ul&gt;&lt;p style=&quot;text-align: left;&quot;&gt;While proving something as large as current LLMs with ZKPs is not currently feasible, there&#39;s significant progress on creating proofs for smaller models. Teams are actively working on improving ZK technology, including specialized hardware and proof system architectures, to allow proving bigger models on less powerful machines in less time.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;Summary&lt;/h3&gt;&lt;p style=&quot;text-align: left;&quot;&gt;While Merkle Trees excel at verifying data inclusion and consistency, ZKPs extend this idea to verifying computations and knowledge with the added benefit of privacy. This makes them incredibly powerful for building the next generation of scalable and private applications on blockchains, especially as AI-generated content and privacy concerns continue to grow. The future of verifiable content, whether data or computation, is increasingly intertwined with these advanced cryptographic proofs.&lt;/p&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;Update&lt;/h3&gt;&lt;p style=&quot;text-align: left;&quot;&gt;A few days after I published this I saw&amp;nbsp;&lt;a href=&quot;https://blog.google/technology/safety-security/opening-up-zero-knowledge-proof-technology-to-promote-privacy-in-age-assurance/&quot; target=&quot;_blank&quot;&gt;Opening up ‘Zero-Knowledge Proof’ technology to promote privacy in age assurance&lt;/a&gt; from Google showing that some well-known players are active in this space as well.&lt;/p&gt;&lt;h3 style=&quot;text-align: left;&quot;&gt;&lt;span&gt;Sources&lt;/span&gt;&lt;/h3&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://zkintro.com/articles/programming-zkps-from-zero-to-hero&quot;&gt;https://zkintro.com/articles/programming-zkps-from-zero-to-hero&lt;/a&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://world.org/blog/engineering/intro-to-zkml&quot;&gt;https://world.org/blog/engineering/intro-to-zkml&lt;/a&gt;&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2025/06/zero-knowledge-proofs-verifying.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-74624703257000525</guid><pubDate>Mon, 17 Feb 2025 16:44:00 +0000</pubDate><atom:updated>2026-04-11T13:22:16.490-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">big data</category><category domain="http://www.blogger.com/atom/ns#">blockchain</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>Merkle Trees</title><description>&lt;div&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Merkle_tree&quot; target=&quot;_blank&quot;&gt;Merkle Trees&lt;/a&gt;&amp;nbsp;are a data structure that allows for efficient and secure validation of contents. They are typically implemented as binary trees. Given N pieces of data, they would have 2N nodes and log(N) height. Each leaf node is the hash of a piece of data and every non-leaf node is a hash of its children&#39;s hashes. With only a hash stored at each node Merkle trees have a small, predictable size compared to the data of a large N.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Hash_Tree.svg&quot; title=&quot;Azaghal, CC0, via Wikimedia Commons&quot;&gt;&lt;img alt=&quot;Graphical representation of the Merkle Tree. Illustration by David Göthberg&quot; height=&quot;255&quot; src=&quot;https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Hash_Tree.svg/960px-Hash_Tree.svg.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Examples of Merkle tree usage include detecting data inconsistencies between replicas in NoSQL databases, ensuring file integrity in distributed storage systems, and verifying blockchain transactions. We can easily create one with &lt;a href=&quot;https://pymerkle.readthedocs.io/en/latest/index.html&quot; target=&quot;_blank&quot;&gt;pymerkle&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/4f7787c3cec74d515e81cec044d2c5ae.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;└─9f42c047...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; ├──50fcd75a...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;├──4c4b77fe...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;├──e8bcd97e...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;├──2215e8ac...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;└──fa61e3de...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;└──9c769ac2...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;├──906c5d24...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;└──11e1f558...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp;└──fed7af7d...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;├──2b15ae18...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;├──53304f5e...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;│&amp;nbsp; &amp;nbsp;└──3bf9c81c...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;└──8007dd69...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;├──797427cf...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; │&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;└──195f58bc...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; └──555da077...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ├──85224a5c...&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; └──5c889ef4...&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A &quot;Merkle proof&quot; is a list of intermediate hashes from the path between a leaf node representing the data you want to prove and the root of the tree. Generating the proof is like a modified DFS. The beauty of this is that anyone can verify the data is included in the tree without the whole tree being revealed.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/6b9f5f4820ac8bb1340a76d5cc84a88b.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;{&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;metadata&quot;: {&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;algorithm&quot;: &quot;sha256&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;security&quot;: true,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;size&quot;: 10&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; },&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;rule&quot;: [&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; ],&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;subset&quot;: [],&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;path&quot;: [&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;53304f5e3fd4bcd20b39abdef2fe118031cc5ae8217bcea008dea7e27869348a&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;3bf9c81c231cae70b678d3f3038f9f4f6d6b9d7adcf9b378f25919ae53d17686&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;8007dd69b92a67ea6410098635fa8ba53c44a5994c7e5d92b99e27f0711c626f&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;4c4b77fe3fc6cfb92e4d3c90b5ade42f059a1f112a49827f07edbb7bd4540e7b&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;555da077fcadba1f23e0f2bfac8793e6a3c79a0d605902df34ab43d3e0fb487c&quot;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; ]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Verifying the proof requires only calculating the root hash from the provided proof and the data being verified. If the calculated root matches the known root of the tree then the data is present in the tree. We are using less space and less compute than if we were iterating a list to check if data is present.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/7ad6a43ff995cd09264c74a95e06b15d.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;div&gt;That was an inclusion or audit proof. We can also do a consistency proof. In an append-only tree we can verify earlier versions of the tree against later versions to make sure no tampering has occurred. The later version must include everything in the earlier version, in the same order, and all new entries come after old entries.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/7ba4dfd0817082c006f53865afdae85a.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;{&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;metadata&quot;: {&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;algorithm&quot;: &quot;sha256&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;security&quot;: true,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;size&quot;: 10&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; },&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;rule&quot;: [&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; ],&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;subset&quot;: [&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; ],&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &quot;path&quot;: [&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;fa61e3dec3439589f4784c893bf321d0084f04c572c7af2b68e3f3360a35b486&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;2215e8ac4e2b871c2a48189e79738c956c081e23ac2f2415bf77da199dfd920c&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;9c769ac26f8d61ff40859e5201537845555136f0fd7ab604f7033180fbe76af9&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;fed7af7d64bf0a73fcad018df1219928dbafa4d96b5d78f8a5e9be66ff0ada38&quot;,&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &quot;555da077fcadba1f23e0f2bfac8793e6a3c79a0d605902df34ab43d3e0fb487c&quot;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;&amp;nbsp; &amp;nbsp; ]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Given how powerful and pervasive Merkle trees are, I&#39;m surprised they aren&#39;t discussed more along with other common data structures. It seems that with AI-generated content making verifiable content more imperative, their usage will only increase going forwards.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</description><link>http://www.programmingopiethehokie.com/2025/02/merkle-trees.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-7937670990297527603</guid><pubDate>Tue, 16 Jan 2024 02:37:00 +0000</pubDate><atom:updated>2025-04-05T14:04:04.825-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">parallelism</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>CUDA Kernels: Speeding up Matrix Multiplication</title><description>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;The Python ecosystem benefits greatly from be able to use libraries written in C/C++ and Rust (see my &lt;a href=&quot;https://www.programmingopiethehokie.com/2023/07/rust-to-python-and-fibonacci-numbers.html&quot; target=&quot;_blank&quot;&gt;previous post&lt;/a&gt;) to increase performance. Increasingly, though, I&#39;ve been running code on GPUs instead. Libraries like PyTorch simplify this, but how do they do it? Previously I could only answer something like &quot;it&#39;s CUDA&quot; without knowing what that really means. In this post we&#39;ll dive deeper and see what it takes to create our own CUDA kernel. We&#39;ll (re-)implement matrix multiplication, run it on a GPU, and compare to numpy and PyTorch performance.&lt;br /&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAgXJqQN-Nr5Xs7HtnG8VUha4_PI7q-An4Tnrz3EskuFl_FXptHqbke3ctnZJ77HarswT23Hnsj1nSEv36MMDas1QaZd3QfrWYEA2nv3WxwUzp5LkRRtaowUjqmf-1K-BM7ZBiBTqgJLrvAUbnphw9vA5RWFPOM8IVk_b6kdNBXic3pxtbR48Dl3hV65LA/s1792/DALL%C2%B7E%202024-01-10%2018.47.10%20-%20Wide%20illustrations%20for%20a%20computer%20science%20blog,%20focusing%20on%20immense%20number%20crunching%20in%20CUDA%20kernel%20for%20matrix%20multiplication,%20with%20an%20&#39;Icy&#39;%20color%20the.png&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1024&quot; data-original-width=&quot;1792&quot; height=&quot;229&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAgXJqQN-Nr5Xs7HtnG8VUha4_PI7q-An4Tnrz3EskuFl_FXptHqbke3ctnZJ77HarswT23Hnsj1nSEv36MMDas1QaZd3QfrWYEA2nv3WxwUzp5LkRRtaowUjqmf-1K-BM7ZBiBTqgJLrvAUbnphw9vA5RWFPOM8IVk_b6kdNBXic3pxtbR48Dl3hV65LA/w630-h229/DALL%C2%B7E%202024-01-10%2018.47.10%20-%20Wide%20illustrations%20for%20a%20computer%20science%20blog,%20focusing%20on%20immense%20number%20crunching%20in%20CUDA%20kernel%20for%20matrix%20multiplication,%20with%20an%20&#39;Icy&#39;%20color%20the.png&quot; width=&quot;630&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;First, what is CUDA?&amp;nbsp;&lt;a href=&quot;https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html&quot; target=&quot;_blank&quot;&gt;CUDA&lt;/a&gt;&amp;nbsp;is a general purpose parallel computing platform and API. It gives direct access to NVIDIA GPU&#39;s virtual instruction set and parallel computation. It works with C, C++, Fortran, has Java and Python wrappers, and is supposed to be easier to use than earlier APIs like &lt;a href=&quot;https://developer.nvidia.com/opencl&quot; target=&quot;_blank&quot;&gt;OpenCL&lt;/a&gt;. CUDA allows us to execute kernels which are functions compiled for GPUs, separately from the main program. This is how the majority of deep learning taking place today functions.&lt;/p&gt;&lt;p&gt;Second, what makes GPUs so special?&amp;nbsp;GPUs are generally slower than CPUs at one operation but excel at operations that can be parallelized. They can have thousands of cores and higher memory bandwidth. If we want to tackle more than embarrassingly parallel problems, like matrix multiplication, this means we need a &lt;a href=&quot;https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Parallel_and_distributed_algorithms&quot; target=&quot;_blank&quot;&gt;parallel algorithm&lt;/a&gt;&amp;nbsp;to make use of the GPU.&lt;/p&gt;&lt;p&gt;Third, how do we access it from Python? It turns out there are several options, or several Python wrappers for this. I chose to go with&amp;nbsp;&lt;a href=&quot;https://numba.readthedocs.io/en/stable/user/5minguide.html&quot; target=&quot;_blank&quot;&gt;Numba&lt;/a&gt;, a just-in-time (JIT) compiler that can target both CPUs and GPUs. It lets you write kernels directly using a subset of Python. It does not implement the complete CUDA API, but supports enough to tackle many problems. There are other (uninvestigated) options like CuPy and PyCUDA as well.&lt;/p&gt;&lt;p&gt;Lastly, before we get to the code, is the topic of writing efficient kernels. While this is beyond the scope of my post, I can say that I&#39;ve at least learned it requires understanding several additional concepts beyond general concurrent programming. CUDA kernels are not for the faint of heart. It&#39;s not something you can just pick up in a couple of hours or from following one tutorial. One of the things you&#39;ll first notice, as a very simple example, is the need to explicitly move data back and forth between CPUs and GPUs. And kernels can&#39;t return values, you can only pass inputs and outputs.&lt;/p&gt;&lt;p&gt;For a better intro, I recommend reading the four-part &lt;a href=&quot;https://towardsdatascience.com/cuda-by-numba-examples-1-4-e0d06651612f&quot; target=&quot;_blank&quot;&gt;CUDA by Numba Examples&lt;/a&gt; series (&lt;a href=&quot;https://towardsdatascience.com/cuda-by-numba-examples-215c0d285088&quot; target=&quot;_blank&quot;&gt;part 2&lt;/a&gt;, &lt;a href=&quot;https://towardsdatascience.com/cuda-by-numba-examples-7652412af1ee&quot; target=&quot;_blank&quot;&gt;part 3&lt;/a&gt;, &lt;a href=&quot;https://towardsdatascience.com/cuda-by-numba-examples-c583474124b0&quot; target=&quot;_blank&quot;&gt;part 4&lt;/a&gt;).&lt;/p&gt;&lt;p&gt;Our baseline for this experiment will be numpy&#39;s highly-optimized matmul function. It supports &lt;a href=&quot;https://www.programmingopiethehokie.com/2014/08/exploring-vectorization.html&quot; target=&quot;_blank&quot;&gt;vectorized array operations&lt;/a&gt; and some multi-threading by releasing the GIL per &lt;a href=&quot;https://scipy-cookbook.readthedocs.io/items/ParallelProgramming.html&quot; target=&quot;_blank&quot;&gt;Parallel Programming with numpy and scipy&lt;/a&gt;. The underlying &lt;a href=&quot;https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms&quot; target=&quot;_blank&quot;&gt;BLAS&lt;/a&gt; routines will be optimized for your hardware. This method of matrix multiplication has been tuned over the past several decades.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/225c2f77f6e5b95e37d2168a5b748bb0.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Here we&#39;ll try our own kernel. It&#39;s moving the computation from the CPU to the GPU, but it&#39;s likely lacking many optimizations that a battle-tested library has. Still, for medium-sized, two-dimensional matrices, we get some performance improvement.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/1ccf9c03a8f75e81a9a52f54c021e6a2.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Now we&#39;ll use PyTorch&#39;s matmul function. It&#39;s highly optimized like numpy and has the GPU benefit like the kernel. It&#39;s amazingly fast and works with larger, higher-dimensional matrices.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/8adbb1bfcef7c377ef45d4f51ba47eef.js&quot;&gt;&lt;/script&gt;&lt;p&gt;NVIDIA has a profiling tool called &lt;a href=&quot;https://developer.nvidia.com/nsight-systems&quot; target=&quot;_blank&quot;&gt;Nsight Systems&lt;/a&gt; that we can use to see GPU utilization. GPUs are expensive, we would want them to be fully utilized. From the reports, I see that the PyTorch implementation used more threads so that&#39;s consistent with higher parallelism. It also seems to have a higher ratio of memory operations vs. kernel operations. I&#39;m not sure I understand what that means, but the kernel operations are sgemm which looks to be a matrix multiplication algorithm akin to numpy using BLAS.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnRl4wMC9TCnU19fqtezi6ttlxklw_uehEn5oZsb73laV2QLjt8q-Ce8YUVThyGsV1h2pRLaeOanNqk8FGT29AIFPHqihyphenhyphenVbjYgR88x9Kp9cuLcugplVnZNAfsgBNg0Ts1m6s_enEzpKsEjGZ4jKzLO3h3l4yGOPWW9FQDsL1GIniopUmIbTGRN8oHtXCM/s1275/Screenshot%202024-01-15%20210010.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;542&quot; data-original-width=&quot;1275&quot; height=&quot;170&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnRl4wMC9TCnU19fqtezi6ttlxklw_uehEn5oZsb73laV2QLjt8q-Ce8YUVThyGsV1h2pRLaeOanNqk8FGT29AIFPHqihyphenhyphenVbjYgR88x9Kp9cuLcugplVnZNAfsgBNg0Ts1m6s_enEzpKsEjGZ4jKzLO3h3l4yGOPWW9FQDsL1GIniopUmIbTGRN8oHtXCM/w400-h170/Screenshot%202024-01-15%20210010.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwpH2xoWLQGNayWLBJF8QwkMyUXk_9KOLBQLFoAqSbrd9GiZzKr_7ABle1Xi4qOd9Z7evJr24bA-U-g1SURDw7iM7Pl0DHsIsAI2VCx9CNBjR1tS3FQYtp4k2JtcGRK3YGODzjKBZyW3-lJVgq_8GBMH4VCEYGaAyXMX8x3y8_cEbGRj1iRzwpfJi4eCf2/s1439/Screenshot%202024-01-15%20210620.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;537&quot; data-original-width=&quot;1439&quot; height=&quot;149&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwpH2xoWLQGNayWLBJF8QwkMyUXk_9KOLBQLFoAqSbrd9GiZzKr_7ABle1Xi4qOd9Z7evJr24bA-U-g1SURDw7iM7Pl0DHsIsAI2VCx9CNBjR1tS3FQYtp4k2JtcGRK3YGODzjKBZyW3-lJVgq_8GBMH4VCEYGaAyXMX8x3y8_cEbGRj1iRzwpfJi4eCf2/w400-h149/Screenshot%202024-01-15%20210620.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;Creating a CUDA kernel has become accessible enough that I could do it in a couple of hours on a laptop, yet my implementation remains far from the top implementations. Matrix multiplication is obviously common and great libraries exist. For less common operations,&amp;nbsp;even if there&#39;s a known parallel algorithm, I would hesitate going the custom kernel route again. It&#39;s not a one-off thing you would just try to improve performance, it requires a way of thinking and deep optimization knowledge that most software developers don&#39;t have. The underlying libraries used by PyTorch are, for example, optimizing use of memory caches, shared vs. global memory accesses, thread utilization, and probably tuning kernel parameters for my specific GPU.&lt;/p&gt;&lt;p&gt;UPDATE:&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://thenewstack.io/nvidia-finally-adds-native-python-support-to-cuda/&quot; target=&quot;_blank&quot;&gt;NVIDIA Finally Adds Native Python Support to CUDA&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2024/01/cuda-kernels-speeding-up-matrix.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAgXJqQN-Nr5Xs7HtnG8VUha4_PI7q-An4Tnrz3EskuFl_FXptHqbke3ctnZJ77HarswT23Hnsj1nSEv36MMDas1QaZd3QfrWYEA2nv3WxwUzp5LkRRtaowUjqmf-1K-BM7ZBiBTqgJLrvAUbnphw9vA5RWFPOM8IVk_b6kdNBXic3pxtbR48Dl3hV65LA/s72-w630-h229-c/DALL%C2%B7E%202024-01-10%2018.47.10%20-%20Wide%20illustrations%20for%20a%20computer%20science%20blog,%20focusing%20on%20immense%20number%20crunching%20in%20CUDA%20kernel%20for%20matrix%20multiplication,%20with%20an%20&#39;Icy&#39;%20color%20the.png" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-3298953494316628245</guid><pubDate>Wed, 12 Jul 2023 12:19:00 +0000</pubDate><atom:updated>2023-07-12T08:19:39.108-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">bindings</category><category domain="http://www.blogger.com/atom/ns#">parallelism</category><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">rust</category><title>Rust to Python and Fibonacci Numbers</title><description>&lt;p&gt;While Python continues to be the language of choice for ML projects, I&#39;m increasingly seeing mentions of &lt;a href=&quot;https://www.rust-lang.org/&quot; target=&quot;_blank&quot;&gt;Rust&lt;/a&gt; in packages I use. Hugging Face fast&amp;nbsp;&lt;a href=&quot;https://github.com/huggingface/tokenizers&quot; target=&quot;_blank&quot;&gt;tokenizers&lt;/a&gt;&amp;nbsp;are written in Rust to speed up training and tokenization. They say it &quot;Takes less than 20 seconds to tokenize a GB of text on a server&#39;s CPU.&quot; I recently used &lt;a href=&quot;https://www.pola.rs/&quot; target=&quot;_blank&quot;&gt;Polars&lt;/a&gt;, which is also written in Rust,&amp;nbsp;for a quick task where I wanted to run a SQL query on 20 GB of CSV files. It only took about 60 seconds on my laptop. It&#39;s a way of bypassing Python&#39;s GIL to write code that is parallelizable. This is in contrast to packages like numpy that traditionally have been written in C/C++ for performance reasons.&amp;nbsp;&lt;/p&gt;&lt;p&gt;How does one turn Rust code into something you can call from Python? It turns out I&#39;ve already done something similar,&amp;nbsp;&lt;a href=&quot;https://www.programmingopiethehokie.com/2020/12/rust-to-webassembly-and-fibonacci.html&quot; target=&quot;_blank&quot;&gt;compiling Rust into WebAssembly and using it in a JavaScript project&lt;/a&gt;. For Python, the process is similar. I followed&amp;nbsp;&lt;a href=&quot;http://saidvandeklundert.net/learn/2021-11-18-calling-rust-from-python-using-pyo3/&quot; target=&quot;_blank&quot;&gt;Calling Rust from Python using PyO3&lt;/a&gt; and re-created the Fibonacci numbers experiment (apparently I&#39;m not the only one with this idea) from my previous post. It&#39;s a toy example, just intended to show the ease with which Rust can be leveraged. It ignores more complex data types and any actual multi-threaded code. Perhaps that will be the topic of a future post.&lt;/p&gt;&lt;p&gt;The function in Rust looks like this:&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/e34557e8eaf5dc3bb6a8057e656bdcc2.js&quot;&gt;&lt;/script&gt;&lt;p&gt;And the Python code that calls it and times it:&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/90515356433e580aa4d12b70d9470532.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Using the same n=35 as in my JavaScript experiment, I&#39;m seeing about .05 seconds for Rust vs. 7.31 seconds for pure Python. The likelihood of wanting/needing this in a Python project seems greater than with JavaScript. My guess is that the trend continues.&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2023/07/rust-to-python-and-fibonacci-numbers.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-7189641275265986791</guid><pubDate>Tue, 31 Jan 2023 03:17:00 +0000</pubDate><atom:updated>2024-01-15T17:00:41.803-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">devops</category><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">simulation</category><title>Queueing Simulations</title><description>&lt;p&gt;Waiting in lines is something we&#39;ve all experienced. But how long will the wait for a table really be? How many checkouts should the grocery store have open?&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Queueing_theory&quot; target=&quot;_blank&quot;&gt;Queueing theory&lt;/a&gt; gives us the tools to be able to answer questions like these and more. It&#39;s also relevant to understanding software under heavy load. I&#39;ve seen it come up recently in The Amazon Builder&#39;s Library &lt;a href=&quot;https://aws.amazon.com/builders-library/avoiding-insurmountable-queue-backlogs&quot; target=&quot;_blank&quot;&gt;Avoiding Insurmountable Queue Backlogs&lt;/a&gt; and Site Reliability Engineering&#39;s &lt;a href=&quot;https://sre.google/sre-book/addressing-cascading-failures/&quot; target=&quot;_blank&quot;&gt;Addressing Cascading Failures&lt;/a&gt; chapter.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Since you are reading this blog it is assumed that you are already familiar with queues and the idea of using them in software systems. To summarize, queues can improve availability and throughput at the expense of latency and resource consumption. Here we will see what the field of queueing theory can teach us about how and when to best use them.&amp;nbsp;&lt;/p&gt;&lt;p&gt;The Builder&#39;s Library article warns us of the bi-modal behavior of queue-based systems: the behavior can be very different based on whether there is a backlog or not. Recovery time after an outage can be dramatically increased and time can be wasted doing work that is no longer useful. We then get the first hints of how to manage this. Using more than one queue helps shape traffic. LIFO-ish behavior can be more desirable than FIFO. The SRE book discusses throttling via small queue sizes so requests are rejected when the incoming request rate is too high to be sustained. Traffic patterns, queue sizes, and the number of threads removing work from the queue are all intertwined. Can we better quantify these recommendations though?&lt;/p&gt;&lt;p&gt;I first learned of queueing theory in &lt;a href=&quot;https://github.com/VividCortex/ebooks/blob/master/queueing-theory.pdf&quot; target=&quot;_blank&quot;&gt;The Essential Guide to Queueing Theory&lt;/a&gt; e-book from VividCortex and wanted to revisit it in the context of the SRE reading. Right from the beginning we are warned that while queues are linear data structures, queueing systems behave non-linearly. Queueing theory is probabilistic. We won&#39;t understand the behavior exactly, but will know about wait times on average or a distribution of wait times.&lt;/p&gt;&lt;p&gt;The first thing they suggest we must understand is that queueing happens even when there is enough capacity to do the work. This is because work arrives to the queue in irregular sizes and at irregular intervals. Queueing gets worse at high utilization, when there is high variability, and when fewer workers are servicing the queues.&lt;/p&gt;&lt;p&gt;Any system can be decomposed into networks of queues and workers. These are the parameters and metrics commonly used to understand these systems:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;arrival rate to the queue (&lt;i&gt;λ&lt;/i&gt; or &lt;i&gt;A&lt;/i&gt;) - in a stable system this is the same as throughput&lt;/li&gt;&lt;li&gt;average wait time or residence time in the queue (&lt;i&gt;W&lt;/i&gt; or&amp;nbsp;&lt;i&gt;W&lt;/i&gt;&lt;i&gt;&lt;sub&gt;q&lt;/sub&gt;&lt;/i&gt;)&lt;/li&gt;&lt;li&gt;average post-queue service time (&lt;i&gt;S&lt;/i&gt;)&lt;/li&gt;&lt;li&gt;service rate (&lt;i&gt;μ&lt;/i&gt;) - inverse of service time&lt;/li&gt;&lt;li&gt;latency or residence time (&lt;i&gt;R&lt;/i&gt;) - sum of wait time and service time&lt;/li&gt;&lt;li&gt;worker utilization (&lt;i&gt;ρ&lt;/i&gt; or &lt;i&gt;U&lt;/i&gt;) - 0 to 1&lt;/li&gt;&lt;li&gt;average number of requests waiting or in service concurrently (&lt;i&gt;L &lt;/i&gt;or&lt;i&gt;&amp;nbsp;&lt;/i&gt;&lt;i&gt;L&lt;sub&gt;q&lt;/sub&gt;&lt;/i&gt;)&lt;/li&gt;&lt;li&gt;number of workers (&lt;i&gt;M&lt;/i&gt;)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Little%27s_law&quot; target=&quot;_blank&quot;&gt;Little&#39;s Law&lt;/a&gt; states that concurrency is arrival rate times residence time:&amp;nbsp;&lt;i&gt;L = λR&lt;/i&gt;&amp;nbsp;and&amp;nbsp;&lt;i&gt;L&lt;sub&gt;q =&amp;nbsp;&lt;/sub&gt;&lt;/i&gt;&lt;i&gt;λ&lt;/i&gt;&lt;i&gt;W&lt;/i&gt;&lt;i&gt;&lt;sub&gt;q&lt;/sub&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;The Utilization Law states that utilization is throughput times service time:&amp;nbsp;&lt;i&gt;ρ = λS&lt;/i&gt; or&amp;nbsp;&lt;i&gt;ρ = λ/μ&lt;/i&gt; or&amp;nbsp;with &lt;i&gt;M &lt;/i&gt;workers&amp;nbsp;&lt;i&gt;ρ = λS/M&lt;/i&gt;&lt;/p&gt;&lt;p&gt;Kendall&#39;s Notation is a shorthand for describing queue systems. Normally we&#39;ll be working with M/M/m systems. The first M says events arrive randomly and independently (memoryless) meaning they are generated by a &lt;a href=&quot;https://mathworld.wolfram.com/PoissonProcess.html&quot; target=&quot;_blank&quot;&gt;Poisson process&lt;/a&gt;. The second M says the service times are &lt;a href=&quot;https://en.wikipedia.org/wiki/Exponential_distribution&quot; target=&quot;_blank&quot;&gt;exponentially distributed&lt;/a&gt;. The last m is the number of workers or &lt;i&gt;M&lt;/i&gt; from above. These are safe assumptions unless you know otherwise.&lt;/p&gt;&lt;p&gt;For a M/M/1 system&amp;nbsp;&lt;i&gt;R = S&lt;/i&gt; / (1 -&amp;nbsp;&lt;i&gt;ρ&lt;/i&gt;) which is a fast-growing curve (the non-linear mentioned above). Residence time is inversely proportional to idle capacity. By rearranging the equations above we can solve for other parameters as well. For systems beyond M/M/1 it gets more complicated. For M/M/m, for example, we need to bring in &lt;a href=&quot;https://en.wikipedia.org/wiki/Erlang_(unit)#Erlang_C_formula&quot; target=&quot;_blank&quot;&gt;Erlang&#39;s C-formula&lt;/a&gt;.&lt;/p&gt;&lt;div&gt;To do that, and to see these laws in action, it&#39;s time to write some code. There appear to be several Python packages for simulating queueing systems. I will use &lt;a href=&quot;https://github.com/CiwPython/Ciw.git&quot; target=&quot;_blank&quot;&gt;Ciw&lt;/a&gt;&amp;nbsp;for this simulation. We&#39;ll assume we have 1,000 requests per minute, several servers pulling work from a queue, and need enough capacity such that 2 servers could be offline without the system falling behind.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/ae76f6284cd3180b182a30d44cb8f1b1.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;table border=&quot;1&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;&amp;nbsp;servers&amp;nbsp;&lt;/th&gt;&lt;th&gt;&amp;nbsp;utilization %&amp;nbsp;&lt;/th&gt;&lt;th&gt;&amp;nbsp;latency (sec)&amp;nbsp;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;83&lt;/td&gt;&lt;td&gt;.35&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;63&lt;/td&gt;&lt;td&gt;.18&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;.15&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can use this simulation to figure out what &quot;levers to pull&quot; to get the behavior we want. It could even become a &lt;a href=&quot;https://www.programmingopiethehokie.com/2018/07/linear-programming.html&quot; target=&quot;_blank&quot;&gt;Linear Programming&lt;/a&gt;&amp;nbsp;optimization problem. Based on the arrival rate and service rate it&#39;s easy to see we need at least 3 servers, but not as easy to see how the latency would change going from 3 to 5 or what the minimum queue size is.&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;While not always intuitive, especially with multiple queues, there are ways to understand how queue-based systems will behave. If you are interested in the topic, I suggest reading the linked resources as they go into much more detail. If you have other examples of queueing theory in action, please share in the comments.&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2023/01/queueing-simulations.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-7081082265291221472</guid><pubDate>Sun, 23 Oct 2022 16:17:00 +0000</pubDate><atom:updated>2022-10-23T12:17:43.568-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">java</category><title>Java Updates Versions 9-19</title><description>&lt;p&gt;The past few years I haven&#39;t written much Java code and when I did, it was Java 8. Many projects, it seems, have stuck with Java 8 which was released back in 2014. Per the &lt;a href=&quot;https://www.oracle.com/java/technologies/java-se-support-roadmap.html&quot; target=&quot;_blank&quot;&gt;roadmap&lt;/a&gt;, Java 8 is designated as LTS, but so are Java 11 and Java 17. In fact, Java 19 is available as of last month and many interesting features have been introduced in the past 8 years. This post is an overview of what&#39;s changed. The highlights, in my opinion, so we&#39;re up-to-date. I think it&#39;s enough to be interesting but not so much that it can&#39;t be picked up quickly if you have experience with older Java versions.&lt;/p&gt;&lt;p&gt;First, some name conventions. &lt;a href=&quot;https://www.baeldung.com/java-enterprise-evolution&quot; target=&quot;_blank&quot;&gt;Java EE is now Jakarta EE&lt;/a&gt;. Definitely don&#39;t call it J2EE anymore. And since Java 11 &lt;a href=&quot;https://www.baeldung.com/oracle-jdk-vs-openjdk&quot; target=&quot;_blank&quot;&gt;Oracle JDK and OpenJDK&lt;/a&gt; are basically the same.&lt;/p&gt;&lt;h4 style=&quot;text-align: left;&quot;&gt;Tooling Updates&lt;/h4&gt;&lt;p style=&quot;text-align: left;&quot;&gt;I won&#39;t go into too much detail here. If we are interested in using any of these they are explained and documented well elsewhere. For awareness:&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Java got a REPL called &lt;a href=&quot;https://docs.oracle.com/javase/9/jshell/introduction-jshell.htm&quot; target=&quot;_blank&quot;&gt;JShell&lt;/a&gt; for learning and prototyping interactively (9)&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.baeldung.com/java-9-modularity&quot; target=&quot;_blank&quot;&gt;Java Module System&lt;/a&gt; as a new package abstraction (9) - my impression is that this more for the JDK itself and some libraries while OSGi continues to be applicable for regular, modular apps&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://openjdk.org/jeps/330&quot; target=&quot;_blank&quot;&gt;Single-file programs can be executed directly&lt;/a&gt; and don&#39;t need the intermediate javac step (11)&lt;/li&gt;&lt;li&gt;Multiple new garbage collection options for improved and more consistent performance (11)&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://www.baeldung.com/jvm-garbage-collectors#6-z-garbage-collector&quot; target=&quot;_blank&quot;&gt;ZGC&lt;/a&gt;&amp;nbsp;for low latency&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.baeldung.com/jvm-epsilon-gc-garbage-collector&quot; target=&quot;_blank&quot;&gt;Epsilon GC&lt;/a&gt;&amp;nbsp;no-op&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;&lt;a href=&quot;https://developers.redhat.com/articles/2022/04/19/java-17-whats-new-openjdks-container-awareness#&quot; target=&quot;_blank&quot;&gt;Linux container awareness&lt;/a&gt; for GC, thread pool sizes, etc. (11)&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html&quot; target=&quot;_blank&quot;&gt;Application class data sharing (CDS)&lt;/a&gt; for improved startup times and lower memory footprints (12) - this also seems to have &lt;a href=&quot;https://developer.ibm.com/articles/eclipse-openj9-class-sharing-in-docker-containers/&quot; target=&quot;_blank&quot;&gt;implications for apps running in containers&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;h4 style=&quot;text-align: left;&quot;&gt;API Updates&lt;/h4&gt;&lt;p style=&quot;text-align: left;&quot;&gt;We want to start incorporating these into our code where applicable, so I created examples to help get used to some of these updates.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://www.baeldung.com/java-interface-private-methods&quot; target=&quot;_blank&quot;&gt;Private interface methods&lt;/a&gt; (9). Helps to encapsulate code in default methods and create more reusable code.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/0c57be2d07ad1b63ba2c78fb6ea20a4a.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Variables can have implicit types, including in lambdas, to reduce the verbosity of code (10 and 11).&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/901038cc674a0a1b9b92dd886594af88.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://mkyong.com/java/java-13-switch-expressions/&quot; target=&quot;_blank&quot;&gt;Switch expressions&lt;/a&gt; to simplify code and prepare for pattern matching in the future (12).&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/1f05f9745924326ffdc1b54e0e998e55.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://www.baeldung.com/java-text-blocks&quot; target=&quot;_blank&quot;&gt;Text blocks&lt;/a&gt;&amp;nbsp;as way to simplify code with multi-line strings (13).&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/047984b71fbca3b65affc903ff7d4a23.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Simple&amp;nbsp;&lt;a href=&quot;https://www.baeldung.com/java-switch-pattern-matching&quot; target=&quot;_blank&quot;&gt;pattern matching&lt;/a&gt;&amp;nbsp;(14). I was introduced to pattern matching when programming in Scala and this seems to continue a trend of Scala features making their way, in some form, to Java. It looks like&amp;nbsp;&lt;a href=&quot;https://openjdk.org/projects/amber/&quot; target=&quot;_blank&quot;&gt;more is coming&lt;/a&gt;&amp;nbsp;in terms of pattern matching options.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/929cfd782d96b54d72e903ac52a8d100.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://www.baeldung.com/java-record-keyword&quot; target=&quot;_blank&quot;&gt;Record keyword&lt;/a&gt; for immutable data classes (14). Getters, a public constructor, plus &lt;i&gt;equals&lt;/i&gt;, &lt;i&gt;hashCode&lt;/i&gt;, and &lt;i&gt;toString&lt;/i&gt; methods are generated automatically. Lombok is still more flexible, but this is nice for simple cases.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/cab6d6d448090daeea848ac46f19b592.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://www.baeldung.com/java-sealed-classes-interfaces&quot; target=&quot;_blank&quot;&gt;Sealed classes&lt;/a&gt;&amp;nbsp;for fine-grained inheritance control (15). Super-classes that are widely accessible but not widely extensible.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/cccb69002b1135af26a751e78665dc5e.js&quot;&gt;&lt;/script&gt;&lt;div&gt;&lt;h4 style=&quot;text-align: left;&quot;&gt;Paradigm Updates&lt;/h4&gt;&lt;p style=&quot;text-align: left;&quot;&gt;For lack of a better name I&#39;ll call these paradigm updates as they relate more to programming models.&amp;nbsp;&lt;/p&gt;&lt;div&gt;&lt;a href=&quot;https://www.baeldung.com/rxjava-vs-java-flow-api&quot; target=&quot;_blank&quot;&gt;Flow API&lt;/a&gt; as an implementation of the &lt;a href=&quot;http://www.reactive-streams.org/&quot; target=&quot;_blank&quot;&gt;Reactive Streams Specification&lt;/a&gt;&amp;nbsp;(9). This seems to be a way to get the specification interfaces into the JDK but not necessarily replace libraries with better implementations.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a href=&quot;https://openjdk.org/jeps/426&quot; target=&quot;_blank&quot;&gt;Vector API&lt;/a&gt;&amp;nbsp;introduces vectorization to Java (16). It looks like it doesn&#39;t happen automatically and requires special code, so to be more useful I think it needs to be included in a common library like NumPy in Python.&lt;p&gt;&lt;a href=&quot;https://openjdk.org/jeps/425&quot; target=&quot;_blank&quot;&gt;Virtual threads&lt;/a&gt; and &lt;a href=&quot;https://openjdk.org/jeps/428&quot; target=&quot;_blank&quot;&gt;structured concurrency&lt;/a&gt; (19). The one-to-one kernel to user thread mapping is broken enabling easier asynchronous programming. Read/watch &lt;a href=&quot;https://www.infoq.com/presentations/loom-java-concurrency/&quot; target=&quot;_blank&quot;&gt;Project Loom: Revolution in Java Concurrency or Obscure Implementation Detail?&lt;/a&gt;&amp;nbsp;The tl;dr is we&#39;ll still need a higher level of abstraction like reactive programming unless you want to relearn all the low-level concurrency structures.&lt;/p&gt;&lt;/div&gt;</description><link>http://www.programmingopiethehokie.com/2022/10/java-updates-versions-9-19.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-7011132372207040185</guid><pubDate>Fri, 15 Apr 2022 18:43:00 +0000</pubDate><atom:updated>2024-05-12T22:20:42.914-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">algorithms</category><category domain="http://www.blogger.com/atom/ns#">machine learning</category><category domain="http://www.blogger.com/atom/ns#">optimization</category><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">statistics</category><title>Probabilistic Graphical Models</title><description>&lt;p&gt;Deep learning and neural networks get a lot of (deserved) attention, but there is another class of ML models called Probabilistic Graphical Models (PGMs) that can also be used for inference and prediction. They have applications in fields such as medical diagnosis, image understanding, and speech recognition. Think decision making based on incomplete or insufficient knowledge.&lt;/p&gt;&lt;p&gt;More formally, PGMs use graphs to encode&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Joint_probability_distribution&quot; target=&quot;_blank&quot;&gt;joint probability distributions&lt;/a&gt;&amp;nbsp;as opposed to the more traditional ML approach of learning a function that directly maps input to a target variable. This post isn&#39;t a technical introduction though. Rather, it is more of an introduction-by-example and a summary of&amp;nbsp;&lt;a href=&quot;https://github.com/pgmpy/pgmpy_notebook/blob/master/notebooks&quot; target=&quot;_blank&quot;&gt;pgmpy&#39;s excellent notebooks&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Given a simple graph for flower type:&lt;/p&gt;&lt;p style=&quot;text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTlcXd7vtAdkeqNYz3nc0DZrKlm-H4oNB8mB1acj436Ge5-awaHhXNOg14lgMm43yotK2dfRMAA1_UwO9KQHSrDe1aXbVkuiWDOyrrtKX4c_fFwXK0Re0Ah09CMHmdDoMiPtZiL14ZApTr/s435/iris.png&quot; style=&quot;clear: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;316&quot; data-original-width=&quot;435&quot; height=&quot;145&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTlcXd7vtAdkeqNYz3nc0DZrKlm-H4oNB8mB1acj436Ge5-awaHhXNOg14lgMm43yotK2dfRMAA1_UwO9KQHSrDe1aXbVkuiWDOyrrtKX4c_fFwXK0Re0Ah09CMHmdDoMiPtZiL14ZApTr/w200-h145/iris.png&quot; width=&quot;200&quot; /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Our two approaches would look something like this:&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/7ef9e0e458c1548f3c9ef1a98225f286.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;b&gt;Bayesian networks&lt;/b&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;In this section I&#39;ll use a more complex graph for student grades:&lt;/p&gt;&lt;p style=&quot;text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirRwtTTuZgZpnBpKsFh4RwmaMS11LhjLqObfbfN0HlWOPoPO0vMVRU2sRprzRh-bC3XctirdnqitBx0MaIVW51x5ImwcB5Unu_Et6MDeM17ihODC8ERTEGiMGODpNtCnmnQnwqLgLCmyTP/s1564/student.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1023&quot; data-original-width=&quot;1564&quot; height=&quot;210&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirRwtTTuZgZpnBpKsFh4RwmaMS11LhjLqObfbfN0HlWOPoPO0vMVRU2sRprzRh-bC3XctirdnqitBx0MaIVW51x5ImwcB5Unu_Et6MDeM17ihODC8ERTEGiMGODpNtCnmnQnwqLgLCmyTP/w320-h210/student.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;For problems with many features and/or high cardinality features, inference will be difficult because the size of the joint probability distribution increases exponentially. PGMs can compactly represent it by exploiting&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Conditional_independence&quot; target=&quot;_blank&quot;&gt;conditional independence&lt;/a&gt;.&amp;nbsp;They provide us efficient methods for doing inference over these joint distributions.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;In this graph we have cardinalities of 2 for each node except Letter which is 3. The joint distribution would require storing 48 values (2*2*2*2*3) while the PGM only requires 26 (see notebook 1 for details).&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/14bb8eb43a0f7799381d5ef5bdd8294e.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;This is what&#39;s known as a &lt;a href=&quot;https://en.wikipedia.org/wiki/Bayesian_network&quot; target=&quot;_blank&quot;&gt;Bayesian network&lt;/a&gt;, which is always represented as a directed acyclic graph. Each node is parameterized by a &lt;a href=&quot;https://en.wikipedia.org/wiki/Conditional_probability_distribution&quot; target=&quot;_blank&quot;&gt;conditional probability distribution&lt;/a&gt; (CPD) like P(node|parents). For example, the Grade node has the CPD P(G|D,I). Bayesian networks are used when you want to represent causal relationships between random variables.&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Naive_Bayes_classifier&quot; target=&quot;_blank&quot;&gt;Naive Bayes&lt;/a&gt;&amp;nbsp;is a special case where all random variables are assumed to be independent of each other, each only directly affecting the target variable.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzVOLotwq817DnicVNfDlmpsFcgbKDjOESCV53SVMCQtUlYraplJWJBMwNnfqDFHHJ9n_kCW_cZ-QR1FezD89f222xHkhHRqftJkrUKl5JFX3C88n-Vb9q8rKKCVLqy-FeBWuz006qTRtf/s1111/cpds.png&quot; style=&quot;margin-left: 1em; margin-right: 1em; text-align: center;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;716&quot; data-original-width=&quot;1111&quot; height=&quot;258&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzVOLotwq817DnicVNfDlmpsFcgbKDjOESCV53SVMCQtUlYraplJWJBMwNnfqDFHHJ9n_kCW_cZ-QR1FezD89f222xHkhHRqftJkrUKl5JFX3C88n-Vb9q8rKKCVLqy-FeBWuz006qTRtf/w400-h258/cpds.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Given tabular data and a graph structure, CPDs can be estimated using &lt;a href=&quot;https://en.wikipedia.org/wiki/Maximum_likelihood_estimation&quot; target=&quot;_blank&quot;&gt;Maximum Likelihood Estimation&lt;/a&gt; (MLE). It&#39;s similar to what was done with the Iris data in the first code block above. It&#39;s also fragile because it is so dependent on the amount and quality of the observed data (see notebook 10 for details). This explains why that code breaks with some random seeds.&amp;nbsp;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;A better solution is Bayesian Parameter Estimation. There you start with CPDs based on your prior beliefs (or uniform priors) and update them based on the observed data.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/93a825a0aaaa7eb222a1e2baca824c91.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;One method of exact inference in PGMs is variable elimination. It efficiently avoids computing the entire joint probability distribution (see notebooks 2 and 5 for details). For larger graphs there are other, approximate algorithms because an exact solution would be intractable.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/10d7759ae548730fe3e264cea67394fa.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Making predictions is similar. Instead of getting a distribution we get the most probably state.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/f55b4838b85846eadc65cbd06cfc869b.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;b&gt;Markov networks&lt;/b&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Markov_random_field&quot; target=&quot;_blank&quot;&gt;Markov networks&lt;/a&gt; are represented by undirected graphs.&amp;nbsp;They represent non-causal relationships. They can, however, represent dependencies that a Bayesian model can&#39;t, like cycles and bi-directional dependencies. Factors describe connected variable affinity, or how much two nodes agree with each other. The joint probability distribution is the product of all factors.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/6f8ff78421ccc70fb420ac044c40ad1a.js&quot;&gt;&lt;/script&gt;&lt;p&gt;&lt;i&gt;A quick note because the names sound similar.&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Markov_chain&quot; target=&quot;_blank&quot;&gt;Markov chains&lt;/a&gt;&amp;nbsp;are not PGMs because the nodes are not random variables. They can be represented as as Bayesian networks and PGM algorithms would be available.&lt;/i&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;b&gt;Sampling&lt;/b&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Sampling algorithms approximate exact inference by generating a large number of samples that will converge to the original distribution. One of these is Hamiltonian Monte Carlo. It is a&amp;nbsp;Markov Chain Monte Carlo&amp;nbsp;(MCMC) that proposes future states in the Markov Chain using Hamilton dynamics from physics (see notebook 8 for details). Other MCMC algorithms you may encounter are &lt;a href=&quot;https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm&quot; target=&quot;_blank&quot;&gt;Metropolis-Hastings&lt;/a&gt; and &lt;a href=&quot;https://en.wikipedia.org/wiki/Gibbs_sampling&quot; target=&quot;_blank&quot;&gt;Gibb&#39;s Sampling&lt;/a&gt;. See&amp;nbsp;&lt;a href=&quot;https://towardsdatascience.com/monte-carlo-approximation-methods-which-one-should-you-choose-and-when-886a379fb6b&quot; target=&quot;_blank&quot;&gt;Monte Carlo Approximation Methods: Which one should you choose and when?&lt;/a&gt;&amp;nbsp;for a comparison of these methods.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/364c76d85ea9735404d13c7597a83a42.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Another interesting find that fits in at this point is the&amp;nbsp;&lt;a href=&quot;https://docs.pymc.io/&quot; target=&quot;_blank&quot;&gt;PyMC3&lt;/a&gt;&amp;nbsp;library and the&amp;nbsp;&lt;a href=&quot;https://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/&quot; target=&quot;_blank&quot;&gt;Probabilistic Programming and Bayesian Methods for Hackers&lt;/a&gt;&amp;nbsp;open source book.&amp;nbsp;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;I also think this is a nice writeup on&amp;nbsp;&lt;a href=&quot;https://towardsdatascience.com/bayesian-logistic-regression-in-python-9fae6e6e3e6a&quot; target=&quot;_blank&quot;&gt;Bayesian Logistic Regression&lt;/a&gt;&amp;nbsp;using&amp;nbsp;&lt;a href=&quot;https://pyro.ai/&quot; target=&quot;_blank&quot;&gt;Pyro&lt;/a&gt;, another probabilistic programming library, and MCMC.&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;b&gt;Learning networks&lt;/b&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Learning a Bayesian network can be done as an optimization problem by scoring networks on how well they fit a data set, and searching through the space of all possible models. For non-trivial graphs where an exhaustive search is not possible,&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Hill_climbing&quot; target=&quot;_blank&quot;&gt;hill climbing&lt;/a&gt;&amp;nbsp;can be used (see notebook 11 for details).&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/dbc00c4dbe97578fe0fb9a13ec40d3e6.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p style=&quot;text-align: left;&quot;&gt;&lt;b&gt;Wrap-up&lt;/b&gt;&lt;/p&gt;&lt;div&gt;I&#39;ve only scratched the surface here, but I think it&#39;s a more intuitive introduction to the topic than most of the material in this space. We could build up to more complex graphs and problems from here.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;https://towardsdatascience.com/hidden-markov-models-explained-with-a-real-life-example-and-python-code-2df2a7956d65&quot; target=&quot;_blank&quot;&gt;Hidden Markov Models Explained with a Real Life Example and Python code&lt;/a&gt;&amp;nbsp;is a nice extension, for example.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;And to bring things full circle on where PGMs fit in to the ML landscape, here is an &lt;a href=&quot;https://quorasessionwithiangoodfellow.quora.com/Given-the-rise-of-Neural-Networks-what-is-the-future-of-probabilistic-graphical-models-1&quot; target=&quot;_blank&quot;&gt;opinion from well-known ML researcher Ian Goodfellow&lt;/a&gt;:&lt;/span&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;span style=&quot;background-color: white;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span style=&quot;color: #282829;&quot;&gt;The two aren’t mutually exclusive. Most applications of neural nets can be considered graphical models that use neural nets to provide some of the conditional probability distributions. You could argue that the graphical model perspective is growing less useful because so many recent neural models have such simple graph structure &lt;/span&gt;&lt;span style=&quot;color: #282829;&quot;&gt;…&lt;/span&gt;&lt;span style=&quot;color: #282829;&quot;&gt;&amp;nbsp;These graphs are not very structured compared to neural models that were popular a few years ago like …&amp;nbsp;But there are some recent models that make a little bit of use of graph structure, like VAEs with auxiliary variables.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Plus a tweet from the Standford NLP group:&lt;/p&gt;&lt;blockquote class=&quot;twitter-tweet&quot;&gt;&lt;p dir=&quot;ltr&quot; lang=&quot;en&quot; style=&quot;text-align: left;&quot;&gt;There is increasing convergence between this decade’s neural models and last decade’s probabilistic graphical models… &lt;a href=&quot;https://t.co/MKlIP4Sayx&quot;&gt;https://t.co/MKlIP4Sayx&lt;/a&gt;&lt;/p&gt;&lt;div style=&quot;text-align: left;&quot;&gt;— Stanford NLP Group (@stanfordnlp) &lt;a href=&quot;https://twitter.com/stanfordnlp/status/996449546497441792?ref_src=twsrc%5Etfw&quot;&gt;May 15, 2018&lt;/a&gt;&lt;/div&gt;&lt;/blockquote&gt; &lt;script async=&quot;&quot; charset=&quot;utf-8&quot; src=&quot;https://platform.twitter.com/widgets.js&quot;&gt;&lt;/script&gt;&lt;p style=&quot;text-align: left;&quot;&gt;Thus is would seem that knowing these concepts will continue to be useful even if we don&#39;t directly use PGMs or focus solely on PGMs.&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2022/04/probabilistic-graphical-models.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTlcXd7vtAdkeqNYz3nc0DZrKlm-H4oNB8mB1acj436Ge5-awaHhXNOg14lgMm43yotK2dfRMAA1_UwO9KQHSrDe1aXbVkuiWDOyrrtKX4c_fFwXK0Re0Ah09CMHmdDoMiPtZiL14ZApTr/s72-w200-h145-c/iris.png" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-235040763904237946</guid><pubDate>Fri, 19 Feb 2021 13:08:00 +0000</pubDate><atom:updated>2025-07-09T20:58:38.378-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">big data</category><category domain="http://www.blogger.com/atom/ns#">machine learning</category><category domain="http://www.blogger.com/atom/ns#">parallelism</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>ML at Scale Part 3: distributed compute</title><description>&lt;p&gt;In &lt;a href=&quot;http://www.programmingopiethehokie.com/2021/02/ml-at-scale-part-2-memory.html&quot; target=&quot;_blank&quot;&gt;part 2&lt;/a&gt; I focused on ML when your data won&#39;t fit in memory. This post will move on to slow, or compute bound ML instead. I&#39;ll continue to use Dask and explore how it can help us.&lt;/p&gt;&lt;p&gt;Dask leverages multiple CPU cores to enable efficient parallel computation on a single machine. It can also run on a thousand-machine cluster, breaking up computations and routing them efficiently. The &lt;a href=&quot;https://docs.dask.org/en/latest/spark.html&quot; target=&quot;_blank&quot;&gt;Comparison to Spark&lt;/a&gt; documentation is a great reference for understanding Dask in the context of an older tool and that older post. Perhaps the most interesting difference is Spark just being an extension of the MapReduce paradigm and Dask being able to implement more sophisticated algorithms by being generic task scheduling-based.&lt;/p&gt;&lt;p&gt;Sticking with the scikit-learn examples, here is replacing a parallel algorithm&#39;s &lt;a href=&quot;https://joblib.readthedocs.io/en/latest/&quot; target=&quot;_blank&quot;&gt;Joblib&lt;/a&gt;&amp;nbsp;backend with Dask to (potentially) spread work out across a cluster.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/d89173366923ee33f4b7c73494f67fba.js&quot;&gt;&lt;/script&gt;&lt;p&gt;The Dask documentation is quick to point out in their &lt;a href=&quot;https://docs.dask.org/en/latest/best-practices.html&quot; target=&quot;_blank&quot;&gt;best practices&lt;/a&gt; that not everyone needs distributed ML as it has some overhead. Compiling code with Numba or Cython could help, as could intelligently sampling some of your data. In &lt;a href=&quot;https://www.programmingopiethehokie.com/2017/02/machine-learning-for-ncaa-basketball.html&quot; target=&quot;_blank&quot;&gt;this post&lt;/a&gt; I got huge speedups by vectorizing some code that was looping through large matrices.&lt;/p&gt;&lt;p&gt;Across this three part series we&#39;ve now seen how to speed up reading large datasets, work with datasets that don&#39;t fit entirely in memory, and distribute processing across multiple machines. There&#39;s obviously a lot more to this, but I wanted to develop a better intuition for how to approach these types of issues and at least know where to start. Hopefully you learned something too.&lt;/p&gt;&lt;p&gt;UPDATE:&lt;/p&gt;&lt;p&gt;A specific tool isn&#39;t supposed to be the focus of this post. Dask was used here to illustrate the idea and show how simple it can be, but there are other options in this space. Here are a couple of other examples that I&#39;ve come across:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;a href=&quot;https://github.com/ray-project/tune-sklearn&quot; target=&quot;_blank&quot;&gt;tune-sklearn&lt;/a&gt;, which is part of the &lt;a href=&quot;https://docs.ray.io/en/master/&quot; target=&quot;_blank&quot;&gt;Ray&lt;/a&gt; framework for building distributed applications, can do distributed hyperparameter tuning&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.tensorflow.org/guide/distributed_training&quot; target=&quot;_blank&quot;&gt;Distributed training with TensorFlow&lt;/a&gt;&amp;nbsp;shows how you can distribute model training within the TensorFlow ecosystem&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;p&gt;UPDATE 2:&lt;/p&gt;&lt;p&gt;Both&amp;nbsp;&lt;a href=&quot;https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html&quot; target=&quot;_blank&quot;&gt;Pandas 2.0&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href=&quot;https://www.pola.rs/&quot; target=&quot;_blank&quot;&gt;Polars&lt;/a&gt;&amp;nbsp;now use Apache Arrow as a memory model. Polars, a relatively new entrant to this space, is implemented in Rust and exposes a Python API. It is created specifically for fast data processing, not ML, but overlaps enough with this series that it is worth checking out.&lt;/p&gt;&lt;p&gt;UPDATE 3:&lt;/p&gt;&lt;p&gt;Since I mentioned Numba earlier, I&#39;ll also mention &lt;a href=&quot;https://github.com/jax-ml/jax&quot; target=&quot;_blank&quot;&gt;Jax&lt;/a&gt; another, newer library in that space. Jax has a NumPy-like interface, works with GPUs, offers JIT compilation, and supports automatic differentiation, vectorization, and parallelization. Check out their&amp;nbsp;&lt;a href=&quot;https://docs.jax.dev/en/latest/notebooks/Neural_Network_and_Data_Loading.html#&quot; target=&quot;_blank&quot;&gt;simple NN example&lt;/a&gt; to see it in action. It&#39;s not replacing distributed compute or even competing with Dask, the point is more that there are now a lot of amazing tools that make working with data easier and faster.&lt;/p&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2021/02/ml-at-scale-part-3-distributed-compute.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-1821860980486358031</guid><pubDate>Mon, 15 Feb 2021 22:28:00 +0000</pubDate><atom:updated>2021-12-30T11:10:50.714-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">big data</category><category domain="http://www.blogger.com/atom/ns#">machine learning</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>ML at Scale Part 2: memory</title><description>&lt;p&gt;When data we want to train a machine learning model on becomes too big to fit in memory, we need to find a way to work on subsets of the data. I hinted at this in &lt;a href=&quot;https://www.programmingopiethehokie.com/2021/02/ml-at-scale-part-1-io.html&quot; target=&quot;_blank&quot;&gt;part 1&lt;/a&gt;. We can use Pandas to read chunks of a file, but that is fairly primitive and slow.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/b13ab626ff99cb1a527a673180e9da96.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Libraries like &lt;a href=&quot;https://github.com/vaexio/vaex&quot; target=&quot;_blank&quot;&gt;Vaex&lt;/a&gt; and&lt;a href=&quot;https://dask.org/&quot; target=&quot;_blank&quot;&gt; Dask&lt;/a&gt; attempt to abstract this away.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Vaex provides lazy, out-of-core (not all in memory at once) DataFrames via memory-mapping. Pre-processing and feature engineering are more efficient, and memory is freed up for model training. It also has a &lt;a href=&quot;https://vaex.io/docs/tutorial_ml.html&quot;&gt;vaex.ml&lt;/a&gt; package which provides a scikit-learn wrapper.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/878cc1e19f84785e09665975a4052490.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Dask provides large, parallel DataFrames composed of smaller Pandas DataFrames. This helps with data too big to fit in memory because the individual Pandas DataFrames can be stored on disk. &lt;a href=&quot;https://ml.dask.org/index.html&quot; target=&quot;_blank&quot;&gt;DaskML&lt;/a&gt; provides estimators designed to work with Dask DataFrames.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/44ff311da0dbc2d772907a2143bd9a54.js&quot;&gt;&lt;/script&gt;&lt;p&gt;The accuracies both came out to 96%. Similar ideas, different implementation. In fact, these DataFrames remind me a little bit of the persistent data structures covered in my &lt;a href=&quot;http://www.programmingopiethehokie.com/2014/03/exploring-immutability.html&quot; target=&quot;_blank&quot;&gt;Exploring Immutability&lt;/a&gt; post.&lt;/p&gt;&lt;p&gt;Even with these fancy DataFrames, many machine learning algorithms are designed train on all the data at once. If our data is too big to fit in memory then that&#39;s going to be a problem. As in the code above, we need to use online learning or &lt;a href=&quot;https://scikit-learn.org/0.15/modules/scaling_strategies.html#incremental-learning&quot; target=&quot;_blank&quot;&gt;incremental algorithms&lt;/a&gt;&amp;nbsp;to solve this problem. The &lt;i&gt;Incremental&lt;/i&gt; and &lt;i&gt;IncrementalPredictor&lt;/i&gt;&amp;nbsp;classes handle &quot;streaming&quot; the data (in batches) and we specify upfront the possible classes.&lt;/p&gt;&lt;p&gt;DaskML adds several &lt;a href=&quot;https://ml.dask.org/glm.html&quot; target=&quot;_blank&quot;&gt;generalized linear model&lt;/a&gt; implementations.&amp;nbsp;&lt;a href=&quot;https://scikit-multiflow.github.io/&quot; target=&quot;_blank&quot;&gt;scikit-multiflow&lt;/a&gt; is designed for actual streaming data and adds several other online learning algorithms. Neural networks are also trained in this manner, often being fed &quot;mini-batches&quot;, so they are good candidates for datasets that don&#39;t fit in memory as well.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/893a9f984d16b678dda4f777dd2d68b6.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Stay tuned for part 3 where I&#39;ll get into being compute-constrained instead of memory-constrained.&lt;/p&gt;&lt;p&gt;UPDATE:&lt;/p&gt;&lt;p&gt;There&#39;s a lot of I/O and memory-related work coming out of the TensorFlow community as well. Checkout &lt;a href=&quot;https://www.tensorflow.org/guide/data_performance?utm_source=pocket_mylist#overview&quot; target=&quot;_blank&quot;&gt;Better performance with the tf.data API&lt;/a&gt; as it overlaps nicely with Parts 1 and 2 of this series.&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2021/02/ml-at-scale-part-2-memory.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-5180214187342057319</guid><pubDate>Sun, 07 Feb 2021 19:48:00 +0000</pubDate><atom:updated>2021-12-10T15:30:34.348-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">big data</category><category domain="http://www.blogger.com/atom/ns#">machine learning</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>ML at Scale Part 1: I/O</title><description>&lt;p&gt;I recently came across a somewhat large dataset from a Kaggle competition where the data was provided as an approximately 6 GB CSV file. A frequent comment in the discussion forum was how long it took just to read this file. A few GBs is large enough where this starts to become noticeable, but it&#39;s not really that big. If a laptop can have a 1+ TB drive and 32 GB memory then this isn&#39;t even in the realm of &quot;big data&quot;. That&#39;s good, though, because it means there are some simple tricks we can use to cut down on that read time.&lt;/p&gt;&lt;p&gt;The pandas read_csv() method takes about 63 seconds for me. That&#39;s our baseline.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/d858442828b82ebadf269fa42fc3abb6.js&quot;&gt;&lt;/script&gt;&lt;p&gt;First we try reducing the precision. This gets us to 59 seconds. Not great.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/2627d15f07eada59f6caa1beec2c9801.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Next we try reading the files in chunks. This is actually slower, but the technique could help us if the file didn&#39;t fit entirely in memory. More on that in future posts.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/b13ab626ff99cb1a527a673180e9da96.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Then we try &lt;a href=&quot;https://dask.org/&quot; target=&quot;_blank&quot;&gt;Dask&lt;/a&gt; which will spread the work across multiple processers. 30 seconds. Better, but I still don&#39;t want to wait that long. More on Dask in future posts as well.&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/7e11019ba1307aa6456da99b307b3623.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Finally we convert the CSV file to a different format. I tried &lt;a href=&quot;https://arrow.apache.org/docs/python/parquet.html&quot; target=&quot;_blank&quot;&gt;Apache Parquet&lt;/a&gt; but there are others. It&#39;s a binary columnar format (remember &lt;a href=&quot;http://www.programmingopiethehokie.com/2020/06/column-oriented-database-basics.html&quot; target=&quot;_blank&quot;&gt;Column-oriented Database Basics&lt;/a&gt;?). Stored in this manner the data is 2.5 GB. And this gets us to just 3 seconds for reading the whole file!&lt;/p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/c32a1551236369b6694a245499ac9eec.js&quot;&gt;&lt;/script&gt;&lt;p&gt;Converting our data to the binary file format and possibly reducing the precision or using Dask as well would really shorten our feedback loop while training a ML model. It would seem that any cleaning of the data or preprocessing that we can do ahead of time would make sense to do once, before converting the file format, when the data is this size.&lt;/p&gt;&lt;p&gt;UPDATE:&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://arrow.apache.org/docs/index.html&quot; target=&quot;_blank&quot;&gt;Apache Arrow&lt;/a&gt; is another project to checkout in this space along with &lt;a href=&quot;https://en.wikipedia.org/wiki/Memory-mapped_file&quot; target=&quot;_blank&quot;&gt;memory-mapped files&lt;/a&gt;. Reading this data in the Feather file format is even faster than Parquet.&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2021/02/ml-at-scale-part-1-io.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-1369398867423112900</guid><pubDate>Sun, 27 Dec 2020 02:20:00 +0000</pubDate><atom:updated>2023-07-09T09:53:39.628-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">bindings</category><category domain="http://www.blogger.com/atom/ns#">javascript</category><category domain="http://www.blogger.com/atom/ns#">rust</category><title>Rust to WebAssembly and Fibonacci Numbers</title><description>&lt;p&gt;&lt;a href=&quot;https://developer.mozilla.org/en-US/docs/WebAssembly/Concepts&quot; target=&quot;_blank&quot;&gt;WebAssembly&lt;/a&gt; (wasm) is a virtual assembly language. It&#39;s a binary instruction format that can be executed at near native speeds by JavaScript engines like V8, which means it can run in a browser or Node.js app. Not intended as a JavaScript replacement, it instead works with it for performance critical pieces of code. You can make calls from JavaScript to WebAssembly and vice versa. It can be useful for games, VR, and AR, for example.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://www.rust-lang.org/&quot; target=&quot;_blank&quot;&gt;Rust&lt;/a&gt; is a fast and memory-efficient programming language with interesting type and thread-safety characteristics. I&#39;ve never used it, but have wanted to check it out for a while. Along with Go and C, it can easily be compiled into WebAssembly. This post just touches the surface of Rust.&lt;/p&gt;&lt;p&gt;First, to learn how to use Rust in a JavaScript app, I followed &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/WebAssembly/Rust_to_wasm&quot; target=&quot;_blank&quot;&gt;Compiling from Rust to WebAssembly&lt;/a&gt;. They guide you through compiling Rust code, generating a JavaScript wrapper package, then using npm and webpack to run it. The end user only needs a modern browser and they can execute the code originally written in Rust none the wiser.&lt;/p&gt;&lt;p&gt;Building on that, as a simple experiment to see how much faster Rust code compiled to WebAssembly can be versus vanilla JavaScript, I compared calculating the nth Fibonacci number in each. I used the inefficient recursive algorithm intentionally, wanting it to be slow.&lt;/p&gt;&lt;p&gt;The function in Rust looks like this:&lt;/p&gt;&lt;p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/1e20fe003dd7b8bc1c332077341aede9.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p&gt;And the JavaScript code that calls it, times it, and compares it:&lt;/p&gt;&lt;p&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/fc2bcac26d18c8a89a580715296f0d79.js&quot;&gt;&lt;/script&gt;&lt;/p&gt;&lt;p&gt;Surprisingly, although maybe it shouldn&#39;t be, the WebAssembly version of this simple function is orders of magnitude faster when executed in the browser. At n=35, where they really start to diverge, I&#39;m seeing about .06 seconds versus 3.5 seconds.&amp;nbsp;&lt;/p&gt;&lt;p&gt;I don&#39;t often have to write CPU-intensive code like this in JavaScript, but with such a stark performance difference possible it&#39;s a good tool to have in the toolbox.&amp;nbsp;&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2020/12/rust-to-webassembly-and-fibonacci.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-1924246420871301413</guid><pubDate>Wed, 04 Nov 2020 22:43:00 +0000</pubDate><atom:updated>2020-11-04T17:43:13.433-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">APIs</category><category domain="http://www.blogger.com/atom/ns#">cloud</category><category domain="http://www.blogger.com/atom/ns#">nodejs</category><title>GraphQL API Gateway Prototype</title><description>&lt;p&gt;I&#39;ve been meaning to check out &lt;a href=&quot;https://graphql.org/&quot; target=&quot;_blank&quot;&gt;GraphQL&lt;/a&gt; for a while. At work I&#39;m seeing calls to REST APIs being made to get one small piece of data, most of the response discarded. Or, multiple HTTP requests, usually sequential, whose responses must be combined to do something useful with the retrieved data. Their granularity is wrong for the use case, but maybe not for someone else&#39;s. Can GraphQL help with this? It seems like it can. Model the business domain as a graph, like our mental models and object-oriented programming. Engines for many languages. A &lt;a href=&quot;https://graphql.org/learn/schema/&quot; target=&quot;_blank&quot;&gt;type system&lt;/a&gt; to enable good developer tools. It sounds great.&lt;/p&gt;&lt;p&gt;At the risk of adding another network hop, another moving, part, another layer, it would be nice to be able to call something to get exactly the data I need without changing the existing APIs. Mobile devices or slow internet connections could benefit from the reduced number of round trips. This extra layer could abstract away the different (read poor) uses of HTTP status codes and different response body styles. What I think I really want is a GraphQL API gateway.&lt;/p&gt;&lt;p&gt;The &lt;a href=&quot;https://microservices.io/patterns/apigateway.html&quot; target=&quot;_blank&quot;&gt;API gateway&lt;/a&gt;&amp;nbsp;is a single entry point for all clients (the backends for frontends variation of the pattern has a different gateway optimized for each type of client). Requests can be proxied straight through to a single microservice or fanned out to several microservices. Responses can be aggregated and/or modified. This is the overlap point with GraphQL and why I think they would go well together. The API gateway can also centralize several cross-cutting concerns like throttling, routing, circuit-breaking, input validation, authentication (authorization stays in the business logic), etc.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://www.apollographql.com/&quot; target=&quot;_blank&quot;&gt;Apollo&lt;/a&gt; is the GraphQL implementation I went with to prototype this. From &lt;a href=&quot;https://www.apollographql.com/docs/tutorial/introduction/&quot; target=&quot;_blank&quot;&gt;their tutorial&lt;/a&gt; I started with the rocket launch API and extended it with a made up weather API to see how the two could be chained together. To the client, it&#39;s seamless. Both &quot;services&quot; are part of the same graph.&lt;/p&gt;&lt;p&gt;This is a lot of code to show in one shot, but I&#39;ll explain it below and then show some example GraphQL queries.&lt;/p&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/9e0e44790c16af5d417ee54baacafe0a.js&quot;&gt;&lt;/script&gt;
&lt;p&gt;First, &lt;i&gt;typeDefs&lt;/i&gt; defines the schema. You can query a list of rocket launches or a single launch by ID. &lt;i&gt;dataSources&lt;/i&gt; specifies, of course, where the data is coming from, like a database or REST API. &lt;i&gt;resolvers&lt;/i&gt; stitches these together. Notice how the Weather type takes a site from its parent, a Launch. This is how they are linked, or chained together.&lt;/p&gt;&lt;p&gt;With the Apollo server running you can try it out at http://localhost:4000/ in a browser. The GraphQL queries are on the left and the responses are on the right.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivh2E6BvNZr_ZuFj2V2wanBsxpkZrmf6-qKZ1rrBvDZG0jdOFMW2W6ZeIFOumJlHtmCZ0Jv7AR8EsJDJyd_pIvGx7MGv9Wxa-ZijVg8efMSKzvlQCYaWrv105AfZFfyEtBWJAEsf1qD9xF/s821/getLaunchById.PNG&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;277&quot; data-original-width=&quot;821&quot; height=&quot;216&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivh2E6BvNZr_ZuFj2V2wanBsxpkZrmf6-qKZ1rrBvDZG0jdOFMW2W6ZeIFOumJlHtmCZ0Jv7AR8EsJDJyd_pIvGx7MGv9Wxa-ZijVg8efMSKzvlQCYaWrv105AfZFfyEtBWJAEsf1qD9xF/w640-h216/getLaunchById.PNG&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4O_p3JDPIci83AFZYvuZ8IZEY6w8jGMvx5IEtBlCQa6Ryyfozem0wPZzKKL4JjcWfbGyEut_-ixEZ1O8Yr1dZCQ-7JuYTtLGzjXUCo3hpHFQyM-Z2q1dh8xMHtt00CxLQN5Hr7ADnAJyE/s1011/getLaunches.PNG&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;666&quot; data-original-width=&quot;1011&quot; height=&quot;422&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4O_p3JDPIci83AFZYvuZ8IZEY6w8jGMvx5IEtBlCQa6Ryyfozem0wPZzKKL4JjcWfbGyEut_-ixEZ1O8Yr1dZCQ-7JuYTtLGzjXUCo3hpHFQyM-Z2q1dh8xMHtt00CxLQN5Hr7ADnAJyE/w640-h422/getLaunches.PNG&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;It&#39;s cool to see the different responses without having had to write any code to specifically handle them. A natural extension to this prototype would be to add &lt;a href=&quot;https://www.apollographql.com/docs/tutorial/mutation-resolvers/&quot; target=&quot;_blank&quot;&gt;mutation resolvers&lt;/a&gt;&amp;nbsp;so clients could also update the graph.&lt;/p&gt;&lt;p&gt;Finally, the &lt;a href=&quot;https://www.thoughtworks.com/radar#graphql-grandiosity&quot; target=&quot;_blank&quot;&gt;ThoughtWorks Tech Radar&lt;/a&gt; cautions against trying to create a universal, canonical, centralized data model. I think the &lt;a href=&quot;https://martinfowler.com/bliki/BoundedContext.html&quot; target=&quot;_blank&quot;&gt;bounded context&lt;/a&gt; ideas from DDD would apply. In their &lt;a href=&quot;https://www.thoughtworks.com/radar/techniques?blipid=202005092&quot; target=&quot;_blank&quot;&gt;zero trust architecture blurb&lt;/a&gt;&amp;nbsp;they also mention that a network perimeter isn&#39;t a security boundary anymore. That makes me question thinking of the API gateway as a place to shift all those cross-cutting concerns to. Do users have to go through the gateway or can they hit microservices directly? They mention &lt;a href=&quot;https://istio.io/latest/docs/concepts/what-is-istio/&quot; target=&quot;_blank&quot;&gt;service mesh&lt;/a&gt; as a solution and that would seem to be an API gateway alternative, but &lt;a href=&quot;https://blog.christianposta.com/microservices/do-i-need-an-api-gateway-if-i-have-a-service-mesh/&quot; target=&quot;_blank&quot;&gt;they aren&#39;t mutually exclusive&lt;/a&gt;&amp;nbsp;either.&lt;/p&gt;</description><link>http://www.programmingopiethehokie.com/2020/11/graphql-api-gateway-prototype.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivh2E6BvNZr_ZuFj2V2wanBsxpkZrmf6-qKZ1rrBvDZG0jdOFMW2W6ZeIFOumJlHtmCZ0Jv7AR8EsJDJyd_pIvGx7MGv9Wxa-ZijVg8efMSKzvlQCYaWrv105AfZFfyEtBWJAEsf1qD9xF/s72-w640-h216-c/getLaunchById.PNG" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-6870051027254321630</guid><pubDate>Sat, 13 Jun 2020 21:06:00 +0000</pubDate><atom:updated>2020-06-13T17:06:41.620-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">big data</category><category domain="http://www.blogger.com/atom/ns#">nosql</category><title>Column-oriented Database Basics</title><description>&lt;div&gt;This is a short post on &lt;a href=&quot;https://en.wikipedia.org/wiki/Column-oriented_DBMS&quot; target=&quot;_blank&quot;&gt;column-oriented databases&lt;/a&gt;. I&#39;ll barely scratch the surface, but among the types of NoSQL databases—document, key-value, column-oriented and graph—I&#39;ve always thought column-oriented was the most difficult to wrap my head around. Hopefully we can get past that initial hurdle here, run a few queries, and they will seem like less of a mystery.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A relational database is optimized for retrieving rows of data. This works well for &lt;a href=&quot;https://en.wikipedia.org/wiki/Online_transaction_processing&quot; target=&quot;_blank&quot;&gt;transactional applications&lt;/a&gt;. A column-oriented database is optimized for retrieving columns of data. This works well for &lt;a href=&quot;https://en.wikipedia.org/wiki/Online_analytical_processing&quot; target=&quot;_blank&quot;&gt;analytical applications&lt;/a&gt; and some queries, like aggregations, become really fast because much less data needs to be read from disk to retrieve the whole column. There seems to be a lot of overlap between column-oriented databases and data warehouses.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A relational database would store 3 rows of data like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;1:a,b,c;2:d,e,f;3:g,h,i&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While a column-oriented database would store the same data like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;a:1,d:2,g:3;b:1,e:2,h:3;c:1,f:2,i:3&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For a low-cardinality column, compression algorithms work very well. Something like &lt;i&gt;a:2,a:3&lt;/i&gt; becomes &lt;i&gt;a:2,3&lt;/i&gt;. In some ways this is like &lt;a href=&quot;https://en.wikipedia.org/wiki/Database_normalization&quot; target=&quot;_blank&quot;&gt;normalizing a relational database&lt;/a&gt;&amp;nbsp;to reduce data duplication, but I don&#39;t think that enables the same level of compression or gives you the same data locality benefits.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Of course column-oriented databases aren&#39;t good for all workloads. They aren&#39;t optimized for queries that touch many fields. Writes can also be slow since they aren&#39;t just appending to the end of a file.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So where do they fit into a big data architecture? When thinking about this I remembered the article &lt;a href=&quot;https://martinfowler.com/articles/data-monolith-to-mesh.html&quot; target=&quot;_blank&quot;&gt;How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh&lt;/a&gt;. It avoids mentioning specific products, but they way I interpret it is that their first generation was dedicated data warehouse products where you did &lt;a href=&quot;https://en.wikipedia.org/wiki/Extract,_transform,_load&quot; target=&quot;_blank&quot;&gt;ETL&lt;/a&gt;. The second generation was more &lt;a href=&quot;https://en.wikipedia.org/wiki/Extract,_load,_transform&quot; target=&quot;_blank&quot;&gt;ELT&lt;/a&gt;, maybe using Hadoop and Spark. The third generation, the eponymous data mesh, unifies batch and stream processing, perhaps adding Kafka to the above mix. Using the &lt;a href=&quot;https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/&quot; target=&quot;_blank&quot;&gt;CQRS pattern&lt;/a&gt;, for example, a column-oriented database could fulfill some of the Q (query) operations as a sort of data warehouse.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For my first column-oriented database adventure, after that background research, I chose Amazon Redshift and I followed their &lt;a href=&quot;https://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html&quot; target=&quot;_blank&quot;&gt;Getting Started with Amazon Redshift&lt;/a&gt; guide. It only took about an hour to spin up a Redshift cluster, upload their sample data to an S3 bucket, copy that into Redshift, then run a few SQL queries.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The big surprise and takeaway was that this felt just like using a relational database. The data I uploaded was in text files (basically CSV files) and was used to create tables. The queries I ran were SQL queries that joined tables, selected different fields and aggregated the results. The details of how the database works under the covers is interesting, but doesn&#39;t inform its usage. There isn&#39;t much mystery after all.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: white;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;SELECT&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; * &lt;/span&gt;&lt;span style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;FROM&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; event&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgh2aXHBERgpUZrBpl-etZpmAI2tlbritIdfKJA9jpdBNcwLe-gTB0zwIpNNjtyDpfWPFZhbVW5QoElbU11-V13-XLXWAxESesXtjpNZ2gSGFci9Z9wMy1je5mBECl_AfbOoiW1RyCqFfuN/s1289/selectstar.PNG&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em; text-align: center;&quot;&gt;&lt;font size=&quot;4&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;456&quot; data-original-width=&quot;1289&quot; height=&quot;226&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgh2aXHBERgpUZrBpl-etZpmAI2tlbritIdfKJA9jpdBNcwLe-gTB0zwIpNNjtyDpfWPFZhbVW5QoElbU11-V13-XLXWAxESesXtjpNZ2gSGFci9Z9wMy1je5mBECl_AfbOoiW1RyCqFfuN/w640-h226/selectstar.PNG&quot; width=&quot;640&quot; /&gt;&lt;/font&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: white;&quot;&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;SELECT&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; firstname, lastname, total_quantity 
&lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;FROM&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;   (&lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;SELECT&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; buyerid, &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;sum&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;(qtysold) total_quantity
        &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;FROM&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;  sales
        &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;GROUP&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;BY&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; buyerid
        &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;ORDER&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;BY&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; total_quantity &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;desc&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;limit&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; &lt;/span&gt;&lt;span class=&quot;hljs-number&quot; style=&quot;color: #986801; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;10&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;) Q, &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;users&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;WHERE&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; Q.buyerid = userid
&lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;ORDER&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;BY&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt; Q.total_quantity &lt;/span&gt;&lt;span class=&quot;hljs-keyword&quot; style=&quot;color: #794938; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;desc&lt;/span&gt;&lt;span style=&quot;color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;background-color: white; color: #16191f; font-family: Monaco, Menlo, Consolas, &amp;quot;Courier Prime&amp;quot;, Courier, &amp;quot;Courier New&amp;quot;, monospace; font-size: 14.88px; white-space: pre;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjawYGeUXNQQCjCNa8BSCp_SGfPoOdFyZiGwh8c1-bW1M-l8MM4p0T7k10AGxUtmFoRcg9QMOA3dtNlcAsMR0thzhD5zjWZ0kOqY2Yki55VP1vFcqu0pir50cF07oV3GYeN78ZlALFfQgdz/s979/topevents.PNG&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;411&quot; data-original-width=&quot;979&quot; height=&quot;268&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjawYGeUXNQQCjCNa8BSCp_SGfPoOdFyZiGwh8c1-bW1M-l8MM4p0T7k10AGxUtmFoRcg9QMOA3dtNlcAsMR0thzhD5zjWZ0kOqY2Yki55VP1vFcqu0pir50cF07oV3GYeN78ZlALFfQgdz/w640-h268/topevents.PNG&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A couple of other column-oriented databases you may have heard of are Cassandra and HBase. NoSQL databases are built to scale horizontally so you also need to consider how the&lt;a href=&quot;https://en.wikipedia.org/wiki/CAP_theorem&quot; target=&quot;_blank&quot;&gt;&amp;nbsp;CAP Theorem&lt;/a&gt;&amp;nbsp;applies to your situation when choosing among them as they each make different trade-offs in addition to their unique features. The best choice will be highly data and workload specific.&lt;/div&gt;</description><link>http://www.programmingopiethehokie.com/2020/06/column-oriented-database-basics.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgh2aXHBERgpUZrBpl-etZpmAI2tlbritIdfKJA9jpdBNcwLe-gTB0zwIpNNjtyDpfWPFZhbVW5QoElbU11-V13-XLXWAxESesXtjpNZ2gSGFci9Z9wMy1je5mBECl_AfbOoiW1RyCqFfuN/s72-w640-h226-c/selectstar.PNG" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-5061767332458460362</guid><pubDate>Sat, 25 Apr 2020 15:34:00 +0000</pubDate><atom:updated>2020-04-25T11:34:02.046-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">algorithms</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>Dynamic Programming</title><description>Optimal substructure means that the optimal solution can be constructed from optimal solutions of sub-problems. Think recursion. When we combine optimal solutions to non-overlapping sub-problems the algorithmic technique is called divide and conquer. Think of a merge sort where each sub-sort can happen independently and they are combined at the end. When we combine optimal solutions to overlapping sub-problems the algorithmic technique we can use is &lt;a href=&quot;https://en.wikipedia.org/wiki/Dynamic_programming&quot; target=&quot;_blank&quot;&gt;dynamic programming&lt;/a&gt;. Because the sub-problems overlap, we can improve the running time by remembering the results of already computed sub-problems and not computing them again.&lt;br /&gt;
&lt;br /&gt;
An easy way to visualize this is in calculating Fibonacci numbers. The standard recursive solution &lt;a href=&quot;https://www.programmingopiethehokie.com/2014/04/exploring-memoization.html&quot; target=&quot;_blank&quot;&gt;can be memoized&lt;/a&gt; to significantly improve time complexity at the cost of using more space to store sub-problem results. This is top-down dynamic programming—recursion and memoization. Top-down dynamic programming can be easier to program and anecdotally it is more commonly used.&lt;br /&gt;
&lt;br /&gt;
Bottom-up dynamic programming is instead iterative and we solve sub-problems first, building them into bigger sub-problems. Without the overhead of recursive calls we don&#39;t necessarily increase space complexity, but still have the decrease in time complexity. For calculating the nth Fibonacci number this looks like:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/c751a2335d76125aa35d153243c0b5f7.js&quot;&gt;&lt;/script&gt;

This is time complexity O(n) and space complexity O(1), a huge improvement on the naive recursive solution that&#39;s O(2&lt;sup&gt;n&lt;/sup&gt;) time and O(n) space. For details on big-O of recursive algorithms read &lt;a href=&quot;https://www.programmingopiethehokie.com/2016/05/big-o-notation-for-recursive-algorithms.html&quot; target=&quot;_blank&quot;&gt;this&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
Some other interesting applications of dynamic programming are:&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm&quot; target=&quot;_blank&quot;&gt;Dijkstra&#39;s algorithm&lt;/a&gt;&amp;nbsp;(shortest paths)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Tower_of_Hanoi&quot; target=&quot;_blank&quot;&gt;Tower of Hanoi&lt;/a&gt;&amp;nbsp;(teaching game)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Knapsack_problem&quot; target=&quot;_blank&quot;&gt;Knapsack problem&lt;/a&gt;&amp;nbsp;(combinatorial optimization)&lt;/li&gt;
&lt;/ul&gt;
&lt;br /&gt;
The knapsack problem, for example, has a top-down solution but I think the bottom-up solution is especially appealing. Using a matrix to store sub-problem solutions we can make the O(2&lt;sup&gt;n&lt;/sup&gt;) time recursive algorithm O(nW) time and space:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/9dfaf7db1ef7691acf71a9c1504f0e52.js&quot;&gt;&lt;/script&gt;

But wait, there&#39;s more. We only need to remember a part of the matrix for the next iteration, which means we don&#39;t even need the matrix. It can be further optimized to only use O(W) space:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/2b132e9b6607097a12c5b3cfb7fd6ab0.js&quot;&gt;&lt;/script&gt;

Not, in my opinion, an obvious solution by any means, but a very elegant one.
</description><link>http://www.programmingopiethehokie.com/2020/04/dynamic-programming.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-7271250970468002514</guid><pubDate>Sat, 11 Apr 2020 19:32:00 +0000</pubDate><atom:updated>2021-12-30T08:41:57.800-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">algorithms</category><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">quantum</category><title>Quantum Teleportation</title><description>I recently checked out the &lt;a href=&quot;https://quantum-computing.ibm.com/&quot; target=&quot;_blank&quot;&gt;IBM Quantum Experience&lt;/a&gt; and was really impressed with the content available for getting started with quantum computing. There are simulators for executing quantum circuits, runnable both locally and in the cloud, but the highlight is being able to submit jobs to run on a real quantum computer. To create these circuits we use the Python DSL &lt;a href=&quot;https://qiskit.org/documentation/index.html&quot; target=&quot;_blank&quot;&gt;Qiskit&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
For gaining a better understanding of Qiskit I recommend starting with the &lt;a href=&quot;https://www.youtube.com/playlist?list=PLOFEBzvs-Vvp2xg9-POLJhQwtVktlYGbY&quot; target=&quot;_blank&quot;&gt;Coding with Qiskit&lt;/a&gt;&amp;nbsp;video series. Following along with the videos and coding my first circuit then led down a rabbit hole of trying to understand what I was really building and how it could be useful.&lt;br /&gt;
&lt;br /&gt;
&lt;a href=&quot;https://en.wikipedia.org/wiki/Quantum_computing&quot; target=&quot;_blank&quot;&gt;Quantum computing&lt;/a&gt; uses phenomena from quantum mechanics like superposition and entanglement to perform computation. In classical computing we have bits which can either be on or off, 1 or 0. In quantum computing we have &lt;a href=&quot;https://en.wikipedia.org/wiki/Qubit&quot; target=&quot;_blank&quot;&gt;qubits&lt;/a&gt;&amp;nbsp;which can be on or off at the same time, or somewhere in between. This is superposition. I like the coin analogy from &lt;a href=&quot;https://www.wired.co.uk/article/quantum-computing-explained&quot; target=&quot;_blank&quot;&gt;Quantum computers will change the world (if they work)&lt;/a&gt;&amp;nbsp;of this being a coin flip (classical) versus a spinning coin (quantum).&lt;br /&gt;
&lt;br /&gt;
A qubit is represented as a vector of two &lt;a href=&quot;https://en.wikipedia.org/wiki/Complex_number&quot; target=&quot;_blank&quot;&gt;complex numbers&lt;/a&gt; with length 1 where these represent the probability of a 0 or 1 state. Measurements destroy quantum state, collapsing the qubit to a classical bit. A measurement cannot be reversed. A quantum state cannot be copied like a classical bit&#39;s state can. Instead, quantum teleportation is the transfer of state from one qubit to another via a linkage known as entanglement. Even if moved far apart, entangled particles transmit information to one another. One coin flip mirroring another, linked flip.&lt;br /&gt;
&lt;br /&gt;
The probabilistic nature of quantum computing enables many calculations to be processed simultaneously, thus making some intractable problems tractable. &lt;a href=&quot;https://en.wikipedia.org/wiki/Quantum_algorithm&quot; target=&quot;_blank&quot;&gt;Quantum algorithms&lt;/a&gt; use superposition or entanglement to solve problems like factorization or search faster than classical algorithms. Qiskit has some amazing Jupyter notebooks showing how these can be used. Three I thought looked the most interesting, for example:&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/Qiskit/qiskit-iqx-tutorials/blob/master/qiskit/advanced/aqua/machine_learning/qsvm_classification.ipynb&quot; target=&quot;_blank&quot;&gt;quantum-enhanced support vector machines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/Qiskit/qiskit-iqx-tutorials/blob/master/qiskit/advanced/aqua/optimization/max_cut_and_tsp.ipynb&quot; target=&quot;_blank&quot;&gt;traveling salesman problem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/Qiskit/qiskit-iqx-tutorials/blob/master/qiskit/advanced/aqua/finance/optimization/portfolio_diversification.ipynb&quot; target=&quot;_blank&quot;&gt;portfolio diversification&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
My first circuit is much simpler, of course, but I feel good about getting it running. Based on the videos linked above, I teleport a 1 state from one qubit to another and then measure it to confirm. I simulate this locally, in the cloud, and then run it on one of IBM&#39;s quantum computers. The actual quantum computer run includes some noise which is from the imperfect nature of today&#39;s quantum computers.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/cff5f5c086d259b9f10ea2592e1daaa8.js&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2nL7P4u2XlVM8uHtc4JFTHGOu7nCIMbwkge6vl2HeSb7biBNFfCb8WmZaOF3L0ygYQH-cLkcfPs-NhCyN_aZH1DPK-mTFtzOBrno7Ln0w4zAbC190n5WkO_s9pqsYEtp14a3P9C1WfR5t/s1600/quantum.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;298&quot; data-original-width=&quot;371&quot; height=&quot;257&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2nL7P4u2XlVM8uHtc4JFTHGOu7nCIMbwkge6vl2HeSb7biBNFfCb8WmZaOF3L0ygYQH-cLkcfPs-NhCyN_aZH1DPK-mTFtzOBrno7Ln0w4zAbC190n5WkO_s9pqsYEtp14a3P9C1WfR5t/s320/quantum.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;UPDATE:&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;&lt;a href=&quot;https://www.tensorflow.org/quantum&quot; target=&quot;_blank&quot;&gt;TensorFlow Quantum&lt;/a&gt; is another library that could be used if you want to leverage Google&#39;s quantum computing resources.&lt;/div&gt;
&lt;br /&gt;</description><link>http://www.programmingopiethehokie.com/2020/04/quantum-teleportation.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2nL7P4u2XlVM8uHtc4JFTHGOu7nCIMbwkge6vl2HeSb7biBNFfCb8WmZaOF3L0ygYQH-cLkcfPs-NhCyN_aZH1DPK-mTFtzOBrno7Ln0w4zAbC190n5WkO_s9pqsYEtp14a3P9C1WfR5t/s72-c/quantum.png" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-3291283279022722242</guid><pubDate>Sat, 28 Mar 2020 21:59:00 +0000</pubDate><atom:updated>2023-07-23T15:26:00.442-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">time series</category><title>Geometric Brownian Motion</title><description>&lt;a href=&quot;https://en.wikipedia.org/wiki/Geometric_Brownian_motion&quot; target=&quot;_blank&quot;&gt;Geometric Brownian Motion&lt;/a&gt;&amp;nbsp;is a stochastic process that can be used to model stock prices. It&#39;s related to random walks and Markov chains. This is a much different way to look at time series than what I explored in my&amp;nbsp;&lt;a href=&quot;https://www.programmingopiethehokie.com/2020/03/time-series-predictions.html&quot; target=&quot;_blank&quot;&gt;Time Series Predictions&lt;/a&gt;&amp;nbsp;post and given the recent market volatility it seems especially timely to take a closer look at it.&lt;br /&gt;
&lt;br /&gt;
There are detailed explanations of the math and theory elsewhere. I&#39;ve just written some code to see it in action. This generates 100 simulations for the S&amp;amp;P 500 ETF with the ticker SPY, modeling the past year:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/0ac4305cbd1eac9e36dfd5d3b07c8285.js&quot;&gt;&lt;/script&gt;
What&#39;s especially interesting today is that the current value of SPY is on the extreme lower end of what we see simulated. Even increasing 100 to 1000 or more, we are still on a rare path.&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6peDysXudTCVo1XdkOMJVVq9sdzAGXqhOovB4rUMx9Cv8iFb0GSJnruFhE-r0P-X21NE0WFm1EgcXJuTit1PxPuoTUHBghzgfrLME_TFTk9SDTQnPFwg2IaBCq-SjxvcrBJq0XqUI_ze7/s1600/gbm.PNG&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;416&quot; data-original-width=&quot;565&quot; height=&quot;292&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6peDysXudTCVo1XdkOMJVVq9sdzAGXqhOovB4rUMx9Cv8iFb0GSJnruFhE-r0P-X21NE0WFm1EgcXJuTit1PxPuoTUHBghzgfrLME_TFTk9SDTQnPFwg2IaBCq-SjxvcrBJq0XqUI_ze7/s400/gbm.PNG&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
There are some known problems with the model that may help explain this. First, volatility isn&#39;t really constant in financial markets. Second, the randomness in GBM is normally distributed but we know that stock returns are not. They have fatter tails, or higher kurtosis. Also stock prices react to specific geopolitical events that definitely are not random, even opening a day at a different level than the previous day&#39;s closing.&lt;br /&gt;
&lt;br /&gt;
Out of curiosity, and because the second issue is the easiest to tweak, I replaced the normal distribution with a &lt;a href=&quot;https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.random.laplace.html&quot; target=&quot;_blank&quot;&gt;Laplace distribution&lt;/a&gt; and see a slightly wider dispersion of results (in both directions). &lt;span style=&quot;font-family: inherit;&quot;&gt;In reality, though, the 52 week range is&amp;nbsp;&lt;span style=&quot;background-color: white; text-align: right;&quot;&gt;218.26 - 339.08 so we still aren&#39;t capturing the extremes witnessed.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPTMzPjNS8APalMbUzur1r5BLrBnhymQHXYk4fS_K6kR28dBGqLFIC_isZ5NWBBeW-jVfTEZniqMcZ-233sxan-fZVchM1Si6PkyCxewT2FSQ1HzOOVUfIxf7v3WS2c6LyBu1iu0lr-_8z/s1600/lgbm.PNG&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;424&quot; data-original-width=&quot;559&quot; height=&quot;302&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPTMzPjNS8APalMbUzur1r5BLrBnhymQHXYk4fS_K6kR28dBGqLFIC_isZ5NWBBeW-jVfTEZniqMcZ-233sxan-fZVchM1Si6PkyCxewT2FSQ1HzOOVUfIxf7v3WS2c6LyBu1iu0lr-_8z/s400/lgbm.PNG&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
Any ideas on what&#39;s going on here? It must have something to do with the massive increase in volatility at the end of what was otherwise a calm year. Please comment.</description><link>http://www.programmingopiethehokie.com/2020/03/geometric-brownian-motion.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6peDysXudTCVo1XdkOMJVVq9sdzAGXqhOovB4rUMx9Cv8iFb0GSJnruFhE-r0P-X21NE0WFm1EgcXJuTit1PxPuoTUHBghzgfrLME_TFTk9SDTQnPFwg2IaBCq-SjxvcrBJq0XqUI_ze7/s72-c/gbm.PNG" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-4870900363894110932</guid><pubDate>Sat, 14 Mar 2020 21:38:00 +0000</pubDate><atom:updated>2021-09-16T17:09:03.621-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">feature engineering</category><category domain="http://www.blogger.com/atom/ns#">machine learning</category><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">time series</category><title>Time Series Predictions</title><description>Time series data is just a series of observations ordered in time. As simple as it sounds, there are important differences when analyzing time series data vs. cross-sectional data. This post will attempt to cover enough basics, from statistics and machine learning, to get to a point where we can forecast future observations.&lt;br /&gt;
&lt;br /&gt;
First, some terminology. Data is &lt;a href=&quot;https://en.wikipedia.org/wiki/Autocorrelation&quot; target=&quot;_blank&quot;&gt;autocorrelated&lt;/a&gt; when there are similarities between an observation and previous observations. &lt;a href=&quot;https://en.wikipedia.org/wiki/Seasonality&quot; target=&quot;_blank&quot;&gt;Seasonality&lt;/a&gt; is when the similarities occur at regular intervals. Trend is a long-term upward or downward movement. And data is &lt;a href=&quot;https://en.wikipedia.org/wiki/Stationary_process&quot; target=&quot;_blank&quot;&gt;stationary&lt;/a&gt; when its statistical properties, like mean and variance, do not change over time.&lt;br /&gt;
&lt;br /&gt;
I&#39;ll generate data with these characteristics to use for the rest of the post:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/c2dcfd0ae7f1f55e279564f74c7cd692.js&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEic_N72kuxGsbpLs5gju_wHdtA2NGeWQEG201sIUhjus3oGNwxF2zYt3Lmsh1itTK-8V4LgEVCgAxcCTXT1U0F-6OKRq5IBh0tMOGfn5S8dZ6vDzZ1eTjWIobD0eT4KahoOeZq_V9B5UOkM/s1600/fig1.PNG&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;474&quot; data-original-width=&quot;627&quot; height=&quot;301&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEic_N72kuxGsbpLs5gju_wHdtA2NGeWQEG201sIUhjus3oGNwxF2zYt3Lmsh1itTK-8V4LgEVCgAxcCTXT1U0F-6OKRq5IBh0tMOGfn5S8dZ6vDzZ1eTjWIobD0eT4KahoOeZq_V9B5UOkM/s400/fig1.PNG&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;h3&gt;
Detecting stationarity&lt;/h3&gt;
While time series data is usually not stationary, stationarity is important because most statistical models and tests have that assumption. The &lt;a href=&quot;https://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test&quot; target=&quot;_blank&quot;&gt;Augmented Dickey-Fuller (ADF) test&lt;/a&gt;&amp;nbsp;can be used on normally distributed data to detect stationarity. The null hypothesis is that the data is not stationary, thus you are looking to reject it with a certain level of confidence.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/5c1847e78d24e3190243b60d738ea52b.js&quot;&gt;&lt;/script&gt;
There are other (non-parametric) stationarity tests without the normally distributed data assumption that are beyond the scope of this post.&lt;br /&gt;
&lt;br /&gt;
&lt;h3&gt;
Transformations&lt;/h3&gt;
By applying different transformations to our data we can make non-stationary data stationary. One approach is to subtract the rolling mean or weighted rolling mean (favoring more recent observations) from the data. A another approach is called differencing. Subtract the difference from some time period ago, like a month or a week, from the data.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/790e27a3210efcc0394ba08857e83aee.js&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgH6xPCJMTRV7rCNafgWBZ-mMMl5mjiLD_O2FsUtN9Ph3hyphenhyphenf683Vk58gfxfVGoxpvThz_y31uRoBw4bD-uSyjDNBR16OklkpEdLtfYVBDlQdGa6ojcMowQ1R3wRZjx8Oa-cdab4VJkZ_OH9/s1600/fig2.PNG&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;430&quot; data-original-width=&quot;564&quot; height=&quot;303&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgH6xPCJMTRV7rCNafgWBZ-mMMl5mjiLD_O2FsUtN9Ph3hyphenhyphenf683Vk58gfxfVGoxpvThz_y31uRoBw4bD-uSyjDNBR16OklkpEdLtfYVBDlQdGa6ojcMowQ1R3wRZjx8Oa-cdab4VJkZ_OH9/s400/fig2.PNG&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;h3&gt;
Forecasting&lt;/h3&gt;
&lt;br /&gt;
Special care must be taken when splitting time series data into a training and a test set. The order must be preserved, the data can not be reshuffled. For cross-validation, it is also important to evaluate the model only on future observations so a &lt;a href=&quot;https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-of-time-series-data&quot; target=&quot;_blank&quot;&gt;variation of k-fold&lt;/a&gt; is needed.&lt;br /&gt;
&lt;br /&gt;
&lt;h4&gt;
SARIMA&lt;/h4&gt;
&lt;a href=&quot;https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average#Variations_and_extensions&quot; target=&quot;_blank&quot;&gt;Seasonal autoregressive integrated moving average (SARIMA)&lt;/a&gt; is a model that can be fitted via a &lt;a href=&quot;https://en.wikipedia.org/wiki/Kalman_filter&quot; target=&quot;_blank&quot;&gt;Kalman filter&lt;/a&gt; to time series data. It accounts for seasonality and trend by differencing the data, however it is a linear model so an observation needs to be a linear combination of past observations. A log or square root transform, for example, might help make the time series linear.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/7ef68746308c57ceb4d9a256c37c3db3.js&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaUswaTdWgiTsdAP7tTi_nyR2phHwgdD7eTxgcN7SqwhumlQKIb4Y_O2OmZ4la2rZrAg806RcZzqrWLNpoxMJjQgKBr_cbmvgo3ckl4MwB4NSA1FcB1eUGGNpaVssa8ptzL4R5l4Dium5x/s1600/fig3.PNG&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;430&quot; data-original-width=&quot;566&quot; height=&quot;303&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaUswaTdWgiTsdAP7tTi_nyR2phHwgdD7eTxgcN7SqwhumlQKIb4Y_O2OmZ4la2rZrAg806RcZzqrWLNpoxMJjQgKBr_cbmvgo3ckl4MwB4NSA1FcB1eUGGNpaVssa8ptzL4R5l4Dium5x/s400/fig3.PNG&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;h4&gt;
&lt;b&gt;RNN&lt;/b&gt;&lt;/h4&gt;
A recurrent neural network (RNN) with &lt;a href=&quot;https://en.wikipedia.org/wiki/Long_short-term_memory&quot; target=&quot;_blank&quot;&gt;long short-term memory (LSTM)&lt;/a&gt; is an alternative to SARIMA for modeling time series data. At the cost of complexity, it can handle non-linear data or data that isn&#39;t normally distributed.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/fc2202f56dddac077a4ab4d4bf7594d1.js&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgqrSlW-HCMn0E6d_e7sRUFPlqOcoR41geD7AROO_v5PEgwekxJ5pT9ZCqiX9S4xq9brVAg0ZOMZQ4jeCyckGRcFoC7bdPY79h5mYu1B-5itGL4jphlWY_UCsoyQ-nY91gO-dPBuaH2J-g/s1600/fig4.PNG&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;417&quot; data-original-width=&quot;564&quot; height=&quot;295&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgqrSlW-HCMn0E6d_e7sRUFPlqOcoR41geD7AROO_v5PEgwekxJ5pT9ZCqiX9S4xq9brVAg0ZOMZQ4jeCyckGRcFoC7bdPY79h5mYu1B-5itGL4jphlWY_UCsoyQ-nY91gO-dPBuaH2J-g/s400/fig4.PNG&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;
I didn&#39;t put a lot of effort into tuning these models, or coming up with additional features, and they aren&#39;t perfect, but we can start to get a feel for how they work. The SARIMA model looks underfit. It did, however, nicely ignore the randomness in the data. The RNN model clearly overfits the data and more work would be needed to get a smoother curve.&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: left;&quot;&gt;
This was my first attempt at working with SARIMAX and RNNs so any feedback is appreciated.&lt;/div&gt;
&lt;br /&gt;</description><link>http://www.programmingopiethehokie.com/2020/03/time-series-predictions.html</link><author>noreply@blogger.com (opiethehokie)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEic_N72kuxGsbpLs5gju_wHdtA2NGeWQEG201sIUhjus3oGNwxF2zYt3Lmsh1itTK-8V4LgEVCgAxcCTXT1U0F-6OKRq5IBh0tMOGfn5S8dZ6vDzZ1eTjWIobD0eT4KahoOeZq_V9B5UOkM/s72-c/fig1.PNG" height="72" width="72"/></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-8696849874476996777</guid><pubDate>Sat, 06 Apr 2019 22:35:00 +0000</pubDate><atom:updated>2023-07-23T15:26:19.571-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">probability</category><category domain="http://www.blogger.com/atom/ns#">python</category><category domain="http://www.blogger.com/atom/ns#">simulation</category><title>Probability Problems Programmatically</title><description>&lt;b&gt;Birthday Problem&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
How big of a group of people do you need so that there is a 50% chance 2 of them share a birthday? How about 99.9%? Of course, 100% is 366 people by the &lt;a href=&quot;https://en.wikipedia.org/wiki/Pigeonhole_principle&quot;&gt;pigeon principle&lt;/a&gt;, but the small numbers for lower probabilities is surprising.&lt;br /&gt;
&lt;br /&gt;
The key is that birthday comparisons are made between every pair of individuals and not just between one individual and everyone else. Then the solution becomes p(n) = 1 - &lt;sub&gt;365&lt;/sub&gt;P&lt;sub&gt;n&lt;/sub&gt;/365&lt;sup&gt;n&lt;/sup&gt;. At n=23, the probability of two individuals having the same birthday is about 50.7%.&lt;br /&gt;
&lt;br /&gt;
If we didn&#39;t know how to arrive at this solution, we could use a simple &lt;a href=&quot;https://en.wikipedia.org/wiki/Monte_Carlo_method&quot;&gt;Monte Carlo&lt;/a&gt; simulation to estimate the probability:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/8b18902e6748b4c0b964cb89c58968f2.js&quot;&gt;&lt;/script&gt;

&lt;b&gt;Monte Hall Problem&lt;/b&gt; &lt;br /&gt;
You are on a game show and are given the opportunity to select one of 3 doors. Behind one door, known to the host, is a major prize. Behind the other 2 doors are goats. Once you&#39;ve made a selection the host will open a door showing a goat. Now you have the opportunity to select a different door. Should you switch or keep your original guess?&lt;br /&gt;
&lt;br /&gt;
This problem is very counter-intuitive and seems to cause disagreement among otherwise intelligent people even after the correct solution is known. By switching doors, you dramatically increase your chance of winning the prize from 1/3 to 2/3.&lt;br /&gt;
&lt;br /&gt;
Again, instead of worrying too much about how to arrive at the correct solution we can simulate it:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/2ef777397a11ed00e9d443b237aca87f.js&quot;&gt;&lt;/script&gt;

&lt;b&gt; Determining if a coin is fair&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
Out of 1000 coin flips, how many heads can you get and still assume with 95% confidence that it&#39;s a fair coin?&lt;br /&gt;
&lt;br /&gt;
Coins flips are a binomial distribution, but can be approximated as a normal distribution based on the &lt;a href=&quot;https://en.wikipedia.org/wiki/Central_limit_theorem&quot;&gt;central limit theorem&lt;/a&gt;. A 95% confidence interval is ± 1.96 standard deviations. The variance in 1000 flips is 15.8. 530 heads is (530-500)/15.8 or ± 1.9 standard deviations.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/39449c73ef1934b7676174186018ae33.js&quot;&gt;&lt;/script&gt;

&lt;b&gt; Rain in Seattle Problem&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
You call 3 friends and ask them if it&#39;s raining. Each friend has a 1/3 chance of lying and 2/3 chance of telling the truth. All 3 friends say it&#39;s raining. What is the probability it&#39;s actually raining?&lt;br /&gt;
If we say it&#39;s raining 10% of the time, then our probability is .1 * (2/3)3 / (.1 * (2/3)3 + .9 * (1/3)3) by &lt;a href=&quot;https://en.wikipedia.org/wiki/Bayes%27_theorem&quot;&gt;Baye&#39;s theorem&lt;/a&gt;. The answer is 47%.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/d68a104f19aae09b6b5b9eef1bfaa019.js&quot;&gt;&lt;/script&gt;

These simple Monte Carlo simulations are a way to get an approximate answer when we don&#39;t know how to (or don&#39;t want to) calculate the exact answer. These were all just a few lines of code, took a few minutes to write, and run nearly instantly. It&#39;s a great way to verify the above answers as well. And to get a more exact answer, you just trade off the quick run time with doing more than 10,000 simulations.
</description><link>http://www.programmingopiethehokie.com/2019/04/probability-problems-programmatically.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-5460194318111350708</guid><pubDate>Wed, 11 Jul 2018 00:53:00 +0000</pubDate><atom:updated>2022-12-29T19:07:45.936-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">algorithms</category><category domain="http://www.blogger.com/atom/ns#">optimization</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>Linear Programming</title><description>&lt;a href=&quot;https://en.wikipedia.org/wiki/Linear_programming&quot; target=&quot;_blank&quot;&gt;Linear programming&lt;/a&gt;&amp;nbsp;(LP) is a technique for the optimization of a linear objective function. Integer programming (IP) has the additional constraint of the unknown variables being integers. I vaguely remember these from school, but they are terms I see every so often so I wanted to code something up and get reacquainted.&lt;br /&gt;
&lt;br /&gt;
Many LP solvers exist and a quick search for a Python library turned me onto &lt;a href=&quot;https://github.com/coin-or/pulp&quot; target=&quot;_blank&quot;&gt;PuLP&lt;/a&gt;. They already had a Sudoku example, which is what I wanted to do, so I&#39;ve turned that into a Jupyter notebook (getting one of these embedded in a post is something else I&#39;ve been wanting to try) to make it a bit easier to follow.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/0f64a5e18246d0827ae32af74d02e9d2.js&quot;&gt;&lt;/script&gt;

LP has polynomial time solutions (see &lt;a href=&quot;https://en.wikipedia.org/wiki/Karmarkar%27s_algorithm&quot; target=&quot;_blank&quot;&gt;Karmarkar&#39;s algorithm&lt;/a&gt;), while IP is &lt;a href=&quot;https://en.wikipedia.org/wiki/NP-hardness&quot; target=&quot;_blank&quot;&gt;NP-hard&lt;/a&gt;&amp;nbsp;(see &lt;a href=&quot;https://en.wikipedia.org/wiki/Branch_and_bound&quot; target=&quot;_blank&quot;&gt;branch-and-bound&lt;/a&gt;). With Sudoku being a relatively small problem it doesn&#39;t seem to matter. These puzzles are easily solvable in under 1 second.&lt;br /&gt;
&lt;br /&gt;
The linear programming I remember was graphing several lines and getting a solution where they intersect. Applying the technique to Sudoku is way more fun. And there are obviously other more useful applications like &lt;a href=&quot;https://blog.thinknewfound.com/2018/08/trade-optimization/&quot; target=&quot;_blank&quot;&gt;portfolio optimization&lt;/a&gt;, airline crew scheduling, or vehicle routing.&lt;br /&gt;
&lt;br /&gt;
While working on this I also came across &lt;a href=&quot;https://developers.google.com/optimization/&quot; target=&quot;_blank&quot;&gt;Google Optimization Tools&lt;/a&gt; which I haven&#39;t heard of before. I don&#39;t know how popular PuLP is, so it could be another, perhaps better supported, alternative.&lt;br /&gt;
&lt;br /&gt;
UPDATE:&lt;br /&gt;
&lt;br /&gt;
&lt;a href=&quot;https://www.cvxpy.org/&quot; target=&quot;_blank&quot;&gt;CVXPY&lt;/a&gt; also looks promising as demonstrated in &lt;a href=&quot;https://towardsdatascience.com/optimization-with-python-how-to-make-the-most-amount-of-money-with-the-least-amount-of-risk-1ebebf5b2f29&quot; target=&quot;_blank&quot;&gt;Optimization with Python: How to make the most amount of money with the least amount of risk&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;https://towardsdatascience.com/integer-programming-vs-linear-programming-in-python-f1be5bb4e60e&quot; target=&quot;_blank&quot;&gt;Integer vs Linear Programming in Python&lt;/a&gt;&amp;nbsp;has a nice comparison of ML to linear programming that will help you decide when to use each.&lt;br /&gt;&lt;/div&gt;</description><link>http://www.programmingopiethehokie.com/2018/07/linear-programming.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-6906121265191374031</guid><pubDate>Sun, 13 Aug 2017 21:43:00 +0000</pubDate><atom:updated>2023-07-09T09:50:00.365-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">concurrency</category><category domain="http://www.blogger.com/atom/ns#">parallelism</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>Python 3 asyncio</title><description>A while back I wrote a few posts about asynchronous programming:&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://www.programmingopiethehokie.com/2014/03/asynchronous-non-blocking-io-java-echo.html&quot; target=&quot;_blank&quot;&gt;Asynchronous Non-blocking I/O Java Echo Server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.programmingopiethehokie.com/2014/04/fixing-my-asynchronous-non-blocking-io.html&quot; target=&quot;_blank&quot;&gt;Fixing My Asynchronous Non-blocking I/O Callback Hell With Monads&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.programmingopiethehokie.com/2014/10/scala-asynchronous-io-echo-server.html&quot; target=&quot;_blank&quot;&gt;Scala Asynchronous IO Echo Server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.programmingopiethehokie.com/2015/12/reactive-actors.html&quot; target=&quot;_blank&quot;&gt;Reactive Actors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
So when I learned that Python 3.5 has added async and await operations I knew I had to check it out to see how it compares. &lt;a href=&quot;https://docs.python.org/3/library/asyncio.html&quot; target=&quot;_blank&quot;&gt;asyncio&lt;/a&gt; describes itself as &lt;i&gt;&quot;infrastructure for writing single-threaded concurrent code using coroutines, multiplexing I/O access over sockets and other resources, running network clients and servers, and other related primitives&quot;&lt;/i&gt;.
&lt;br /&gt;
&lt;br /&gt;
&lt;a href=&quot;https://en.wikipedia.org/wiki/Coroutine&quot; target=&quot;_blank&quot;&gt;Coroutines&lt;/a&gt; in Python are similar to generators, but coroutines (a function definition using &lt;span style=&quot;font-family: Courier New, Courier, monospace;&quot;&gt;async def&lt;/span&gt;) can control where execution continues after the &lt;span style=&quot;font-family: Courier New, Courier, monospace;&quot;&gt;yield&lt;/span&gt; (replaced by &lt;span style=&quot;font-family: Courier New, Courier, monospace;&quot;&gt;await&lt;/span&gt;). You await a another coroutine or a future. The coroutine approach can replace callbacks.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/4551dd63ab0043749ae2a7977f4842d7.js&quot;&gt;&lt;/script&gt;

If we check the time it takes for the program to run, it&#39;s 3 seconds (the longest slow operation) and not 6 seconds (the sum of all the slow operations).
&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;div class=&quot;p1&quot;&gt;Slow operation sleep 1 complete&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Slow operation sleep 2 complete&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Slow operation sleep 3 complete&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;Completed in 3.02 seconds&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;It&#39;s similar if we want to get results from each task. We&#39;re even able to get them as they are available.&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/f2b57a9c42d4136178ef396bddf76d88.js&quot;&gt;&lt;/script&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sleep 1 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 1&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sleep 2 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 2&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sleep 3 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 3&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Completed in 3.00 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;With Python&#39;s &lt;a href=&quot;https://wiki.python.org/moin/GlobalInterpreterLock&quot; target=&quot;_blank&quot;&gt;GIL&lt;/a&gt; you&#39;ve never really been able to run multiple threads in parallel. You&#39;ve had to run concurrent code in multiple processes to leverage multiple CPU cores. With asyncio we are able to at least make single process IO-bound tasks execute faster because it switches between tasks to bypass GIL contention.&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;For CPU-bound code you still have to use multiple processes to parallelize your code. Python&#39;s &lt;a href=&quot;https://docs.python.org/3/library/multiprocessing.html&quot; target=&quot;_blank&quot;&gt;parallel API&lt;/a&gt; limits your ability to use results from these tasks as they become available, as we did in the example above. Waiting for a process to finish and getting a future result both block. asyncio gives us a way to unify the concurrent and parallel APIs.&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;span class=&quot;s1&quot;&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/f625f61f8f2fb8bb00e8db7f365e8e83.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;div class=&quot;p1&quot;&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sum 10000000 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 49999995000000&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sum 20000000 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 199999990000000&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sum 30000000 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 449999985000000&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Completed in 2.91 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;There is some overhead in creating the processes but a single 3 second task is only slightly faster than parallel 1, 2, and 3 second tasks.&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;div class=&quot;p1&quot;&gt;Slow operation sum 30000000 complete&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got result 449999985000000&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Completed in 2.66 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;This hints at another use. Given that asyncio uses a single thread, if there are too many IO-bound tasks or if any of them consumes too much CPU, we can overwhelm it. For such a situation it is possible to create multiple processes, each with its own asyncio event loop.&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;script src=&quot;https://gist.github.com/opiethehokie/560e9d755473fcc1fd5524aa930d8353.js&quot;&gt;&lt;/script&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;1 process and 3 tasks:&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;div class=&quot;p1&quot;&gt;Got 3 results in 6.20 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;1 process and 15 tasks:&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got 15 results in 23.94 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;The simulated mix of IO-bound and CPU-bound code is interesting. It&#39;s slower than just the CPU-bound code on it&#39;s own, but faster than the sum of the IO-bound and CPU-bound code. We see some asyncio benefits up to around 15 tasks and then it levels off.&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;1 process and 30 tasks:&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got 30 results in 46.58 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;4 processes and 15*4 tasks:&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got 60 results in 34.11 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;8 processes and 15*8 tasks:&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got 120 results in 52.81 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;We similarly see some benefit of parallelizing the code up the number of CPU cores I have, then the time is increasing linearly as we would expect.&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;div class=&quot;p1&quot;&gt;16 processes and 15*16 tasks:&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;Got 120 results in 105.98 seconds&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;If you are looking for more details the comprehensive&amp;nbsp;&lt;a href=&quot;https://qtalen.medium.com/list/python-concurrency-2c979347da3b&quot; target=&quot;_blank&quot;&gt;Python Concurrency&lt;/a&gt; series of articles has additional examples like this and goes into more detail on some of the underlying concepts.&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;p1&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
</description><link>http://www.programmingopiethehokie.com/2017/08/python-3-asyncio.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-7320208392506590703</guid><pubDate>Thu, 23 Feb 2017 13:01:00 +0000</pubDate><atom:updated>2023-07-23T15:27:02.535-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">linear algebra</category><category domain="http://www.blogger.com/atom/ns#">optimization</category><category domain="http://www.blogger.com/atom/ns#">python</category><title>Vectorization and Eigenvectors: Sports Rating Examples</title><description>While calculating team ratings for a machine learning-based March Madness prediction,&amp;nbsp;I ran into a couple of situations where my code got slow as I expanded it to include all the teams over all the seasons. By slow I mean several hours, and that was longer than I was willing to wait. I needed to be able to recalculate ratings on-demand, in a few minutes at most.&lt;br /&gt;
&lt;br /&gt;
For my offensive-defensive rating, I started with an implementation similar to the one in&amp;nbsp;&lt;a href=&quot;http://biasvariance.blogspot.com/2015/07/blog-post.html&quot;&gt;Offensive and defensive team ratings for the Premier League 2014-2015&lt;/a&gt;:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/c554bf6750b6cff66b55d69680c0a8a8.js&quot;&gt;&lt;/script&gt;

It&#39;s looping over all the rows and columns in the matrix many times. With college basketball having over three-hundred teams, this wasn&#39;t going to work for me. I figured out that it could be &lt;a href=&quot;http://www.programmingopiethehokie.com/2014/08/exploring-vectorization.html&quot;&gt;vectorized&lt;/a&gt;&amp;nbsp;and that numpy could handle it more efficiently than me:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/0dd205f73e497d58a333bb94b23e40d8.js&quot;&gt;&lt;/script&gt;

Not only is it faster, but I would argue the more concise code is easier to understand as well.&lt;br /&gt;
&lt;br /&gt;
For my Markov Chain ranking, I started with an implementation similar to the one in &lt;a href=&quot;http://biasvariance.blogspot.com/2015/07/premier-league-2014-15-season-rankings.html&quot;&gt;A Markov Chain ranking of Premier League teams (14/15 season)&lt;/a&gt;:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/8d5a4a4782e684a455228d0c2b97b4f7.js&quot;&gt;&lt;/script&gt;

I don&#39;t mean to disparage these two posts in any way. They are awesome and really helped me. I just had a different situation and had to worry about performance, and here I figured out that I could get rid of both loops:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/f57df7b9f08ab5d4158216e20eba59b3.js&quot;&gt;&lt;/script&gt;

Eigenvectors to the rescue. The eigenvector for the largest eigenvalue is the stationary distribution we are trying to find and we don&#39;t need to do the 100,000 step random walk.</description><link>http://www.programmingopiethehokie.com/2017/02/machine-learning-for-ncaa-basketball.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-3611918183516956797</guid><pubDate>Tue, 31 May 2016 02:09:00 +0000</pubDate><atom:updated>2023-06-26T07:34:55.307-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">algorithms</category><title>Big-O notation for recursive algorithms</title><description>&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;I once learned of a way to figure out the big-O&lt;sup&gt;*&lt;/sup&gt; for recursive algorithms and in today&#39;s post I just wanted to go back and learn it again. Apparently it&#39;s call the &lt;a href=&quot;https://en.wikipedia.org/wiki/Master_theorem&quot; target=&quot;_blank&quot;&gt;master theorem&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;Basically you take the recursive algorithm and let&amp;nbsp;&lt;i&gt;a&lt;/i&gt;&amp;nbsp;be the number of sub-problems in the recursion,&amp;nbsp;&lt;i&gt;b&lt;/i&gt;&amp;nbsp;the size of each sub-problem, and&amp;nbsp;&lt;i&gt;f&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;) the work done outside the recursive calls.&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;Formulate&amp;nbsp;&lt;i&gt;T&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;) as&amp;nbsp;&lt;i&gt;aT&lt;/i&gt;(&lt;i&gt;n/b&lt;/i&gt;) +&amp;nbsp;&lt;i&gt;f&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;). &lt;i&gt;T&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;)&amp;nbsp;can&#39;t be something like sin &lt;i&gt;n&lt;/i&gt; (must be &lt;a href=&quot;https://en.wikipedia.org/wiki/Monotonic_function&quot; target=&quot;_blank&quot;&gt;monotonic&lt;/a&gt;)&amp;nbsp;and&amp;nbsp;&lt;i&gt;f&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;) can&#39;t be something that&#39;s not polynomial.&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span style=&quot;font-family: Times, Times New Roman, serif;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: Times, Times New Roman, serif;&quot;&gt;If &lt;i&gt;f&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;) ∈ &lt;i&gt;O&lt;/i&gt;(&lt;i&gt;n&lt;sup&gt;d&lt;/sup&gt;&lt;/i&gt;) then the big-O for &lt;i&gt;T&lt;/i&gt;&amp;nbsp;is:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;i&gt;O&lt;/i&gt;(&lt;i&gt;n&lt;sup&gt;d&lt;/sup&gt;&lt;/i&gt;) if &lt;i&gt;a&lt;/i&gt; &amp;lt; &lt;i&gt;b&lt;sup&gt;d&lt;/sup&gt;&lt;/i&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;i&gt;O&lt;/i&gt;(&lt;i&gt;n&lt;sup&gt;d&lt;/sup&gt;&lt;/i&gt; log &lt;i&gt;n&lt;/i&gt;)&amp;nbsp;if &lt;i&gt;a&lt;/i&gt; =&amp;nbsp;&lt;i&gt;b&lt;sup&gt;d&lt;/sup&gt;&lt;/i&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;i&gt;O&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;&lt;sup&gt;log&lt;sub style=&quot;font-style: italic;&quot;&gt;b&lt;/sub&gt;&lt;i&gt; a&lt;/i&gt;&lt;/sup&gt;)&amp;nbsp;if &lt;i&gt;a&lt;/i&gt; &amp;gt;&amp;nbsp;&lt;i&gt;b&lt;sup&gt;d&lt;/sup&gt;&lt;/i&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;As an example, lets apply this to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Merge_sort&quot; target=&quot;_blank&quot;&gt;merge sort&lt;/a&gt;. There are two sub-problems so &lt;i&gt;a&lt;/i&gt; = 2. The size of each sub-problem is half of &lt;i&gt;n&lt;/i&gt;, so b = 2. Outside of the recursive calls there is one pass through the data so &lt;i&gt;d &lt;/i&gt;= 1. This leaves us with&amp;nbsp;&lt;i&gt;T&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt;) = 2&lt;i&gt;T&lt;/i&gt;(&lt;i&gt;n/2&lt;/i&gt;) + &lt;i&gt;n&lt;/i&gt;&amp;nbsp;which is &lt;i&gt;O&lt;/i&gt;(&lt;i&gt;n&lt;/i&gt; log &lt;i&gt;n&lt;/i&gt;).&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;For &lt;a href=&quot;https://en.wikipedia.org/wiki/Binary_search_algorithm&quot; target=&quot;_blank&quot;&gt;binary search&lt;/a&gt;, it would be a = 1, b = 2, and d = 0. Therefore&amp;nbsp;&lt;i&gt;O&lt;/i&gt;(log&amp;nbsp;&lt;i&gt;n&lt;/i&gt;).&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;* Technically I think this should be &lt;a href=&quot;http://stackoverflow.com/questions/471199/what-is-the-difference-between-%CE%98n-and-on&quot; target=&quot;_blank&quot;&gt;big-theta&lt;/a&gt; but informally people only seem to use big-O so I&#39;m sticking with that.&lt;/span&gt;&lt;/div&gt;
&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;sup&gt;&lt;span style=&quot;font-family: inherit; font-size: small;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/sup&gt;
&lt;sup&gt;&lt;br /&gt;&lt;/sup&gt;</description><link>http://www.programmingopiethehokie.com/2016/05/big-o-notation-for-recursive-algorithms.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-335003763204455382</guid><pubDate>Fri, 18 Mar 2016 03:27:00 +0000</pubDate><atom:updated>2016-03-17T23:27:12.389-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">C#</category><category domain="http://www.blogger.com/atom/ns#">cloud</category><category domain="http://www.blogger.com/atom/ns#">devops</category><title>Twelve-Factor ASP.NET Core Apps</title><description>I had never done any .NET development until this past year when I started to work with ASP.NET Core. Microsoft says it&#39;s cloud-ready and I agree, but I haven&#39;t seen anyone describe what a &lt;a href=&quot;http://12factor.net/&quot; target=&quot;_blank&quot;&gt;twelve-factor&lt;/a&gt; ASP.NET Core app looks like yet. In the Cloud Foundry (CF) and Heroku communities twelve-factor apps are a popular topic. This is my take on it for ASP.NET Core.&lt;br /&gt;
&lt;br /&gt;
The twelve factors are best practices for anyone who works on a software-as-a-service app. They help your app be portable, continuously deployed, horizontally scaled, and easier to develop. Basically cloud-native. I&#39;m most familiar with the CF platform-as-a-service, so assume that&#39;s the execution environment of the app in my examples.&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Codebase:&lt;/b&gt;&amp;nbsp;The app is tracked in version control.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Store each app in its own git repository&lt;/li&gt;
&lt;li&gt;Factor shared code into libraries that become dependencies&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Dependencies:&lt;/b&gt;&amp;nbsp;The app declares all of it&#39;s dependencies and never relies on the implicit existence of system-wide libraries or tools.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Use NuGet package manager for distribution&lt;/li&gt;
&lt;li&gt;Declare dependencies in project.json&lt;/li&gt;
&lt;li&gt;Dependency isolation is provided by the automatic creation of project.lock.json&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Config:&lt;/b&gt;&amp;nbsp;Config for anything that varies between deploys as should be stored as environment variables.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Cloud Foundry provides service credentials in the VCAP_SERVICES environment variable&lt;/li&gt;
&lt;li&gt;Use environment variables for other config like whether the environment is dev or prod, or logging levels&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Backing services:&lt;/b&gt;&amp;nbsp;There is no distinction between local and remote services, they are all just resources.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Based on the config, add services to the DI container in the ConfigureServices method of Startup.cs&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Build, release, run:&lt;/b&gt;&amp;nbsp;A build transports the code repository to an executable bundle, release takes the build output and combines it with the config, and run starts the app.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;I think the build could be a dnu publish or the&amp;nbsp;&lt;a href=&quot;https://devcenter.heroku.com/articles/buildpack-api#bin-compile&quot; target=&quot;_blank&quot;&gt;buildpack compile&lt;/a&gt;&amp;nbsp;phase&lt;/li&gt;
&lt;li&gt;I think the release could be pushing the app to CF or the&amp;nbsp;&lt;a href=&quot;https://devcenter.heroku.com/articles/buildpack-api#bin-release&quot; target=&quot;_blank&quot;&gt;buildpack release&lt;/a&gt;&amp;nbsp;phase&lt;/li&gt;
&lt;li&gt;The run (regardless of how you prefer to think of build and release) is CF starting the app and creating a route&lt;/li&gt;
&lt;li&gt;Provide a rollback mechanism by using &lt;a href=&quot;https://docs.pivotal.io/pivotalcf/devguide/deploy-apps/blue-green.html&quot; target=&quot;_blank&quot;&gt;blue-green deploys&lt;/a&gt; and &lt;a href=&quot;http://martinfowler.com/bliki/CanaryRelease.html&quot; target=&quot;_blank&quot;&gt;canary testing&lt;/a&gt;&amp;nbsp;(there are CF CLI plugins for this)&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Processes:&lt;/b&gt;&amp;nbsp;The app is stateless and doesn&#39;t rely on sticky sessions.&amp;nbsp;&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Use a backing service like Redis for HTTP sessions&lt;/li&gt;
&lt;li&gt;Don&#39;t write to the local filesystem because it&#39;s ephemeral and not shared between the multiple instances of the app&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Port binding:&lt;/b&gt;&amp;nbsp;The app is self-contained. It listens on a port specified by the execution environment.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Declare a dependency on Kestrel&lt;/li&gt;
&lt;li&gt;Kestrel can be configured with the --server.urls command-line option to run on a specific host and port&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Concurrency:&lt;/b&gt;&amp;nbsp;The app scales horizontally because nothing is shared.&amp;nbsp;&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Manual CF CLI scale and auto-scaling based on different metrics&lt;/li&gt;
&lt;li&gt;CF restarts app on crashes and provides an API to start and stop the app&lt;/li&gt;
&lt;li&gt;Run at least 3 app instances to keep the app running during data center and CF updates&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Disposability:&lt;/b&gt;&amp;nbsp;The app starts quickly, shuts down gracefully, and handles unexpected non-graceful terminations.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;When an application is deployed the dependencies including the runtime are cached and re-used to create additional instances of the application&lt;/li&gt;
&lt;li&gt;Implement a &lt;a href=&quot;http://stackoverflow.com/questions/35257287/kestrel-shutdown-function-in-startup-cs-in-asp-net-core&quot;&gt;shutdown handler&lt;/a&gt; that can run when CF stops your app&lt;/li&gt;
&lt;li&gt;Use message queues to communicate between applications and keep track of work for non-graceful terminations&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Dev/prod parity:&lt;/b&gt;&amp;nbsp;The development, staging, and production environments are as similar as possible.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Can do development on Linux to match OS in CF&lt;/li&gt;
&lt;li&gt;Can run ASP.NET Core apps in Docker containers even using the &lt;a href=&quot;https://hub.docker.com/r/cloudfoundry/cflinuxfs2/&quot;&gt;cflinuxfs2&lt;/a&gt; image to get really close to CF, for example when running tests&lt;/li&gt;
&lt;li&gt;Can use Kestrel &lt;a href=&quot;http://stackoverflow.com/questions/33748192/how-do-i-start-kestrel-from-a-test/33771514#33771514&quot;&gt;web server in integration tests&lt;/a&gt; and production&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Logs:&lt;/b&gt;&amp;nbsp;Logs are a stream of events and can be consumed by a data warehouse, log indexing, or log analysis service.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;CF apps can log to STDOUT/STDERR and the log messages are buffered in memory where they can be tailed or dumped&lt;/li&gt;
&lt;li&gt;Log messages can be persisted by &lt;a href=&quot;http://docs.cloudfoundry.org/devguide/services/log-management.html#create&quot;&gt;providing a syslog URL&lt;/a&gt; for a log service and they will be drained there&lt;/li&gt;
&lt;/ul&gt;
&lt;li&gt;&lt;b&gt;Admin processes:&lt;/b&gt;&amp;nbsp;One-off admin processes are executed in the same environment and with the same configuration as the app.&lt;/li&gt;
&lt;ul&gt;
&lt;li&gt;Control the app&#39;s start command in a &lt;a href=&quot;http://manifest.yml/&quot;&gt;manifest.yml&lt;/a&gt; file that is version controlled with the app like &lt;i&gt;dnx ef database update &amp;amp;&amp;amp; dnx web ...&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;Employ worker apps to run your admin processes and don&#39;t assign a route to them&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
When you dig into these you realize that many of them can actually be handled by a PaaS like Cloud Foundry or by an application&#39;s runtime like ASP.NET Core. That&#39;s good because then the app can focus on the problem it&#39;s trying to solve. The app developers just need to be aware of and know how to leverage what&#39;s provided. These folks should study the twelve-factor site as it has much more detail and examples than included here.&lt;br /&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Check out the &lt;a href=&quot;https://github.com/cloudfoundry-community/asp.net5-buildpack&quot; target=&quot;_blank&quot;&gt;ASP.NET Core Cloud Foundry buildpack&lt;/a&gt;&amp;nbsp;mentioned above (full disclosure: I am a contributor to this).&lt;br /&gt;
&lt;br /&gt;
Something else to keep an eye on is&amp;nbsp;&lt;a href=&quot;https://github.com/SteelToeOSS&quot; target=&quot;_blank&quot;&gt;Steel Toe OSS&lt;/a&gt;. The twelve factors&amp;nbsp;don&#39;t cover some microservice patterns like service discovery, or latency and fault tolerance patterns like circuit breakers and bulkheads. The .NET world seems to currently be far behind Java in this area, so it&#39;s good to see something in the works.&lt;/div&gt;
</description><link>http://www.programmingopiethehokie.com/2016/03/twelve-factor-aspnet-core-apps.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-3541277328341508218</guid><pubDate>Mon, 28 Dec 2015 21:54:00 +0000</pubDate><atom:updated>2023-11-04T10:42:32.255-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">books</category><category domain="http://www.blogger.com/atom/ns#">concurrency</category><category domain="http://www.blogger.com/atom/ns#">coursera</category><category domain="http://www.blogger.com/atom/ns#">scala</category><title>Reactive Actors</title><description>I&#39;ve been meaning to revisit&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Reactive_programming&quot; target=&quot;_blank&quot;&gt;reactive programming&lt;/a&gt;&amp;nbsp;and the&amp;nbsp;&lt;a href=&quot;https://en.wikipedia.org/wiki/Actor_model&quot; target=&quot;_blank&quot;&gt;actor model&lt;/a&gt;&amp;nbsp;for a while now. I first learned about them in the&amp;nbsp;&lt;a href=&quot;https://www.coursera.org/course/reactive&quot; target=&quot;_blank&quot;&gt;Principles of Reactive Programming&lt;/a&gt;&amp;nbsp;Coursera class and then actors came up again in the&amp;nbsp;&lt;a href=&quot;https://pragprog.com/book/pb7con/seven-concurrency-models-in-seven-weeks&quot; target=&quot;_blank&quot;&gt;Seven Concurrency Models in Seven Weeks&lt;/a&gt;&amp;nbsp;book. The Scala I picked up is quickly being forgotten and I haven&#39;t done a post with code in a while, so here I&#39;ll get back into that and create a simple application using&amp;nbsp;&lt;a href=&quot;http://akka.io/&quot; target=&quot;_blank&quot;&gt;Akka&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href=&quot;http://reactivex.io/rxscala/&quot; target=&quot;_blank&quot;&gt;RxScala&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Actor model&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
The developerWorks article&amp;nbsp;&lt;a href=&quot;https://www.ibm.com/developerworks/library/j-jvmc5/&quot; target=&quot;_blank&quot;&gt;JVM Concurrency: Acting asynchronously with Akka&lt;/a&gt;&amp;nbsp;gives a good introduction to the actor model:&lt;br /&gt;
&lt;blockquote cite=&quot;https://www.ibm.com/developerworks/library/j-jvmc5/&quot;&gt;
The actor model for concurrent computations builds up systems based on primitives called&amp;nbsp;&lt;i&gt;actors&lt;/i&gt;. Actors take actions in response to inputs called&amp;nbsp;&lt;i&gt;messages&lt;/i&gt;. Actions can include changing the actor&#39;s own internal state as well as sending off other messages and even creating other actors. All messages are delivered asynchronously, thereby decoupling message senders from receivers. Because of this decoupling, actor systems are inherently concurrent: Any actors that have input messages available can be executed in parallel, without restriction.&lt;/blockquote&gt;
Then&amp;nbsp;&lt;a href=&quot;http://www.ibm.com/developerworks/opensource/library/j-jvmc6/index.html&quot; target=&quot;_blank&quot;&gt;JVM Concurrency: Building actor applications with Akka&lt;/a&gt;&amp;nbsp;goes on to explain the advantages of this approach:&lt;br /&gt;
&lt;blockquote cite=&quot;http://www.ibm.com/developerworks/opensource/library/j-jvmc6/index.html&quot;&gt;
If you compose your actors and messages correctly, you end up with a system in which most things happen asynchronously. Asynchronous operation is harder to understand than a linear approach, but it pays off in scalability. Highly asynchronous programs are better able to use increased system resources (for example, memory and processors) either to accomplish a particular task more quickly or to handle more instances of the task in parallel. With Akka, you can even extend this scalability across multiple systems, by using remoting to work with distributed actors.&lt;/blockquote&gt;
At first the actor model may sound the same as what I described in my&amp;nbsp;&lt;a href=&quot;http://www.programmingopiethehokie.com/2015/05/communicating-sequential-processes.html&quot; target=&quot;_blank&quot;&gt;Communicating Sequential Processes post&lt;/a&gt;&amp;nbsp;because both involve message passing, but the two concurrency models have several differences:&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;Actors have identities while CSPs are anonymous&lt;/li&gt;
&lt;li&gt;Actors transmit messages to named actors while CSPs transmit messages using channels&lt;/li&gt;
&lt;li&gt;Actors transmit messages asynchronously while CSPs can&#39;t transmit a message until the sender is ready to receive it&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;
My impression, and I could be wrong, is that actors more naturally extend beyond a single machine to a distributed system since the sending and receiving of messages is decoupled. A quick search does turn up&amp;nbsp;&lt;a href=&quot;https://code.google.com/p/pycsp/wiki/Getting_Started_With_Parallel&quot; target=&quot;_blank&quot;&gt;distributed channels in pycsp&lt;/a&gt;, though, so it seems that both can be distributed.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Reactive applications&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
The&amp;nbsp;&lt;a href=&quot;http://www.reactivemanifesto.org/&quot; target=&quot;_blank&quot;&gt;Reactive Manifesto&lt;/a&gt;&amp;nbsp;details four qualities of reactive applications:&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;responsive - the system responds in a timely manner if at all possible&lt;/li&gt;
&lt;li&gt;resilient - the system stays responsive in the face of failure&lt;/li&gt;
&lt;li&gt;elastic - the system stays responsive under varying workloads&lt;/li&gt;
&lt;li&gt;message driven - the system relies on asynchronous message passing between components&lt;/li&gt;
&lt;/ul&gt;
Where Akka describes itself as a toolkit and&amp;nbsp;&lt;i&gt;runtime&lt;/i&gt;, RxScala only claims to be a library for composing asynchronous and event-based programs using observable sequences. To me it&#39;s not clear how it helps us achieve all four qualities (or if it even intends to). &amp;nbsp;Nevertheless, the&amp;nbsp;&lt;a href=&quot;http://reactivex.io/intro.html&quot; target=&quot;_blank&quot;&gt;ReactiveX introduction&lt;/a&gt;&amp;nbsp;explains their advantages:&lt;br /&gt;
&lt;blockquote&gt;
The ReactiveX Observable model allows you to treat streams of asynchronous events with the same sort of simple, composable operations that you use for collections of data items like arrays. It frees you from tangled webs of callbacks, and thereby makes your code more readable and less prone to bugs.&lt;/blockquote&gt;
This means is that the methods returning Observables can be implemented using thread pools, non-blocking I/O, actors, or anything else. This is how ReactiveX and Akka will be used together: Actors are the concurrency implementation for services communicating with asynchronous messages.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Combining Akka and RxScala&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
I came up with the following short example. First I wrote a couple of methods returning an Observable to get a feel for it, then added the&amp;nbsp;stockQuote()&amp;nbsp;method which also uses an actor in it&#39;s implementation:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/2ffbe732f92c21681b80.js&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
Running it produces the expected output, something like:&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;i&gt;6&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;i&gt;8&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;i&gt;broken service&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;
&lt;i&gt;GOOG: 253.22&lt;/i&gt;&lt;br /&gt;
&lt;br /&gt;
I can really see the potential in the Observable model, especially after reading more about it at&amp;nbsp;&lt;a href=&quot;http://techblog.netflix.com/2013/02/rxjava-netflix-api.html&quot; target=&quot;_blank&quot;&gt;The Netflix Tech Blog&lt;/a&gt;.&amp;nbsp;If you were already using actors maybe combining them like this could make sense. I also need to checkout Akka Streams which seems like a similar idea.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;UPDATE&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;https://docs.ray.io/en/latest/ray-overview/index.html&quot; target=&quot;_blank&quot;&gt;Ray&lt;/a&gt; is a framework for parallelizing ML workloads. They use the &lt;a href=&quot;https://docs.ray.io/en/latest/ray-core/actors.html#actor-guide&quot; target=&quot;_blank&quot;&gt;actor model&lt;/a&gt; as a way of coordinating work and maintaining state.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
</description><link>http://www.programmingopiethehokie.com/2015/12/reactive-actors.html</link><author>noreply@blogger.com (opiethehokie)</author></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-1401515658801150010.post-860763885828341144</guid><pubDate>Fri, 04 Sep 2015 22:17:00 +0000</pubDate><atom:updated>2023-07-23T15:28:39.580-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">java</category><category domain="http://www.blogger.com/atom/ns#">math</category><title>Bitwise operations 101</title><description>Whenever I see lists of interview questions it seems like bitwise operations are on there. I&#39;ve never used these at work and I don&#39;t even remember needing to do it in school. I wanted to see if I could answer a few of these. After a quick review of what each operation does this is what I came up with.&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/opiethehokie/034335312e2c6a2df5d4.js&quot;&gt;&lt;/script&gt;
The first three methods--nthBitSet, countBits, and isPalindrome--are fairly intuitive and I feel like they could be reasonable interview questions.&lt;br /&gt;
&lt;br /&gt;
For adding and multiplying I cheated and looked up the algorithms. They sort of make sense, like the repeated additions for multiplication, but it would have taken a long time to come up with that on my own.</description><link>http://www.programmingopiethehokie.com/2015/09/bitwise-operations-101.html</link><author>noreply@blogger.com (opiethehokie)</author></item></channel></rss>