<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:media="http://search.yahoo.com/mrss/"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Towards AI</title>
	<atom:link href="https://towardsai.com/feed" rel="self" type="application/rss+xml" />
	<link>https://towardsai.com</link>
	<description>Making AI accessible to all</description>
	<lastBuildDate>Thu, 25 Jun 2026 11:14:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://towardsai.com/wp-content/uploads/2019/05/cropped-towards-ai-square-circle-png-32x32.png</url>
	<title>Towards AI</title>
	<link>https://towardsai.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>I Deleted Every Static Claude API Key I Owned. Here’s the Keyless Migration, Provider by Provider.</title>
		<link>https://towardsai.com/p/machine-learning/i-deleted-every-static-claude-api-key-i-owned-heres-the-keyless-migration-provider-by-provider</link>
		
		<dc:creator><![CDATA[Anup Karanjkar]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:50:15 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51899</guid>

					<description><![CDATA[Author(s): Anup Karanjkar Originally published on Towards AI. Workload Identity Federation just hit GA — the per-provider setup, and the precedence trap that cost me two quiet days Last Tuesday I went looking for every static Claude API key I owned, and stopped counting at eleven. The author recounts migrating from long-lived static Claude API keys to keyless authentication using Workload Identity Federation (WIF), emphasizing that federation doesn’t truly “delete” the secret—it moves trust and credentials upstream to the identity provider. They explain how the system works (issuer, service account, federation rule; runtime JWT exchange to short-lived access tokens), then share the critical migration gotcha: the SDK’s credential precedence chain means that if an environment variable like ANTHROPIC_API_KEY is still present anywhere, it will silently override WIF and make the migration appear successful while doing nothing. The post provides a reliable no-downtime cutover sequence (configure federation in parallel, verify with ant auth status, remove the key everywhere, confirm federation wins, then revoke), and gives guidance for setting tight match conditions per provider (GitHub Actions, Kubernetes, AWS, GCP, Entra/Okta) to avoid wildcard rules. Finally, it stresses what WIF doesn’t solve—upstream IdP misconfiguration, lack of attestation for runtime workload identity, and limited auditability across governance frameworks—so “keyless” must be paired with proper IdP security and auditing of the trust hop you can’t see. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:1000/1*wUqvE_Z40kNav9b6_DWEuw.png" medium="image"></media:content>
            	</item>
		<item>
		<title>I Replaced ChatGPT With Local AI for 30 Days. Here’s What Actually Happened.</title>
		<link>https://towardsai.com/p/machine-learning/i-replaced-chatgpt-with-local-ai-for-30-days-heres-what-actually-happened</link>
		
		<dc:creator><![CDATA[MayhemCode]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:44:38 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51901</guid>

					<description><![CDATA[Author(s): MayhemCode Originally published on Towards AI. Why Local AI Is Not a Fringe Thing Anymore My ChatGPT Plus subscription was costing me $20 a month. That’s $240 a year. For someone who uses AI every single day for drafting, coding help, summarizing long PDFs that number started to bother me. Not because it’s too expensive in absolute terms, but because I kept hearing people say local models had gotten good enough to replace it. I wanted to find out if that was actually true. Why Local AI Is Not a Fringe Thing AnymoreAfter setting up local AI for 30 days with Ollama and Open WebUI on a desktop and a MacBook, the author found that today’s models are genuinely capable for everyday work—especially writing, summarizing, brainstorming, and many “80% of the time” knowledge tasks—often producing results close enough to ChatGPT to be hard to tell apart. Qwen3 32B became the main choice for quality, while smaller or different models (like DeepSeek for reasoning-style tasks and Gemma for lightweight summarization and quick Q&#38;A) served specific use cases. Local AI’s biggest wins were privacy (prompts never leave the machine) and cost for high-volume batch text processing, where local inference can be far cheaper and faster for repetitive jobs. The main frustrations were long-context multi-step reasoning failures, limited or absent image understanding for most local setups, slower response speeds on CPU for big models, and the real time/effort required to troubleshoot local configuration and model selection. Overall, the author concludes that local AI isn’t a full replacement for the best cloud models, but it can replace most cloud usage, making a hybrid workflow (local for the bulk, cloud for the hardest 10–15%) the most practical approach; they end by recommending starter models based on hardware and emphasizing that even when switching back to cloud, the privacy instinct learned during the experiment made the process feel different. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:1000/1*876meYFzhqT_AMbDmbYuSA.jpeg" medium="image"></media:content>
            	</item>
		<item>
		<title>A Practical Guide to Evaluating a Cloud Migration Partner</title>
		<link>https://towardsai.com/p/machine-learning/a-practical-guide-to-evaluating-a-cloud-migration-partner</link>
		
		<dc:creator><![CDATA[Datafortune Inc]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:44:13 +0000</pubDate>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51903</guid>

					<description><![CDATA[Author(s): Datafortune Inc Originally published on Towards AI. Should we move to AWS, Azure, or GCP? Do we need a hybrid architecture? Is multicloud the right long-term strategy? How quickly can we modernize legacy workloads? These are important questions. Yet they often overshadow a decision that can have just as much impact on the outcome of a migration: choosing the partner who will help execute it. When organizations look back on migrations that exceeded budgets or missed deadlines, the story is rarely about a lack of cloud capability. More often, it’s about whether the people leading the migration understood the environment they were moving into, the business they were supporting, and the operational realities waiting on the other side of go-live. That’s a risky imbalance. A cloud migration partner does more than move workloads from one environment to another. Their decisions influence migration timelines, governance models, cost visibility, operational readiness, and the experience of the teams that inherit the environment after launch. If you’re evaluating partners for an upcoming migration, there are a few signals worth paying attention to long before a contract is signed. Why Cloud Migrations Go Wrong Before Migration Begins The cloud platforms themselves are mature, proven, and used at massive scale. But if you’ll ask a room full of IT leaders about failed cloud migrations, you’ll hear familiar explanations. The timeline was too aggressive. Dependencies surfaced late. The application architecture was more complex than expected. Compliance requirements appeared halfway through the project. Teams discovered that critical applications are more interconnected than anyone realized. They are often symptoms rather than root causes. Problems usually emerge in the gaps between planning and execution. Many migration challenges can be traced back to decisions made. Specifically, decisions about how the migration is planned, who is responsible for it, and how success is defined. The first and most common mistake is evaluating a partner primarily through certifications. Cloud certifications matter, as they demonstrate expertise with a platform’s services, tools, and best practices. What they don’t reveal is whether a team has experience migrating an environment that resembles yours. For example, a manufacturing company moving an ERP platform faces a very different set of challenges than a software company migrating customer-facing applications. Another mistake emerges when migration planning focuses almost exclusively on infrastructure. The conversation becomes centered on servers, storage, networking, and timelines, while business processes receive less attention. Unfortunately, business processes are often where the most expensive surprises are hiding. An application exchanges data with other systems, supports multiple departments, and often serves workflows that have evolved over many years. When those relationships aren’t fully understood, migration teams discover them in the middle of execution, usually when changes become significantly more expensive. Three Signals You’re Evaluating the Wrong Things Over the years, a few patterns tend to show up when organizations focus on the wrong evaluation criteria. Signal #1: Every conversation revolves around tools and technologies They should absolutely be part of the discussion. The problem arises when it’s the only discussion. If every meeting centers on cloud services, migration tools, and platform capabilities, you’re only seeing part of the picture. A migration is ultimately a business initiative supported by technology, not the other way around. A partner should be asking questions about operational dependencies, critical business processes, reporting requirements, regulatory obligations, and acceptable downtime windows. Those conversations often reveal more about migration complexity than the technical architecture diagram. Signal #2: Nobody discusses operational ownership Many migration projects are planned around a finish line. The workloads are migrated, and the project is officially complete; nobody talks about what happens after go-live. The first few months after a migration are often when organizations discover optimization opportunities, integration issues, user adoption challenges, and operational adjustments that weren’t visible during planning. A partner’s role during that period can be just as important as their role during the migration itself. If post-migration ownership remains vague throughout the evaluation process, it’s worth digging deeper before moving forward. Signal #3: Compliance appears late in the discussion Not all cloud environments are built for the same purpose. A company adopting a hybrid architecture faces different operational considerations than one pursuing a multicloud strategy. Governance models, networking requirements, security controls, and workload placement decisions can vary significantly depending on the environment being built. Yet many evaluation discussions treat cloud migration as though every destination follows the same blueprint. Understanding the target environment should shape the migration strategy from the beginning. Questions to Ask Every Cloud Migration Partner Once the conversation moves beyond certifications, case studies, and platform expertise, the quality of the evaluation often depends on the questions being asked. The goal isn’t to put a potential partner under pressure. It’s to understand how they think when complexity appears, priorities conflict, and decisions have to be made with incomplete information. Here are a few questions worth bringing into the discussion. Q1. Have You Migrated Workloads Similar to Ours? Experience is most valuable when it is relevant. A partner may have completed dozens of migrations and still have limited experience with the specific challenges your organization faces. Ask for examples that resemble your environment, not just your industry. Pay attention to how they describe the challenges they encountered and how those challenges were resolved. Specific answers tend to reveal genuine experience. Q2. How Do You Identify and Manage Dependencies? Dependencies are responsible for a surprising number of migration delays. Applications exchange data with other systems, rely on shared services, support business processes, and interact with users across multiple departments. The more interconnected the environment, the more important dependency mapping becomes. A strong partner should be able to explain how they discover, document, validate, and monitor dependencies before migration work begins. The methodology matters as much as the final architecture. Q3. What Happens if Something Doesn’t Go According to Plan? Every migration plan includes assumptions. Some of those assumptions will prove accurate. Others won’t. What creates risk is the absence of a structured response when unexpected issues emerge. Ask how [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*4LKPfmHxCakXCZ0G7ICjwQ.png" medium="image"></media:content>
            	</item>
		<item>
		<title>AsyncIO in Python: What It Actually Is and Why Your ‘Async’ Code Might Not Be Async</title>
		<link>https://towardsai.com/p/machine-learning/asyncio-in-python-what-it-actually-is-and-why-your-async-code-might-not-be-async</link>
		
		<dc:creator><![CDATA[Rizwanhoda]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:43:12 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51905</guid>

					<description><![CDATA[Author(s): Rizwanhoda Originally published on Towards AI. First: What Problem Does AsyncIO Solve? Adding async and await to your code doesn&#39;t make it asynchronous. It makes it eligible to be asynchronous. There&#39;s a big difference and it bites almost everyone the first time. Photo by Árpád Czapp on UnsplashThe article explains that AsyncIO is designed to improve performance for I/O-bound workloads by using cooperative multitasking: while tasks are waiting, the event loop can run other pending work rather than blocking a single thread. It walks through how the event loop schedules coroutines and why yielding only happens at proper await points. It also clarifies common failure modes—using sequential awaits when concurrency is needed, accidentally blocking the event loop with synchronous libraries or CPU-heavy work, forgetting to actually run the event loop, and mixing sync/async incorrectly. Through a real FastAPI “before vs after” example and a mental model, the piece shows that async/await are signaling mechanisms, not speed buttons, and real parallelism requires launching multiple coroutines concurrently (e.g., with asyncio.gather or create_task). Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/0*5gnep8V6kYVgCaAE" medium="image"></media:content>
            	</item>
		<item>
		<title>Building Long-Running Claude Managed Agents: Why State Matters More Than Compute</title>
		<link>https://towardsai.com/p/machine-learning/building-long-running-claude-managed-agents-why-state-matters-more-than-compute</link>
		
		<dc:creator><![CDATA[Divy Yadav]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:43:01 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51907</guid>

					<description><![CDATA[Author(s): Divy Yadav Originally published on Towards AI. Photo from AI At 9:03 am on a Tuesday, my research agent said hello and stared at an empty /workspace/. Six hours of analysis from the night before. Gone. The cloned repository. The installed packages. The notes it had spent hours writing. Gone. I had assumed that if an agent stopped working for the night, it could simply continue the next morning. That was wrong. Over the next three weeks, I rebuilt the same workflow on Tensorlake, Cloudflare, and Daytona to figure out what had happened. The hardest part of running Claude Managed Agents isn’t the model. It’s everything underneath it. This is the exact code I ran, the things that broke, and the mistake that cost me two weeks to understand. If you want more such information about AI, consider subscribing to my newsletter, where you will get noise-free AI information every week Link for the newsletter: Newsletter What Claude Managed Agents is, before anything else Photo from Anthropic If you’ve never built with Claude Managed Agents, the architecture needs a minute. Skip this if you already know it. Anthropic runs the reasoning. You run the execution. The agent loop, session state, work queue, and retry logic all live on Anthropic’s infrastructure. You configure a Self-hosted Environment in the Claude Console. When your application starts a session, Anthropic queues the work, your orchestrator picks it up, spins up a sandbox, and the model starts issuing tool calls into that sandbox. Every bash, read, write, grep, and edit call executes inside an environment you own. Anthropic never touches it. You decide what that environment looks like, what it can access, and what happens between sessions. Anthropic’s intelligence is fixed. Your engineering determines whether that intelligence has a stable, stateful environment to work in, or a clean slate that forgets everything the moment it goes idle. What I was building and why it mattered Photo from AI I needed an agent that could do real deep-work research on a codebase: clone a repository, read through the module structure, build an understanding of how the pieces fit together, write notes, and propose refactoring strategies. The kind of work that takes a senior engineer a full day and an AI agent about six hours. The key constraint: the agent couldn’t do this all at once. Sometimes I’d kick off a session at 8pm, let it run until midnight, and pick it back up the next morning. The filesystem it had built during that first session — the analysis notes, the installed tools, the half-read source files — had to be there when the next session started. Rebuilding from scratch each time wasn’t viable. That constraint is what drove every provider decision I made. The requirements I didn’t know I had At the start, I thought I needed a Linux environment that could run Claude Managed Agents. By the end, I realized I actually needed three things. I found them all in one place, but not until I had looked in two others first. A filesystem that survived between work sessions. Near-zero cost while the agent was idle. The ability to branch from an already-completed analysis state. I did not discover all three requirements on day one.I discovered them one mistake at a time. How a session actually starts: the code before the sandbox You drive a session through the reference orchestrator using a simple command: make session PROMPT=&#34;Clone the repository at github.com/tensorlakeai/tensorlake. \Read through the module structure. Write a summary to /workspace/analysis.md. \Note any components that look like they could be simplified.&#34; The orchestrator sends this prompt to Anthropic as a new session. Anthropic picks it up, starts the agent loop, and immediately begins issuing tool calls. Those tool calls arrive at your sandbox. The agent reads files, runs bash commands, writes notes. The session runs until the task is complete or you stop it. The agent stream looks roughly like this as it runs: [thinking] The repository appears to be a Python SDK for…[bash] git clone https://github.com/tensorlakeai/tensorlake[bash] ls -la /workspace/tensorlake/[read] /workspace/tensorlake/tensorlake/sandbox.py[write] /workspace/analysis.md[thinking] The Sandbox class handles… Each bracketed event is a tool call going into your sandbox. The session accumulates state inside /workspace/ across all those calls. By the end of a six-hour session, that directory contains the cloned repo, installed packages, analysis files, and intermediate notes. That’s the state that needs to survive overnight. Build 1: Cloudflare Photo from Cloudflare My first assumption was that I needed a platform that could efficiently run Claude Managed Agents. Cloudflare is optimized for high-concurrency execution. My problem turned out to be different. The agent I was building accumulated hours of filesystem state between bursts of work. Notes, cloned repositories, installed dependencies, and intermediate analysis all needed to survive overnight. Cloudflare’s execution model wasn’t designed around that requirement.That was the first time I realized I wasn’t looking for compute. I was looking for persistent state. Build 2: Daytona Photo from Daytona The second build solved part of the problem.The agent could accumulate state throughout a session, which initially felt like progress. Then I wanted to test three different refactoring strategies starting from the same six-hour analysis. Instead of branching from that state, I found myself repeating the setup work each time: rebuilding context, reinstalling dependencies, and re-running analysis before I could begin the actual experiment. That was when I discovered my second requirement.Preserving state wasn’t enough.I also needed a way to branch from an existing state without repeating hours of work. Build 3: Tensorlake Photo from Tensorlake The first thing that caught my attention was not a feature. It was an architectural decision. Most platforms preserve state by keeping compute alive. This one treated compute and state as separate problems. The docs described a suspended sandbox that could preserve its state and resume in approximately 0.6 seconds. That was the first time I saw a design that directly addressed the problem I’d been running into. I wanted to know whether it actually worked. I started with [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*dFGAIvcuYh47KDmw_2WpQQ.png" medium="image"></media:content>
            	</item>
		<item>
		<title>The Building Blocks of LangGraph (Part 0)</title>
		<link>https://towardsai.com/p/machine-learning/the-building-blocks-of-langgraph-part-0</link>
		
		<dc:creator><![CDATA[Bessie Delight Kekeli]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:42:38 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51909</guid>

					<description><![CDATA[Author(s): Bessie Delight Kekeli Originally published on Towards AI. The Building Blocks of LangGraph (Part 0) For other parts of the series : Part 0 , Part 1 , Part 2 , Part 3 As Large Language Models (LLMs) have become more capable, developers have moved beyond simple chatbots and begun building systems that can reason, make decisions, use tools, retrieve information, interact with APIs, and collaborate with other AI agents. Building these systems introduces a new challenge: How do we coordinate and manage the flow of intelligence? This is the problem that LangGraph was created to solve. At its core, LangGraph is a framework for building stateful, controllable, and production-ready AI workflows. It allows developers to define how AI agents think, make decisions, communicate with tools, and move through complex tasks. If LangChain helps you connect AI components together, LangGraph helps you orchestrate how those components behave over time. LangGraph is an orchestration framework built by the team behind LangChain. It allows developers to model AI applications as a graph The Simplest Graph(agent flow) Let’s build a simple graph with 3 nodes and one conditional edge. The easiest way to understand nodes, edges, and state is to imagine a food delivery process. A node is simply a task or action that does something. For example: Receive Order is a node. Prepare Food is another node. Deliver Food is another node. Every time some work is performed, you are at a node. An edge is the path that tells the system where to go next. For example: Receive Order ↓Prepare Food ↓Deliver Food Those arrows are the edges. The edge is not doing any work itself. It simply says: “After this step finishes, go to that step.” Think of an edge as a road connecting two cities. The cities are the nodes, and the road is the edge. A state is the information that travels through the entire process. Imagine a customer orders: PizzaAddress: 123 Main StreetCustomer: John When the order is received, that information enters the system. As the order moves from: Receive Order ↓Prepare Food ↓Deliver Food the information moves along with it. That information is the state. Let’s build our first simple agent State Think of state as the graph’s shared memory. It is the information that travels through the workflow as it moves from one node to another. Every node can read the state, update it, and pass the updated version to the next node. In this example, the state contains a single piece of information called graph_state. First, define the State of the graph. The State schema serves as the input schema for all Nodes and Edges in the graph. Let’s use the TypedDict class from python&#39;s typing module as our schema, which provides type hints for the keys. from typing_extensions import TypedDictclass State(TypedDict): graph_state: strNodes Nodes A node is simply a function that performs some work. When a node runs, it receives the current state, does something with it, and returns an updated state. You can think of a node as a worker in a factory. The worker receives a package (the state), modifies it, and then passes it along. The first positional argument is the state, as defined above. Because the state is a TypedDict with schema as defined above, each node can access the key, graph_state, with state[&#39;graph_state&#39;]. Each node returns a new value of the state key graph_state. By default, the new value returned by each node will override the prior state value. def node_1(state): print(&#34;---Node 1---&#34;) return {&#34;graph_state&#34;: state[&#39;graph_state&#39;] + &#34;I am&#34;}def node_2(state): print(&#34;---Node 2---&#34;) return {&#34;graph_state&#34;: state[&#39;graph_state&#39;] + &#34;happy!&#34;}def node_3(state): print(&#34;---Node 3---&#34;) return {&#34;graph_state&#34;: state[&#39;graph_state&#39;] + &#34;sad!&#34;} Edges An edge is simply a connection between nodes. It tells the graph where to go after a node finishes its work. A normal edge is a fixed path. After one node completes, the graph always moves to the same next node. For example, if a workflow has “Collect Data” followed by “Analyze Data,” the graph will always move from the first node to the second. A conditional edge is a decision point. Instead of always following the same path, the graph looks at the current state and decides where to go next. For example, after analyzing data, the graph might ask: “Do I have enough information?” If the answer is yes, it moves to “Generate Report.” If the answer is no, it moves back to “Collect More Data.” Conditional edges are implemented as functions that return the next node to visit based on some logic. import randomfrom typing import Literaldef decide_mood(state) -&#62; Literal[&#34;node_2&#34;, &#34;node_3&#34;]: # Often, we will use state to decide on the next node to visit user_input = state[&#39;graph_state&#39;] # Here, let&#39;s just do a 50 / 50 split between nodes 2, 3 if random.random() &#60; 0.5 # 50% of the time, we return Node 2 return &#34;node_2&#34; # 50% of the time, we return Node 3 return &#34;node_3&#34; Graph Construction Now, we build the graph from our components defined above. The StateGraph class is the graph class that we can use. First, we initialize a StateGraph with the State class we defined above. Then, we add our nodes and edges. We use the START Node, a special node that sends user input to the graph, to indicate where to start our graph. The END Node is a special node that represents a terminal node. Finally, we compile our graph to perform a few basic checks on the graph structure. We can visualize the graph as a Mermaid diagram. from IPython.display import Image, displayfrom langgraph.graph import StateGraph, START, END#Build Graphbuilder = StateGraph(state)builder.add_node(&#34;node_1&#34;, node_1)builder.add_node(&#34;node_2&#34;, node_2)builder.add_node(&#34;node_3&#34;, node_3)#Logicbuilder.add_edge(START, &#34;node_1&#34;)builder.add_conditional_edges(&#34;node_1&#34;, decide_mood)builder.add_edge(&#34;node_2&#34;, END)builder.add_edge(&#34;node_3&#34;, END)#Addgraph = builder.compile()#Viewdisplay(Image(graph.get_graph().draw_mermaid_png())) #OUTPUT Graph Invocation The compiled graph implements the runnable protocol. This provides a standard way to execute LangChain components. invoke is one of the standard methods in this interface. The input is a dictionary {&#34;graph_state&#34;: &#34;Hi, this is lance.&#34;}, which sets the initial value for our graph state dict. When invoke is called, the graph starts execution [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*Q5UozakGUGtRpzd2jPVHpA.png" medium="image"></media:content>
            	</item>
		<item>
		<title>Five Ways Claude Code Runs Multi-Step Work. The Two Questions That Pick the Right One.</title>
		<link>https://towardsai.com/p/machine-learning/five-ways-claude-code-runs-multi-step-work-the-two-questions-that-pick-the-right-one</link>
		
		<dc:creator><![CDATA[Anup Karanjkar]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:42:21 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51911</guid>

					<description><![CDATA[Author(s): Anup Karanjkar Originally published on Towards AI. Single agent, subagents, skills, agent teams, dynamic workflows — a builder’s map, and the one that isn’t really orchestration On May 28, Claude Code got its fifth way to run a multi-step job, and I watched a room of good engineers immediately reach for the wrong one. The article argues that choosing between Claude Code’s multi-step primitives is not primarily about how many agents you want to spawn—agent count is an output, not an input. It presents five ways to run work (single agent, subagents, skills, agent teams, and dynamic workflows), clarifying that skills are orthogonal because they package know-how (e.g., via SKILL.md) and don’t orchestrate or spawn agents. It then sorts the orchestration options with two key questions: who holds the plan (model-held vs code-held, where dynamic workflows move the plan into JavaScript for determinism, repeatability, and verifiable coordination) and how many memories/contexts the task needs (single context vs isolated subagent contexts vs peer coordination via agent teams using shared codebases and hub-and-spoke coordination). Finally, it emphasizes using these questions in order, defaulting to the simplest option that fits, and watching the first run to avoid the “thirty-times tax” from overusing complex orchestration when the job doesn’t require it. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:1000/1*blxKx3yfV4k4ZK_QyNycOQ.png" medium="image"></media:content>
            	</item>
		<item>
		<title>Choose Wisely: Models Should Follow Your Use Case.</title>
		<link>https://towardsai.com/p/machine-learning/choose-wisely-models-should-follow-your-use-case</link>
		
		<dc:creator><![CDATA[Dhanush Kandhan]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:42:08 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51913</guid>

					<description><![CDATA[Author(s): Dhanush Kandhan Originally published on Towards AI. Choose Wisely: Models Should Follow Your Use Case. — By Dhanush Kandhan A guy in my builder’s discord group blew his entire Codex subscription in eleven days. Two weeks into the month, nothing left. You know what he was building? A billing feature in his SaaS. Not a compiler. Not an operating system kernel. Not a real-time physics simulation. A billing page with subscriptions, invoices, and a Dodo Payments webhook that doesn’t send duplicate emails. He said it with the exhausted pride of someone who just pushed to prod at 2 AM (we devs are batmans, right?). I nodded. I didn’t say anything. But inside I was doing the mental math. I run my full AI stack coding agent, agent workflows, browser automation, speech to text, for around $10 — $15 a month. And I ship. Regularly (my github is proof for that). With billing features and everything. That conversation is what this post is about. The Benchmark Theater We All Fell For Let me describe a pattern you’ve probably noticed. A big AI company/lab drops a new model/version. The announcement lands. Within hours, everyone on X is posting about it. “Our model built a C compiler from scratch.” “Our model achieved gold on the International Math Olympiad.” “Our model solved problems that researchers said required human-level reasoning.” Image Credits: Faiapp Meme Creator The posts get thousands of likes. Engineers screenshot the benchmark charts. Someone puts together a thread comparing it to the previous generation. Replies flood in from founders saying they’re switching immediately. Then someone from Chennai quietly tries it on their actual codebase and reports back that it’s roughly the same as before for their use case. This tweet gets eleven likes. I’m not mocking the benchmark results. Building a C compiler is impressive. Scoring on the IMO is legitimately hard. These results tell you something real about what the model is capable of in controlled settings. But here is the question nobody asks loudly enough: when was the last time your actual work required an AI to build a C compiler? Look at what you built last week. Probably a REST endpoint. A React component that talks to it. Some data validation logic. An email template. A webhook handler. A cron job that moves rows between two database tables. Maybe a RAG pipeline if you’re in the AI space. Something with auth. Something with payments. You are not building compiler infrastructure. You are building software for users. Web apps. Mobile apps. Developer tools. Internal automation. The kind of work that, individually, each piece looks boring on a benchmark slide but collectively represents most of the software being written on earth today. The benchmark score tells you the ceiling of what a model can achieve on curated academic tasks. It does not tell you whether the model is the right tool for your Monday morning standup’s ticket queue. I learned this slowly. And expensively. What “Open Source” Actually Means Here? (It’s Not One Thing) Before I get into the specific models, I need to clear up something that trips up engineers constantly. When someone says a model is “open source,” they usually mean one of two very different things, and conflating them leads to bad decisions. The first is open weights. The actual model parameters, the billions of floating point numbers that encode what the model knows are publicly available. You can download them. You can run them on your own hardware. You can fine-tune them on your own data. You can deploy them inside your own VPC and never send a single token to anyone else’s server. You can modify the architecture and release derivatives. Models like GLM-5.2, DeepSeek V4, Kimi K2.6, and Nemotron from NVIDIA are all open-weight models. The weights live on Hugging Face. Most of them ship under MIT licenses, which means you can use them commercially without paying anyone a licensing fee. The second is what most of the subscription-based coding tools are: API access. You get to call their endpoint. The model runs on their servers. Their data retention policy applies to your prompts. Their pricing can change next quarter. If their infrastructure has issues on the day you have a demo, that is your problem too. You never see the weights. You cannot run it locally. The model is theirs; you are renting access. The practical difference matters more than most engineers realize until they’ve felt it. With open weights, your inference cost is literally your compute. You can run through OpenRouter or Together AI and pay per token with no monthly subscription, switching to a better model the day it ships. You can cache aggressively. You can self-host if the data sensitivity requires it. You are not locked into anyone’s pricing model. There is also a comfortable middle path, which is what I run: open-weight models accessed through inference providers. Pay per token, no subscription, full flexibility to switch, and the per-token cost is typically a fraction of what the closed model APIs charge. The Stack. For Real. I’ve read too many “why I use open source models” posts that are basically just “open source good, closed source bad” with a Hugging Face link at the bottom. Useless. Let me be specific. GLM-5.2 for Coding via OpenCode When GLM-5.2 dropped from Z.ai, the Beijing-based lab that used to be called Zhipu AI the X(twitter) reaction was something. Aravind Srinivas posted about it. Guillermo Rauch appreciated it. The Artificial Analysis Intelligence Index ranked it at 51 points, which put it above DeepSeek V4 Pro, Kimi K2.6, and even some Google models. On their GDPval-AA v2 metric, which is their best approximation of real agentic task performance, GLM-5.2 roughly matched GPT-5.5. But you know how it goes. X(Twitter) energy is its own genre. I do not make infra decisions based on who gets quote-tweeted by whom. So I used it. On a $10/month OpenCode Go plan, using it daily. The billing feature I [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:1000/1*gqvfYFyucCmPzOXodOKADA.png" medium="image"></media:content>
            	</item>
		<item>
		<title>You Do Not Need 50 Diffusion Steps. Here Is What Nvidia Proved at GTC.</title>
		<link>https://towardsai.com/p/machine-learning/you-do-not-need-50-diffusion-steps-here-is-what-nvidia-proved-at-gtc</link>
		
		<dc:creator><![CDATA[Siddhant Nitin Patil]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:39:55 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51915</guid>

					<description><![CDATA[Author(s): Siddhant Nitin Patil Originally published on Towards AI. You Do Not Need 50 Diffusion Steps. Here Is What Nvidia Proved at GTC. The video diffusion industry has had the same conversation for two years. Better model. More parameters. Higher resolution. Longer clips. Richer motion. And underneath all of it, the same silent constraint that nobody advertises: generating a single second of 720p video still takes long enough to make most real-time use cases a fantasy. At GTC 2026 in San Jose, Nvidia’s Ziv Ilan from the AI Labs team in Paris gave a 20-minute talk that reframed the problem entirely. The title: You Might Not Need 50 Diffusion Steps. The argument was not about a new model. It was about what happens when you stop treating the step count as a fixed constraint and start treating it as an engineering variable. Why Step Count Is the Real Bottleneck Diffusion models generate images and videos through iterative denoising. Random noise gets progressively cleaned up across a series of steps, each step moving the output closer to the final result. Standard production models run 20 to 50 denoising steps. Each step is a full forward pass through a model that, in the case of modern video diffusion architectures, can have 20 to 40 billion parameters. The math compounds fast. A single 1,328 x 1,328 image generated with Qwen-Image involves approximately 12,900 TFLOPs of computation, producing a latency of up to 127 seconds per image on an Nvidia H20 GPU. For video, where you need consistent quality across frames with temporal coherence, the compute demand grows faster than linearly with resolution and duration. This is why Adobe’s Firefly video generation model, before optimization, was architecturally capable but commercially constrained. State-of-the-art image diffusion already took tens of seconds per image. Video diffusion with a 50-step process at production resolution was simply not viable for interactive or real-time applications. The path forward was not a bigger model. It was a smarter inference stack. The Three-Technique Stack Ilan’s talk organized the solution space into three composable techniques: quantization, caching, and distillation. Critically, these are not alternatives. They are stackable. You deploy them in combination, and each one adds a multiplier to the performance gains of the others. Quantization: Making Each Step Cheaper Quantization reduces the numerical precision of the model’s weights and activations from 16-bit or 32-bit floating point to lower-precision formats: INT8, FP8, or even FP4 in the latest research. For LLMs, the impact of quantization is well understood and well documented. Diffusion models present a more complex picture because they are attention-heavy in ways that LLMs are not. The multi-head attention mechanisms in transformer-based diffusion architectures (DiT models) are more sensitive to precision loss than the feed-forward layers in autoregressive models. This means that naive quantization approaches developed for LLMs often produce measurable quality degradation in diffusion models even at INT8 precision. The solution Nvidia has deployed in production, demonstrated through their collaboration with Black Forest Labs on Flux 2, uses dynamic quantization rather than static quantization. Static quantization pre-computes the activation range across a calibration dataset and applies fixed scaling factors at inference time. Dynamic quantization computes activation ranges on the fly per batch, adapting to the actual data distribution being processed. For diffusion models where the latent space evolves significantly across denoising steps, dynamic quantization maintains quality that static approaches cannot match. The hardware layer amplifies this further. Nvidia’s Blackwell architecture introduced NVFP4 support, a 4-bit floating point format that, combined with Blackwell’s dedicated FP4 tensor cores, delivers performance gains that dwarf what FP8 achieved on Hopper. In ComfyUI benchmarks, NVFP4 optimizations on RTX 50-series cards delivered up to 3x performance boosts over FP16 baselines. For Stable Diffusion 3.5 Large, FP8 quantization alone cuts the VRAM requirement from 18GB to 11GB, opening up mid-range 12GB GPUs for a model that previously required 24GB. The Adobe Firefly case is the most concrete enterprise data point. Using TensorRT with mixed FP8 and BF16 precision on Hopper GPUs via AWS EC2 P5 instances: 60% latency reduction, 40% total cost of ownership reduction, serving more users with fewer GPUs. This is not a research result. It is a production deployment that is live today. One important note from Ilan on diffusion-specific quantization considerations: because these models are more attention-heavy than LLMs, the memory savings from quantization are less dramatic than in the LLM world. The performance gains still matter, but the ratio of memory benefit to compute benefit is different. Quantization should be treated as the entry-point optimization, the lowest-friction gain available, rather than the primary strategy. Quantization gets you into the field. Caching and distillation win the game. Caching: Skipping the Computation You Already Did The second technique exploits a property of diffusion that is counterintuitive until you see it: adjacent denoising steps are highly redundant. When a diffusion model runs 50 steps to generate a video frame, the feature representations in the model’s internal layers do not change dramatically between step 23 and step 24. The high-level structure, the composition, the semantic layout, these are largely determined in the early steps. The middle steps refine. The late steps clean up residual noise and adjust texture. Large swaths of the computation happening in steps 24 through 48 are recalculating values that changed very little from the previous step. This is the same insight that motivated KV caching in LLMs: if you have already computed something and it has not changed meaningfully, do not recompute it. In the autoregressive case, KV cache is straightforward because you are generating one token at a time and the previously computed keys and values are definitionally unchanged. In diffusion, the cache mechanics are more complex because you are denoising across a full latent space simultaneously, but the redundancy is real and measurable. T-cache, the approach Ilan referenced in his talk, operates at the full pixel or latent space level. It computes a similarity metric between the current denoising step’s output and the previous step’s output. If the change [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*cvmV_zrvO0oGQpf2mOuwow.jpeg" medium="image"></media:content>
            	</item>
		<item>
		<title>Understanding Reinforcement Learning — A Primer</title>
		<link>https://towardsai.com/p/machine-learning/understanding-reinforcement-learning-a-primer</link>
		
		<dc:creator><![CDATA[Ayo Akinkugbe]]></dc:creator>
		<pubDate>Thu, 25 Jun 2026 11:36:05 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.com/?p=51917</guid>

					<description><![CDATA[Author(s): Ayo Akinkugbe Originally published on Towards AI. Understanding Reinforcement Learning — A Primer Photo by Girl with red hat on Unsplash Introduction: Learning by Trial and Error Imagine teaching a dog to fetch a ball. You don’t hand the dog a manual titled “The Complete Guide to Ball Retrieval.” Instead, you throw the ball, and when the dog brings it back, you give it a treat. When the dog gets distracted and wanders off, you withhold the treat. Over dozens of repetitions, the dog learns that bringing the ball back leads to rewards, while ignoring the ball doesn’t. This process of learning through interaction, experimentation, and feedback is exactly what reinforcement learning does for artificial intelligence. Teaching a dog to fetch a ball A Different Type of Learning : Supervised, Unsupervised, Reinforced Reinforcement learning is fundamentally different from the other types of machine learning you might be familiar with. In supervised learning, we show the algorithm thousands of examples with correct answers, like showing a child flashcards where one side has a picture of an apple and the other side has the word “apple.” In unsupervised learning, we give the algorithm data without answers and ask it to find patterns, like asking someone to organize a messy drawer without telling them how. But in reinforcement learning, we do something more interesting: we place an agent in an environment, give it a goal, and let it figure out how to achieve that goal through experimentation. The agent doesn’t know the right answer in advance. It doesn’t have a dataset of correct moves to learn from. Instead, it takes actions, observes what happens, receives rewards or penalties, and gradually learns which actions tend to lead to good outcomes and which ones don’t. This is how DeepMind’s AlphaGo learned to beat world champions at Go, how robotic arms learn to grasp objects, and how autonomous vehicles learn to navigate roads. The agent learns by doing, making mistakes, and slowly improving its strategy based on the consequences of its actions. “In reinforcement learning, the agent doesn’t know the right answer in advance. It doesn’t have a dataset of correct moves to learn from. Instead, it takes actions, observes what happens, receives rewards or penalties and gradually learns which actions tend to lead to good outcomes and which ones don’t.” The Core Components of Reinforcement Learning At the heart of every reinforcement learning problem are 5 fundamental components that work together in a continuous loop. Understanding each of these components and how they interact is essential to grasping how reinforcement learning actually works. Agent The agent is the learner or decision-maker. In our dog example, the dog is the agent. In a video game, the agent might be the character you control. In a self-driving car, the agent is the AI system making decisions about steering, acceleration, and braking. The agent exists to make decisions, and its entire purpose is to learn which decisions lead to the best outcomes. The agent doesn’t start out knowing anything; it begins with a blank slate and learns entirely from experience. Environment The environment is everything the agent interacts with. It’s the world in which the agent operates. For the dog, the environment includes the room, the ball, you as the trainer, and all the physical laws that govern how balls bounce and roll. For a chess-playing agent, the environment is the chessboard and the rules of chess. For a trading algorithm, the environment is the stock market with all its complexity, volatility, and rules. The environment responds to the agent’s actions and provides feedback. It’s important to note that the agent doesn’t control the environment; it can only influence it through its actions. “The agent doesn’t control the environment; it can only influence it through its actions.” State A state represents a specific situation or configuration of the environment at a particular moment in time. When you’re teaching the dog to fetch, one state might be “ball has just been thrown and is in the air,” another state might be “ball has landed fifteen feet away,” and another might be “dog has ball in mouth and is five feet from owner.” States capture all the relevant information the agent needs to make a decision. In a video game, the state might include the positions of all characters, their health levels, available items, and the current score. The quality of the state representation is key: if you don’t include important information in your state, the agent won’t be able to make good decisions.\ “A state represents a specific situation or configuration of the environment at a particular moment in time.” Action An action is something the agent can do to interact with the environment. Actions are the agent’s way of influencing its world. For the dog, actions might include “run toward ball,” “pick up ball,” “run toward owner,” or “lie down and take a nap.” For a chess agent, actions are the legal moves available given the current board position. For a robot learning to walk, actions are the specific motor commands sent to each joint and actuator. The set of available actions can change depending on the current state. In chess, the legal moves change with every move made. In the fetch example, the dog can’t pick up the ball if the ball isn’t within reach. “An action is an agent interacting or influencing the environment” Reward The reward is the feedback signal that tells the agent whether its action was good or bad. Rewards are numbers: positive numbers for good outcomes and negative numbers (penalties) for bad outcomes. When the dog brings the ball back, it gets a positive reward (the treat, which we might represent as +10). When it ignores the ball, it gets zero or even a small negative reward (no treat, perhaps represented as -1 or 0). The reward is the only way the environment communicates value to the agent. The agent’s entire learning process is driven by a single objective: maximize [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/0*JO-GYduy74Q4iDum" medium="image"></media:content>
            	</item>
	</channel>
</rss>
