<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:media="http://search.yahoo.com/mrss/"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Towards AI</title>
	<atom:link href="https://towardsai.net/feed" rel="self" type="application/rss+xml" />
	<link>https://towardsai.net</link>
	<description>Making AI accessible to all</description>
	<lastBuildDate>Fri, 10 Apr 2026 12:55:45 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://towardsai.net/wp-content/uploads/2019/05/cropped-towards-ai-square-circle-png-32x32.png</url>
	<title>Towards AI</title>
	<link>https://towardsai.net</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>The L1 Loss Gradient, Explained From Scratch</title>
		<link>https://towardsai.net/p/machine-learning/the-l1-loss-gradient-explained-from-scratch</link>
		
		<dc:creator><![CDATA[Utkarsh Mittal]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 07:44:07 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/the-l1-loss-gradient-explained-from-scratch</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Utkarsh Mittal Originally published on Towards AI. A complete, step-by-step walkthrough of how gradient descent works with absolute-value loss — with diagrams you can actually follow. If you’ve ever read a deep learning tutorial and hit a derivative that seems to appear from nowhere, this article is for you. We’re going to break down one of the simplest — yet most instructive — gradient calculations in machine learning: the gradient of L1 (absolute-value) loss with respect to a single weight. Our concrete example uses these values:The article explains the gradient calculation of L1 loss through a structured approach, starting with a simple regression model and discussing its components, the loss function, and how to derive the gradient with respect to a weight. It emphasizes clarity by using concrete examples and progressively builds the understanding through the chain rule in calculus. The synopsis concludes by contrasting L1 loss&#8217;s insensitivity to outliers with L2 loss&#8217;s responsiveness to error magnitude, ultimately guiding on when to use each loss function effectively. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*Mawr0dvpEL2hKhMR8F-q3w.png" medium="image"></media:content>
            	</item>
		<item>
		<title>Your Postcode Is Deciding Your Care. I Built a Pipeline to Prove It.</title>
		<link>https://towardsai.net/p/machine-learning/your-postcode-is-deciding-your-care-i-built-a-pipeline-to-prove-it</link>
		
		<dc:creator><![CDATA[Yusuf Ismail]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 07:03:31 +0000</pubDate>
				<category><![CDATA[Data Engineering]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/your-postcode-is-deciding-your-care-i-built-a-pipeline-to-prove-it</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Yusuf Ismail Originally published on Towards AI. Picture this. It’s 2 am. You’re on a trolley in a hospital corridor. Not a ward. A corridor. Fluorescent lights, the smell of disinfectant, the sound of a ward that’s full somewhere behind a set of doors that won’t open for you yet. A doctor saw you three hours ago. She decided you needed to be admitted. She wrote it down, made the call, and moved on to the next patient. The bed she ordered for you doesn’t exist yet. So you wait. On the trolley. In the corridor. While the clock runs. This isn’t a worst case scenario. This is Tuesday. I’ve been building data pipelines on NHS performance data for the past few months: outpatient records, referral-to-treatment waiting times, and now A&#38;E. The deeper I go into this data, the more I realise that what I’m looking at isn’t a system under pressure. It’s a system that has quietly normalised something that should never be normal. 1,280 patients. Every single day. Stuck in corridors after a doctor had already decided they needed a bed. I needed to see that number. So I built something to track it properly. The target nobody is hitting The NHS has a target. It’s been there since 2000. 95% of patients arriving at A&#38;E should be seen, treated, and either admitted, discharged or transferred within four hours of arrival. Simple enough. Measurable. Published every month by NHS England for anyone to download. So I downloaded it. All of it. 36 months. April 2022 to March 2025. Every NHS Trust in England, every month, their numbers sitting in CSV files on a government website that most people will never visit. I built a Bronze→Silver→Gold medallion pipeline in Python — ingesting the raw files, cleaning and classifying them, then calculating the metrics that matter. 36 files. 7,238 rows of provider-level data after transformation. Three Gold tables: national, regional, trust-level. Then I ran the numbers. 0 out of 36 months hit the 95% target. Not one. Not even close. The best month in three years was April 2023–65.5%. The worst was December 2023–54.3%. The average across the entire period was 59.8%. The NHS is, on average, 35 percentage points below its own target. Every single month. 19.5 million people waited over four hours in a major A&#38;E between April 2022 and March 2025. That’s 19.5 million moments where someone sat in pain, or fear, or exhaustion, past the point the system promised they wouldn’t have to. NHS A&#38;E 4-hour performance vs 95% target, April 2022 — March 2025. Source: NHS England. The corridor care numbers Underneath the four-hour headline sits a number that doesn’t get enough attention — the 12-hour wait. Not 12 hours to be seen. 12 hours after a doctor has already decided you need to be admitted. You’re in the corridor. The system just can’t find anywhere to put you. Across three years, 1,381,891 people experienced this. And it’s getting worse: 2022/23: 410,029 corridor waits of 12 hours or more 2023/24: 439,411 2024/25: 532,451 That’s a 30% increase in three years. December is consistently the worst month — an average of 50,879 people waiting 12 hours or more, every December, like clockwork. Winter is not a surprise. It happens every year. The system knows it’s coming. Monthly 12-hour corridor waits after decision to admit, England. Source: NHS England. Your postcode is the variable I was on a call earlier this year with my assessor. Somewhere in the conversation, I mentioned what I’d been building — the NHS pipeline, the waiting times data, the postcode lottery finding starting to emerge from the numbers. She paused. Then she told me about where she lives in rural Wales. If she phones her GP in the morning, she gets seen the same day. Her father, who lives in England, has to go online, fill in a form, and hope the algorithm decides he’s urgent enough. She told me about ambulances. Where she lives, they come. In England, she said, they ask you first if you’re breathing — because the triage system is so pressured that the questions you answer on the phone determine whether help arrives at all. She mentioned delays in care for her father during a serious health episode. I didn’t probe further. She mentioned that once, in her part of Wales, the coastguard came when she called for an ambulance. Because that’s who could get there first. I went back to my data. Because what she described — that gap, that difference in what the system gives you depending on where you happen to live — is exactly what the numbers show. The best performing region in England over three years is the South East, at an average of 63.9%. The worst is the North West, at 55.2%. But the regional gap is nothing compared to the trust-level gap. Best performing major A&#38;E trust: Sheffield Children’s NHS Foundation Trust — 90.2% average. Worst: United Lincolnshire Hospitals NHS Trust — 40.5%. 49.7 percentage points between them. Same country. Same target. Same NHS. Different postcode. Average A&#38;E 4-hour performance by NHS England region, April 2022 — March 2025. Source: NHS England. The one that hit different I live in Telford. When I pulled the trust-level data and sorted it, I wasn’t looking for anything specific. I was just reading down the worst performers list. There it was. The Shrewsbury and Telford Hospital NHS Trust. Third worst in England. 43.6% average four-hour performance over 36 months. 825 patients per month waiting 12 hours or more after a decision to admit. This is the hospital that serves the town I live in. The A&#38;E my family would go to if something went wrong tonight. 43.6%. I don’t say that to attack the staff. Anyone who has worked in or around the NHS knows that the people on the floor are doing everything they can. But the data is the data. [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*T_toyoOcJ4KMGpcD4-xXbg.png" medium="image"></media:content>
            	</item>
		<item>
		<title>I Directed AI Agents to Build a Tool That Stress-Tests Incentive Designs. Here’s What It Found.</title>
		<link>https://towardsai.net/p/machine-learning/i-directed-ai-agents-to-build-a-tool-that-stress-tests-incentive-designs-heres-what-it-found</link>
		
		<dc:creator><![CDATA[Selfradiance]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 07:03:15 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/i-directed-ai-agents-to-build-a-tool-that-stress-tests-incentive-designs-heres-what-it-found</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Selfradiance Originally published on Towards AI. Incentive Wargame I don’t write code. I have zero programming experience. What I do is direct AI coding agents — Claude Code, Codex — to build open-source tools, and then I test them until they either work or break in interesting ways. Agent 006 is the sixth tool in that series, and it’s the one that surprised me. It takes a natural-language description of an economic system — a set of rules about who can do what, with what resources, under what constraints — and runs AI-generated adversarial agents against those rules to see what fails. Not a formal verifier. Not a replacement for game theory. An exploratory stress-test that can surface boundary conditions and failure modes before you commit to a design. Here’s what it found, how the pipeline works, and where it falls short. How It Works You write a spec in plain English — a text file describing your scenario’s resources, actions, constraints, and win conditions. Here’s the public goods spec from the repo (trimmed for length): There are 5 agents. Each round, each agent decides how much of their private balance to contribute to a public fund — anywhere from 0 to their current balance. The fund is multiplied by 1.5 and distributed equally. Agents start with 100 tokens. The game runs for 30 rounds. If total contributions drop below 5 tokens for 3 consecutive rounds, the system collapses. The pipeline then makes four Claude API calls in sequence: an extractor structures your spec into a normalized scenario (and flags ambiguities for your review), an economy generator produces a sandboxed JavaScript simulation engine, an archetype generator creates adversarial agent personalities tailored to your specific rules, and a strategy generator writes executable decision logic for each agent. Then the simulation runs for N rounds, checking invariants along the way. A reporter analyzes the results. One CLI command: npx tsx src/cli.ts --spec my-scenario.txt A caveat worth stating up front: the generated economy is Claude’s best interpretation of your spec, not a guaranteed-faithful implementation. Results are non-deterministic — the same spec can produce different outcomes across runs. And the tool currently handles only simultaneous-move, single-action-per-round games. This is an exploratory tool for surfacing issues early, not a validation framework. The Public Goods Finding Notice what the spec above doesn’t say: it doesn’t specify a maximum contribution per round. It says agents can contribute “between 0 and their current balance.” That ambiguity is deliberate — it’s the kind of thing a real spec might leave underspecified. When I first ran this scenario, the LLM extractor interpreted “between 0 and their current balance” and generated a hard cap of 100 tokens per contribution — matching the starting balance, but not scaling with wealth. The economy code enforced that cap rigidly. Here’s what happened: as agents’ balances grew past 100 tokens (thanks to the multiplier compounding wealth over rounds), they tried to contribute more than the generated cap allowed. The system rejected those contributions as invalid — 25 invalid decisions across 30 rounds. Agents couldn’t participate in the economy they were succeeding in. The reporter flagged this as a parameter design flaw: the contribution cap breaks as agent wealth grows. The interesting part: when I ran the same spec again weeks later, the extractor made a different choice — it set the cap at 1,000 tokens and added dynamic clamping to agent balances. Zero invalid decisions. Clean run. Same spec, different interpretation, different outcome. This is non-determinism working as designed. The spec had an ambiguity. One run surfaced it as a design flaw. Another run resolved it silently. Both are useful information — the first tells you your spec has a gap, the second shows one way to close it. I wrote that spec and didn’t catch the ambiguity. The tool did — on one run. That’s what a pre-flight stress test does: surface the issues that are invisible at design time. If you’re prototyping a token economy, a bonus structure, or a resource-sharing policy, running it through this kind of check multiple times is cheaper than finding the boundary condition in production. The Ultimatum Bug: What the Investigation Loop Looks Like The second scenario worth discussing is the ultimatum game — not because the result was impressive, but because it shows what happens when the tool’s own generated code fails. The spec: each round, a proposer offers a split of a pot. A responder accepts or rejects. Agents rotate roles. Standard ultimatum game. First run: total collapse after 5 of 40 rounds. Zero payouts. 0% acceptance rate. Every agent failed, including the cooperative ones. When I dug into the generated economy code, I found the problem: the tick() function swapped proposer/responder roles before processing decisions. Every agent&#39;s decision was evaluated against the wrong role. Every decision was silently discarded. The cooperative agents were doing everything right, but the economy couldn&#39;t see it. This is worth calling out because it’s a failure mode I hadn’t expected from LLM-generated code: execution-order assumptions that aren’t in the spec get silently wrong. The generated economy wasn’t violating any invariants — it was just processing things in a sequence that made all decisions invisible. No error, no crash, just silent failure. If you’re using LLMs to generate simulation logic, this is one bug pattern worth watching for. The fix was a one-line prompt constraint: process decisions against current roles before making state transitions. After the fix, the re-run completed all 50 rounds — 67% acceptance rate, 7,500 total wealth. The Hardliner archetype dragged down efficiency while cooperative agents absorbed losses to keep the market alive. This kind of tool requires an investigation loop: run, observe anomalous results, inspect the generated code, identify the root cause, fix the generator, re-run. You don’t run it once and trust the output. You use it to poke at your design and see what falls over. Under the Hood The generated economy and strategy [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*VhDG0Qmf0sykpbSuAfQzPg.jpeg" medium="image"></media:content>
            	</item>
		<item>
		<title>Long-Term vs Short-Term Memory for AI Agents: A Practical Guide Without the Hype</title>
		<link>https://towardsai.net/p/machine-learning/long-term-vs-short-term-memory-for-ai-agents-a-practical-guide-without-the-hype</link>
		
		<dc:creator><![CDATA[Andrii Tkachuk]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 06:41:30 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/long-term-vs-short-term-memory-for-ai-agents-a-practical-guide-without-the-hype</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Andrii Tkachuk Originally published on Towards AI. Over the past year, memory has become one of the most overused — and misunderstood — concepts in AI agent design. But before I start, I want to add a few words, most of us building AI agents today didn’t start as “AI engineers”. We come from backend engineering, data engineering, or data science. That background shapes how we think about systems: scalability, reliability, clear lifecycles, and predictable failure modes. And when we bring LLMs and agents into production, we still care about the same things: we don’t want state explosions we don’t want hidden coupling and we definitely don’t want to create systems that make life harder for backend engineers and architects down the line. This article is written from that mindset, not “what sounds impressive in demos”, but what leads to a reasonable trade-off between AI capabilities, backend architecture, and long-term system health. You hear phrases like long-term memory, short-term memory, context engineering, persistent agents, and stateful conversations everywhere. But if you look closely at most real implementations, many teams either: don’t actually use memory at all, or use it in ways that introduce serious scalability and reliability issues. This article aims to cut through the hype and explain, in practical terms, how memory for AI agents actually works, which approaches exist today, and what trade-offs they come with. Photo by dianne clifford on Unsplash Before we start!&#x1F9BE; If this piece gives you something practical you can take into your own system:&#x1F44F; leave 50 claps (yes, you can!) — Medium’s algorithm favors this, increasing visibility to others who then discover the article.&#x1F514; Follow me on Medium and LinkedIn for more deep dives into agentic systems, LLM architecture, and production-grade AI engineering. First, Let’s Define the Terms Clearly Long-Term Memory (LTM) Long-term memory is anything that persists across sessions, restarts, and disconnections (includes the agent’s past behaviors and thoughts that need to be retained and recalled over an extended period of time; this often leverages an external vector store accessible through fast and scalable retrieval to provide relevant information for the agent as needed). Typical characteristics: Stored in databases, object storage, or vector stores Survives process restarts Not necessarily injected into the model on every request Common forms of LTM: Full chat history stored in a relational database Events or messages stored in an append-only log Vector embeddings of conversations or summaries User preferences, profiles, or behavioral facts Think of long-term memory as durable knowledge, not working context. Short-Term Memory (STM) / Working Memory Short-term memory (often called working memory or execution state, includes context information about the agent’s current situations; this is typically realized by in-context learning which means it is short and finite due to context window constraints) is: Ephemeral Session-scoped Typically stored in RAM Used during active interaction In practice, what we call “short-term memory” in agents usually combines: conversational state (messages) execution state (tool outputs, intermediate results) control flow metadata Short-term memory exists to reduce overhead and improve reasoning continuity, not to replace persistence. Approach #1 — The Legacy Stateless Approach (Still Very Common) The most widespread approach today is actually stateless. How it works For every user request: Fetch chat history from a persistent data store Truncate or limit it Inject it into the prompt Run the agent Repeat on the next request history = db.load_last_messages(user_id, limit=20)prompt = build_prompt(history, user_message)response = llm(prompt) Pros Extremely simple Easy to reason about No RAM management concerns Works well in serverless environments Cons Database is hit on every request Context is always injected, even when not needed Hard limits must be enforced aggressively Becomes expensive and slow at scale This approach does not use short-term memory at all. Each request is fully independent. Approach #2 — Short-Term Memory via In-Memory State (LangGraph-Style) A more advanced approach introduces explicit short-term memory. This is the model used by frameworks like LangGraph. Core idea Load long-term memory once Keep a mutable state object in RAM Update it as messages arrive Use it throughout the agent flow Dispose of it when the session ends Conceptually: class ChatState(TypedDict): user_id: str messages: list[dict] Typical flow (e.g., with WebSockets or Socket.IO) SocketIO one of the most common and well-known framework for building chat based applications. On connect Load chat history from the database Store it in an in-memory state object On each message Read state from RAM Update messages Run the agent On disconnect Optionally persist summary Remove state from memory Pros No database calls on every message Much faster per interaction Natural conversational continuity Clean separation between LTM and STM Cons (and they are important) RAM usage grows with: number of concurrent users length of conversations Requires: strict size limits trimming or summarization TTL / garbage collection Socket-based systems have edge cases: dropped connections multiple tabs per user missing disconnect events This approach can be production-ready, but only if memory management is treated as a first-class concern. Context Variables: What They Are (and What They Are Not) Many implementations add context variables (for example, ContextVar in Python) to avoid passing state through every function. This is useful — but limited. Context variables: &#x2714;️ Improve code readability &#x2714;️ Allow access to state “from anywhere” in the execution flow &#x274C; Do NOT persist state across events &#x274C; Do NOT replace an in-memory store They are an access pattern, not a memory strategy. What context variables are good for Avoiding passing state through dozens of function calls Accessing the current execution state inside deep agent logic Improving code readability state = get_current_state()state[&#34;messages&#34;].append(new_message) What they do not do They do not persist memory across events They do not replace an in-memory store They do not solve session lifecycle problems Context variables are a convenience layer, not a memory system. Approach #3 — Memory as a Tool (The New Emerging Pattern) A newer and increasingly popular approach is Memory as a Tool. Before dismissing this approach as “too complex”, I would [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/0*HlWi3rYCiRXBiYIR" medium="image"></media:content>
            	</item>
		<item>
		<title>Your System Prompt Is the Product — Not the Feature</title>
		<link>https://towardsai.net/p/machine-learning/your-system-prompt-is-the-product-not-the-feature</link>
		
		<dc:creator><![CDATA[Nagaraj]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 06:30:25 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/your-system-prompt-is-the-product-not-the-feature</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Nagaraj Originally published on Towards AI. Complete control over everything in one string System prompts are the architecture of every good AI app. Learn to design them precisely and build consistent, role-appropriate Claude integrations. Source : NagarajThe article discusses the importance of system prompts in AI application design, highlighting how well-designed prompts create effective user interactions. It emphasizes the need to establish clear roles and avoid vague instructions, as these can lead to generic responses from AI models. The discussion also covers the impact of different prompt structures on the AI&#8217;s performance, suggesting that a defined role can guide the AI’s understanding and responses, resulting in more meaningful interactions. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*CJiZqlGosSd-UAeV-vSVOw.png" medium="image"></media:content>
            	</item>
		<item>
		<title>The LLM Wiki Trend Has a Retention Problem Nobody Mentions</title>
		<link>https://towardsai.net/p/machine-learning/the-llm-wiki-trend-has-a-retention-problem-nobody-mentions</link>
		
		<dc:creator><![CDATA[Mayank Bohra]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 02:01:01 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/the-llm-wiki-trend-has-a-retention-problem-nobody-mentions</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Mayank Bohra Originally published on Towards AI. The viral LLM Knowledge Base workflow looks productive, but EEG studies show that outsourced note-taking weakens memory and critical thinking. Here is the fix. The LLM Wiki trend is a workflow where you dump raw documents into a folder, point an LLM at it, and let the model build a structured wiki of summaries, backlinks, and concept pages you never edit yourself. A viral post in early April 2026 from an OpenAI co-founder hit sixteen million views within two days, and a wave of build-your-own-wiki tutorials followed. The wiki gets smarter. The reader does not. Banner created with Nano Banana Pro.The article discusses the LLM Wiki trend and its drawbacks, highlighting how outsourcing note-taking to AI can impair memory retention and critical thinking. It presents evidence from research indicating that cognitive offloading—relying on external systems for memory—weakens the brain&#8217;s capacity to encode information leading to poorer long-term recall. The author shares their experience and suggests a new approach that retains cognitive engagement by involving the user actively in summarization and relationships of ideas, thereby enhancing retention of knowledge rather than merely accumulating it. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*cHjrrZFGfpLI2LnhebtZ6w.png" medium="image"></media:content>
            	</item>
		<item>
		<title>Top 20 Data Preparation Interview Questions and Answers (Part 2 of 2)</title>
		<link>https://towardsai.net/p/machine-learning/top-20-data-preparation-interview-questions-and-answers-part-2-of-2</link>
		
		<dc:creator><![CDATA[Shahidullah Kawsar]]></dc:creator>
		<pubDate>Thu, 09 Apr 2026 21:01:01 +0000</pubDate>
				<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/top-20-data-preparation-interview-questions-and-answers-part-2-of-2</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Shahidullah Kawsar Originally published on Towards AI. Machine Learning Interview Preparation Part 25 Data preparation is the foundation of every successful machine learning project. Before algorithms can learn, raw data must be collected, cleaned, understood, and transformed into a form that models can use effectively. This process involves handling missing values, reducing noise, engineering meaningful features, and ensuring data quality and consistency. In this blog, we’ll explore why data preparation matters, the key steps involved, and best practices that help turn messy data into a strong, reliable input for building accurate, robust, and scalable machine learning models. Source: This Image is generated by ChatGPTThis article discusses the critical aspects of data preparation in machine learning, emphasizing its importance in creating effective models. It covers various methodologies for handling data issues such as duplicates, missing values, and necessary transformations to ensure that datasets are clean and usable. The article also elaborates on the significance of uniform preprocessing and highlights key techniques to enhance data quality, ultimately leading to improved model accuracy and reliability. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*SMBmIB2Vi10D77FPJ8DQMA.png" medium="image"></media:content>
            	</item>
		<item>
		<title>LAI #122: Word Embeddings Started in 1948, Not With Word2Vec</title>
		<link>https://towardsai.net/p/machine-learning/lai-122-word-embeddings-started-in-1948-not-with-word2vec</link>
		
		<dc:creator><![CDATA[Towards AI Editorial Team]]></dc:creator>
		<pubDate>Thu, 09 Apr 2026 19:01:06 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/lai-122-word-embeddings-started-in-1948-not-with-word2vec</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Towards AI Editorial Team Originally published on Towards AI. Good morning, AI enthusiasts! This week, we’re covering what happens when AI labs sit across the table from governments, why most AI-generated writing still sounds the same (and how to fix it), and whether open models like Gemma 4 are ready to be ranked at all. We also cover: A modular framework that goes from raw text to a knowledge graph in one command. The full eight-layer architectural evolution behind systems like ChatGPT, from stateless prompts to agentic assistants. Why traditional XAI methods fall short for multi-agent systems and what to build instead. Every major positional encoding method computed by hand, all the way up to RoPE. Why word embeddings trace back to Shannon’s 1948 information theory, not neural networks. Let’s get into it! What’s AI Weekly As AI capabilities mature, the relationship between AI labs and governments is getting complicated, fast. You’ve probably seen the headline version of this story: Anthropic reportedly drew a line on mass surveillance and autonomous weapons, faced pushback from the U.S. government, and OpenAI stepped in to fill the gap. But the real story is more nuanced than that. Both companies were already doing defense work. Both were already in conversations with government agencies. The conflict wasn’t about whether AI companies should work with governments. It was about what happens when a government asks for broader terms and one company says no. This week, I break down what actually happened, what most coverage got wrong, and why this moment sets a precedent that goes well beyond one news cycle. Watch the full video here. AI Tip of the Day When tuning your RAG pipeline, chunk overlap is one of the most skipped parameters. Most implementations set it to zero or a fixed default. Overlap controls how much content is repeated between adjacent chunks. Without it, retrieval can miss context that spans a chunk boundary: the first half of an explanation lands in one chunk, the second half in the next, and neither is retrieved in full. The model still returns an answer, but it is built on an incomplete context. Too much overlap, on the other hand, inflates your index size and slows retrieval without proportional gains in recall. A good starting point is generally an overlap of 10 to 20 percent of your chunk size. Before scaling, evaluate retrieval recall on real queries from your domain. This tip comes directly from our Full Stack AI Engineering course. If you want to build a complete RAG pipeline and go deeper into chunking, overlap tuning, and the full retrieval stack for production RAG, you can check out the course here (the first 6 lessons are available as a free preview). — Louis-François Bouchard, Towards AI Co-founder &#38; Head of Community If you’ve ever used AI to write an email, a blog post, or a project update and spent more time editing the output than it would have taken to write it yourself, this is for you. After 3+ years of editing the same AI slop out of every piece of content at Towards AI, we turned our pattern recognition into a reusable prompt template and are releasing it for free. The Anti-Slop AI Writing Guide has 50+ banned AI phrases, style constraints, and a two-model workflow that catches slop before you ever read the draft. Paste it into any LLM, fill in your topic, and it works across emails, reports, blog posts, proposals, and more. Download the guide, fill in your topic, and let the prompt do what you’ve been doing manually. &#x1F449; Get it free here Learn AI Together Community Section! Featured Community post from the Discord Augmnt_sh has built AOP, an open protocol for real-time observability of autonomous AI agents. It is agent-native, i.e., events are emitted from inside the agent, capturing reasoning and intent, and all events are fire-and-forget HTTP POST with a 500ms timeout, so it doesn’t slow down or crash your agent. It is also vendor-neutral and local-first. Check it out on GitHub and support a fellow community member. If you have any feedback, share it in the thread! AI poll of the week! Almost half of you picked “too early to say,” while the rest is split between Top 3 and mid-tier, which shows that our selection criteria has changed from “is Gemma 4 the best?” to “are we ready to trust a ranking yet?” But Gemma 4 matters more for what it enables than for taking the crown: the last year of Chinese-lab dominance has produced some outstanding open models, but many are huge MoE systems that are awkward to self-host, costly to run cleanly, and for some Western enterprises, just complicated from a compliance standpoint. Gemma 4 gives those teams a credible alternative: US-origin, Apache 2.0-licensed, and practical to deploy on a single GPU, making it a real option for regulated sectors, air-gapped setups, edge devices, and anyone who needs control over data retention and customization. When you choose a model stack, where do you personally fall on the spectrum: “I’ll trade some capability for control (self-hosting, data retention, offline)” vs. “I’ll trade control for capability (hosted APIs, fastest frontier models).” Let’s talk in the thread! Collaboration Opportunities The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too — we share cool opportunities every week! 1. Canvas123 is looking for a peer or mentor to collaborate on projects involving machine learning, astrophysics, and general mathematics. If this is your field of expertise, connect with them in the thread! 2. Tanners1406 is building an orchestration platform and needs developers and early testers for the project. If this sounds interesting, reach out to them in the thread! 3. Jojosef6192 is specializing in Data [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*wcI-xWJpBsMkktDuOjD6zQ.png" medium="image"></media:content>
            	</item>
		<item>
		<title>Top 15 Computer Vision Datasets [2026]</title>
		<link>https://towardsai.net/p/machine-learning/top-15-computer-vision-datasets-2026</link>
		
		<dc:creator><![CDATA[Asad iqbal]]></dc:creator>
		<pubDate>Thu, 09 Apr 2026 09:18:14 +0000</pubDate>
				<category><![CDATA[Computer Vision]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/top-15-computer-vision-datasets-2026</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Asad Iqbal Originally published on Towards AI. A ML engineer’s guide to top image datasets. Learn about ImageNet, COCO, and more, and understand how data annotation and benchmarks drive AI model development. If you are not a premium Medium member, read the full guide FREE here and consider joining Medium to read more such guides. Figure 1: Example of (a) iconic object images, (b) iconic scene images, and (c) non-iconic images.This article provides an in-depth exploration of various computer vision datasets, highlighting their significance in training artificial intelligence models. It begins by defining what computer vision datasets are and discusses crucial aspects such as data quality, diversity, and annotation accuracy. The article details influential datasets like COCO, ImageNet, and Open Images, emphasizing their roles in the AI landscape. It also examines key concepts related to dataset usage, including annotation formats, data augmentation techniques, and best practices for leveraging these datasets in real-world applications to ensure optimal AI model performance. Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor. Published via Towards AI]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*e9tj4kRR7dH_IV8topwfdw.png" medium="image"></media:content>
            	</item>
		<item>
		<title>40 Generative AI Interview Questions That Actually Get Asked in 2026 (With Answers)</title>
		<link>https://towardsai.net/p/machine-learning/40-generative-ai-interview-questions-that-actually-get-asked-in-2026-with-answers</link>
		
		<dc:creator><![CDATA[Darshandagaa]]></dc:creator>
		<pubDate>Thu, 09 Apr 2026 09:02:12 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Towards AI - Medium]]></category>
		<guid isPermaLink="false">https://towardsai.net/p/artificial-intelligence/40-generative-ai-interview-questions-that-actually-get-asked-in-2026-with-answers</guid>

					<description><![CDATA[Last Updated on April 10, 2026 by Editorial Team Author(s): Darshandagaa Originally published on Towards AI. A practitioner’s guide to cracking senior GenAI/LLM engineering roles — from RAG pipelines to multi-agent orchestration I’ve been in AI/ML for eight years. In the last two, almost every interview I’ve sat in — whether for senior data science, ML engineering, or AI product roles — has shifted toward Generative AI. The questions aren’t theoretical anymore. Interviewers want to know if you’ve actually built something: a RAG pipeline that didn’t hallucinate, a multi-agent system that didn’t deadlock, an LLM evaluation suite that caught regressions before production did. This article compiles the 40 questions I’ve encountered most frequently — grouped by topic, with concise but precise answers. If you’re preparing for a senior GenAI role, bookmark this. If you’re already in one, use it to pressure-test your mental model. Section 1: LLM Fundamentals Q1. What is the difference between a base model and an instruction-tuned model? A base model is trained purely on next-token prediction over large corpora. It can complete text but won’t follow instructions reliably. An instruction-tuned model (e.g., GPT-4, Claude) is further fine-tuned on curated instruction-response pairs — often using RLHF or RLAIF — to align outputs to user intent. In production, you almost always use instruction-tuned variants unless you’re doing a very specific fine-tuning task from scratch. Q2. Explain the attention mechanism in transformers and why it matters for LLMs. Attention allows each token to “attend” to all other tokens in the sequence and compute a weighted sum of their value vectors. The key innovation is that the weights (attention scores) are learned through Query-Key dot products. This enables long-range dependencies that RNNs couldn’t capture efficiently. For LLMs, self-attention is what allows the model to resolve pronoun references, track context across thousands of tokens, and perform multi-step reasoning. Q3. What is the context window, and what are the practical challenges of a large one? The context window is the maximum number of tokens the model can process in a single forward pass. Larger windows (128k+ in GPT-4o, Claude 3.7) improve in-context learning but come with quadratic attention complexity — O(n²) in memory and compute. Practically, models also exhibit a “lost in the middle” problem [1], where retrieval accuracy degrades for information positioned in the center of a long context. Q4. What is temperature, and how does it affect generation? Temperature scales the logits before the softmax. At temperature = 0, the model always picks the highest-probability token (greedy). At temperature = 1, probabilities are unchanged. Above 1, the distribution flattens and outputs become more random. For factual tasks, use low temperature (0.0–0.3). For creative tasks, 0.7–1.0 is appropriate. Q5. What is the difference between top-k and top-p (nucleus) sampling? Top-k restricts sampling to the k highest-probability tokens. Top-p samples from the smallest set of tokens whose cumulative probability exceeds p. Top-p is generally preferred because it dynamically adapts the candidate set to the entropy of the distribution — at low-entropy moments, it considers fewer tokens; at high-entropy moments, more. This produces more coherent and contextually appropriate outputs. Section 2: Retrieval-Augmented Generation (RAG) Q6. What problem does RAG solve, and what are its core components? LLMs have a knowledge cutoff and can hallucinate on specific facts. RAG grounds generation in retrieved documents, combining the LLM’s language ability with real-time or domain-specific knowledge. Core components: (1) a document ingestion pipeline with chunking and embedding, (2) a vector store for similarity search, (3) a retriever, and (4) the LLM generator that synthesizes a response from retrieved context. Q7. How do you choose a chunking strategy? This depends on document type and query nature. Fixed-size chunking (e.g., 512 tokens with 50-token overlap) is simple but ignores semantic boundaries. Semantic chunking groups sentences by embedding similarity. Hierarchical chunking creates parent-child relationships — retrieving a small chunk but sending the parent for full context. For legal or structured documents, structure-aware chunking that respects section headers usually outperforms token-based approaches [2]. Q8. What is hybrid search, and when does it outperform pure vector search? Hybrid search combines dense (vector) retrieval with sparse (BM25/TF-IDF) retrieval, then re-ranks using Reciprocal Rank Fusion or a learned reranker. Pure vector search excels at semantic similarity but struggles with keyword-exact queries (e.g., product codes, names, IDs). Hybrid search outperforms both individually when your query distribution is mixed — which is almost always in enterprise settings. Q9. Explain the difference between a reranker and a bi-encoder. A bi-encoder encodes the query and document independently into fixed vectors and computes similarity via dot product — fast but coarse. A reranker (cross-encoder) takes the concatenated query+document pair and scores it jointly using cross-attention — much slower but significantly more accurate. Best practice: use a bi-encoder for fast candidate retrieval from a large corpus, then apply a cross-encoder reranker to the top-k results. Q10. How do you evaluate a RAG pipeline? Using the RAGAS framework [3], you evaluate across four dimensions: (1) Faithfulness — are the claims in the answer grounded in the retrieved context? (2) Answer Relevance — does the answer actually address the question? (3) Context Precision — is the retrieved context relevant? (4) Context Recall — does the retrieved context contain the needed information? In production, I track faithfulness and context precision most closely since those catch hallucinations and retrieval drift. Q11. What is the “lost in the middle” problem in RAG? Research by Liu et al. [1] showed that LLMs are better at using information that appears at the beginning or end of the context window. Information in the middle of a long context is disproportionately ignored. This matters enormously for RAG when you stuff many chunks into the prompt. Mitigations: rerank chunks to put the most relevant ones first, use a “stuffing with boundary tokens” approach, or reduce the number of retrieved chunks. Q12. What are the failure modes of a naive RAG pipeline in production? (1) Chunk granularity mismatch — chunks too large dilute signal; too small lose context. (2) [&#8230;]]]></description>
		
		
		
		<media:content url="https://miro.medium.com/v2/resize:fit:700/1*0736h39bKuBI0LwUgOkMAQ.png" medium="image"></media:content>
            	</item>
	</channel>
</rss>
