<?xml version="1.0" encoding="UTF-8" standalone="no"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" version="2.0">

<channel>
	<title>Jesse Liberty - Silverlight Geek</title>
	<atom:link href="https://jesseliberty.com/feed/" rel="self" type="application/rss+xml"/>
	<link>https://jesseliberty.com</link>
	<description>More Signal - Less Noise</description>
	<lastBuildDate>Sun, 14 Jun 2026 18:02:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>Creating a multi-agent application – Part 4</title>
		<link>https://jesseliberty.com/2026/06/14/creating-a-multi-agent-application-part-4/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sun, 14 Jun 2026 18:02:36 +0000</pubDate>
				<category><![CDATA[Agents]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13319</guid>

					<description><![CDATA[In part 3 we looked at creating the researcher. As promised, today we&#8217;ll look at the author. You&#8217;ll notice in the following code a great deal of similarity to what we&#8217;ve seen before. The goal is to create a code &#8230; <a href="https://jesseliberty.com/2026/06/14/creating-a-multi-agent-application-part-4/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In <a href="https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-3/">part 3</a> we looked at creating the researcher. As promised, today we&#8217;ll look at the author.</p>



<figure class="wp-block-image size-large is-resized"><img fetchpriority="high" decoding="async" width="770" height="800" src="https://jesseliberty.com/wp-content/uploads/2026/06/author-770x800.jpg" alt="" class="wp-image-13320" style="aspect-ratio:0.9625140291806958;width:284px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/06/author-770x800.jpg 770w, https://jesseliberty.com/wp-content/uploads/2026/06/author-289x300.jpg 289w, https://jesseliberty.com/wp-content/uploads/2026/06/author-144x150.jpg 144w, https://jesseliberty.com/wp-content/uploads/2026/06/author-768x798.jpg 768w, https://jesseliberty.com/wp-content/uploads/2026/06/author.jpg 925w" sizes="(max-width: 770px) 100vw, 770px" /></figure>



<p>You&#8217;ll notice in the following code a great deal of similarity to what we&#8217;ve seen before. The goal is to create a code &#8220;template&#8221; that we can follow as we create any agent; departing only for the agent&#8217;s special requirements and abilities.</p>



<p>As usual, we start with the factory method:</p>



<pre class="wp-block-code"><code>def create_author_chain():
    """Creates the author chain."""
    def author_invoke(state):
        research = state.get("research_findings", &#91;])
        research_text = "\n\n".join(research) if research else "No research available."

        prompt = author_prompt_template.format(
            main_task=state.get("main_task", ""),
            research_findings=research_text,
            draft=state.get("draft", ""),
            review_notes=state.get("review_notes", "")
        )

        try:
            response = llm.invoke(prompt)
            content = response.content if hasattr(response, 'content') else str(response)
            return content if content else "Draft in progress..."
        except Exception as e:
            print(f"Author error: {e}")
            return "Error generating draft. Please try again."

    return author_invoke

# Creating a callable object
author_chain = create_author_chain()</code></pre>



<span id="more-13319"></span>



<p>We start by getting the research findings as a collection. We join all the entries into a single text string, or if there are no findings, we create the string &#8220;No research available.&#8221;</p>



<p>A local variable, prompt, is created from the author_prompt_template and passed to the invoke method of the LLM. The local variable content will contain the metadata or just the response from the LLM if there is no metadata.</p>



<p>The structure, so far, is identical to what we&#8217;ve seen before.</p>



<p>Next, we create the author_prompt_template used above,</p>



<pre class="wp-block-code"><code>author_prompt_template = """
You are a professional blogger.

Main Task: {main_task}

Research Findings:
{research_findings}

Current Draft: {draft}

Review Notes: {review_notes}

Instructions:
- If this is the first draft (no current draft), create a comprehensive post based on the findings
- If there is a current draft and review notes, revise the draft to address all feedback
- Use professional tone
- Make the post consise (aim for 250-500 words)

Write the complete post now:
"""</code></pre>



<p>This is pretty self-explanatory. The interpolation variables (e.g., {draft}) will be filled in when the create_author_chain runs.</p>



<p>Finally, we create the author_node,</p>



<pre class="wp-block-code"><code>def author_node(state: ResearchState) -> dict:
    """Author node that creates or revises draft."""
    print("\n>>>Author")

    draft = author_chain(state)
    print(f"Draft created: {len(draft)} characters")

    return {
        "draft": draft,
        "revision_number": state.get("revision_number", 0) + 1
    }</code></pre>



<h2 class="wp-block-heading">Reviewer</h2>



<p>The reviewer follows the same pattern, so we might as well look at it here. We start with the factory</p>



<pre class="wp-block-code"><code>def create_reviewer_chain():
    """Creates the reviewer chain."""
    def reviewer_invoke(state):
        draft = state.get("draft", "")
        revision_num = state.get("revision_number", 0)

        if len(draft.strip()) &lt; 100:
            return "APPROVED - Draft is minimal but acceptable."

        if revision_num >= 4:
            return "APPROVED - Maximum revisions reached. The report is satisfactory."

        prompt = reviewer_prompt_template.format(
            main_task=state.get("main_task", ""),
            draft=draft
        )

        try:
            response = llm.invoke(prompt)
            content = response.content if hasattr(response, 'content') else str(response)
            return content if content else "APPROVED"
        except Exception as e:
            print(f"Review error: {e}")
            return "APPROVED - Error in review, proceeding with current draft."

    return reviewer_invoke

# Creating a callable object
reviewer_chain = create_reviewer_chain()</code></pre>



<p>The reviewer will either approve the draft or kick it back directly to the author for revision. The only other part here is creating the node for the reviewer.</p>



<pre class="wp-block-code"><code>def reviewer_node(state: ResearchState) -> dict:
    """Node that reviews the draft."""
    print("\n>>REVIEWER")

    review = reviewer_chain(state)
    print(f"Review: {review&#91;:100]}...")

    is_approved = "APPROVED" in review.upper()

    if is_approved:
        print("✓ Draft APPROVED")
        return {
            "review_notes": "APPROVED",
            "next_step": "END"
        }
    else:
        print("✗ Revisions needed")
        return {
            "review_notes": review,
            "next_step": "author"
        }</code></pre>



<p>In the next post we&#8217;ll finally see how the nodes are used in creating a workflow.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Creating a multi-agent application — Part 3</title>
		<link>https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-3/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Fri, 12 Jun 2026 21:16:22 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13316</guid>

					<description><![CDATA[In the previous post, we examined how to load the libraries we need and how to create the Blogger agent. In this post, we&#8217;ll examine the Research agent. You&#8217;ll no doubt notice the pattern of defining the template, the agent &#8230; <a href="https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-3/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In the <a href="https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-2/">previous post</a>, we examined how to load the libraries we need and how to create the Blogger agent. In this post, we&#8217;ll examine the Research agent. You&#8217;ll no doubt notice the pattern of defining the template, the agent and the node. This will carry through for all the agents we&#8217;ll create.</p>



<pre class="wp-block-code"><code>researcher_prompt_template = """You are a researcher for a technical blog 
focused on .NET and AI with examples in C# and Python

Research Topic: {task}

Your goal is to find relevant, up-to-date insights for developers. Focus on:
- Key trends, challenges, or innovations
- Real-world use cases
- Supporting data or quotes from credible sources
- Simple explanations
- Short code examples in C# or Python

Summarize your findings concisely.
"""</code></pre>



<p>In this template we start by telling the researcher what role it will play. We then provide a goal and narrow that goal to a series of topics to focus on and how to present that data.</p>



<span id="more-13316"></span>



<pre class="wp-block-code"><code>def create_researcher_agent():
    """Creates a researcher agent that uses Tavily search."""

    def researcher_invoke(input_dict):
        """Execute research using Tavily search."""
        query = input_dict.get("input", "")

        try:
            search_response = tavily_tool.invoke({"query": query})
            results = search_response.get('results', &#91;])
            formatted_results = &#91;]

            if results:
                for result in results&#91;:3]:
                    title = result.get('title', 'Untitled')
                    url = result.get('url', 'N/A')
                    content = result.get('content', '')
                    formatted_results.append(f">>{title}\nSource: {url}\n{content&#91;:300]}...\n")

                raw_output = "\n".join(formatted_results)
            elif not raw_output:
                raw_output = "No results found"

            # Summarize with LLM
            summary_prompt = f"""Based on these search results about '{query}',
            provide a concise summary of key findings:
            {raw_output}
            """

            summary_response = llm.invoke(summary_prompt)
            summary = summary_response.content if hasattr(summary_response, 'content') else str(summary_response)

            return {
                "output": summary if summary else raw_output,
                "input": query
            }

        except Exception as e:
            print(f"Research error: {e}")
            return {
                "output": f"Research completed on: {query}. Key information has been gathered from web sources.",
                "input": query
            }

    return researcher_invoke

researcher_agent = create_researcher_agent()</code></pre>



<p>Using the same pattern we used with the Blogger we create a factory to create the researcher agent. Note that we flag that we&#8217;ll be using Tavily for searching the web. </p>



<p>We take the raw information obtained and feed it to the LLM asking for a concise summary. This line:</p>



<pre class="wp-block-code"><code>summary = summary_response.content if hasattr(summary_response, 'content') else str(summary_response)</code></pre>



<p>&#8230;attempts to get the content attribute if it exists. Otherwise, it returns the summary_response.</p>



<p>Don&#8217;t be confused by the inner and outer methods. Remember, indentation is critical in Python.</p>



<p>Finally, we&#8217;ll create the research node, just as we did with Blogger. </p>



<pre class="wp-block-code"><code>def research_node(state: ResearchState) -> dict:
    """Research node that gathers information."""
    print("\n>>>RESEARCHER")

    sub_task = state.get("current_sub_task", state.get("main_task"))
    print(f"Researching: {sub_task}")

    try:
        result = researcher_agent({"input": sub_task})
        findings = result.get("output", "Research completed")
        print(f"Found: {str(findings)&#91;:100]}...")
    except Exception as e:
        print(f"Research error: {e}")
        findings = f"Research on {sub_task} - information gathered"

    return {
        "research_findings": &#91;findings]
    }</code></pre>



<p>In the next post, we&#8217;ll examine the Author agent.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Creating a multi-agent application. Part 2</title>
		<link>https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-2/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Fri, 12 Jun 2026 07:00:00 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13308</guid>

					<description><![CDATA[In my previous post, I showed the output of a multi-agent application I wrote to create blog posts (not to worry, it is for demonstration purposes only). In this post, I will begin the process of working through the code, &#8230; <a href="https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In my <a href="https://jesseliberty.com/2026/06/11/creating-a-multi-agent-application/">previous post,</a> I showed the output of a multi-agent application I wrote to create blog posts (not to worry, it is for demonstration purposes only). In this post, I will begin the process of working through the code, line by line.</p>



<p>This application is written in Python, in a Colab notebook, using (among other things) LangChain and LangGraph. To follow along you will need to obtain an API key from OpenAI and a key from <a href="https://www.tavily.com/">Tavily</a>. </p>



<p>If you are a C# programmer with little or no Python experience, <em>don&#8217;t panic!</em> Python is pretty readable, and I&#8217;ll explain any part that is potentially obscure or confusing.</p>



<p>This will be a multi-agent application. The agents we&#8217;ll create will be:</p>



<ul class="wp-block-list">
<li><strong>Blogger</strong> which will orchestrate the others</li>



<li><strong>Researcher</strong>, which will search the web for relevant information</li>



<li><strong>Author</strong>, which will write drafts of the blog post</li>



<li><strong>Reviewer</strong>, which will evaluate the drafts and suggest improvements</li>
</ul>



<p>As a general rule, I try to limit the number of agents to 3-5. Any more than that can get terribly complicated with diminishing returns. Your mileage may vary.</p>



<span id="more-13308"></span>



<h2 class="wp-block-heading">Set Up</h2>



<p>The program begins by importing the necessary libraries.</p>



<pre class="wp-block-code"><code>!pip install -q openai==1.66.3 \
                langchain==0.3.20 \
                langchain-openai==0.3.9 \
                langchain_experimental==0.3.4 \
                langchain-tavily==0.2.4 \
                tavily-python==0.5.0 \
                langgraph==0.3.21 \
                langgraph-supervisor==0.0.18
</code></pre>



<p>You then load your OpenAI API key and Tavily API key from either a configuration file (which is what I do here) or from the environment.</p>



<pre class="wp-block-code"><code>import json
import os
from pprint import pprint

file_name = 'config.json'
with open(file_name, 'r') as file:
    config = json.load(file)
    os.environ&#91;'OPENAI_API_KEY'] = config.get("API_KEY")
    os.environ&#91;"OPENAI_BASE_URL"] = config.get("OPENAI_API_BASE")
    os.environ&#91;"TAVILY_API_KEY"] = config.get("TAVILY_API_KEY")</code></pre>



<p>Next, we set up the model. </p>



<pre class="wp-block-code"><code>from langchain_openai import ChatOpenAI

model_name = 'gpt-4o-mini'

llm = ChatOpenAI(
    model = model_name,
    temperature=0,
    max_tokens=4096
)</code></pre>



<p>I&#8217;ve opted for the gpt-4o-mini model, as it&#8217;s the most cost-effective option. I&#8217;ve set the temperature to 0 to get the most consistent results. </p>



<p>Tavily is a tool for searching the web, so let&#8217;s set that up as well</p>



<pre class="wp-block-code"><code>from langchain_tavily import TavilySearch

tavily_tool = TavilySearch(
    max_results=5,
    topic="general",
    include_answer=False,
    include_raw_content=False,
    search_depth="basic"
)</code></pre>



<p>You can find all the options for this on the <a href="https://tavily.com">Tavily website</a>.</p>



<p>I&#8217;m going to want a state object so that I can pass it among the agents. </p>



<pre class="wp-block-code"><code>from typing import TypedDict, Annotated, List
from langgraph.graph import StateGraph, END
import operator

class ResearchState(TypedDict):
    """State for the research workflow."""
    main_task: str
    research_findings: Annotated&#91;List&#91;str], operator.add]
    draft: str
    review_notes: str
    revision_number: int
    next_step: str
    current_sub_task: str</code></pre>



<p><strong>Note</strong>, the line <em>research_findings: Annotated [List(str), operator.add) </em>is a Python type annotation. In short, an annotated list has metadata, in this case the function operator.add. This operator, in this context, is used to update state, though it can also be used to reduce the prompt. </p>



<h2 class="wp-block-heading">Blogger</h2>



<p>To get started, we&#8217;ll create the (non-trivial) blogger agent. This is actually the most powerful and thus the most complex of the agents.</p>



<p>We begin with the prompt template (this is, essentially, the system prompt)</p>



<pre class="wp-block-code"><code>blogger_prompt_template = """You are a blogger managing a blog post creation workflow.

Current Task: {main_task}

Current State:
- Research Findings: {research_findings}
- Blog Draft: {draft}
- Reviewer Feedback: {review_notes}
- Revision Number: {revision_number}

Your goal is to ensure a clear, engaging, and valuable blog post targeted at software developers.

Decide the next step and respond only with a JSON object (no extra text):
{
  "next_step": "researcher" or "author" or "END",
  "task_description": "Brief description of what needs to be done next"
}

Decision Rules:
- If no research exists, choose "researcher"
- If research exists but no draft, choose "author"
- If draft exists and reviewer says "APPROVED", choose "END"
- If draft needs revision, choose "author"
- If revision_number >= 4, choose "END"
"""</code></pre>



<p>Take a moment to read this over; it sets the parameters and goals of the program. The decision rules are critical, they control the flow, and they set a limit on revisions (in this case 4). </p>



<p>Having told the Blogger what we are trying to accomplish and the general tone of the output we&#8217;re ready to create the decision tree that constitutes the workflow for the Blogger. This is pretty long, but much of it is self-explanatory.</p>



<pre class="wp-block-code"><code>def create_blogger_chain():
    """Creates the bloger decision chain."""

    def blogger_invoke(state):
        research = state.get("research_findings", &#91;])
        research_text = "\n".join(research) if research else "No research yet."
        revision = state.get("revision_number", 0)
        has_research = len(research) > 0
        has_draft = bool(state.get("draft", "").strip())
        review = state.get("review_notes", "")

        if "APPROVED" in review.upper() and has_draft:
            print("Blogger: Draft approved, ending workflow")
            return {
                "next_step": "END",
                "task_description": "Report approved and complete"
            }

        if not has_research:
            print("Blogger: No research yet, directing to researcher")
            return {
                "next_step": "researcher",
                "task_description": f"Research the topic: {state.get('main_task', '')}"
            }


        if has_research and not has_draft:
            print("Blogger: Have research, creating first draft")
            return {
                "next_step": "author",
                "task_description": "Write the first draft based on research findings"
            }

        if has_draft and not review:
            print("Blogger: Have draft, sending to reviewer")
            return {
                "next_step": "reviewer",
                "task_description": "Prepare draft for review"
            }

        if review and "APPROVED" not in review.upper() and revision &lt;= 4:
            print(f"Blogger: Revision {revision}, sending back to author")
            return {
                "next_step": "author",
                "task_description": "Revise the draft based on review feedback"
            }

        if revision >= 4:
            print("Blogger: Max revisions reached! Ending")
            return {
                "next_step": "END",
                "task_description": "Maximum revisions reached! Finalizing report"
            }

        # LLM as fallback
        prompt = blogger_prompt_template.format(
            main_task=state.get("main_task", ""),
            research_findings=research_text,
            draft=state.get("draft", "No draft yet."),
            review_notes=review if review else "No review yet.",
            revision_number=revision
        )

        try:
            response = llm.invoke(prompt)
            content = response.content if hasattr(response, 'content') else str(response)

            text = content.strip()
            if text.startswith("```"):
                lines = text.split("\n")
                text = "\n".join(&#91;l for l in lines if not l.strip().startswith("```")])
            text = text.strip()

            decision = json.loads(text)

            if "next_step" in decision:
                return decision

        except Exception as e:
            print(f"LLM parsing error: {e}")

        # Final fallback 
        print("Blogger: Using final fallback - continuing with author")
        return {
            "next_step": "author",
            "task_description": "Continue with draft creation"
        }

    return blogger_invoke

# Creating a callable object
blogger_chain = create_blogger_chain()</code></pre>



<p>We begin with a nested function.</p>



<pre class="wp-block-code"><code>def create_blogger_chain():<br />    """Creates the blogger decision chain."""<br /><br />    def blogger_invoke(state):<br />        ...</code></pre>



<h3 class="wp-block-heading">When called, Python executes the code inside it, that is, the inner function. In this case, the inner function does the real work.</h3>



<p>We define a second function <strong>inside</strong> <code>create_blogger_chain</code>. We do this to create a closure, that is the inner function can access variables from the outer function. This is a common construct when working with langchain. The inner function has access to the llm without it being passed every time. So, in short, <em>create_blogger_chain</em> is actually a factory function that constructs and configures a callable function and then returns it.</p>



<p>We next set up our variables based on starting values in the state object. With that done, we&#8217;re ready to progress through a series of possible conditions. These follow the rules established in the template.</p>



<p>All that&#8217;s left for the blogger is to create a node where the decision will be implemented. We&#8217;ll use this node, and the others we&#8217;ll create, when we implement the workflow (after we define all the agents).</p>



<pre class="wp-block-code"><code>def blogger_node(state: ResearchState) -&gt; dict:
    """Blogger decides the next step."""
    print("\n&gt;&gt;&gt;Blogger")

    decision = blogger_chain(state)

    next_step = decision.get("next_step", "researcher")
    task_desc = decision.get("task_description", "Continue work")

    print(f"Decision: {next_step}")
    print(f"Task: {task_desc}")

    return {
        "next_step": next_step,
        "current_sub_task": task_desc,
    }</code></pre>



<p><strong>Note</strong> The variable <em>decision</em> is assigned as a result of calling the code we just reviewed. With that in hand, we call <em>get</em> on the <em>decision,</em> asking for the next step. If none is returned, we use researcher.</p>



<p>That&#8217;s it for blogger. We&#8217;ll review the other agents in the next posting.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Creating a multi-agent application – Part 1</title>
		<link>https://jesseliberty.com/2026/06/11/creating-a-multi-agent-application/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Thu, 11 Jun 2026 16:24:42 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13301</guid>

					<description><![CDATA[The following text was created by a multi-agent application designed to create blog posts. In my next post we&#8217;ll take the application apart, step by step. For now, here is a test run with the prompt Use of multiagents in &#8230; <a href="https://jesseliberty.com/2026/06/11/creating-a-multi-agent-application/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p><strong>The following text was created by a multi-agent application designed to create blog posts. In my next post we&#8217;ll take the application apart, step by step. For now, here is a test run with the prompt <em>Use of multiagents in writing a C# application</em>.&#8221;</strong></p>



<p>Draft created: 2653 characters<br />{&#8216;author&#8217;: {&#8216;draft&#8217;: &#8216;# Harnessing Multi-Agent Systems in C# Applications\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;In the evolving landscape of software development, &#8216;<br />&#8216;multi-agent systems (MAS) have emerged as a powerful &#8216;<br />&#8216;paradigm, particularly in enhancing the functionality of &#8216;<br />&#8216;applications. However, the integration of these systems &#8216;<br />&#8216;into C# applications comes with its own set of &#8216;<br />&#8216;challenges and considerations. This post explores the &#8216;<br />&#8216;key aspects of implementing multi-agent systems in C#, &#8216;<br />&#8216;drawing from recent research findings.\n&#8217;<br />&#8216;\n&#8217;</p>



<p><br />&#8216;## Understanding Multi-Agent Systems\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;At its core, a multi-agent system consists of multiple &#8216;<br />&#8216;autonomous agents that interact with one another to &#8216;<br />&#8216;achieve specific goals. These agents can be designed to &#8216;<br />&#8216;perform tasks collaboratively, leading to improved &#8216;<br />&#8216;efficiency and problem-solving capabilities. However, as &#8216;<br />&#8216;highlighted by Elliot One, simply increasing the number &#8216;<br />&#8216;of agents does not guarantee better outcomes. In fact, &#8216;<br />&#8216;it can complicate the debugging process, making it more &#8216;<br />&#8216;difficult to trace failures and understand system &#8216;<br />&#8216;behavior. This underscores the importance of thoughtful &#8216;<br />&#8216;design and implementation when developing multi-agent &#8216;<br />&#8216;systems.\n&#8217;<br />&#8216;\n&#8217;</p>



<span id="more-13301"></span>



<p><br />&#8216;## Multi-Agent Architecture in .NET\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;For developers looking to implement multi-agent systems &#8216;<br />&#8216;in C#, the Microsoft Agent Framework provides a robust &#8216;<br />&#8216;foundation. A recent tutorial video introduces the &#8216;<br />&#8216;concept of multi-agent orchestration and workflows, &#8216;<br />&#8216;offering insights into how these systems can be &#8216;<br />&#8216;structured within .NET applications. This resource is &#8216;<br />&#8216;invaluable for developers seeking to grasp the &#8216;<br />&#8216;architectural considerations necessary for effective &#8216;<br />&#8216;multi-agent implementation.\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;## Practical Implementation\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;Building a multi-agent system requires a structured &#8216;<br />&#8216;approach. A Codelabs resource outlines a step-by-step &#8216;<br />&#8216;guide for developers interested in practical &#8216;<br />&#8216;implementation. This guide covers prerequisites, &#8216;<br />&#8216;essential components, and best practices for creating a &#8216;<br />&#8216;functional multi-agent system. By following this &#8216;<br />&#8216;structured methodology, developers can mitigate common &#8216;<br />&#8216;pitfalls and enhance the overall quality of their &#8216;<br />&#8216;applications.\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;## Conclusion\n&#8217;<br />&#8216;\n&#8217;<br />&#8216;While multi-agent systems hold significant potential for &#8216;<br />&#8216;enhancing C# applications, developers must navigate the &#8216;<br />&#8216;complexities associated with their implementation. By &#8216;<br />&#8216;leveraging available resources and adhering to best &#8216;<br />&#8216;practices, it is possible to create robust multi-agent &#8216;<br />&#8216;systems that improve application functionality without &#8216;<br />&#8216;succumbing to the common challenges of increased &#8216;<br />&#8216;complexity. As the field continues to evolve, staying &#8216;<br />&#8216;informed and educated on the latest developments will be &#8216;<br />&#8216;crucial for developers aiming to harness the full power &#8216;<br />&#8216;of multi-agent systems in their applications.&#8217;</p>



<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; <br />&gt;&gt;REVIEWER Review: APPROVED &#8211; The draft effectively introduces the concept of multi-agent systems in C# applications an&#8230; ✓ Draft APPROVED {&#8216;reviewer&#8217;: {&#8216;next_step&#8217;: &#8216;END&#8217;, &#8216;review_notes&#8217;: &#8216;APPROVED&#8217;}}<br /> &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; <br />&gt;&gt;&gt;Blogger Blogger: Draft approved, ending workflow Decision: END Task: Report approved and complete {&#8216;blogger&#8217;: {&#8216;current_sub_task&#8217;: &#8216;Report approved and complete&#8217;, &#8216;next_step&#8217;: &#8216;END&#8217;}}</p>



<p>We take this apart beginning in <a href="https://jesseliberty.com/2026/06/12/creating-a-multi-agent-application-part-2/">part 2</a>.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>ReAct and Agents in AI</title>
		<link>https://jesseliberty.com/2026/05/22/react-and-agents-in-ai/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Fri, 22 May 2026 11:22:31 +0000</pubDate>
				<category><![CDATA[Agents]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[CoT]]></category>
		<category><![CDATA[ReAct]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13295</guid>

					<description><![CDATA[In the previous post, we looked at the use of Chain of Thought (CoT) reasoning in the context of LLMs. For an LLM to take action in the world, however, it needs agents. The paradigm for this is called ReAct—that &#8230; <a href="https://jesseliberty.com/2026/05/22/react-and-agents-in-ai/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In the <a href="https://jesseliberty.com/2026/05/18/ai-reasoning-and-planning/">previous post,</a> we looked at the use of Chain of Thought (CoT) reasoning in the context of LLMs. For an LLM to take action in the world, however, it needs agents. The paradigm for this is called ReAct—that is, REason and ACT. </p>



<p>In order to interact with the world, the agent will use tools (such as code that accesses APIs, searches the Internet, etc.). This creates a dynamic cycle: </p>



<figure class="wp-block-image size-full"><img decoding="async" width="257" height="211" src="https://jesseliberty.com/wp-content/uploads/2026/05/loop.jpg" alt="" class="wp-image-13296" srcset="https://jesseliberty.com/wp-content/uploads/2026/05/loop.jpg 257w, https://jesseliberty.com/wp-content/uploads/2026/05/loop-150x123.jpg 150w" sizes="(max-width: 257px) 100vw, 257px" /></figure>



<p><strong>Think</strong>—the LLM reasons and decides what tool to use<br /><strong>ACT</strong>—the LLM uses the tool to take action in the world<br /><strong>Observe</strong>—the LLM observes the result of the action and adjusts accordingly, refining its plan</p>



<p>The cycle ends when the LLM has its final answer.</p>



<span id="more-13295"></span>



<h2 class="wp-block-heading">Prompts for ReAct</h2>



<p>The prompt must specify the available tools <em>and their descriptions</em>. It will instruct the LLM to use the cycle above and provide examples of the cycle. Finally, it will constrain the output into a <em>machine-readable format</em>. </p>



<p>The prompt must be very clear and precise. The basic template is to tell the agent what it is (&#8220;you are a medical assistant&#8221;), what tools it has access to, how to use the tool, what the returned observation will look like (structured data such as JSON) and how to provide the interim and final answers.</p>



<p>The output after each tool use is machine-readable so that it can be fed back into the cycle or to a tool or internal prompt.</p>



<p>With CoT (Chain of Thought), the LLM was limited to its existing knowledge and could not deviate from its initial planning. With ReAct the LLM can adapt to the observed results and use tools to extend its knowledge. If an error is encountered, the LLM can adjust its plan to avoid the failure (e.g., using a different tool).</p>



<p>We can tell the LLM to output its plan and observations as it goes. This can be enormously helpful for debugging.</p>



<p>When we look at the &#8220;think&#8221; phase, we return to the use of CoT. It is here that an initial plan is generated and the LLM uses the tools to execute the plan after which the LLM will observe the results and use them to adjust the plan.</p>



<p>Interacting with the tools brings us to the Model Context Protocol (MCP), a topic for an upcoming blog post.</p>



<p></p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>AI Reasoning and Planning</title>
		<link>https://jesseliberty.com/2026/05/18/ai-reasoning-and-planning/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Mon, 18 May 2026 22:01:19 +0000</pubDate>
				<category><![CDATA[Agents]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Essentials]]></category>
		<category><![CDATA[Agentics]]></category>
		<category><![CDATA[Chain-of-thought]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13289</guid>

					<description><![CDATA[Until very recently, it was observed that LLMs had a very hard time with complex problems. Context was lost, memory of previous steps was distorted, and so forth. This led to unreliable results (hallucinations) and, consequently, to a lack of &#8230; <a href="https://jesseliberty.com/2026/05/18/ai-reasoning-and-planning/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>Until very recently, it was observed that LLMs had a very hard time with complex problems. Context was lost, memory of previous steps was distorted, and so forth. This led to unreliable results (hallucinations) and, consequently, to a lack of trust in the technology.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="197" height="275" src="https://jesseliberty.com/wp-content/uploads/2026/05/robot-404.jpg" alt="" class="wp-image-13290" srcset="https://jesseliberty.com/wp-content/uploads/2026/05/robot-404.jpg 197w, https://jesseliberty.com/wp-content/uploads/2026/05/robot-404-107x150.jpg 107w" sizes="(max-width: 197px) 100vw, 197px" /></figure>



<p>Recent research has shown that LLMs are, in fact, quite good at reasoning and planning if the problem is broken into a series of steps as a result of the right prompts. This reasoning and planning greatly improves the accuracy of the LLM’s output.</p>



<span id="more-13289"></span>



<h2 class="wp-block-heading">Chain of Thought</h2>



<p>One of the great breakthroughs in the field of LLMs was the discovery that appending &#8220;think step by step&#8221; to the prompt had a profound effect on the accuracy of the LLMs output. Prompting the LLM in this way forces the model to decompose the problem and to generate an &#8220;inner monologue&#8221; of the steps it is taking to solve the problem.</p>



<p>This has two significant advantages: </p>



<ul class="wp-block-list">
<li>It makes the LLM&#8217;s reasoning process more transparent</li>



<li>It allows the model to check its own work as it goes.</li>
</ul>



<p>For example, we might ask, &#8220;If a plane crashes on the border of the north field and the south field, where will the survivors be buried?&#8221;</p>



<p>Without the reasoning prompt, it is entirely possible that the LLM would pick one of the fields at random. However, if we tell it to think step-by-step, it will examine the parts of the question and realize that <em>survivors</em> are not buried at all.</p>



<p><em>Note that today&#8217;s LLMs have a degree of Chain-of-Thought built into them and won&#8217;t get this wrong.</em></p>



<p>There are three key aspects to explain why chain-of-thought reasoning works:</p>



<ul class="wp-block-list">
<li>Decomposing the problem into smaller intermediate steps</li>



<li>CoT offers the model the ability to keep track of its work and to remember intermediate results.</li>



<li>Typically, more tokens are allocated to the reasoning, and thus the model can &#8220;think&#8221; longer.</li>
</ul>



<h3 class="wp-block-heading">Debugging</h3>



<p>Because we&#8217;ve asked the LLM to think step-by-step, it can tell us each step in its reasoning, and we can examine those steps to see when the LLM fell of the rails. This takes a process that might otherwise be opaque and makes it transparent, greatly enhancing the debugging process.</p>



<h2 class="wp-block-heading">Next up: ReAct&#8230;</h2>



<p></p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>PEAS for Agent AI</title>
		<link>https://jesseliberty.com/2026/05/11/peas-for-agent-ai/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Mon, 11 May 2026 11:52:00 +0000</pubDate>
				<category><![CDATA[Agents]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Mini-Tutorial]]></category>
		<category><![CDATA[PEAS]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13283</guid>

					<description><![CDATA[A classic AI framework to define an agent&#8217;s task environment is PEAS. It stands for: Performance Performance defines success for our agent (the objective and measurable criteria for evaluating the agent&#8217;s behavior). A good performance measure will evaluate the state &#8230; <a href="https://jesseliberty.com/2026/05/11/peas-for-agent-ai/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>A classic AI framework to define an agent&#8217;s task environment is PEAS. It stands for:<br /></p>



<ul class="wp-block-list">
<li><strong>P</strong>erformance </li>



<li><strong>E</strong>nvironment</li>



<li><strong>A</strong>ctuators</li>



<li><strong>S</strong>ensors</li>
</ul>



<span id="more-13283"></span>



<h2 class="wp-block-heading">Performance</h2>



<p>Performance defines success for our agent (the objective and measurable criteria for evaluating the agent&#8217;s behavior). A good performance measure will evaluate the state of the environment, not the agent&#8217;s internal state.</p>



<p>Designing for performance is typically the hardest part of designing an agent. Thinking deeply about what you want to accomplish, <em>before you start coding</em>, is the key to creating a successful agent.</p>



<p>For example, a naïve performance measure for an automated vehicle might be &#8220;Get me to my destination.&#8221; A more robust performance measure might include:</p>



<ul class="wp-block-list">
<li>Speed </li>



<li>Comfort</li>



<li>Fuel consumption</li>



<li>Following traffic laws</li>



<li>Safety</li>
</ul>



<p>Some of these conflict (e.g., speed v safety)</p>



<p>To ensure that performance matches your goals, you will need to tradeoff values. You might end up with something like this:</p>



<ul class="wp-block-list">
<li>Value safety over speed</li>



<li>Value speed over comfort</li>



<li>Value fuel consumption over speed</li>



<li>Value comfort over fuel consumption</li>



<li>Value traffic laws over comfort or fuel consumption</li>
</ul>



<p>Even this is a bit simplistic. Another approach is to give weights to the various factors and see how they balance out. As you can see, deep thinking is required to get this right.</p>



<h2 class="wp-block-heading">Environment</h2>



<p>This refers to the &#8220;world&#8221; the agent lives in—that is, everything external to the agent. In the case above the roads, streets, traffic lights, pedestrians, etc., constitute the environment.</p>



<h2 class="wp-block-heading">Actuators</h2>



<p>These are the things that change the environment. In our self-driving car these are the steering wheel, the brakes, and the accelerator. In a robot these might be legs, arms, and fingers. And, most relevant to most of us, in code these are methods that call APIs, functions that change values, etc.</p>



<h2 class="wp-block-heading">Sensors</h2>



<p>Sensors are how the agent perceives its environment. Like humans, the only way the agent can know about the world it moves in is through its sensors. For our self-driving car this might include</p>



<ul class="wp-block-list">
<li>GPS</li>



<li>LiDAR</li>



<li>Cameras</li>



<li>Tire pressure sensors</li>
</ul>



<h2 class="wp-block-heading"></h2>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>The R in RAG</title>
		<link>https://jesseliberty.com/2026/04/26/the-r-in-rag/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sun, 26 Apr 2026 10:51:43 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Essentials]]></category>
		<category><![CDATA[RAG]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13278</guid>

					<description><![CDATA[In my previous post we looked at saving to the vector store. In this short post we&#8217;ll look at retrieving that information. The simple search is a good starting point and depends on writing a good prompt, but we can &#8230; <a href="https://jesseliberty.com/2026/04/26/the-r-in-rag/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In my <a href="https://jesseliberty.com/wp-admin/post.php?post=13273&amp;action=edit">previous post</a> we looked at saving to the vector store. In this short post we&#8217;ll look at retrieving that information.</p>



<p>The simple search is a good starting point and depends on writing a good prompt, but we can do better.</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="395" height="252" src="https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map.jpg" alt="" class="wp-image-13279" style="width:354px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map.jpg 395w, https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map-300x191.jpg 300w, https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map-150x96.jpg 150w" sizes="auto, (max-width: 395px) 100vw, 395px" /></figure>



<ol class="wp-block-list">
<li>One problem in searching is that we often retrieve redundant data. For example, the basic search might retrieve ten very similar chunks. <strong>Maximal Marginal Relevance</strong> first builds a set of results that are relevant but also diverse. It then iterates through the relevant documents looking for ones that are as different as possible. As you might expect, this gives you a diverse set of relevant documents.</li>
</ol>



<span id="more-13278"></span>



<p>2. You will remember that we stored metadata with our documents. This gives us a very powerful tool in searching. We can search for documents based on the metadata, and then within those results, we can search semantically.</p>



<p>For example, we might say, &#8220;Give me documents that were created in the past month,&#8221; and then from that reduced set we can ask for those that provide information on sales. This is faster and more accurate than other search methods and, of course, can be combined with search methods that reduce redundancy.</p>



<p>3. Let the LLM improve your search. The LLM can rewrite your prompt to be more effective, which will get you a better set of results.</p>



<p>4. Interestingly, the order of returned documents matters. After you get your initial results, reorder them with the most important documents at the very beginning or very end (the middle tends to get lost!) You can then search this subset for your most relevant answers.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Deeper into RAG</title>
		<link>https://jesseliberty.com/2026/04/25/deeper-into-rag/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sat, 25 Apr 2026 20:36:22 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13273</guid>

					<description><![CDATA[In the previous post we walked through creating a RAG example, line by line. Let&#8217;s take a closer conceptual look at the steps involved in creating a RAG The first step is to load your document. Here you are taking &#8230; <a href="https://jesseliberty.com/2026/04/25/deeper-into-rag/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In the <a href="https://jesseliberty.com/2026/04/21/rag-in-detail/">previous post </a>we walked through creating a RAG example, line by line. Let&#8217;s take a closer conceptual look at the steps involved in creating a RAG</p>



<ul class="wp-block-list">
<li>Load your source documents, being careful to keep the meta-data</li>



<li>Split your document into semantically meaningful chunks</li>



<li>Convert your text to vector representations in an embedding model</li>



<li>Store your vector representations in a vector database</li>



<li>Retrieve the data you need</li>
</ul>



<span id="more-13273"></span>



<p>The first step is to load your document. Here you are taking the raw content and its metadata (e.g., creation date, etc.) and loading it into your application. Next, it is time to split the text, which is called chunking. It turns out that this is critical to creating a useful and fast RAG. Chunks are the atomic units used for retrieval, and to do this well, you&#8217;ll want to break your large documents into smaller semantically meaningful pieces.</p>



<p>Here the Goldilocks approach is critical. You don&#8217;t want your chunk to be so small that it has no context, nor do you want it to be so big that it covers more than one concept.</p>



<p>There are a few strategies for creating hunks. A simple example is to split your document by paragraphs. If that is impossible (the paragraphs are too long, etc.) you split by sentences, and failing that by lines.</p>



<p>If you are lucky enough to have structured data (e.g., headings, hierarchy, outline, etc.), then the splitter can use that structure to create chunks. If you are working with code, you might have the splitter chunk by classes, etc.</p>



<p>The next step is embedding. This is the most conceptually challenging task, though the tools will do the work for you. Conceptually you are creating a multi-dimensional map where the distance between the various chunks corresponds to how similar they are. This enables searching by semantic meaning rather than just keywords. </p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="276" height="235" src="https://jesseliberty.com/wp-content/uploads/2026/04/semantic-vectors.jpg" alt="" class="wp-image-13275" style="width:248px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/semantic-vectors.jpg 276w, https://jesseliberty.com/wp-content/uploads/2026/04/semantic-vectors-150x128.jpg 150w" sizes="auto, (max-width: 276px) 100vw, 276px" /></figure>



<p><em>Notice that &#8220;speaking&#8221; and &#8220;speech&#8221; are close to each other, while &#8220;dog&#8221; and &#8220;keyboard&#8221; are further apart.</em></p>



<p>There are a number of embedding tools, some proprietary and some open source. The open-source models may be harder to set up, but they are free and can run locally.</p>



<p>Finally, we need to store the vectors in a vector database. The vectors are indexed, and searching typically uses Approximate Nearest Neighbor, taking advantage of the multi-dimensional model we created above.</p>



<p>That covers the ingestion workflow. Next is retrieval, which I&#8217;ll leave for the next blog post.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>RAG In Detail</title>
		<link>https://jesseliberty.com/2026/04/21/rag-in-detail/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Tue, 21 Apr 2026 18:48:10 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Essentials]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[RAG]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13255</guid>

					<description><![CDATA[In my previous post I walked through a RAG example but glossed over the details. In this post I&#8217;ll back up and walk through the program line by line. The key steps in RAG are Let&#8217;s walk through the steps &#8230; <a href="https://jesseliberty.com/2026/04/21/rag-in-detail/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In my <a href="https://jesseliberty.com/2026/04/19/rag-a-quick-example/">previous post </a>I walked through a RAG example but glossed over the details. In this post I&#8217;ll back up and walk through the program line by line.</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="486" height="354" src="https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c.jpeg" alt="" class="wp-image-13269" style="width:233px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c.jpeg 486w, https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c-300x219.jpeg 300w, https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c-150x109.jpeg 150w" sizes="auto, (max-width: 486px) 100vw, 486px" /></figure>



<p>The key steps in RAG are</p>



<ul class="wp-block-list">
<li>Load the data</li>



<li>Split the text into smaller chunks to fit within context limits</li>



<li>Create a Document object</li>



<li>Embed the document in vectors that represent semantic meaning</li>



<li>Store the document—typically in vector stores. These are databases designed to store embeddings and provide fast semantic retrieval</li>



<li>Invoke a retriever to query the back end to return the most relevant Document object</li>



<li>Create a prompt for the LLM</li>
</ul>



<span id="more-13255"></span>



<p>Let&#8217;s walk through the steps shown in the previous post with these in mind.</p>



<h2 class="wp-block-heading">Loading the document</h2>



<p>First, we need to identify and load the documents. In our case, this consists only of a single text file with an excerpt from Romeo and Juliet. In most real-world scenarios you&#8217;ll have multiple data sources.</p>



<pre class="wp-block-code"><code>from <strong>langchain_community.document_loaders import TextLoader</strong>
loader = TextLoader("RomeoAndJuliet.txt", encoding="utf-8")
docs = loader.load()</code></pre>



<p>Notice that we are using the langchain_community document loader to do the text loading. Langchain will be the principal framework we&#8217;ll be working with, and it can load many types of data.</p>



<h2 class="wp-block-heading">Splitting the text</h2>



<p>We saw how to chunk that data in the previous post. We begin by using a text splitter to break large text into overlapping chunks using token-based splitting (not characters). In our case, we will set each chunk to about 1,000 tokens with 200 tokens of overlap. The overlap ensures that nothing is lost.</p>



<pre class="wp-block-code"><code>from langchain.text_splitter import RecursiveCharacterTextSplitter 
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=1000,
    chunk_overlap=200
)
chunks = loader.load_and_split(text_splitter)</code></pre>



<p>The text splitter that we use is the same one OpenAI uses.  The cl100k_base is the tokenizer used by many embedding models. The 200 token overlap prevents losing meaning at the boundaries of the chunks and helps embeddings preserve context.</p>



<p>We use a recursive text splitter because it splits text intelligently, splitting by paragraphs when possible, then by sentences if the paragraphs are too big, then by words and finally by characters.</p>



<h2 class="wp-block-heading">Embedding in a vector store</h2>



<pre class="wp-block-code"><code>embedding_model = OpenAIEmbeddings(model="text-embedding-ada-002")</code></pre>



<p>The embedding_model knows how to take text and send it to OpenAI and get back a vector embedding (a list of numbers). Each chunk you pass into Chroma (see below) will be embedded using this model</p>



<p>Our next task is to build the vector store, using the chunks we created above</p>



<pre class="wp-block-code"><code>vectorstore = Chroma.from_documents(
    chunks,
    embedding_model,
    collection_name="RomeoAndJuliet"
)</code></pre>



<p>Here we embed each of the chunks. For each chunk Chroma calls embedding_model.embed_document which produces the vector</p>



<p>For each chunk, Chroma will store the vector embedding, the original text and the metadata such as the source file, etc. This is used for similarity search (see below).</p>



<p>The final value passed in is the collection_name. The vector store is saved under that name.</p>



<h2 class="wp-block-heading">Getting the retriever</h2>



<p>As noted in the previous post, the next step is to create the retriever, which we do from the vector store, telling it that we want the search_type to be similarity and telling it how many of the most relevant chunks to return.</p>



<p>You get back a LangChain Document with the text chunk and the metadata.</p>



<h2 class="wp-block-heading">Instantiating the LLM</h2>



<p>The next section in the previous post is self-explanatory until we instantiate the LLM. </p>



<pre class="wp-block-code"><code>llm = ChatOpenAI(
    model="gpt-4o-mini",                      
    temperature=0,                
    max_tokens=10000,                 
    top_p=0.95,
    frequency_penalty=1.2,
    stop_sequences=&#91;'INST']
)</code></pre>



<p>Here we are using the OpenAI gpt-40-mini LLM &#8211; a popular and inexpensive LLM for RAG. </p>



<p>We set the temperature, which is a value that determines randomness in the answer. 0 is deterministic and repeatable.</p>



<p>max_tokens sets the upper bounds on how long the model&#8217;s response can be.</p>



<p>top_p=0.95 is tricky. This says that the model should sample from the top 95% probability. However, with temperature set to 0, this is meaningless. If you tinker with temperature, however, this can be useful.</p>



<p>frequency_penalty controls how often a token can repeat in the result. We&#8217;re using 1.2 which is a strong penalty creating concise, non-repetitive answers.</p>



<p>stop_sequence says to stop generation when the model outputs INST. This just prevents the model from &#8220;leaking&#8221; into the next instruction.</p>



<p>That&#8217;s it! Together with the previous post, you are now fully equipped to implement your RAG. Enjoy!</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>