<?xml version="1.0" encoding="UTF-8" standalone="no"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" version="2.0">

<channel>
	<title>Jesse Liberty - Silverlight Geek</title>
	<atom:link href="https://jesseliberty.com/feed/" rel="self" type="application/rss+xml"/>
	<link>https://jesseliberty.com</link>
	<description>More Signal - Less Noise</description>
	<lastBuildDate>Sun, 26 Apr 2026 13:35:31 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>The R in RAG</title>
		<link>https://jesseliberty.com/2026/04/26/the-r-in-rag/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sun, 26 Apr 2026 10:51:43 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Essentials]]></category>
		<category><![CDATA[RAG]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13278</guid>

					<description><![CDATA[In my previous post we looked at saving to the vector store. In this short post we&#8217;ll look at retrieving that information. The simple search is a good starting point and depends on writing a good prompt, but we can &#8230; <a href="https://jesseliberty.com/2026/04/26/the-r-in-rag/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In my <a href="https://jesseliberty.com/wp-admin/post.php?post=13273&amp;action=edit">previous post</a> we looked at saving to the vector store. In this short post we&#8217;ll look at retrieving that information.</p>



<p>The simple search is a good starting point and depends on writing a good prompt, but we can do better.</p>



<figure class="wp-block-image size-full is-resized"><img fetchpriority="high" decoding="async" width="395" height="252" src="https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map.jpg" alt="" class="wp-image-13279" style="width:354px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map.jpg 395w, https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map-300x191.jpg 300w, https://jesseliberty.com/wp-content/uploads/2026/04/treasure-map-150x96.jpg 150w" sizes="(max-width: 395px) 100vw, 395px" /></figure>



<ol class="wp-block-list">
<li>One problem in searching is that we often retrieve redundant data. For example, the basic search might retrieve ten very similar chunks. <strong>Maximal Marginal Relevance</strong> first builds a set of results that are relevant but also diverse. It then iterates through the relevant documents looking for ones that are as different as possible. As you might expect, this gives you a diverse set of relevant documents.</li>
</ol>



<span id="more-13278"></span>



<p>2. You will remember that we stored metadata with our documents. This gives us a very powerful tool in searching. We can search for documents based on the metadata, and then within those results, we can search semantically.</p>



<p>For example, we might say, &#8220;Give me documents that were created in the past month,&#8221; and then from that reduced set we can ask for those that provide information on sales. This is faster and more accurate than other search methods and, of course, can be combined with search methods that reduce redundancy.</p>



<p>3. Let the LLM improve your search. The LLM can rewrite your prompt to be more effective, which will get you a better set of results.</p>



<p>4. Interestingly, the order of returned documents matters. After you get your initial results, reorder them with the most important documents at the very beginning or very end (the middle tends to get lost!) You can then search this subset for your most relevant answers.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Deeper into RAG</title>
		<link>https://jesseliberty.com/2026/04/25/deeper-into-rag/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sat, 25 Apr 2026 20:36:22 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13273</guid>

					<description><![CDATA[In the previous post we walked through creating a RAG example, line by line. Let&#8217;s take a closer conceptual look at the steps involved in creating a RAG The first step is to load your document. Here you are taking &#8230; <a href="https://jesseliberty.com/2026/04/25/deeper-into-rag/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In the <a href="https://jesseliberty.com/2026/04/21/rag-in-detail/">previous post </a>we walked through creating a RAG example, line by line. Let&#8217;s take a closer conceptual look at the steps involved in creating a RAG</p>



<ul class="wp-block-list">
<li>Load your source documents, being careful to keep the meta-data</li>



<li>Split your document into semantically meaningful chunks</li>



<li>Convert your text to vector representations in an embedding model</li>



<li>Store your vector representations in a vector database</li>



<li>Retrieve the data you need</li>
</ul>



<span id="more-13273"></span>



<p>The first step is to load your document. Here you are taking the raw content and its metadata (e.g., creation date, etc.) and loading it into your application. Next, it is time to split the text, which is called chunking. It turns out that this is critical to creating a useful and fast RAG. Chunks are the atomic units used for retrieval, and to do this well, you&#8217;ll want to break your large documents into smaller semantically meaningful pieces.</p>



<p>Here the Goldilocks approach is critical. You don&#8217;t want your chunk to be so small that it has no context, nor do you want it to be so big that it covers more than one concept.</p>



<p>There are a few strategies for creating hunks. A simple example is to split your document by paragraphs. If that is impossible (the paragraphs are too long, etc.) you split by sentences, and failing that by lines.</p>



<p>If you are lucky enough to have structured data (e.g., headings, hierarchy, outline, etc.), then the splitter can use that structure to create chunks. If you are working with code, you might have the splitter chunk by classes, etc.</p>



<p>The next step is embedding. This is the most conceptually challenging task, though the tools will do the work for you. Conceptually you are creating a multi-dimensional map where the distance between the various chunks corresponds to how similar they are. This enables searching by semantic meaning rather than just keywords. </p>



<figure class="wp-block-image size-full is-resized"><img decoding="async" width="276" height="235" src="https://jesseliberty.com/wp-content/uploads/2026/04/semantic-vectors.jpg" alt="" class="wp-image-13275" style="width:248px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/semantic-vectors.jpg 276w, https://jesseliberty.com/wp-content/uploads/2026/04/semantic-vectors-150x128.jpg 150w" sizes="(max-width: 276px) 100vw, 276px" /></figure>



<p><em>Notice that &#8220;speaking&#8221; and &#8220;speech&#8221; are close to each other, while &#8220;dog&#8221; and &#8220;keyboard&#8221; are further apart.</em></p>



<p>There are a number of embedding tools, some proprietary and some open source. The open-source models may be harder to set up, but they are free and can run locally.</p>



<p>Finally, we need to store the vectors in a vector database. The vectors are indexed, and searching typically uses Approximate Nearest Neighbor, taking advantage of the multi-dimensional model we created above.</p>



<p>That covers the ingestion workflow. Next is retrieval, which I&#8217;ll leave for the next blog post.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>RAG In Detail</title>
		<link>https://jesseliberty.com/2026/04/21/rag-in-detail/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Tue, 21 Apr 2026 18:48:10 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Essentials]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[RAG]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13255</guid>

					<description><![CDATA[In my previous post I walked through a RAG example but glossed over the details. In this post I&#8217;ll back up and walk through the program line by line. The key steps in RAG are Let&#8217;s walk through the steps &#8230; <a href="https://jesseliberty.com/2026/04/21/rag-in-detail/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In my <a href="https://jesseliberty.com/2026/04/19/rag-a-quick-example/">previous post </a>I walked through a RAG example but glossed over the details. In this post I&#8217;ll back up and walk through the program line by line.</p>



<figure class="wp-block-image size-full is-resized"><img decoding="async" width="486" height="354" src="https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c.jpeg" alt="" class="wp-image-13269" style="width:233px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c.jpeg 486w, https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c-300x219.jpeg 300w, https://jesseliberty.com/wp-content/uploads/2026/04/325A41F5-C76D-4EE5-8C63-BD578176F1A7_4_5005_c-150x109.jpeg 150w" sizes="(max-width: 486px) 100vw, 486px" /></figure>



<p>The key steps in RAG are</p>



<ul class="wp-block-list">
<li>Load the data</li>



<li>Split the text into smaller chunks to fit within context limits</li>



<li>Create a Document object</li>



<li>Embed the document in vectors that represent semantic meaning</li>



<li>Store the document—typically in vector stores. These are databases designed to store embeddings and provide fast semantic retrieval</li>



<li>Invoke a retriever to query the back end to return the most relevant Document object</li>



<li>Create a prompt for the LLM</li>
</ul>



<span id="more-13255"></span>



<p>Let&#8217;s walk through the steps shown in the previous post with these in mind.</p>



<h2 class="wp-block-heading">Loading the document</h2>



<p>First, we need to identify and load the documents. In our case, this consists only of a single text file with an excerpt from Romeo and Juliet. In most real-world scenarios you&#8217;ll have multiple data sources.</p>



<pre class="wp-block-code"><code>from <strong>langchain_community.document_loaders import TextLoader</strong>
loader = TextLoader("RomeoAndJuliet.txt", encoding="utf-8")
docs = loader.load()</code></pre>



<p>Notice that we are using the langchain_community document loader to do the text loading. Langchain will be the principal framework we&#8217;ll be working with, and it can load many types of data.</p>



<h2 class="wp-block-heading">Splitting the text</h2>



<p>We saw how to chunk that data in the previous post. We begin by using a text splitter to break large text into overlapping chunks using token-based splitting (not characters). In our case, we will set each chunk to about 1,000 tokens with 200 tokens of overlap. The overlap ensures that nothing is lost.</p>



<pre class="wp-block-code"><code>from langchain.text_splitter import RecursiveCharacterTextSplitter 
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=1000,
    chunk_overlap=200
)
chunks = loader.load_and_split(text_splitter)</code></pre>



<p>The text splitter that we use is the same one OpenAI uses.  The cl100k_base is the tokenizer used by many embedding models. The 200 token overlap prevents losing meaning at the boundaries of the chunks and helps embeddings preserve context.</p>



<p>We use a recursive text splitter because it splits text intelligently, splitting by paragraphs when possible, then by sentences if the paragraphs are too big, then by words and finally by characters.</p>



<h2 class="wp-block-heading">Embedding in a vector store</h2>



<pre class="wp-block-code"><code>embedding_model = OpenAIEmbeddings(model="text-embedding-ada-002")</code></pre>



<p>The embedding_model knows how to take text and send it to OpenAI and get back a vector embedding (a list of numbers). Each chunk you pass into Chroma (see below) will be embedded using this model</p>



<p>Our next task is to build the vector store, using the chunks we created above</p>



<pre class="wp-block-code"><code>vectorstore = Chroma.from_documents(
    chunks,
    embedding_model,
    collection_name="RomeoAndJuliet"
)</code></pre>



<p>Here we embed each of the chunks. For each chunk Chroma calls embedding_model.embed_document which produces the vector</p>



<p>For each chunk, Chroma will store the vector embedding, the original text and the metadata such as the source file, etc. This is used for similarity search (see below).</p>



<p>The final value passed in is the collection_name. The vector store is saved under that name.</p>



<h2 class="wp-block-heading">Getting the retriever</h2>



<p>As noted in the previous post, the next step is to create the retriever, which we do from the vector store, telling it that we want the search_type to be similarity and telling it how many of the most relevant chunks to return.</p>



<p>You get back a LangChain Document with the text chunk and the metadata.</p>



<h2 class="wp-block-heading">Instantiating the LLM</h2>



<p>The next section in the previous post is self-explanatory until we instantiate the LLM. </p>



<pre class="wp-block-code"><code>llm = ChatOpenAI(
    model="gpt-4o-mini",                      
    temperature=0,                
    max_tokens=10000,                 
    top_p=0.95,
    frequency_penalty=1.2,
    stop_sequences=&#91;'INST']
)</code></pre>



<p>Here we are using the OpenAI gpt-40-mini LLM &#8211; a popular and inexpensive LLM for RAG. </p>



<p>We set the temperature, which is a value that determines randomness in the answer. 0 is deterministic and repeatable.</p>



<p>max_tokens sets the upper bounds on how long the model&#8217;s response can be.</p>



<p>top_p=0.95 is tricky. This says that the model should sample from the top 95% probability. However, with temperature set to 0, this is meaningless. If you tinker with temperature, however, this can be useful.</p>



<p>frequency_penalty controls how often a token can repeat in the result. We&#8217;re using 1.2 which is a strong penalty creating concise, non-repetitive answers.</p>



<p>stop_sequence says to stop generation when the model outputs INST. This just prevents the model from &#8220;leaking&#8221; into the next instruction.</p>



<p>That&#8217;s it! Together with the previous post, you are now fully equipped to implement your RAG. Enjoy!</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>RAG – A Quick Example</title>
		<link>https://jesseliberty.com/2026/04/19/rag-a-quick-example/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sun, 19 Apr 2026 19:04:05 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[RAG]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13251</guid>

					<description><![CDATA[In the previous blog post, we imported a few Python modules and configured our AI key, using Colab. In this blog post we&#8217;ll use Retrieval-Augmented Generation (RAG) to extend an LLM that we&#8217;ll get from OpenAI. I&#8217;ll use a number &#8230; <a href="https://jesseliberty.com/2026/04/19/rag-a-quick-example/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In <a href="https://jesseliberty.com/2026/04/17/creating-our-python-ai-project/#more-13247">the previous blog post</a>, we imported a few Python modules and configured our AI key, using Colab. </p>



<p>In this blog post we&#8217;ll use Retrieval-Augmented Generation (RAG) to extend an LLM that we&#8217;ll get from OpenAI. I&#8217;ll use a number of features from the libraries we imported with only a cursory explanation and will come back to them in upcoming blog posts to examine them in more depth. But I want to get to RAG right away because it is rapidly becoming central to AI and because it is cool.</p>



<p>LLMs are incredibly expensive to create and train, and it isn&#8217;t feasible to train them on everything. Besides that, much data is proprietary. It may be that you want an LLM that handles (to use the canonical case) your HR policies. Clearly no commercial LLM knows about those policies, nor should they. And equally clearly, you&#8217;re not going to train an LLM from scratch. What you want to do is to combine your own corpus of data (HR policy papers, etc.) with an existing LLM, and that is exactly what RAG is for.</p>



<p>In this simple example, we&#8217;re going to take a scene or two from Romeo and Juliet and feed it to <em>gpt-40-mini</em>; one of many LLMs available for use at minimal cost (we&#8217;ll get into how cost is computed in an upcoming post).</p>



<p>The first thing we&#8217;ll do <a href="https://jesseliberty.com/2026/04/17/creating-our-python-ai-project/#more-13247">after configuring the OPEN_API_KEY</a> will be to get a TextLoader to import the text file with the scenes from Romeo and Juliet</p>



<div class="wp-block-file"><a id="wp-block-file--media-753e000b-e51b-4237-b63b-51eccd0a474c" href="https://jesseliberty.com/wp-content/uploads/2026/04/RomeoAndJuliet.txt">RomeoAndJuliet</a><a href="https://jesseliberty.com/wp-content/uploads/2026/04/RomeoAndJuliet.txt" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-753e000b-e51b-4237-b63b-51eccd0a474c">Download</a></div>



<span id="more-13251"></span>



<p>To do that, we&#8217;ll use the TextLoader from langchain_community.document_loaders (again, we&#8217;ll examine this and the other referenced modules in upcoming blog posts). We do this in three steps:</p>



<ol class="wp-block-list">
<li>load the import statement</li>



<li>Point the TextLoader to our file</li>



<li>Load the file</li>
</ol>



<pre class="wp-block-code"><code>from langchain_community.document_loaders import TextLoader
loader = TextLoader("RomeoAndJuliet.txt", encoding="utf-8")
docs = loader.load()</code></pre>



<p>Next, we need to divide the text into chunks that the LLM can work with. We do that with a RecursiveCharacterTextSplitter from langchain. We&#8217;ll use the cl100k_base encoder, and we&#8217;ll set the chunk_size to 1000 (that is 1,000 of those mysterious tokens that, e.g., words are divided into). To ensure that nothing is dropped, we set a property, chunk_overlap to 200.</p>



<pre class="wp-block-code"><code>from langchain.text_splitter import RecursiveCharacterTextSplitter 
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=1000,
    chunk_overlap=200
)
chunks = loader.load_and_split(text_splitter)</code></pre>



<p>In this particular example, we get six chunks,</p>



<pre class="wp-block-code"><code>len(chunks)
6</code></pre>



<p>By now you are getting annoyed that so much is going by that I&#8217;m not explaining. As promised, however, all will be clarified in <a href="https://jesseliberty.com/2026/04/21/rag-in-detail/">the next blog post</a>. In fact, we&#8217;ll go back through this line by line and explain what each step is doing. But for now, let&#8217;s continue&#8230;</p>



<p>We need an embedding model which we&#8217;ll use to create our vector store (the place we hold onto our chunks) As an aside, the other things held in the vector store are the metadata about each chunk and vectors which are numerical embeddings of the chunks. The vectors are actually the key part of this, they are a long list of numbers that represent the semantic meaning of each chunk which can be used, and will be used below, in a similarity search.</p>



<pre class="wp-block-code"><code>from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embedding_model = OpenAIEmbeddings(model="text-embedding-ada-002")
vectorstore = Chroma.from_documents(
    chunks,
    embedding_model,
    collection_name="RomeoAndJuliet"
)</code></pre>



<p>Now that we have the vector store, we need a way to conduct the search, for which we need a retriever. When we instantiate it, we&#8217;ll tell the retriever to use a similarity search.</p>



<pre class="wp-block-code"><code>retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 10}
)</code></pre>



<p>The kwargs are additional keyword-arguments you can pass in. In this case, we&#8217;re telling the retriever to get 10 results.</p>



<p>Now! We are ready to create our user message and our retrieval Query. A retrieval query is the search query you send to your vector store to fetch relevant chunks. The user message is a natural-language question—it&#8217;s the optimized text you give to the retriever so it can find the right documents.</p>



<pre class="wp-block-code"><code>userMessage = "Give me every line having the word swear in it" 
retrievalQuery = "the play 'Romeo and Juliet'"</code></pre>



<p>We can now extract the relevant chunks, iterate through them and create a long string of the resulting context chunks.</p>



<pre class="wp-block-code"><code>relevantChunks = retriever.invoke(retrievalQuery)
contextChunks = &#91;d.page_content for d in relevantChunks]
contextString = ". ".join(contextChunks)</code></pre>



<p>We need to give the LLM context to work from. One great way to do that is to assign a role to the LLM (e.g., &#8220;you are a human resource assistant&#8221;). In this case, we&#8217;ll use a reviewer who knows about plays. This is also a good place to provide explicit directions on how you want the LLM to respond.</p>



<pre class="wp-block-code"><code>qna_system_message = """
You are a play reviewer using the RAG to combine the text of the play with your knowledge of plays in general.
You will review RomeoAndJuliet.txt and provide appropriate answers from the context.
The user input will have the context required by you and will begin with the token: ###Context.
The user questions will begin with the token: ###Question.
Please answer only using the context provided and do not mention anything about the context in your answer.
If the answer is not found in the context, respond "I don't know."
"""</code></pre>



<p>We just need a way to tell the LLM how the context and question will appear, for which we create a template.</p>



<pre class="wp-block-code"><code>qna_user_message_template = """
###Context
{context}

###Question
{question}
"""</code></pre>



<p>Let&#8217;s create the final userQuery by combining the context with the user message we created above.</p>



<pre class="wp-block-code"><code>userQuery = qna_user_message_template.format(
    context=contextString,
    question=userMessage
)</code></pre>



<p>Finally, we&#8217;re ready to create the prompt that we&#8217;ll feed to the LLM</p>



<pre class="wp-block-code"><code>prompt = f"""
&#91;INST]{qna_system_message}

{userQuery}
&#91;/INST]
"""</code></pre>



<p>Next, we instantiate our LLM filling in some parameters that, again, we&#8217;ll review in an upcoming blog post</p>



<pre class="wp-block-code"><code>from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="gpt-4o-mini",                      
    temperature=0,                
    max_tokens=10000,                 
    top_p=0.95,
    frequency_penalty=1.2,
    stop_sequences=&#91;'INST']
)</code></pre>



<p>And we are now, at last, ready to feed our prompt to the LLM which will incorporate the RAG we created from the Romeo and Juliet text. Remember that we asked it to give us the lines with the word swear in it</p>



<pre class="wp-block-code"><code>response = llm.invoke(prompt)
response.content

ROMEO.  
O, then, dear saint, let lips do what hands do:  
They pray, grant thou, lest faith turn to despair.  

JULIET.  
Saints do not move, though grant for prayers’ sake.  

ROMEO.  
Then move not while my prayer’s effect I take.  
Thus from my lips, by thine my sin is purg’d.

JULIET.  
O swear not by the moon, th’inconstant moon,  
That monthly changes in her circled orb,  
Lest that thy love prove likewise variable.

ROMEO.  
What shall I swear by?

JULIET.   
Do not swear at all.   
Or if thou wilt, swear by thy gracious self,
Which is the god of my idolatry,
And I’ll believe thee.

ROMEO.
If my heart’s dear love,—</code></pre>



<p>One thing to note is that the LLM interpreted &#8220;swear&#8221; liberally. For example, in the first verse Romeo says &#8220;They pray,&#8221; which is pretty close to &#8220;swear.&#8221; </p>



<p>We need a method to ask more questions. Let&#8217;s create a method, that takes a user message, chunks it, creates the prompt and invokes the LLM with that prompt</p>



<pre class="wp-block-code"><code>def UseRag(userMessage):
    """
    Args:
    userMessage: Takes a user input for which the response should be retrieved from the vectorDB.
    Returns:
    relevant context as per user query.
    """
    chunks = retriever.invoke(userMessage)
    contextContent = &#91;d.page_content for d in chunks]
    contextString = ". ".join(contextContent)

    prompt = f"""&#91;INST]{qna_system_message}\n
                {'user'}: {qna_user_message_template.format(context=contextString, question=userMessage)}
                &#91;/INST]"""

    # Quering the LLM
    try:
        response = llm.invoke(prompt)

    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'

    return response.content </code></pre>



<p>To prove that we&#8217;re getting our answers from the RAG, let&#8217;s ask a question about text that is not in our excerpt but that would be known by anyone (anything?) that is familiar with the play.</p>



<pre class="wp-block-code"><code>print(UseRag("What town does Romeo live in?"))

I don't know.</code></pre>



<p>Finally, let&#8217;s have a bit of fun,</p>



<pre class="wp-block-code"><code>UseRag("Write a 10 line poem in the style of 'Romeo and Juliet'")

In shadows deep where whispered secrets lie,  
Two hearts entwined beneath the moonlit sky.  
A glance exchanged, a spark ignites the night,  
Forbidden love that dances out of sight.  

O sweet Juliet, with beauty rare and bright,  
Your name a curse yet brings my soul delight.  
Though feuding kin may seek to tear apart,  
Our love shall bloom within each beating heart.  

For in this world of strife and bitter woe,  
Together we shall rise; our passion's glow.  
</code></pre>



<p>I think that is actually pretty good.</p>



<p>OK, that was a lot, and it went by fast. I look forward to going back through it, line by line, and exploring what each line is doing.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Creating Our Python AI Project</title>
		<link>https://jesseliberty.com/2026/04/17/creating-our-python-ai-project/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Fri, 17 Apr 2026 13:54:29 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[colab]]></category>
		<category><![CDATA[getting started]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13247</guid>

					<description><![CDATA[As noted in a previous blog post, we&#8217;ll be building our project on two platforms: Python and .NET (C#). For Python, we&#8217;ll build on Colab. For now, you can use a free account. The first step is to get an &#8230; <a href="https://jesseliberty.com/2026/04/17/creating-our-python-ai-project/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>As noted in a previous blog post, we&#8217;ll be building our project on two platforms: Python and .NET (C#). For Python, we&#8217;ll build on Colab. For now, you can use a free account.</p>



<p>The first step is to get an OpenAI key, as described in this <a href="https://jesseliberty.com/2026/04/04/microsoft-agent-framework/#more-13227">previous blog post</a> (scroll to the bottom). Note, these are not free, but we&#8217;re talking a few dollars for this entire series of posts.</p>



<span id="more-13247"></span>



<p>With that in hand, create a config.json file that looks like this:<br /></p>



<pre class="wp-block-code"><code>{
    "API_KEY": "gl-U2FsdGVkX1/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/MA8peF",
}</code></pre>



<p>Next, open a browser to <em><a href="https://colab.research.google.com">colab.research.google.com</a></em>. Create a new notebook.</p>



<p>Click on the folder icon on the left, and when you see a number of folders, drag your config file to that window&#8230;</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="593" height="652" src="https://jesseliberty.com/wp-content/uploads/2026/04/image-2.png" alt="" class="wp-image-13248" style="aspect-ratio:0.9095070338150358;width:178px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/image-2.png 593w, https://jesseliberty.com/wp-content/uploads/2026/04/image-2-273x300.png 273w, https://jesseliberty.com/wp-content/uploads/2026/04/image-2-136x150.png 136w" sizes="auto, (max-width: 593px) 100vw, 593px" /></figure>



<p>We&#8217;ll need some modules, so in the first cell enter:<br /></p>



<pre class="wp-block-code"><code>!pip install langchain==0.3.25 \
                langchain-core==0.3.65 \
                langchain-openai==0.3.24 \
                chromadb==1.3.4 \
                langchain-community==0.3.20 \
                pypdf==5.4.0</code></pre>



<p>A <em>lot</em> of lines will follow as the correct modules are loaded, and then you will be instructed to restart the runtime. Do this <em>once</em>.</p>



<p>Now let&#8217;s add that config file. In the next cell enter</p>



<pre class="wp-block-code"><code>import json
import os

# Load the JSON file and extract values
file_name = "config.json"
with open(file_name, 'r') as file:
    config = json.load(file)
    os.environ&#91;'OPENAI_API_KEY'] = config.get("API_KEY") # Loading the API Key
 </code></pre>



<p>After you run it (click on the triangle) You will see a green check mark next to the cell (on the left) and the time it took to run your command.</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="459" height="153" src="https://jesseliberty.com/wp-content/uploads/2026/04/image-3.png" alt="" class="wp-image-13249" style="aspect-ratio:3.000204269226841;width:201px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/image-3.png 459w, https://jesseliberty.com/wp-content/uploads/2026/04/image-3-300x100.png 300w, https://jesseliberty.com/wp-content/uploads/2026/04/image-3-150x50.png 150w" sizes="auto, (max-width: 459px) 100vw, 459px" /></figure>



<p>You are all set! In the next blog post we&#8217;ll instantiate our first LLM</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Mads Torgersen</title>
		<link>https://jesseliberty.com/2026/04/17/mads-torgersen/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Fri, 17 Apr 2026 12:59:41 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13241</guid>

					<description><![CDATA[Mads (Lead Designer of C#) joins me to discuss C# and AI as well as what to expect in C# 15. PodcastVideo]]></description>
										<content:encoded><![CDATA[
<p>Mads (Lead Designer of C#) joins me to discuss C# and AI as well as what to expect in C# 15.</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="646" height="716" src="https://jesseliberty.com/wp-content/uploads/2026/04/image.jpg" alt="" class="wp-image-13245" style="width:432px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/image.jpg 646w, https://jesseliberty.com/wp-content/uploads/2026/04/image-271x300.jpg 271w, https://jesseliberty.com/wp-content/uploads/2026/04/image-135x150.jpg 135w" sizes="auto, (max-width: 646px) 100vw, 646px" /></figure>



<p class="has-medium-font-size"><a href="https://jesseliberty.fireside.fm">Podcast</a><br /><a href="https://youtube.com/jesseliberty">Video</a></p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Distributed Computing &amp; Docker</title>
		<link>https://jesseliberty.com/2026/04/07/distributed-computing-docker/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Tue, 07 Apr 2026 23:11:35 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13238</guid>

					<description><![CDATA[Joe Dluzen joins me to discuss, in depth, distributed computing and Docker. The podcast is here and the video is here.]]></description>
										<content:encoded><![CDATA[
<p>Joe Dluzen joins me to discuss, in depth, distributed computing and Docker.</p>



<p>The podcast is <a href="https://jesseliberty.fireside.fm/">here </a>and the video is <a href="https://youtube.com/jesseliberty">here</a>.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Microsoft Agent Framework – Part 0</title>
		<link>https://jesseliberty.com/2026/04/04/microsoft-agent-framework/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sat, 04 Apr 2026 11:01:58 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13227</guid>

					<description><![CDATA[I&#8217;ve been looking at a number of different ways to build Agents. I&#8217;ve settled on two and will be documenting what I learn as I go: The advantage of the first is that you understand the underlying mechanisms in more &#8230; <a href="https://jesseliberty.com/2026/04/04/microsoft-agent-framework/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>I&#8217;ve been looking at a number of different ways to build Agents. I&#8217;ve settled on two and will be documenting what I learn as I go:</p>



<ul class="wp-block-list">
<li>Building from first principles based on my course on Agentics at Johns Hopkins</li>



<li>Building using Microsoft&#8217;s new Agent Framework</li>



<li></li>
</ul>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" width="800" height="446" src="https://jesseliberty.com/wp-content/uploads/2026/04/neteverythingelse-800x446.jpg" alt="" class="wp-image-13228" style="aspect-ratio:1.793766994352646;width:389px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/neteverythingelse-800x446.jpg 800w, https://jesseliberty.com/wp-content/uploads/2026/04/neteverythingelse-300x167.jpg 300w, https://jesseliberty.com/wp-content/uploads/2026/04/neteverythingelse-150x84.jpg 150w, https://jesseliberty.com/wp-content/uploads/2026/04/neteverythingelse-768x428.jpg 768w, https://jesseliberty.com/wp-content/uploads/2026/04/neteverythingelse.jpg 875w" sizes="auto, (max-width: 800px) 100vw, 800px" /></figure>



<p>The advantage of the first is that you understand the underlying mechanisms in more depth; the advantage of the second is that a lot of the plumbing is done for you and you become more productive more quickly. </p>



<p>I will do the .NET work in C#, and probably do all the other work in Python. See my blogpost on why <a href="https://jesseliberty.com/2026/02/20/python/">Python</a>. </p>



<p>I will, to a degree, be documenting what I learn as I learn it, without infringing on copyright, of course.</p>



<span id="more-13227"></span>



<h2 class="wp-block-heading">Project 1 &#8211; Jupyter</h2>



<p>The work for Johns Hopkins is done in a Jupyter notebook. These are very convenient files that contain runnable cells. You put your code snippet in a cell and run it, either receiving a result or extensive error information. To get started, open Visual Studio Code and click New File. In the drop down on top, choose Jupyter Notebook:</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="598" height="147" src="https://jesseliberty.com/wp-content/uploads/2026/04/NewFile.jpg" alt="" class="wp-image-13229" style="width:431px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/NewFile.jpg 598w, https://jesseliberty.com/wp-content/uploads/2026/04/NewFile-300x74.jpg 300w, https://jesseliberty.com/wp-content/uploads/2026/04/NewFile-150x37.jpg 150w" sizes="auto, (max-width: 598px) 100vw, 598px" /></figure>



<p>Choose your Python environment, and put code into the first &#8220;cell.&#8221; Then click on the knob on the left and the cell runs and the output is displayed:</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="253" height="107" src="https://jesseliberty.com/wp-content/uploads/2026/04/image.png" alt="" class="wp-image-13230" style="width:220px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/04/image.png 253w, https://jesseliberty.com/wp-content/uploads/2026/04/image-150x63.png 150w" sizes="auto, (max-width: 253px) 100vw, 253px" /></figure>



<p>You can add markdown, you can even tell it to generate code based on your description of what you want. It is possible to have these cells generate a proper Python program and/or generate </p>



<pre class="wp-block-code"><code></code></pre>



<h2 class="wp-block-heading">Open AI Endpoint</h2>



<p>Whether you are working in the Jupyter notebook or in Agent Framework, you will want an OpenAI Endpoint. To do so go to https://platform.openai.com, sign in, go to API Keys, create a new key. Your endpoint is the base url used for API calls.</p>



<p>Or&#8230; if you are working in the Microsoft ecosystem, in the Azure Portal search for Azure OpenAI and create a resource. Then go to Resource Management -> Keys and Endpoint. Copy the Endpoint and Key 1 (your API key) Deploy under Resource Management -> Model Deployments. Your endpoint will look like this:</p>



<p><a href="https://.openai.azure.com/">https://&lt;your-resource-name>.openai.azure.com/</a><br /><br />Copilot can set all this up for you.</p>



<p>Next up: setting up your environment.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>AI: The Near Term</title>
		<link>https://jesseliberty.com/2026/03/30/ai-the-near-term/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Mon, 30 Mar 2026 11:49:06 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13222</guid>

					<description><![CDATA[As promised, I&#8217;ll be posting slides and commentary from my recent user-group presentation on AI (Boston Code Camp). One of my first slides talked about the near-term evolution of AI, defined as either 1-2 years or 6 months, depending on &#8230; <a href="https://jesseliberty.com/2026/03/30/ai-the-near-term/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>As promised, I&#8217;ll be posting slides and commentary from my recent user-group presentation on AI (<a href="https://www.bostoncodecamp.com/CC40/info">Boston Code Camp)</a>. One of my first slides talked about the near-term evolution of AI, defined as either 1-2 years or 6 months, depending on who you believe.</p>



<p>I divided the slide into two parts: good news and bad. The <strong>good news </strong>is the promise of enormously increased productivity, which may well lead to a higher standard of living across the board. Further, as AI progresses, we may see accelerated breakthroughs in many fields, most notably medicine and genetics.</p>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" width="800" height="543" src="https://jesseliberty.com/wp-content/uploads/2026/03/image-2-800x543.png" alt="" class="wp-image-13223" style="aspect-ratio:1.4727297705728808;width:296px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/03/image-2-800x543.png 800w, https://jesseliberty.com/wp-content/uploads/2026/03/image-2-300x204.png 300w, https://jesseliberty.com/wp-content/uploads/2026/03/image-2-150x102.png 150w, https://jesseliberty.com/wp-content/uploads/2026/03/image-2-768x521.png 768w, https://jesseliberty.com/wp-content/uploads/2026/03/image-2.png 810w" sizes="auto, (max-width: 800px) 100vw, 800px" /></figure>



<span id="more-13222"></span>



<p>The bad news is that along with these changes will almost certainly come massive displacement. As a society, we&#8217;ve seen this kind of displacement before. The advent of the tractor displaced well over 50% of farm workers, and ultimately mechanization displaced over 95% of agricultural labor! <br /><br />The key difference, however, is that in the past technology displaced the least skilled workers; AI is set to affect many of the most educated and skilled. I&#8217;m not sure that society knows how to handle this. Reskilling has a poor performance history, and sending highly paid college-educated programmers back to two-year training programs in robotics does not seem auspicious. </p>



<p>In the very short term, AI presents some real challenges, not least is the environmental impact of massive energy-consuming data centers. Add to that the potential economic impact when the AI boom goes bust and there is plenty to worry about, but here is one more&#8230;</p>



<p>Right now we are learning to adjust to AI, but research continues on AGI (Artificial General Intelligence). The next step after that is Artificial Super Intelligence: when AI becomes meaningfully smarter than humans. No one knows quite what to expect when that happens, but I commend to you the book <a href="https://bookshop.org/p/books/if-anyone-builds-it-everyone-dies-why-superhuman-ai-would-kill-us-all-eliezer-yudkowsky/2da88520a671d222?ean=9780316595643&amp;next=t"><em>If Anyone Builds It, Everyone Dies</em> </a>by <a href="https://www.amazon.com/Eliezer-Yudkowsky/e/B00J6XXP9K/ref=dp_byline_cont_ebooks_1">Eliezer Yudkowsk</a>y and <a href="https://www.amazon.com/Nate-Soares/e/B0FSF2C9PK/ref=dp_byline_cont_ebooks_2">Nate Soares</a>. These are not crackpot writers &#8212; they are recognized experts in AI and their thesis is chilling.</p>



<p>In coming postings I will divide my attention between two aspects of AI:</p>



<ul class="wp-block-list">
<li>Using AI in developing your application</li>



<li>Creating applications that incorporate AI</li>
</ul>



<p>Along the way I will talk about AI in the .NET world (specifically the Azure workflow tools) and outside of .NET (time to brush up on Python). </p>



<p>Next up: Assisted Coding&#8230;</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>MCP In Depth</title>
		<link>https://jesseliberty.com/2026/03/22/mcp-in-depth/</link>
		
		<dc:creator><![CDATA[Jesse Liberty]]></dc:creator>
		<pubDate>Sun, 22 Mar 2026 14:47:51 +0000</pubDate>
				<category><![CDATA[Essentials]]></category>
		<guid isPermaLink="false">https://jesseliberty.com/?p=13214</guid>

					<description><![CDATA[In a special videoCast, Lance McCarthy of Progress Software dives deep into MCP, not only explaining what it is for and how it works, but demonstrating, in code, how it is done. MCP (Model, Context, Protocol) is an open standard &#8230; <a href="https://jesseliberty.com/2026/03/22/mcp-in-depth/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[
<p>In a special <a href="https://youtube.com/jesseliberty">videoCast</a>, Lance McCarthy of Progress Software dives deep into MCP, not only explaining what it is for and how it works, but demonstrating, in code, how it is done. </p>



<p>MCP (Model, Context, Protocol) is an open standard that lets AI models (read copilot, ChatGPT, etc.) connect to external tools. These tools can be databases, code you write, code you connect to, other AIs, etc. etc. It is the universal API for Agents.</p>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" width="800" height="303" src="https://jesseliberty.com/wp-content/uploads/2026/03/image-1-800x303.png" alt="" class="wp-image-13216" style="aspect-ratio:2.6403940886699506;width:473px;height:auto" srcset="https://jesseliberty.com/wp-content/uploads/2026/03/image-1-800x303.png 800w, https://jesseliberty.com/wp-content/uploads/2026/03/image-1-300x114.png 300w, https://jesseliberty.com/wp-content/uploads/2026/03/image-1-150x57.png 150w, https://jesseliberty.com/wp-content/uploads/2026/03/image-1-768x291.png 768w, https://jesseliberty.com/wp-content/uploads/2026/03/image-1.png 1045w" sizes="auto, (max-width: 800px) 100vw, 800px" /></figure>



<p><a href="https://github.com/LanceMcCarthy/McpServers">Here&#8217;s a link to the code</a>.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>