GauravMantri.com

Keeping Your Azure OpenAI App Running Smoothly During Service Interruptions

Gaurav — Fri, 15 Nov 2024 16:24:08 +0000

In this post, I’ll walk you through a simple yet effective approach we use at Purple Leaf to ensure our application stay online, even when Azure OpenAI service faces throttling or downtime. By deploying Azure OpenAI in multiple regions and implementing a smart failover strategy, we’re able to provide a seamless experience for our users, regardless of unexpected service disruptions.

Here’s how it works:

1. Deploy Azure OpenAI Across Multiple Regions

We deploy Azure OpenAI service in at least two regions — one as the primary and another as the secondary.

2. Failover Logic

With both regions set up, our application needs a way to switch smoothly between them. For example, if primary region fails to process a request, the system should automatically retry the request with the secondary region. This approach ensures that, even if one service instance is down or throttled, your application can continue running without significant interruptions.

By deploying Azure OpenAI service across multiple regions and integrating this failover strategy, we protect our application from unexpected downtime and keep our users happy.

In the next sections, I’ll provide sample code to demonstrate how to implement this approach effectively.

Code

export default class AzureOpenAIHelper {
  private readonly _azureOpenAIPrimaryClient: AzureOpenAI;
  private readonly _azureOpenAISecondaryClient: AzureOpenAI;

  constructor() {
    this._azureOpenAIPrimaryClient = new AzureOpenAI({
      endpoint: "primary-end-point",
      apiKey: "primary-end-point-api-key",
      deployment: "primary-deployment-id",
      apiVersion: "primary-api-version",
    });
    this._azureOpenAISecondaryClient = new AzureOpenAI({
      endpoint: "secondary-end-point",
      apiKey: "secondary-end-point-api-key",
      deployment: "secondary-deployment-id",
      apiVersion: "secondary-api-version",
    });
  }

  async processChatRequest(systemMessage: string, userMessage: string, isPrimary: boolean = true, attempt: number = 1) {
    try {
      const client = isPrimary
        ? this._azureOpenAIPrimaryClient
        : this._azureOpenAISecondaryClient;
      const messages = [
        { role: 'system', content: systemMessage },
        { role: 'user', content: userMessage}
      ]  
      const result = await client.chat.completions.create({
          messages,
          model: model,
          response_format: { type: 'text' },
        },
        {},
      );        
      return result;
    } catch (error: unknown) {
      if (attempt < 4) {
        return await this.processChatRequest(
          systemMessage,
          userMessage,
          !isPrimary,
          attempt + 1,
        );
      } else {
        throw err;
      }
    }
  }
}

In the class constructor, we create instances of both primary and secondary Azure OpenAI services.

processChatRequest method makes the call to the Azure OpenAI service. By default this method connects to the primary service.

If the call fails for whatever reason, we execute the same method but this time we are connecting to the secondary service. If the call fails again, we fall back to primary.

We do this a limited number of times (4) before we give up and throw an exception.

Summary

When working with services like Azure OpenAI, throttling and downtime can potentially bring your entire application down. That’s why having a plan for redundancy isn’t just nice to have — it’s essential.

By deploying Azure OpenAI in multiple regions and using a simple failover strategy, you’re not only adding a layer of resilience but also ensuring your app keeps running smoothly, even when things go wrong.

This setup has been a super helpful for us at Purple Leaf, helping us deliver a stable, reliable experience without leaving our users hanging.

I hope you have found this blog post useful. Please do share your thoughts on how you are handling Azure OpenAI service issues.

The post Keeping Your Azure OpenAI App Running Smoothly During Service Interruptions appeared first on GauravMantri.com.

Is SaaS Dead? I Don’t Think So!

Gaurav — Wed, 02 Oct 2024 03:14:39 +0000

Writing software has become incredibly easy today, thanks to the proliferation of AI-powered code-generation tools like GitHub Copilot, Kursor, Vercel’s V0, and GPT Engineer. Combine that with the rise of low-code/no-code platforms, and it’s no surprise that debates about the future of Software-as-a-Service (SaaS) have emerged.

Many pundits are now proclaiming the death of the SaaS industry, fueled by the idea of “why rent when you can build?” They argue that businesses, armed with these new technologies, can build and customize software in-house with fewer resources, bypassing the need for external SaaS providers.

High-profile examples, like Klarna decommissioning some SaaS services from its enterprise, only add fuel to this fire.

So, is SaaS dead?

I don’t think so.

While it’s true that AI and no-code platforms have made development more accessible, they haven’t eliminated the need for comprehensive, secure, and continuously evolving services.

SaaS isn’t just about the software—it’s an ecosystem that offers businesses instant access to powerful tools, enterprise-grade security, and seamless scalability, without the heavy lifting required to build, maintain, and update those solutions in-house.

Why Did You Choose SaaS in the First Place?

Let me ask you this: Why did you choose a SaaS service in the first place? My guess is that the following were some of the primary reasons you considered:

Focus on Core Competencies

Unless you’re a Payment Tech SaaS company, it makes zero sense for you to write your own payment processing solution—simply because that’s not your core competency. Choosing a payment processing SaaS solution (like Stripe) allows you to focus on what you do best.

Instead of sinking time and resources into building, maintaining, and securing software, you can focus on your core competencies, driving innovation and growth where it matters most. SaaS takes the burden of software management off your shoulders, letting you focus on strategy and execution.

While AI tools can undeniably assist with software development, they don’t replace the ongoing responsibility of maintaining, securing, and evolving these systems. The time and effort saved by AI in development can quickly be negated by the resources still required to manage infrastructure, updates, and security. SaaS providers continue to handle all of this for you, allowing businesses to truly focus on their core mission.

Move Fast and Avoid the Complexities of Deployment and Infrastructure Management

When you chose SaaS, a major factor was likely the need to move fast and avoid the technical complexities that come with deployment and infrastructure management. SaaS provides immediate access to powerful, industry-standard tools without the need for lengthy development cycles or complex setups. Whether onboarding a small team or scaling across multiple departments, SaaS solutions enable rapid deployment with minimal disruption, allowing your business to hit the ground running.

While AI can accelerate coding and development, it doesn’t simplify the challenges of integrating custom-built solutions into existing ecosystems. Deployment is often fraught with testing, troubleshooting, and infrastructure configuration. SaaS solutions, by contrast, come pre-configured and battle-tested, offering businesses a faster, hassle-free path to productivity without the infrastructure headaches.

Security and Compliance? No Problem.

When you chose SaaS, it wasn’t just about convenience—it was about trusting experts with the critical aspects of security and compliance. You were fully aware of the complexities involved but didn’t have the resources, expertise, or desire to manage them in-house. SaaS providers invest heavily in enterprise-grade security, regulatory compliance, and continuous monitoring, offering a level of protection that would be difficult and costly to replicate internally.

While AI can certainly assist with certain security tasks, it doesn’t provide the full-scale infrastructure, 24/7 monitoring, and regulatory expertise needed to ensure compliance and prevent breaches. SaaS solutions come with these built-in safeguards, giving you peace of mind that your business is secure and compliant with the latest industry standards.

Cost-Efficiency and Scalability

One of the main drivers behind your decision to choose SaaS was the need for a cost-efficient and scalable solution. SaaS provides a flexible pricing model—allowing you to pay only for what you use—while avoiding the large upfront costs associated with building and maintaining custom software. As your business grows, SaaS effortlessly scales with you, ensuring that you have the right amount of resources at any given time without the headache of managing infrastructure.

While AI can help streamline development, it doesn’t automatically lower the costs associated with infrastructure, scaling, and maintenance. Custom-built software, even AI-assisted, still requires continuous monitoring and resource allocation to handle growth. SaaS, by contrast, offers seamless scalability without the hidden expenses, making it an ideal long-term solution for businesses looking to grow efficiently.

Some folks may bring up the famous “Bell Curve,” where once you reach a certain scale, it becomes more efficient to do things in-house—and quite honestly, I will really look forward to that.

However, for most businesses, reaching that point is a long way off. Until then, the cost-efficiency and scalability that SaaS offers provide the perfect balance of affordability, flexibility, and convenience, allowing companies to grow without unnecessary complications. SaaS gives you the tools to thrive now, and when that scale comes, you’ll be in a strong position to evaluate whether it’s time to take things in-house or continue reaping the benefits of SaaS solutions.

Conclusion

In a world where AI and low-code/no-code platforms are rapidly transforming how we build software, it’s tempting to question the future of SaaS. Yet, while these technologies have made development more accessible, they haven’t replaced the need for SaaS solutions. From allowing businesses to focus on their core competencies, to avoiding the headaches of deployment and infrastructure management, to providing security and compliance at scale, SaaS continues to deliver unmatched value.

Moreover, SaaS offers the cost-efficiency and scalability that growing businesses need to thrive. While some may eventually reach the point where in-house solutions become more practical, for the majority of organizations, SaaS remains the most efficient and flexible path forward. AI may accelerate innovation, but SaaS is the proven foundation that allows businesses to move fast, stay secure, and scale with confidence.

SaaS isn’t dead—far from it. It’s evolving alongside new technologies and continues to be the backbone of countless business operations.

The post Is SaaS Dead? I Don’t Think So! appeared first on GauravMantri.com.

Smart To-Do Creator: Combining the ease of Azure Logic Apps and the power of Generative AI

Gaurav — Sun, 17 Mar 2024 18:52:42 +0000

In this post, I am going to talk about how I built a smart to-do creator using Azure Logic Apps and Generative AI (Azure OpenAI service). I recently took a course of LinkedIn Learning about Azure Logic Apps (called Azure Logic Apps – Building solutions for data and integration) and thought I would put my learning to a good use by building something small yet useful (best way to learn new is by building something using it, right?) and that’s how I ended up creating this tool.

What is Smart To-Do Creator?

Simply put, this tool basically creates tasks out of your emails. It reads the contents of your email and then somehow infers that the sender wants you to do some task for them, creates a task, and then saves it in Microsoft To-Do application.

Architecture

Architecture for this application is fairly simple and is shown in the picture below.

It makes use of Azure Logic Apps and Azure OpenAI Service.

Azure Logic App is connected to your Office 365 account using Outlook connector. As soon as an email comes in, it extracts metadata about the email (like sender, subject and body) and send it to an HTTP connector.

HTTP connector is nothing but an HTTP triggered Azure Function. When it is triggered, it sends the input data to Azure OpenAI and asks it to understand the email and see if a task can be created using the email.

It relies on a Large Language Model (LLM) text comprehension and reasoning capabilities. Through clever prompt engineering, it outputs a JSON object containing the task details like the task subject, description and due date and feeds that to a To-Do connector.

To-Do connector basically takes this data and creates a task for the user.

Test

To test it, I sent myself an email asking me to get some stuff (for me ) during my upcoming trip to India and surprisingly it worked really well. When the workflow finished, I had a task assigned to me with a meaningful title, a concise description of the task and an expected end date (even though I did not specify an exact date).

Pretty neat, huh!

Code

I stitched together the whole solution in less than 4 hours (out of which I spent about an hour getting Logic App service configured correctly). I am pretty sure that this code can be improved considerably but sharing it nonetheless.

Workflow Code

{
    "definition": {
        "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
        "actions": {
            "Add_a_to-do_(V3)": {
                "inputs": {
                    "body": {
                        "body": {
                            "content": "@{body('Parse_JSON')['task']}",
                            "contentType": "html"
                        },
                        "dueDateTime": {
                            "dateTime": "@body('Parse_JSON')['dueDate']",
                            "timeZone": "UTC"
                        },
                        "reminderDateTime": {
                            "timeZone": "UTC"
                        },
                        "title": "@body('Parse_JSON')['title']"
                    },
                    "host": {
                        "connection": {
                            "referenceName": "todo"
                        }
                    },
                    "method": "post",
                    "path": "/lists/@{encodeURIComponent('Some Base64 encoded string')}/tasks"
                },
                "runAfter": {
                    "Parse_JSON": [
                        "SUCCEEDED"
                    ]
                },
                "type": "ApiConnection"
            },
            "HTTP": {
                "inputs": {
                    "body": {
                        "body": "@{triggerBody()}",
                        "from": "@{triggerBody()?['from']}",
                        "subject": "@{triggerBody()?['subject']}"
                    },
                    "method": "POST",
                    "uri": "https://myfunctionapp.azurewebsites.net/api/HttpTrigger1"
                },
                "runAfter": {},
                "runtimeConfiguration": {
                    "contentTransfer": {
                        "transferMode": "Chunked"
                    }
                },
                "type": "Http"
            },
            "Parse_JSON": {
                "inputs": {
                    "content": "@body('HTTP')",
                    "schema": {
                        "$schema": "http://json-schema.org/draft-04/schema#",
                        "properties": {
                            "dueDate": {
                                "type": "string"
                            },
                            "task": {
                                "type": "string"
                            },
                            "title": {
                                "type": "string"
                            }
                        },
                        "required": [
                            "title",
                            "task",
                            "dueDate"
                        ],
                        "type": "object"
                    }
                },
                "runAfter": {
                    "HTTP": [
                        "SUCCEEDED"
                    ]
                },
                "type": "ParseJson"
            }
        },
        "contentVersion": "1.0.0.0",
        "outputs": {},
        "triggers": {
            "When_a_new_email_arrives_(V3)": {
                "inputs": {
                    "fetch": {
                        "method": "get",
                        "pathTemplate": {
                            "template": "/v3/Mail/OnNewEmail"
                        },
                        "queries": {
                            "fetchOnlyWithAttachment": false,
                            "folderPath": "Inbox",
                            "from": "email@domain.com",
                            "importance": "Any",
                            "includeAttachments": false
                        }
                    },
                    "host": {
                        "connection": {
                            "referenceName": "office365"
                        }
                    },
                    "subscribe": {
                        "body": {
                            "NotificationUrl": "@{listCallbackUrl()}"
                        },
                        "method": "post",
                        "pathTemplate": {
                            "template": "/GraphMailSubscriptionPoke/$subscriptions"
                        },
                        "queries": {
                            "fetchOnlyWithAttachment": false,
                            "folderPath": "Inbox",
                            "importance": "Any"
                        }
                    }
                },
                "splitOn": "@triggerBody()?['value']",
                "type": "ApiConnectionNotification"
            }
        }
    },
    "kind": "Stateful"
}

Function Code

Here’s the code for Azure Function. It’s really crappy code so please do not use it as is .

using System.Collections.Generic;
using System.Net;
using Azure;
using Azure.AI.OpenAI;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Http;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.PromptTemplates.Handlebars;
using Newtonsoft.Json;

namespace ToDoListFunctionApp;

public static class HttpTrigger1
{
    private const string _azureOpenAIEndpoint = "https://xxx.openai.azure.com/";
    private const string _azureOpenAIKey = "aff9ad587352c904832fe6ed932ab30f";
    private const string _azureOpenAIDeploymentId = "gpt-4-32k";
    
    [Function("HttpTrigger1")]
    public static async Task Run([HttpTrigger(AuthorizationLevel.Function, "get", "post")] HttpRequestData req,
        FunctionContext executionContext)
    {
        var logger = executionContext.GetLogger("HttpTrigger1");
        logger.LogInformation("C# HTTP trigger function processed a request.");

        IncomingMessage message = null;
        using (var streamReader = new StreamReader(req.Body))
        {
            var requestBody = await streamReader.ReadToEndAsync();
            message = JsonConvert.DeserializeObject(requestBody);
        }

        if (string.IsNullOrWhiteSpace(message.From) || string.IsNullOrWhiteSpace(message.Subject) ||
            string.IsNullOrWhiteSpace(message.Body))
        {
            throw new InvalidOperationException();
        }

        var kernel = GetKernel();
        var path = Path.Combine(Directory.GetCurrentDirectory(), "Prompt.yaml");
        var function = kernel.CreateFunctionFromPromptYaml(await File.ReadAllTextAsync(path),
            new HandlebarsPromptTemplateFactory());
        var openAIPromptSettings = new OpenAIPromptExecutionSettings()
        {
            Temperature = 0
        };        
        var kernelArguments = new KernelArguments(openAIPromptSettings)
        {
            ["current_date"] = DateTime.UtcNow.Date.ToString("yyyy-MM-ddTHH:mm:ssZ"),
            ["sender"] = message.From,
            ["subject"] = message.Subject,
            ["body"] = message.Body,
        };
        var result = (await kernel.InvokeAsync(function, kernelArguments)).ToString();
        var response = req.CreateResponse(HttpStatusCode.OK);
        response.Headers.Add("Content-Type", "text/plain; charset=utf-8");

        response.WriteString(result);

        return response;
        
    }
    
    private static Kernel GetKernel()
    {
        var azureOpenAIClient =
            new OpenAIClient(new Uri(_azureOpenAIEndpoint), new AzureKeyCredential(_azureOpenAIKey));
        var kernelBuilder = Kernel.CreateBuilder();
        kernelBuilder.AddAzureOpenAIChatCompletion(_azureOpenAIDeploymentId, azureOpenAIClient);
        var kernel = kernelBuilder.Build();
        return kernel;
    }
}

class IncomingMessage
{
    [JsonProperty(PropertyName = "from")]
    public string From { get; set; }

    [JsonProperty(PropertyName = "subject")]
    public string Subject { get; set; }
    
    [JsonProperty(PropertyName = "body")]
    public string Body { get; set; }
}

class TaskDetails
{
    [JsonProperty(PropertyName = "title")]
    public string Title { get; set; }
    
    [JsonProperty(PropertyName = "task")]
    public string Task { get; set; }
    
    [JsonProperty(PropertyName = "dueDate")]
    public string DueDate { get; set; }
}

It makes use of Microsoft Semantic Kernel.

Prompt

This is the heart and soul of this entire application.

name: ToDoCreator
description: Use this function to analyze user's ask and create a task for the user .
template_format: handlebars
template: |
  
  Grounding Rules:
  ================
  - You are an AI assistant specializing in comprehending text and extract meaningful information from the text based on user's ask.
  - Maintain honesty. If uncertain of an answer, respond with, "I apologize, but I currently lack sufficient information to accurately answer your question.".
  - Uphold user privacy. Do not ask for, store, or share personal data without explicit permission.  
  - Promote inclusivity and respect. Do not engage in or tolerate hate speech, discrimination, or bigotry of any form. Treat all users equally, irrespective of race, ethnicity, religion, gender, age, nationality, or disability.  
  - Respect copyright laws and intellectual property rights. Do not share, reproduce, or distribute copyrighted material without the appropriate authorization.  
  - Provide precise and concise responses. Maintain a respectful and professional tone in all interactions. 
  - Wait for the user's question before providing information. Stay within your domain of expertise - text comprehension and extracting meaningful information from that text.  
  - Ensure responses are up-to-date and accessible. Avoid unnecessary jargon and technical language when possible.
  
  Rules of Engagement:
  ====================
  - User has provided you the information from the email they have received. The information contains three things:
    - 1. Sender information: Name of the person who has sent the email.
    - 2. Subject: Subject of the email.
    - 3. Body: Body of the email.
  - User is expecting you to analyze the body and the subject of the email and identify if the sender wants the user to do something in that email (task).
  - Your job is to create a task based on the information provided to you.
  - Please do not assume anything and only consider the information provided to you below.
  - ALWAYS produce the output in JSON format. The output JSON format should be: {"title": "task title", "task": "task details", "dueDate": "task due date"}.
  - You will be provided today's date. Based on that and from the task, infer a due date for the task. If due date cannot be inferred from the task, use a date one week from the today's date as task due date.
  - Task due date must always be outputted in ISO-8601 format (yyyy-MM-ddTHH:mm:ssZ). For example, 2024-01-01T15:00:00Z.
  - MOST IMPORTANTLY, If no task can be created based on the information provided, you must return an empty response. Do not make up the task.
  
  
  
  Today's Date
  ============
  {{current_date}}
  
  Sender
  ======
  {{sender}}
  
  Subject
  =======
  {{subject}}
  
  Body
  ====
  {{body}}
  
  
  
  Considering the information provided to you, please create a task and output it in the following format: {"title": "task title", "task": "task details", "dueDate": "task due date"}. If no task can be created, simply return an empty response.
  
input_variables:
  - name: current_date
    description: current date in yyyy-MM-ddTHH:mm:ssZ format.
    is_required: true
  - name: sender
    description: email sender information.
    is_required: true
  - name: subject
    description: email subject
    is_required: true
  - name: body
    description: email body
    is_required: true
execution_settings:
  default:
    temperature: 0

Summary

That’s it for this post. This is the first time I used Azure Logic Apps and I am genuinely impressed! For the first time, I have realized the convenience of No Code/Low Code platform. Combine that with the power of an LLM, we got ourselves a winner!

I will continue to explore this more and share my learnings.

Till then, be well and happy (low/no) coding!

The post Smart To-Do Creator: Combining the ease of Azure Logic Apps and the power of Generative AI appeared first on GauravMantri.com.

Azure Sidekick – An AI Assistant to Answer Questions About Your Azure Resources (Part III – Lessons Learned)

Gaurav — Fri, 16 Feb 2024 00:28:44 +0000

Best way to learn a new technology is by building something (regardless of how big or small it is) with it. This was my primary intent behind building Azure Sidekick.

I had so much fun building this and learnt a lot of things along the way. Not only I learnt more about Microsoft Semantic Kernel and Prompt Engineering, I also picked up a thing or two in .Net as well (AddKeyedSingleton and IAsyncEnumerable to name a few).

In my previous posts about this, I talked about the features and how I built it. You can read those posts here and here.

In this post I will talk about the lessons learned.

So let’s begin!

Lessons Learned

Here are some of the lessons I learned while developing this tool.

Writing prompts is hard

Writing great prompts is really-really hard. An LLM is a black box and we do not know how it is going to respond to our queries. Only solution to this problem is to write really good prompts. That requires one to have subject matter expertise as well as the ability to clearly communicate the ask, which unfortunately not many of us (including yours truly) possess.

Luckily, we can make use of LLMs to help us write great prompts. I wrote about how I used it building this tool in one of my recent posts that you can read here.

You really need to work with the LLM to get the right prompt. I spent countless hours just to get a single prompt working.

For example, when I was working on intent recognition prompt, just to get “Storage” intent returned correctly, I started by asking the question like “Given a question about Azure, how would you know that the question is about Azure Storage?”. LLM gave me some answer and then I asked follow up questions about its answer like “what about this” or “what about that” etc. until it gave me something I felt really comfortable with.

This is what it finally looks like:

Storage

Return this intent if the question:
- Keywords: Has keywords such as 'Azure Storage', 'Blob Storage', 'Queue Storage', 'File Storage', 'Table Storage', 'Disk Storage', 'Data Lake Storage', 'Backup', or 'Azure Storage Explorer'.
- Context: Has context regarding Azure's storage capabilities, types of storage, data management, or other aspects specifically related to Azure Storage.
- Features: Mention of specific features such as security, scalability, redundancy, data replication, or pricing within the Azure Storage ecosystem.
- Operation or Commands: Asking for details about operations or commands in Azure Storage, like creating a storage account, uploading or downloading data, managing data, etc.
- Troubleshooting: Looking for troubleshooting tips, specifically seeking solutions for issues or errors encountered while using Azure Storage.
- Comparisons: Seeking comparison between Azure Storage and other cloud storage services like AWS S3, Google Cloud Storage, etc. and also comparison between various Azure Storage services.
- Rules: Looking for rules, restrictions and naming conventions applicable to storage accounts, blob containers & blobs, queues & messages, tables & entities and file shares, directories & files.
- Storage resources: Finding information about storage accounts, blob containers & blobs, queues & messages, tables & entities and file shares, directories & files in an Azure subscription.
- "How to" questions related to Azure Storage.

Stop thinking like a developer

Us developers are very rational in our thinking. Somehow (through our experience or otherwise) it is hardwired in our brain that if we write code to add two numbers (and wrote it correctly), it will always give us the correct result. This is probably because we know how the programming languages work we kind of implicitly trust them.

Unfortunately, it is not the case when you are building an AI application. We do not anything about underlying LLM and how it would process our question and that creates a problem for us developers.

To solve this dilemma, stop thinking like a developer.

Do not make any assumptions like if I can write code that adds two numbers and give me proper result, LLM can also do the same.

Provide as much details as possible in the prompt even for the most trivial tasks. Do not leave anything behind.

When you are writing prompts, put on your “Business Analyst” hat on and write down each and everything. Think of prompts as writing requirements document for another developer who is going to write an application for you.

Streaming responses are always better

After I built the very first version, I made a small video about the application to share it with some folks. At that time, the application waited to receive complete response from the LLM before sending it to the user. As I was recording the video, I immediately realized that it is a very bad user experience (UX).

My first priority after that was to return the responses in a streaming way. It was much-much better UX as the user is immediately seeing the response from the LLM.

If you are building a chatbot kind of an application, please opt for streamed responses wherever possible. Find a hack if you have to, in order to achieve this. You will make your users very happy!

Don’t make it a chore

It’s more of an advise rather than my learning, but have loads of fun building what you are building (I know, I did). Don’t make it a chore!

Remember, you are building something because you do not have to but because you want to! There is a huge difference between “have to” and “want to”.

Because you are learning along the way, don’t be afraid of making mistakes. But keep on correcting your mistakes and keep on improving as you learn. For example, when I first implemented streaming response, I was not aware of “IAsyncEnumerable” and implemented it using events. When I came to know about it, I changed the code to make use of it. The code was much cleaner that way.

Summary

That’s it for this post and this series about Azure Sidekick. By no means I am done with this project (will be implementing planners shortly). As I make significant changes or learn new things, I will share them here.

In the next post, I am going to talk about a nifty little trick with Semantic Kernel I learned recently that could potentially lower your Azure OpenAI bill.

Keep learning and keep building!

The post Azure Sidekick – An AI Assistant to Answer Questions About Your Azure Resources (Part III – Lessons Learned) appeared first on GauravMantri.com.

Azure Sidekick – An AI Assistant to Answer Questions About Your Azure Resources (Part II – Prompt Patterns & More)

Gaurav — Wed, 14 Feb 2024 01:47:46 +0000

In my previous post about Azure Sidekick, I gave a general introduction about the tool and its capabilities. If you have not read that post so far, I would strongly encourage you to read that first. You can read that post here.

In this post, I will talk about the general flow of the application, prompt patterns used there and some other things.

So, let’s start!

Application Flow

From the outside, it looks like a very simple application: you ask it a question (about Azure or Storage accounts in your subscription) and then it answers it. However, a lot is happening behind the scenes and the application goes through multiple steps to answer your question.

1. Rephrase/Reword Step: When you ask a question, first thing the application does is that it tries to rephrase/reword the question by sending your question to LLM. LLM takes in the grounding rules, context and chat history (if available) into consideration and replies back with a rephrased question.

2. Intent Recognition Step: Next, the rephrased question is sent to the LLM for recognizing the intent of the question. It is a very important step because the routing happens based on the result of this and it also kind of serve as “input” guardrails.

3. Answering Step: This is the final step where based on the intent, appropriate plugins and tools are loaded and LLM provides and answer to the question.

However, depending on the intent there could be multiple steps involved in this step.

For example, if the question is about storage accounts, then the application will fetch the list of storage accounts from the selected subscription and then send that data to the LLM to generate an answer.

Similarly, if the question is about a specific storage account, then the application will first try to extract the storage account name from the question, fetch its properties and then send that data to the LLM to generate an answer.

I must confess that the code currently does not utilize the full potential of the LLMs especially its reasoning and planning capabilities. This is something I have started looking into recently and hopefully in next few releases, I will incorporate those.

Prompt Patterns

Now that you know how the application works, let’s talk about some of the prompt patterns I have used to build this application. I wrote a long post about prompt patterns sometime back which you can read here.

Here are the prompt patterns I have used so far:

Rephrase/Reword

What it does?

As the name suggests, it rephrases or rewords a user’s question so that it becomes easier for an LLM to understand and process. It makes use of grounding rules, context and chat history (if available) to rephrase a question.

Why it is needed?

During the course of development, I found many reasons as to why it is needed:

Incomplete question: A user may have asked an incomplete question. For example, the user may just write “Lighthouse” (when they wanted to ask about Azure Lighthouse). Rephrase prompt will produce an output like “Please tell me about Azure Lighthouse and its features”.

Lacking context: When we converse amongst ourselves, we can somehow understand the context of the question based on the previous questions however LLMs lack that capability. Chat history helps but relying on chat history alone is just not sufficient. Using this prompt pattern helps. For example, let’s say the user has first asked a question like “How many storage accounts are there in my subscription?” and then asks “Can you name them?”. Rephrase prompt will produce an output like “Please give me the names of the storage accounts that are there in my subscription?”.

Intent Recognition

What it does?

This prompt pattern is used to recognize the intent of the question i.e. to find out what the question is about. Is it about Azure, Azure Storage or something else, that’s what is determined using this prompt pattern.

Why it is needed?

There are many use cases for using this pattern. In context of this application, this pattern is used for the following things:

Routing: A user may be asking a general question about Azure, or a question concerning their storage accounts, or a question related to a specific storage account. The user’s question may be ambiguous or may not even be related to Azure. Using this prompt pattern will help the application identify the intent of the question and route it accordingly.
Input Guardrails: This pattern also help us enforce input guardrails. For example, let’s say you ask a general question like “What is the capital of India?”. Even though the LLM knows the answer to the question, because of this pattern, user will be informed that the purpose of this tool is to answer questions about Azure only. This becomes really important if you are building an AI application for your enterprise. You would want to constrain the application only to answer the questions related to your enterprise. If you need a proof of how bad it could be, please see this news about UK based parcel delivery company – DPD: https://www.bbc.com/news/technology-68025677.
Relevant chat history: Another use case we found very useful in our application is that we filter the chat history based on the intent of the question and pass that to the prompt.

Named Entity Recognition/Extraction

What it does?

This pattern is used to recognize and extract the named entities in a question. For example, if the user asks a question like “Can I host a static website in xyz storage account?”, this pattern will recognize that the question is about “xyz” storage account and extract “xyz” it from there.

Why it is needed?

Native code execution: In context of this application, we need this information to execute native code (get the properties of a storage account).

Grounding Rules

Next, I want to talk about setting the grounding rules. Grounding rules are very important in any AI application, as they set the overall constraints for your application. You can use these grounding rules to:

Give your application a persona;
Set the boundaries;
Enforce legal restrictions;
Overall tone of the responses etc.

For example, here are the grounding rules we use in the application:

- You are Azure Sidekick, an AI assistant specializing in Azure, tasked with providing accurate and knowledgeable responses to user inquiries about Azure.
- Maintain honesty. If uncertain of an answer, respond with, "I apologize, but I currently lack sufficient information to accurately answer your question.
- Uphold user privacy. Do not ask for, store, or share personal data without explicit permission. 
- Promote inclusivity and respect. Do not engage in or tolerate hate speech, discrimination, or bigotry of any form. Treat all users equally, irrespective of race, ethnicity, religion, gender, age, nationality, or disability.  
- Respect copyright laws and intellectual property rights. Do not share, reproduce, or distribute copyrighted material without the appropriate authorization.  
- Provide precise and concise responses. Maintain a respectful and professional tone in all interactions. 
- Wait for the user's question before providing information. Stay within your domain of expertise - Azure and related services.
- Ensure responses are up-to-date and accessible. Avoid unnecessary jargon and technical language when possible.

Chat History

Now, let’s talk about chat history. Simply put, chat history is the history of the conversation between the user and the AI application. Usually, it contains the question asked by the user and the answer given by the application.

Chat history is an important part of any LLM application as it helps setting a proper context for the question being asked and thus it increases the chances of getting better answer to the question.

There are a few things to keep in mind when using chat history in the prompts:

Use context-sensitive chat history: What I mean by that is use only that portion of chat history which is relevant to the question being asked. For example, if the user is asking a question about storage accounts, only include chat history related to storage in the prompt.
Avoid unnecessary chat history: Even with context-sensitive chat history, you may end up with many items in the chat history relevant to that context. However, that does not mean you pass everything to your prompt. Just pick last 3, 5 or 10 context-sensitive chat history items to the prompt.

Following these will ensure that you are not unnecessarily bloating your prompts and thus save on input prompt tokens :), without compromising on the functionality.

Goal Reminder

This is an interesting thing I have found. I noticed that when I reminded LLM about their goal, I got better results. The best way to do so is include a summarized goal at the end of the prompt.

For example, this is how I am ending the rephrase prompt:

Considering the information provided to you, 
please rephrase or reword the current question 
to increase its clarity and specificity for a 
language model. Consider identifying the 
key elements or concepts within the question, 
ensuring the language is precise, and avoiding 
any ambiguity or overly complex language. 
Remember to incorporate the context provided by 
the previous questions and answers. 
Your goal is to create a revised question that 
maintains the original intent, but is more 
easily understood by an LLM when considering 
the conversation history.

Summary

That’s it for this post. It turned out to be somewhat lengthy but I couldn’t help it as I had so much to share :). I hope you have enjoyed this post.

In the next (and concluding post about this), I will talk about the lessons learned while building this. Stay tuned for that.

Until then, be well!

The post Azure Sidekick – An AI Assistant to Answer Questions About Your Azure Resources (Part II – Prompt Patterns & More) appeared first on GauravMantri.com.

Azure Sidekick – An AI Assistant to Answer Questions About Your Azure Resources (Introduction)

Gaurav — Sun, 11 Feb 2024 01:55:06 +0000

I am pleased to present to you Azure Sidekick, an AI assistant that can answer questions about the resources running in your Azure Subscriptions.

Unlike my other posts which are quite code heavy, in these series of posts about this tool, I will take you through the journey I undertook while building this. I will talk about my motivations behind building this and some things I learned along the way.

In this post, I will talk about what this AI assistant is and what can it do today. I will talk about why I built it and then the tools and technology used to build this.

So, let’s begin.

What is Azure Sidekick?

As mentioned above, Azure Sidekick is an AI assistant that can (eventually) answer questions about the resources running in your Azure Subscriptions using natural language.

Currently following scenarios are supported:

General Azure Questions: It can answer your general Azure questions. For example, you can ask it to explain Azure Lighthouse to you or you can ask it to compare different messaging services available in Azure. For this, it relies on the knowledge of the LLM (so the information may be outdated). In future, I might integrate Bing search so that most current information is available when answering general Azure questions.
Questions about Storage Accounts: It can answer questions about storage accounts in an Azure Subscription. For example, you ask it to find storage accounts that are tagged/not-tagged/tagged with specific name/value pairs, or find storage accounts of certain kind etc. Basically, you can ask questions about storage accounts in your Azure Subscription based on its properties.
Questions about a specific Storage Account: It can answer questions about a specific storage account in an Azure Subscription. For example, you can ask it to find if a specific storage account can host static website, or find its location etc. Basically, you can ask questions about a specific storage account in your Azure Subscription based on its properties.
General Azure Storage Questions: Lastly, you can use it to get answers to general questions related to Azure Storage. For example, you can ask it to provide naming rules for a storage account, or limitations on the number of blob containers you can create etc.

Currently, the scope of this assistant is very limited but one thing to understand is that Azure is very vast with over 200+ services and support for more services will come in due course of time.

I prepared a small video (nothing fancy by any means) about it, that you can watch.

Why did I build it?

Simple answer, to scratch the itch to build an AI assistant from start to end :). I have been reading a lot about Generative AI and wanted to go through the “pain” of building an AI assistant to apply my learnings and to learn new things.

Before building this tool, I had some beliefs (like, what’s the big deal with prompts) and I was happy to see that some of them were a lot misplaced :). I also got an opportunity to test some of my hypothesis.

Building this tool taught me a lot and those were reflected in some of my recent posts and I will share those in my future posts as well.

Tools & Technologies

In its current form, Azure Sidekick is a console application written using .Net 8. For Generative AI, it makes use of Azure OpenAI (I used a GPT-4 model) and makes use of Microsoft Semantic Kernel for AI orchestration.

Sometime back I wrote an introductory post about Microsoft Semantic Kernel and at that time I thought what is the need for it when you have more robust and feature rich orchestration engines like LangChain available (one of my beliefs that got shattered :)).

After using it for some time now, I am pleasantly surprised with the progress it has made and how robust it is. If you are building a Gen AI application using .Net, please give it a serious try. I think you will not be disappointed.

Source Code

I am very happy to say that the tool is open source and is available on GitHub:

https://github.com/gmantri/azure-sidekick

My humble request is to try it out and let me know the feedback. It would be even more awesome, if we can collaborate on this.

Conclusion

That’s it for this post. I sincerely hope that you will use this. In the next post, I will talk about some things I learned while building this, so stay tuned for that.

So long!

The post Azure Sidekick – An AI Assistant to Answer Questions About Your Azure Resources (Introduction) appeared first on GauravMantri.com.

Microsoft Semantic Kernel – Some Tips & Tricks To Get Prompt & Completion Tokens

Gaurav — Wed, 07 Feb 2024 01:38:52 +0000

In my previous post, I talked about how you can get rendered prompts. In this post, I am going to talk about ways to get prompt and completion tokens when using Microsoft Semantic Kernel.

What are Tokens?

Let’s first begin with what tokens really are. In very simple terms, a token is the smallest unit of data that a Large Language Model (LLM) can understand and process. The data could be text, an image, a video, a sound clip or any other data.

When it comes to text, a token could be a single character (like “a”), a partial word (like “happiness” in “unhappiness”), a full word (like “apple”) or even a combination of words. The key thing to remember is that a token is the smallest unit of data that an LLM can understand.

When dealing with tokens, you will come across two terms: Prompt tokens and Completion tokens.

Prompt tokens are the tokens representing input prompt (i.e. the data being fed to an LLM)
Completion tokens are the tokens consumed by an LLM to process the prompt. It represents the data outputted by an LLM.

Why you should care about Tokens?

So, the next question is why you should care about them. Simple answer is, you pay for these tokens :). This is how the service providers make money. You are charged for both prompt and completion tokens. The pricing varies based on the service provider and the model you are using.

Because you are paying for the tokens, you have to be extra careful with them.

You should ensure that your prompts are complete (otherwise you will not get a proper result) but concise and to the point so that you don’t end up breaking the bank.

Through your prompts, you should also put some constraints on the desired output. You have to be very explicit in your prompts about the size and kind of data you expect an LLM to produce. For example, if you expect that your output should not be more than 500 characters, explicitly mention that.

How to get prompt and completion tokens?

So, how do you get prompt and completion tokens in your LLM application built using Microsoft Semantic Kernel?

I will show you fours ways to accomplish that. So, let’s start!

1. Hooking into Kernel Events (Obsolete, Not Recommended)

First way is by hooking into (Semantic) Kernel events. Kernel in Semantic Kernel exposes a FunctionInvoked event which gets fired when a function is invoked. You can consume this event to get the consumed tokens.

My code would be something like the following:

private Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder();
    var deploymentId = "your-azure-openai-deployment-id";
    AzureOpenAIClient client = GetAzureOpenAIClientSomehow();        
    kernelBuilder.AddAzureOpenAIChatCompletion(deploymentId, client);
    var kernel = kernelBuilder.Build();
    kernel.FunctionInvoked += (sender, args) =>
    {
        var metadata = args.Metadata;
        if (metadata == null || !metadata.ContainsKey("Usage")) return;
        var usage = (CompletionsUsage)metadata["Usage"];
        if (usage == null) return;
        var promptTokens = usage.PromptTokens;
        var completionTokens = usage.CompletionTokens;
        // do something with the tokens
    };
    return kernel;
}

However, you should not be using this approach as it has being marked as obsolete in the latest version. In fact, if you use this approach with version 1.3.0 of Semantic Kernel (which is the most current version at the time of writing this post), you will get a warning about not to use it.

2. Using Filters (Experimental)

This is another approach that you can take. I believe this feature was introduced recently and is recommended to use this approach over using kernel events.

Using filters is really easy. You basically create a custom filter class that implements IFunctionFilter interface and then implement OnFunctionInvoking and OnFunctionInvoked methods to suit your requirements. For example, I could simply write the prompt and completion tokens to console.

So my code would be something like:

private class FunctionFilter : IFunctionFilter
{
    public void OnFunctionInvoking(FunctionInvokingContext context)
    {
    }

    public void OnFunctionInvoked(FunctionInvokedContext context)
    {
        var metadata = context.Result.Metadata;
        if (metadata == null || !metadata.ContainsKey("Usage")) return;
        var usage = (CompletionsUsage)metadata["Usage"];
        if (usage == null) return;
        var promptTokens = usage.PromptTokens;
        var completionTokens = usage.CompletionTokens;
        // do something with the tokens
    }
}

And this is how I would wire up the filter in the Kernel:

kernel.FunctionFilters.Add(new FunctionFilter());

My complete code for kernel would be:

private Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder();
    var deploymentId = "your-azure-openai-deployment-id";
    AzureOpenAIClient client = GetAzureOpenAIClientSomehow();        
    kernelBuilder.AddAzureOpenAIChatCompletion(deploymentId, client);
    var kernel = kernelBuilder.Build();
    kernel.FunctionFilters.Add(new FunctionFilter());
    return kernel;
}

Please note that this is still experimental and may change (or even removed) in the future versions.

3. Using Function Result

If you are invoking functions in a non-streaming way i.e. you are waiting for the complete result to come back, you can make use of FunctionResult to get the tokens.

So my code would be something like:

var function = GetKernelFunctionSomehow();
var kernelArguments = GetKernelArgumentsSomehow();
var result = await kernel.InvokeAsync(function, kernelArguments);
var metadata = result.Metadata;
if (metadata == null || !metadata.ContainsKey("Usage")) return;
var usage = (CompletionsUsage)metadata["Usage"];
if (usage == null) return;
promptTokens = usage.PromptTokens;
completionTokens = usage.CompletionTokens;
// do something with tokens

4. Using 3rd Party Library

Approach 3 above works great if you are waiting for the entire response to come to your application however what I have noticed is that the user experience (UX) is not great in this case.

LLMs sends the response in a streaming way that is they spit out (partial) response as it becomes available and if possible, you should try to stream that response back to the user.

However, approach 3 above would not work in that case. From what I am told, Azure OpenAI does not even return the token usage as part of their response in this case and hence Semantic Kernel also does not provide this information.

In this case, what you can do is make use of 3rd party libraries. One such library that I have used is Tiktoken. You can use it to calculate the token consumption. There are many other libraries available like it.

So my code would be something like:

var prompt = GetPromptSomehow();
var function = GetKernelFunctionSomehow();
var kernelArguments = GetKernelArgumentsSomehow();
var result = kernel.InvokeStreamingAsync(function, kernelArguments);
StringBuilder responseStringBuilder = new StringBuilder();
await foreach (var item in result)
{
    var response = item.ToString();
    // store the partial response. We will use it in the end to calculate prompt and completion tokens.
    responseStringBuilder.Append(response);
}
var answer = responseStringBuilder.ToString();
var encodingForModel = Tiktoken.Encoding.TryForModel("model type e.g. gpt-4, gpt-3.5-turbo or gpt-35-turbo");
promptTokens = encodingForModel.CountTokens(prompt);
completionTokens = encodingForModel.CountTokens(answer);
// do something with tokens

Summary

That’s it for this post. I hope you have found it useful. My recommendation would be to use #3 or #4 approach based on your scenario (non-streaming v/s streaming). You can use #2 but definitely stay away from #1.

I would like to end this post with the same bit of warning I gave in my previous post: Semantic Kernel (and in general AI tools) are changing very rapidly (quite evident from the fact that kernel events are being deprecated within a few minor releases). I would highly recommend referencing official documentation for the most current functionality.

In the next post I will talk about the AI assistant I have been building. A lot of my learning came from building that AI assistant. I am pretty excited about it and can’t wait to share more with you.

Until then, Happy Coding!

The post Microsoft Semantic Kernel – Some Tips & Tricks To Get Prompt & Completion Tokens appeared first on GauravMantri.com.

Microsoft Semantic Kernel – Some Tips & Tricks To Get Rendered Prompts

Gaurav — Mon, 05 Feb 2024 04:01:49 +0000

When you start building a new AI application, most likely you start with a very simple prompt where you write everything you need to do in that prompt only.

However, as the application grows, you write more prompts and that’s when you start templatizing your prompts by extracting things that are common across all your prompts and pass them as template variables to the prompts.

This leads to better manageability of the prompts but you lose the clarity as to what gets sent to the LLM.

Because now the prompt is a template with a number of template parameters, simply by looking at the prompt text will not tell you, what is being actually sent to the LLM. Furthermore, you may want to log the prompts (and response from LLM) somewhere in your system so that you can analyze the performance of these prompts. For these reasons, you would want to have access to the prompts that are being sent to the LLM,

In this post, I will show you three ways in Microsoft Semantic Kernel using which you find out the exact prompt that is being sent to the LLM for processing.

Sample Prompt

Let’s consider the following prompt that I wrote for an application I am building. BTW, I wrote this prompt with the help of an LLM (you can read all about it here). I am writing my prompts in YAML format.

name: Rephrase
description: Use this function to reword an unclear question, considering previous context, for better comprehension.
template_format: handlebars
template: |
  
  Grounding Rules:
  ================
  {{#each grounding_rules}}
  - {{this}}
  {{/each}}
  
  The user has asked a question that may not be clear in its current form and may rely on the context of multiple 
  previous questions and answers. Your task is to rephrase or reword the question, taking into account the conversation history if available, 
  to improve its clarity for a Language Model (LLM) to answer it.
  
  Conversation History:
  ====================
  {{#each chat_history}}
  Question: {{Question}}
  Answer: {{Answer}}
  {{/each}}
  
  
  Current Question:
  =================
  {{question}}
  
  
  Considering the information provided to you, please rephrase or reword the current question to increase its clarity and 
  specificity for a language model. Consider identifying the key elements or concepts within the question, ensuring the 
  language is precise, and avoiding any ambiguity or overly complex language. Remember to incorporate the context provided 
  by the previous questions and answers. Your goal is to create a revised question that maintains the original intent, 
  but is more easily understood by an LLM when considering the conversation history.
  
input_variables:
  - name: question
    description: user question
    is_required: true
  - name: grounding_rules
    description: grounding rules for AI model to behave
    is_required: true
  - name: chat_history
    description: chat history
    is_required: true
execution_settings:
  default:
    temperature: 0

As you can see, my prompt template contains some template variables like question, grounding_rules, and chat_history that I am passing to the prompt. Semantic Kernel in turn parses the YAML, replaces these template variables with values that I pass in and then sends that prompt to the LLM.

Solution

So, how do we get the prompts. As I mentioned above, I will show you three ways by which you can get this information.

1. Hooking into Kernel Events (Obsolete, Not Recommended)

First way is by hooking into (Semantic) Kernel events. Kernel in Semantic Kernel exposes a PromptRendered event which gets fired when a prompt is rendered. You can consume this event to get the rendered prompt.

Your code would be something like the following:

private Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder();
    var deploymentId = "your-azure-openai-deployment-id";
    AzureOpenAIClient client = GetAzureOpenAIClientSomehow();        
    kernelBuilder.AddAzureOpenAIChatCompletion(deploymentId, client);
    var kernel = kernelBuilder.Build();
    kernel.PromptRendered += (sender, args) =>
    {
        Console.WriteLine($"Rendered prompt: {args.RenderedPrompt}");
    };
    return kernel;
}

2. Use Filters (Experimental)

This is another approach that you can take. I believe this feature was introduced recently and is recommended to use this approach over using kernel events.

Using filters is really easy. You basically create a custom filter class that implements IPromptFilter interface and then implement PromptRendering and PromptRendered methods to suit your requirements. For example, I could simply write the rendering and rendered prompts to console.

So my code would be something like:

private class PromptFilter : IPromptFilter
{
    public void OnPromptRendering(PromptRenderingContext context)
    {
    }

    public void OnPromptRendered(PromptRenderedContext context)
    {
        var prompt = context.RenderedPrompt;
        Console.WriteLine($"Rendered prompt: ${prompt}");
    }
}

And this is how I would wire up the filter in the kernel:

kernel.PromptFilters.Add(new PromptFilter());

My complete code for kernel would be:

private Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder();
    var deploymentId = "your-azure-openai-deployment-id";
    AzureOpenAIClient client = GetAzureOpenAIClientSomehow();        
    kernelBuilder.AddAzureOpenAIChatCompletion(deploymentId, client);
    var kernel = kernelBuilder.Build();
    kernel.PromptFilters.Add(new PromptFilter());
    return kernel;
}

Please note that this is still experimental and may change (or even removed) in the future versions.

3. Manual Way

Above 2 approaches would work great (though use only the 2nd approach and not the 1st one) however at times you would want to get the prompt inline in your application flow and not outside of it.

For example, the application I am building required me to calculate the prompt and completion tokens and send those back to the user as part of the response.

If your application has this kind of requirement, you can manually create the prompt from the prompt template by passing the arguments.

Here’s the code to do so:

var promptFileContents = await File.ReadAllTextAsync(promptFilePath);
var promptTemplateConfig = KernelFunctionYaml.ToPromptTemplateConfig(promptFileContents);
var factory = new HandlebarsPromptTemplateFactory();
if (!factory.TryCreate(promptTemplateConfig, out var promptTemplate)) throw new InvalidOperationException("Unable to create prompt template.");
var prompt = await promptTemplate.RenderAsync(kernel, kernelArguments);

Here, what I am doing is first reading the entire prompt template YAML file and then creating a PromptTemplateConfig from it. Because my prompt template uses handlebars templating, I am creating a HandlebarsPromptTemplateFactory and extracting the prompt template out of my prompt template configuration. I would then render the prompt by passing the kernel and the arguments.

The advantage of this approach is that I can get my prompt inline with my code flow and then use it any way I see fit.

However, because this approach parses the raw YAML file, it will not work if your prompt template calls other functions (say for example, calling a native function) inside it. So, please use this approach cautiously.

Summary

That’s it for this post. I hope you have found the information useful. Semantic Kernel (and in general AI tools) are changing very rapidly (quite evident from the fact that kernel events are being deprecated within a few minor releases), I would highly recommend referencing official documentation for the most current functionality.

Happy Coding!

The post Microsoft Semantic Kernel – Some Tips & Tricks To Get Rendered Prompts appeared first on GauravMantri.com.

Writing prompts is hard. Luckily, there’s an easy way out!

Gaurav — Mon, 29 Jan 2024 00:06:18 +0000

In any Generative AI application, prompts are the heart and soul of the application. To get the most out of an LLM, every Gen AI developer must write effective prompts.

Problem

But writing prompts is hard! Believe me, it is hard :). In fact, if you want to torture a developer, ask them to write prompts :D.

In order to write effective prompts, as a prompt writer:

You must have subject matter expertise.
You must think very clearly and
You must be able to communicate your thoughts very clearly as well in your prompts.

And, there in lies the problem.

I read a book on UX some time back and something that mentioned there stayed with me:

The problem with developers designing the UX of an application is that if they think they can use it, everybody else can.

This applies very well to us developers as well when we are building Gen AI applications. When we write prompts, we think as if we are going to answer the user questions and the same gets reflected in our prompts as well.

Recently, I was building an application where I want an LLM to rephrase the question asked by the user. As a developer, this is what I came up with:

You are an AI assistant expert in Azure and language comprehension.

Please rephrase the following question so that it is clear for you to answer.

While rephrasing, please remember the following things:
- It is ok to return the original question if it is already clear.
- Please use chat history given below to better rephrase the question.
 Chat history contains previous questions asked by the user.
- Do not answer the question while rephrasing.

Begin Chat History
==================
Question: User question
Response: LLM response
End Chat History
==================
User question: "How many storage accounts?"

As a developer, I thought this should be good enough for an LLM to answer. Unfortunately, that was not the case. At times, this would work but more than often I saw some random text outputted by the LLM.

So, how do we solve this?

Solution: Use LLM to create the prompt

Then it dawned on me, why not ask LLM to create the prompt. It makes sense to me for multiple reasons:

LLMs are the subject matter expert as they are trained on all kinds of data that we (as developers) may not be aware of.
When properly instructed and controlled, they can not only think clearly but also communicate properly as well.
Most importantly, ultimately it is the LLM that is going to answer the prompt so it would be completely logical for it to write the prompt in a language and include all the necessary details so that it can understand when it comes to executing the prompt.

Steps

Here are the steps I took for asking an LLM to come up with a prompt.

Persona Setup

I started by giving my LLM a persona (and stroked their ego a bit :)). This is what I did:

You are the world's best prompt engineer. 
In fact, no one can write prompts like you do.

Describe the Ask

Then, I clearly described the ask. At first, I provided very basic instruction like “Write a prompt to rephrase a question” and it came up with something like “Craft a prompt that ingeniously transforms a direct query into an artful, indirect expression“.

So, I put some more thoughts into it and came up with the following:

You are helping an AI developer with writing prompts. 
AI developer wants to write a prompt that would rephrase a 
question asked by the user of the application. 
User would ask a question that may or may not be clear. 
What user wants to do is use an LLM to rephrase/reword 
the question so that it becomes clear for the LLM to answer the rephrased/reworded question. 
Please be as comprehensive as possible when writing a prompt for this.

and this is what LLM came up with:

Title: Reframing and Clarifying User Questions for Improved 
LLM Understanding

Prompt:

An application user has asked a question that may not be clear in 
its current form. Your task is to rephrase or reword the question 
in a way that improves its clarity for a Language Model (LLM) 
to answer it.

Given Question: "{user_question}"

Please rephrase or reword this question to increase its clarity 
and specificity for a language model. Consider identifying the 
key elements or concepts within the question, ensuring the 
language is precise, and avoiding any ambiguity or overly 
complex language. Your goal is to create a revised question 
that maintains the original intent, but is more easily 
understood by an LLM.

Revised Question: ""

Much better, right?

However, I forgot to include about chat history so modified the prompt again like this:

Now let's say the question is based on some previous questions. 
For example, the user may have previously asked 
"What is the capital of India" to which LLM has 
responded with "New Delhi". 
Now the user has asked "How many states are in there". 
This question in itself is not complete and the previous 
question will be included in the chat history. 
Based on this information, can you rewrite the prompt?

LLM, replied back with the following:

Title: Contextual Rewording of User Questions for LLM Understanding

Prompt:

An application user has asked a question that may not be clear in 
its current form and also relies on the context of previous 
questions and answers. Your task is to rephrase or reword the question, taking into account the previous conversation history, 
to improve its clarity for a Language Model (LLM) to answer it.

Previous Question and Answer:
Question: "{previous_question}"
Answer: "{previous_answer}"

Current Question: "{user_question}"

Given this context, please rephrase or reword the current question 
to increase its clarity and specificity for a language model. 
Consider identifying the key elements or concepts within the 
question, ensuring the language is precise, and avoiding any 
ambiguity or overly complex language. Remember to incorporate 
the context provided by the previous question and answer. 
Your goal is to create a revised question that maintains the 
original intent, but is more easily understood by an LLM 
when considering the previous conversation.

Revised Question: ""

I realized that there can be multiple questions in the chat history that I did not mention in the previous prompt, so I submitted another prompt:

Now assume there are many previous questions and answers. Please revise the prompt based on this information.

LLM, graciously obliged me with the following:

Title: Contextual Rewording of User Questions Based on Conversation History for LLM Understanding

Prompt:

An application user has asked a question that may not be clear in its current form and relies on the context of multiple previous questions and answers. Your task is to rephrase or reword the question, taking into account the conversation history, to improve its clarity for a Language Model (LLM) to answer it.

Conversation History:
"{previous_questions_and_answers}"

Current Question: "{user_question}"

Given this context, please rephrase or reword the current question to increase its clarity and specificity for a language model. Consider identifying the key elements or concepts within the question, ensuring the language is precise, and avoiding any ambiguity or overly complex language. Remember to incorporate the context provided by the previous questions and answers. Your goal is to create a revised question that maintains the original intent, but is more easily understood by an LLM when considering the conversation history.

Revised Question: ""

So, so much better than my original prompt :).

Test

I was quite satisfied with the prompt. However before I could use it in the application, I wanted to see if the prompt is working fine.

So I used LLM again and fed it some questions and asked it to reply to those questions using the prompt it wrote.

Here’s what I asked:

Now let's test the prompt above with the following questions. 
Your task is to take each question and run the prompt and 
output both the prompt and the result. 
With each question, include the previous questions and 
responses in the chat history. Here are the questions:  
1. Does xyz storage account has tags?  
2. what is its type?  
3. Where it is located?  
4. What are the services supported by it?  
5. Please give me the storage endpoints. 
6. Does abc storage account has tags?  
7. what is its type?  
8. Where it is located?  
9. What are the services supported by it?

And this is how I responded:

Sure, I'll provide you with the prompts and the expected results for all 9 questions. I'll make assumptions for LLM responses to complete the exercise.

1.
Conversation History:
""
Current Question: "Does xyz storage account has tags?"

Prompt:
Does the storage account named 'xyz' have any assigned tags?

2.
Conversation History:
"Question: Does xyz storage account has tags?
Answer: Yes, the 'xyz' storage account has tags."
Current Question: "what is its type?"

Prompt:
What is the type of the 'xyz' storage account that has tags?

3.
Conversation History:
"Question: Does xyz storage account has tags?
Answer: Yes, the 'xyz' storage account has tags.
Question: what is its type?
Answer: The 'xyz' storage account is a 'BlockBlobStorage' type."
Current Question: "Where it is located?"

Prompt:
Where is the 'BlockBlobStorage' type 'xyz' storage account, which has tags, located?

4.
Conversation History:
"Question: Does xyz storage account has tags?
Answer: Yes, the 'xyz' storage account has tags.
Question: what is its type?
Answer: The 'xyz' storage account is a 'BlockBlobStorage' type.
Question: Where it is located?
Answer: The 'xyz' storage account is located in East US."
Current Question: "What are the services supported by it?"

Prompt:
What are the services supported by the 'xyz' storage account, which is a 'BlockBlobStorage' type and located in East US?

5.
Conversation History:
"Question: Does xyz storage account has tags?
Answer: Yes, the 'xyz' storage account has tags.
Question: what is its type?
Answer: The 'xyz' storage account is a 'BlockBlobStorage' type.
Question: Where it is located?
Answer: The 'xyz' storage account is located in East US.
Question: What are the services supported by it?
Answer: The 'xyz' storage account supports Blob, File, Queue, Table, and Data Lake Gen2 services."
Current Question: "Please give me the storage endpoints."

Prompt:
Could you provide the storage endpoints for the 'xyz' storage account which is a 'BlockBlobStorage' type, located in East US, and supports Blob, File, Queue, Table, and Data Lake Gen2 services?

6.
Conversation History:
""
Current Question: "Does abc storage account has tags?"

Prompt:
Does the storage account named 'abc' have any assigned tags?

7.
Conversation History:
"Question: Does abc storage account has tags?
Answer: Yes, the 'abc' storage account has tags."
Current Question: "what is its type?"

Prompt:
What is the type of the 'abc' storage account that has tags?

8.
Conversation History:
"Question: Does abc storage account has tags?
Answer: Yes, the 'abc' storage account has tags.
Question: what is its type?
Answer: The 'abc' storage account is a 'BlockBlobStorage' type."
Current Question: "Where it is located?"

Prompt:
Where is the 'BlockBlobStorage' type 'abc' storage account, which has tags, located?

9.
Conversation History:
"Question: Does abc storage account has tags?
Answer: Yes, the 'abc' storage account has tags.
Question: what is its type?
Answer: The 'abc' storage account is a 'BlockBlobStorage' type.
Question: Where it is located?
Answer: The 'abc' storage account is located in West US."
Current Question: "What are the services supported by it?"

Prompt:
What are the services supported by the 'abc' storage account, which is a 'BlockBlobStorage' type and located in West US?

Not bad, not bad at all!

Summary

That’s it for this post. I hope you have found it useful. Please share your thoughts by providing comments.

Happy Prompting!

The post Writing prompts is hard. Luckily, there’s an easy way out! appeared first on GauravMantri.com.

Using OpenAI Function Calling with Microsoft Semantic Kernel

Gaurav — Sun, 31 Dec 2023 12:22:55 +0000

In this post we are going to see how we can use OpenAI’s Function Calling feature with Microsoft Semantic Kernel.

Context

To explain the concepts in this post, let’s set the context. Let’s say that you are building an AI application that helps users answer basic questions about Maths (e.g. what is 20% of 100). For the sake of argument, let’s assume that our AI model is not capable of answering such questions.

So what we are doing to do is that we are going to write some functions which does mathematical operations like Add, Subtract, Multiply, and Division etc. In Semantic Kernel lingo, we are creating some native functions. In OpenAI lingo, we are creating some tools.

Now the challenge is how do we invoke these tools or functions based on the user prompts (which are in natural language, BTW). This is where OpenAI Function Calling comes in handy and we will see how we can use Semantic Kernel for that purpose.

What is OpenAI Function Calling?

Let’s first briefly talk about OpenAI Function Calling. This is a feature which was recently released by OpenAI.

The way I understand it is that you give a prompt in natural language and a list of functions (tools) to OpenAI. OpenAI then tries to find the best functions (tools) suited to fulfill user’s request and it then returns those function(s) back to the calling program. Not only it returns the function(s), it also tries to extract the arguments from the prompt that are needed to execute that function.

Considering our context, let’s say the user asks “what is 2 + 12” in our AI application. We pass this prompt and the list of functions (Add, Subtract, Multiply, and Division etc.) to OpenAI and it returns Add back to the calling application. If our Add function has 2 arguments say number1 and number2, OpenAI will also return back these arguments with value for number1 argument as 2 and the value for number2 argument as 12. Your application can then execute that function and return the result (14) to the user.

The way it works (I think) is that OpenAI extracts the intent of the prompt and then semantically compares it with the description of the functions.

Here’s the code for our Add function:

    [KernelFunction, Description("Add two numbers")]
    public static double Add(
        [Description("The first number to add")] double number1,
        [Description("The second number to add")] double number2
    )
    {
        return number1 + number2;
    }

Here, the description of the function is very important. You must be concise yet very clear about what a function does for this whole thing to work properly.

Now when OpenAI sees the prompt (what is 2 + 12), it somehow infers that the user wants to add 2 numbers and then compares that with the description of the functions and determines that Add function is the most suited function to answer user’s prompt. It also then maps “2” with “number1” argument and “12” with “number2” argument and returns this information.

You can read more about OpenAI Function Calling here: https://platform.openai.com/docs/guides/function-calling.

What is Semantic Kernel?

Simply put, Semantic Kernel is an open-source SDK that helps you build AI applications. You can use C#, Python or Java to write AI applications using Semantic Kernel.

I wrote a post on overview Microsoft Semantic Kernel a few months back when it was first released that you can read here: https://gauravmantri.com/2023/09/03/microsoft-semantic-kernel-an-overview/.

You can read more about Semantic Kernel here: https://learn.microsoft.com/en-us/semantic-kernel/overview/.

Semantic Kernel makes it super easy to make use of OpenAI Function Calling and that’s what we will see in this post.

Code

Let’s look at the code! I have built a simple console application.

For explanation, I will break the code in small chunks and then we will put the entire code together in the end.

Plugin

Let’s first write our plugin that will contain our functions (tools). Since our AI app deals with Maths, let’s call it MathPlugin. Here’s the code for that which I took from here:

using System.ComponentModel;
using Microsoft.SemanticKernel;

namespace FunctionCallingWithSemanticKernel.Plugins;

public class MathPlugin
{
    [KernelFunction, Description("Take the square root of a number")]
    public static double Sqrt(
        [Description("The number to take a square root of")] double number1
    )
    {
        return Math.Sqrt(number1);
    }

    [KernelFunction, Description("Add two numbers")]
    public static double Add(
        [Description("The first number to add")] double number1,
        [Description("The second number to add")] double number2
    )
    {
        return number1 + number2;
    }

    [KernelFunction, Description("Subtract two numbers")]
    public static double Subtract(
        [Description("The first number to subtract from")] double number1,
        [Description("The second number to subtract away")] double number2
    )
    {
        return number1 - number2;
    }

    [KernelFunction, Description("Multiply two numbers. When increasing by a percentage, don't forget to add 1 to the percentage.")]
    public static double Multiply(
        [Description("The first number to multiply")] double number1,
        [Description("The second number to multiply")] double number2
    )
    {
        return number1 * number2;
    }

    [KernelFunction, Description("Divide two numbers")]
    public static double Divide(
        [Description("The first number to divide from")] double number1,
        [Description("The second number to divide by")] double number2
    )
    {
        return number1 / number2;
    }

    [KernelFunction, Description("Raise a number to a power")]
    public static double Power(
        [Description("The number to raise")] double number1,
        [Description("The power to raise the number to")] double number2
    )
    {
        return Math.Pow(number1, number2);
    }

    [KernelFunction, Description("Take the log of a number")]
    public static double Log(
        [Description("The number to take the log of")] double number1,
        [Description("The base of the log")] double number2
    )
    {
        return Math.Log(number1, number2);
    }

    [KernelFunction, Description("Round a number to the target number of decimal places")]
    public static double Round(
        [Description("The number to round")] double number1,
        [Description("The number of decimal places to round to")] double number2
    )
    {
        return Math.Round(number1, (int)number2);
    }

    [KernelFunction, Description("Take the absolute value of a number")]
    public static double Abs(
        [Description("The number to take the absolute value of")] double number1
    )
    {
        return Math.Abs(number1);
    }

    [KernelFunction, Description("Take the floor of a number")]
    public static double Floor(
        [Description("The number to take the floor of")] double number1
    )
    {
        return Math.Floor(number1);
    }

    [KernelFunction, Description("Take the ceiling of a number")]
    public static double Ceiling(
        [Description("The number to take the ceiling of")] double number1
    )
    {
        return Math.Ceiling(number1);
    }

    [KernelFunction, Description("Take the sine of a number")]
    public static double Sin(
        [Description("The number to take the sine of")] double number1
    )
    {
        return Math.Sin(number1);
    }

    [KernelFunction, Description("Take the cosine of a number")]
    public static double Cos(
        [Description("The number to take the cosine of")] double number1
    )
    {
        return Math.Cos(number1);
    }

    [KernelFunction, Description("Take the tangent of a number")]
    public static double Tan(
        [Description("The number to take the tangent of")] double number1
    )
    {
        return Math.Tan(number1);
    }

    [KernelFunction, Description("Take the arcsine of a number")]
    public static double Asin(
        [Description("The number to take the arcsine of")] double number1
    )
    {
        return Math.Asin(number1);
    }

    [KernelFunction, Description("Take the arccosine of a number")]
    public static double Acos(
        [Description("The number to take the arccosine of")] double number1
    )
    {
        return Math.Acos(number1);
    }

    [KernelFunction, Description("Take the arctangent of a number")]
    public static double Atan(
        [Description("The number to take the arctangent of")] double number1
    )
    {
        return Math.Atan(number1);
    }
}

OpenAI Client

Next, we will create an instance of OpenAI client. You will need the endpoint of your Azure OpenAI service and the key which you can get from Azure Portal (you can also use Azure AD credentials instead of key if you want).

const string AZURE_OPEN_AI_ENDPOINT = "https://xyz.openai.azure.com/";
const string AZURE_OPEN_AI_KEY = "00000000000000000000000";
const string AZURE_OPEN_AI_MODEL_ID = "gpt-4-32k";

// create an instance of OpenAIClient.
var openAIClient = new OpenAIClient(new Uri(AZURE_OPEN_AI_ENDPOINT), new Azure.AzureKeyCredential(AZURE_OPEN_AI_KEY));

Get Kernel

Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder()
        .AddAzureOpenAIChatCompletion(AZURE_OPEN_AI_MODEL_ID, openAIClient);
	
    var kernel = kernelBuilder.Build();
    kernel.Plugins.AddFromType();
    return kernel;
}

What we are doing above is creating an instance of Semantic Kernel and then adding our plugin (with all the functions/tools) to that kernel.

OpenAI Prompt Execution Settings

Next, we will set the OpenAI prompt execution settings.

var promptExecutionSettings = new OpenAIPromptExecutionSettings()
{
    ToolCallBehavior = ToolCallBehavior.EnableKernelFunctions,
    Temperature = 0
};

There are two things I want to highlight in the code above:

Temperature: Temperature parameter sets the creativity of the model. Since we want our model to be constrained and give predictive results, we would want this setting to be 0. Try removing this parameter and you will get some really funny results back!
ToolCallBehavior: This parameter controls how tools will be called. By setting the value to ToolCallBehavior.EnableKernelFunctions, we want Semantic Kernel to just pass the registered tools to OpenAI and return the matching tool. If we want Semantic Kernel to automatically invoke the matching tool returned by OpenAI, we would want to change this setting to ToolCallBehavior.AutoInvokeKernelFunctions. There are other settings as well for this parameter that you can read here.

Get Tool

We are going to focus on invoking the function manually, so the code below gets the appropriate tool for the input prompt by using OpenAI Function Calling:

// select the tool best suited to execute our prompt.
async Task SelectTool(string prompt)
{
    try
    {
        var chatCompletionService = new AzureOpenAIChatCompletionService(AZURE_OPEN_AI_MODEL_ID, openAIClient!);
        var result = await chatCompletionService.GetChatMessageContentAsync(new ChatHistory(prompt),
            promptExecutionSettings, kernel);
        var functionCall = ((OpenAIChatMessageContent)result).GetOpenAIFunctionToolCalls().FirstOrDefault();

        return functionCall;
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
        Console.WriteLine(ex.StackTrace);
        return null;
    }
}

Here, Semantic Kernel takes our prompt and the functions loaded in the kernel and send them to Open AI. It then parses the response from Open AI and returns the first tool. Simple, isn’t it?

However, please keep in mind that Open AI may not find a matching tool. This could happen when a user asks an irrelevant question like “What is the capital of France?”.

Execute Function

Last step would be to execute the function and get the result!

var function = await SelectTool(prompt);
if (function != null)
{
    // now we try to get the plugin function and the arguments.
    kernel.Plugins.TryGetFunctionAndArguments(function, out KernelFunction? pluginFunction,
        out KernelArguments? arguments);
    Console.WriteLine($"Plugin function: {pluginFunction!.Name}");
    if (arguments!.Any())
    {
        Console.WriteLine("Function arguments:");
        foreach (var argument in arguments!)
        {
            Console.WriteLine($"Argument name: {argument.Key}; Argument value: {argument.Value}");
        }
    }
    // execute the plugin function.
    var result = await kernel.InvokeAsync(pluginFunction!, arguments);
    Console.WriteLine($"{prompt}: {result.ToString()}");
}
else
{
    Console.WriteLine("I'm sorry but I am not able to answer your question. I can only answer simple mathematical questions.");
}

Here, first we are getting the kernel function to invoke along with the arguments (kernel.Plugins.TryGetFunctionAndArguments(function, out KernelFunction? pluginFunction, out KernelArguments? arguments)) and then executing the function to get the result (var result = await kernel.InvokeAsync(pluginFunction!, arguments)).

That’s it!

Complete Code

Here’s the complete code:

using Azure.AI.OpenAI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.ChatCompletion;
using FunctionCallingWithSemanticKernel.Plugins;

// Azure OpenAI settings. You can get these settings from portal.
const string AZURE_OPEN_AI_ENDPOINT = "";
const string AZURE_OPEN_AI_KEY = "";
const string AZURE_OPEN_AI_DEPLOYMENT_ID = "";

// create an instance of OpenAIClient.
var openAIClient = new OpenAIClient(new Uri(AZURE_OPEN_AI_ENDPOINT), new Azure.AzureKeyCredential(AZURE_OPEN_AI_KEY));

// get the kernel.
var kernel = GetKernel();

// set OpenAI prompt execution settings.
var promptExecutionSettings = new OpenAIPromptExecutionSettings()
{
    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
    Temperature = 0
};
Console.WriteLine("Hello, I am an AI assistant that can answer simple math questions.");
Console.WriteLine("Please ask me questions like \"What is 2 x 2\" or \"What is sqaure root of 3\" etc.");
Console.WriteLine("To quit, simply type quit.");
Console.WriteLine("");
Console.WriteLine("Now ask me a math question, I am waiting!");
do
{
    var prompt = Console.ReadLine();
    if (!string.IsNullOrWhiteSpace(prompt))
    {
        if (prompt.ToLowerInvariant() == "quit")
        {
            Console.WriteLine("Thank you! See you next time.");
            break;
        }
        else
        {
            // get the tool/function best suited to execute the function.
            var function = await SelectTool(prompt);
            if (function != null)
            {
                // now we try to get the plugin function and the arguments.
                kernel.Plugins.TryGetFunctionAndArguments(function, out KernelFunction? pluginFunction,
                    out KernelArguments? arguments);
                // execute the plugin function.
                var result = await kernel.InvokeAsync(pluginFunction!, arguments);
                Console.WriteLine($"{prompt}: {result.ToString()}");
            }
            else
            {
                Console.WriteLine("I'm sorry but I am not able to answer your question. I can only answer simple math questions.");
            }
        }
    }
} while (true);


// select the tool best suited to execute our prompt.
async Task SelectTool(string prompt)
{
    try
    {
        var chatCompletionService = new AzureOpenAIChatCompletionService(AZURE_OPEN_AI_DEPLOYMENT_ID, openAIClient!);
        var result = await chatCompletionService.GetChatMessageContentAsync(new ChatHistory(prompt),
            new OpenAIPromptExecutionSettings()
            {
                ToolCallBehavior = ToolCallBehavior.EnableKernelFunctions,
                Temperature = 0
            }, kernel);
        var functionCall = ((OpenAIChatMessageContent)result).GetOpenAIFunctionToolCalls().FirstOrDefault();

        return functionCall;
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
        Console.WriteLine(ex.StackTrace);
        return null;
    }
}

// create an instance of Kernel and load all plugins and functions in the Kernel.
Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder()
        .AddAzureOpenAIChatCompletion(AZURE_OPEN_AI_DEPLOYMENT_ID, openAIClient);
	
    var kernel = kernelBuilder.Build();
    kernel.Plugins.AddFromType();
    return kernel;
}

You can download the complete solution from GitHub: https://github.com/gmantri/function-calling-with-semantic-kernel.

Conclusion

That’s it for this post. I hope you will find it useful. If you have any feedback or questions, please feel to provide them.

Happy coding and Happy New Year 2024!

The post Using OpenAI Function Calling with Microsoft Semantic Kernel appeared first on GauravMantri.com.