S Anand

People skills with AI

Sun, 17 May 2026 22:45:57 +0800

I advice people that people skills are important in the AI era.

Now, I’m using AI to help me with people skills.

This morning, I wrote a script export my WhatsApp conversations this year. That makes it easy to feed it into AI models.

Then I used my Local MCP connector and asked Claude:

Who are people in my life that most deserve an unreasonable gesture of thanks and what would that be?

It went through my WhatsApp messages – including threads I had not read. Including a group discussing four 90-minute hands-on AI workshops I’m running for IIM alumni in Singapore on Saturday afternoons:

23 May: Context engineering
20 Jun: AI tools & workflows
25 Jul: Agentic analysis
22 Aug: AI strategy

Nayana Jain prepared a series of LinkedIn / WhatsApp posts to promote this workshop series and also created a poster for the workshops - and the best part of it was they were AI-generated. AI didn’t do a great job at the logo, so she asked for and got the IIMPACT logo (which isn’t public) and fed it to the model to re-generate it.

All of this is something I wasn’t even aware of until Claude pointed it out.

I sent a note thanking her.

This was today’s gesture. With AI’s help.

How I use Local MCP

Sat, 16 May 2026 22:24:32 +0800

I’d love for Claude or ChatGPT to answer questions like:

What meetings am I not setting up that I really should be?

or:

Based on my activities since 9 May 2026, what should I blog about?

or:

Who in my professional life most deserves an unreasonable gesture?

From data. My files, emails, calendar, contacts, transcripts, blogs, notes, code, browsing history, logs, random Markdown files I forgot I wrote.

Hence, a Local MCP.

My Local MCP server exposes one tool: bash.

@mcp.tool()
async def bash(commands: str, ctx: Context) -> str:
    """Runs multiline bash script."""
    result = subprocess.run(commands, shell=True, executable="/bin/bash")
    return result.stdout + result.stderr

That’s it. No vector database. No UI. No custom connectors. No “AI knowledge platform.”

Just: run shell commands on my machine.

I run this locally, expose it online (which is slightly scary), and give Claude and ChatGPT this prompt fragment:

Local MCP runs bash and exposes:

- ~/code/talks/README.md - talk transcripts, slides
- ~/code/blog/description.md - 20K files, 5K posts. Search for "- llm" for AI-related posts.
... (etc.)

`gws` can access email, calendar, etc.

In one shot, this gives EVERYTHING I have to the agents.

A common use is meeting prep.

You are a brilliant, brutally honest Chief of Staff. You have full access via Local MCP bash tool to calendar, emails, and past transcripts. Produce a briefing card for each substantive external meeting today.

It checks the calendar via gws. It searches my transcripts. My notes. My AI advice. Then gives me a briefing card with everything I need.

I can’t do this by uploading files manually. The context is not one file: it’s scattered all over.

A human assistant could do this. But agents are faster, cheaper, and I trust them more.

Another common use is relationship intelligence.

What meetings am I not setting up that I really should be?

Claude scans transcripts, contacts, emails, and recent activity to find people I should speak.

This is where Local MCP is different from a file upload.

In a file upload, I can ask “Where is X?”.

Here, I’m asking “What am I missing?” and the answer depends on recency, relationship history, frequency, how conversations felt, unresolved actions, and so much more.

A third use is mining my own work.

I used Local MCP to ask what I should blog about. It scanned all my content and found themes I haven’t really thought about, like:

Google Meet captions - a code commit I recently made. I wrote about it.
Agents are the new software - a theme I’ve been talking a lot about. I wrote about it.
Local MCPs - that’s this post

… and half a dozen topics I should be writing about soon.

A fourth use is business research.

I have transcripts from sales calls and client conversations. I don’t attend all of them. But Local MCP can.

I can ask:

Which client needs have we heard repeatedly but not converted into demos?

or:

Which solutions have we pitched to one client that another client has explicitly asked for?

This is beyond a CRM search. A HubSpot search finds what people typed in. This finds what people actually said.

Then an email search finds if they acted on it. Calendar search finds what we spent time on.

Across these, I find opportunities that no single system has.

BUT: this is not safe by default. A bash MCP server can delete my files, run commands, read my browser sessions, send emails gws, and all sorts of risky things.

So I monitor the commands like a hawk, and give it fairly controlled access, and only when I’m actually running one of these use-cases.

I tried OAuth but setting up Auth0, dynamic client registration, callback URLs, scopes, ChatGPT connector errors, … I gave up.

For now, supervised local usage gives me most of the value.

BUT #2: Claude and ChatGPT use Local MCP differently.

Claude uses it beautifully. Smooth. No mistakes. References memory.

ChatGPT is more restrictive. No chat memory accessed, nor saved. Keeps asking for permissions.

So I use ChatGPT less for Local MCP-heavy tasks. But ChatGPT is rigorous. When I want structured analysis, exhaustive lists, or better verification discipline, it is useful.

Local MCP is powerful because it lets AI use all systems I have access to:

local files - across Dropbox, Google Drive, my notes, blog posts, transcripts, slides, …
code - not just reading, but running, rewriting, and generatig
email, calendar, contacts
browser history
shell tools - which can be used to access even more system

Local MCP invites Claude / ChatGPT as a real assistant into my laptop.

And into my 2,700-line TODO archive.

You probably shouldn’t expose a bash tool to an AI. But note the direction I’m going with this:

If your work and transactions are agent-readable, your past work compounds.
If they are trapped in apps, screenshots, and memory, your AI has amnesia.

Google Meet captions as a local transcript recorder

Sat, 16 May 2026 13:32:06 +0800

I’m a man of simple needs. All I want is: when I’m on Google Meet, I turn on captions. I wanted to click a bookmarklet and save those captions into a local Markdown file. (So that an AI agent can guide me from it.)

Hence, Google Meet Captions. The code is in gmeetcaptions/. Drag the button to your bookmarks bar. Join a Meet. Turn on captions. Click it.

You get a tiny panel with two buttons: Copy and Start Recording.

The bookmarklet writes this kind of Markdown:

# Meeting title

- **Meeting code**: abc-defg-hij
- **Started**: 5/15/2026, 8:00:00 AM
- **Participants**: Alice, Bob, Carol

---

## Alice [0:12]

Good morning everyone.

## Bob [0:18]

Let's get started with the agenda.

That’s it. No server. No extension. No login. No API. Just a bookmarklet page, a script, and local browser APIs.

BUT: Google Meet captions are live and unstable.

A sentence may appear as:

mic, so,

Then a second later become:

mic, So that's a new person. Okay.

Then become:

mic, So that's a new person. Okay. Hey. oh, but,

If I simply append every change, the transcript becomes garbage. So the bookmarklet keeps updating the active speaker turn until it becomes stable. The implementation uses a MutationObserver plus a one-second polling fallback. After four unchanged polls, it treats the turn as final.

The tests are in gmeetcaptions.test.js, using an anonymized fixture at __fixtures__/captions-anonymized.html.

BUT #2: Google Meet’s DOM is not a public API. Class names like .nMcdL, .NWpY1d, and .ygicle can vanish overnight.

So the scraper first tries semantic and structural selectors:

[role="region"][aria-label="Captions"] for the captions region
img[data-iml] and googleusercontent.com avatars to identify caption items
the first <span> as the speaker
the last non-image <div> as the caption text

Only then does it fall back to obfuscated class names. That selector strategy is documented in the README.

Boring, but also the difference between “worked once” and “might work tomorrow.”

The weirdest was Chrome writing to a .md.crswap file while recording. The file appears unfinished until I click Stop Recording. Then Chrome finalizes it.

This is good, actually. It means the browser is safely streaming to a local file via the File System Access API. But it also means: stop the recorder before trusting the file!

I captured these bugs and prompts in prompts.md, because future-me will forget. Future-agent, too.

Why bother? Because transcripts are not the output. They are raw material.

Once a meeting is Markdown, I can ask agents to extract decisions, questions, follow-ups, contradictions, reusable prompts, and blog ideas. I can diff it. Search it. Commit it. Feed it to another workflow.

Meetings now become the “context” in context engineering!

Unresolved questions across disciplines

Thu, 07 May 2026 20:45:58 +0800

I asked Claude: “What are the most effective and impactful ways you can help me?” One of its ideas was to ask it:

What are the three questions this field has not resolved, where the disagreement is substantive and not just semantic? Who represents each position most forcefully?

So I posed this question about several subjects. This is a great way to discover the frontiers of knowledge in a field.

Mathematics
- Are mathematical objects (numbers, sets, complex numbers) real, independent of human minds? Or useful fiction? Do we discover or invent them?
- Is there one true set theory or a multiverse of equally valid set theories, e.g. where the continuum hypothesis is true vs false?
- Are theorems true or false, even if we can’t prove them?
Biology
- Does natural selection operate at a gene, individual, or group level?
- Does evolution proceed through gene mutation + selection, or do we need to add epigenetics, niche construction, and developmental bias?
- Which came first - RNA or mitochondria?
Physics
- What actually happens when a quantum system is measured?
- When a black hole evaporates, is the information about what fell in preserved (and how) in the Hawking radiation, or destroyed?
- Why does the universe have a strong entropy increase toward the future but not the past?
Computer science
- Is P really equal to NP?
- Can fault-tolerant quantum computers actually be built?
- Can we really achieve AGI with computation?
Finance
- Do asset prices track underlying value or market narrative / noise?
- Why is the historical US equity premium (~6%) so high that it implies a ~30–40x risk aversion?
- Is financial instability inevitable in capitalism, or is it due to external shocks?
Psychology
- Are emotions hard-wired (hence discovered) or constructed from more primitive ingredients?
- Does trauma hide the memory that we recover later, or are these recovered pieces fabricated?
- In psychotherapy, do specific techniques work? Or is it just the relationship?
History
- Were the atomic bombings of Japan militarily necessary?
- Was the Holocaust planned from the start, or did it emerge from bureaucratic chaos?
- Did the Industrial Revolution raise or lower living standards for ordinary workers?
Geography
- Does physical geography cause big changes to human outcomes in the long run?
- Are places about a geographic location or about the flow of capital, people, and power that flow through?
- Are local, regional, national, … really nested ways of organizing society, or an ineffective way of thinking about geography?

Tracking redirects minimally

Thu, 07 May 2026 16:28:22 +0800

Everyone needs a tracking URL shortener.

Why tracking? I want to know if they opened my email and clicked the link.
Why shortener? I want them to know what the link is about. For example, https://r.s-anand.net/edge-remote-debugging.html is so much more meaningful than https://chatgpt.com/share/68528565-0d34-800c-b9ec-6dccca01c24c

I’ve used redirection services in the past - like t.co, bit.ly, goo.gl, ow.ly, and others. They tend to vanish, start charging, serve ads, etc.

Here’s my solution: use static HTML for redirection.

For example, https://r.s-anand.net/example.html redirects to https://example.com/. Here’s the code:

<script data-goatcounter="https://sanand0.goatcounter.com/count" async src="//gc.zgo.at/count.js"></script>
<meta http-equiv="refresh" content="0; url=https://example.com/">

The first line sets up tracking with GoatCounter, which is my current favorite analytics provider. It might vanish, but I can export the data if required and move to another. While it lasts, I can check how often the link was used.

I can create personalized links. For example, https://r.s-anand.net/example.html?ref=user1 tags user1. I can create unique links for each recipient.

The second redirects to the target page immediately.

I have deployed the files on a CloudFlare R2 bucket with a custom domain, which is practically free and fast. But I can switch at any point to any other hosting provider (e.g. GitHub Pages, etc.) by just copying the files and changing the DNS settings.

Advantages: No hosting, no fees, no maintainance, no vendor lock-in.

How the Innovation Team works

Sun, 03 May 2026 16:15:15 +0800

Based on 44 meeting recordings from February to late April 2026, here’s how Straive’s small team (3-6 people at any time, mostly freshers and interns) produce a continuous stream of client-facing demos across topics as diverse as image filtering, geospatial analysis, insurance contract verification, NFL medical scoring, OCR benchmarking, and song similarity clustering — often with a 24–48 hour turnaround from assignment to demo.

Here is how the team works:

Build demos, not products. Every task traces to a specific client meeting with a known date. “Done” means good enough to show once to one audience — not production-ready. The moment a demo works, it gets shown; refinement happens only if the client asks for more.
Show output first, always. Start every update by showing the thing — not explaining what you did to build it. If you don’t have output yet, say so in one sentence and then show where you are. Process is for after the food arrives.
One person holds all client context. The team lead attends client meetings, filters what matters, and translates it into specific buildable tasks. Team members build; they don’t need to know why. This keeps work relevant and prevents wasted effort on misaligned output.
Explore broadly, cut ruthlessly. Multiple tracks run simultaneously — robotics, embeddings, OCR, formal logic — but anything that doesn’t demo well gets dropped fast. The sequence is always: assign a loose exploration, see output quickly, deepen what works, kill what doesn’t.
Compress everything. Small files, single HTML pages, 50KB images, 30-line YAMLs. This isn’t aesthetic — demos need to load in bad hotel WiFi, repos need to clone in meetings, and files need to forward over email. Technical choices serve the demo context.
Make pipelines reproducible after they work. Once a demo runs, it needs shell scripts or CLI commands that reproduce it from scratch. No committed data, no manual steps, no “ask X how it works.” If someone can’t clone and run it, it doesn’t exist as an asset.
Operate AI agents, don’t just use them. The team’s primary skill is steering coding agents — Codex, Claude Code, Copilot — toward specific outputs under specific constraints. When stuck, push harder (“drive it to death”), switch agents, or change the prompt. Don’t wait for instruction on how to build something; use the agent to figure it out.
Numbers, not observations. Every story needs a specific number: 252 test cases passed, $22 for 2,300 slides, 40% cheaper per-deck vs. per-slide. Vague quality claims don’t survive a client meeting. If a finding can be quantified, quantify it; if it can’t, find a different finding.
Action over analysis. The output of every demo should answer “what do I do?” not “what do I know?” Sort students by who needs a call today, not by distress score distribution. Surface the critical failure in the warehouse footage, not a compliance percentage. The analysis can be one click away — it should never be the headline.

However, there are many improvements the team needs to make.

The Single Point of Failure Problem The entire team is dependent on Anand for direction, client context, task assignment, quality review, and stakeholder relationships.
Missing: Taxonomy and Discoverability of Demos The team produces dozens of demos, benchmarks, and data stories. There’s a vague mention of a “catalog” and a “demo list” that Anand maintains, but it’s personal and opaque.
Missing: Explicit Knowledge Transfer Between Members Team members work on parallel tracks with little cross-pollination. X’s work on 3D benchmarking uses similar methodology to Y’s UMAP work, but they don’t reference each other’s approaches. When Anand wants them to collaborate, he explicitly engineers it.
The Presentation Quality Gap Team members consistently make the same presentational errors across many months. Anand corrects these every time, but they recur because the corrections aren’t being internalized or documented.
Client Handoff Documentation Is Missing Several demos reach clients without clear documentation of what was built, what the inputs were, and how to reproduce or extend it. Straive’s pitch is “here is what AI can do for you” — but without handoff documentation, clients can’t do anything with the demo.
The Timing Problem on Model Choices The team regularly uses outdated or suboptimal models, then switches when Anand notices. The team doesn’t have a maintained “current best model for X task” reference. Each person relies on whatever they used last time or whatever they happen to know about.
The Right Senior Mentorship for the Right Stage Anand is simultaneously mentor, product manager, client liaison, and technical reviewer. This works but creates a bottleneck. More importantly, some of the most valuable mentorship happens late. The team would benefit from more structured “junior reviews senior’s plan before execution” moments — not Anand reviewing output after, but Anand reviewing the approach before.

Every point is spot on and totally useful to me. The best part is that it just required me to paste the transcripts and ask it to analyze the team’s structure.

You can analyze your own team meetings too. Paste a dozen transcripts into a good AI agent and ask it:

Based on these transcripts, what are my team’s principles and operating model? How should we improve and why? Share with specific examples from the transcript.

This is like hiring a $100/hr organizational consultant to attend your meetings and give you personalized feedback!

Deploying websites over dinner

Mon, 27 Apr 2026 16:30:50 -0400

Over dinner with Nishka, we were trying to deploy a website. The challenge was: How can we deploy this website, just on mobile, without getting up from the dinner table?

STEP 1: Hosting. On my phone, I dictated to ChatGPT (whose transcription is excellent), copy-pasted that to Gemini (which is faster):

I want to publish specifically a static HTML web page on my own domain.
I want the easiest way that I can host it, preferably just by copy-pasting from my mobile without needing to muck around with Git and the likes of it.
What are the most robust, reliable hosting providers that I could use? I can sort out the domain name myself as long as they support an option to map a custom domain name to them.
Ideally, I am looking for something that is free, preferably free forever.

Sambar Styles

Sun, 26 Apr 2026 20:21:51 -0400

My wife’s sambar tastes different from my mother’s. And mine, too. When I cooked as a bachelor, my neighbour would pop by, taste the sambar, and exclaim, “Rasam super!”

Surbhi’s Day 5 of the 30-day challenge was about Sambar which inspired me to take her dataset and create a decision tree for which state a sambar recipe is from based on its ingredients.

ChatGPT started with 68 recipes and built a tree at 41% accuracy. As we added more recipes:

Recipes	Accuracy
68	41%
293	42%
361	55%
406	54%

… the accuracy wasn’t improving all that much.

Here is the classifier script: sambar_fftree.py. You can run it via:

uv run https://files.s-anand.net/blog/2026-04-26-sambar-styles/sambar_fftree.py

But a ingredients are snipers: rare, precise, devastating.

Kokum or goda masala -> Maharashtra. 32 for 32 in one run. Perfect.
Sesame/gingelly oil -> Tamil Nadu. 28 for 32. Strong.
Coconut oil + shallots -> Kerala. 22 for 29.
Moong dal + no mustard seeds -> Andhra. 31 for 36. Better than garlic.
Byadagi chillies -> Karnataka. 6 for 7. Tiny sample, but clean.

But without some of these strong signals, the sambar could be from anywhere. Better to abstain when unsure.

Here is the classifier that allows abstentions: sambar_fftree_abstain.py and the dataset I used. You can run it via:

# Download the files
wget https://files.s-anand.net/blog/2026-04-26-sambar-styles/sambar_recipe_dataset.csv \
     https://files.s-anand.net/blog/2026-04-26-sambar-styles/sambar_fftree.py \
     https://files.s-anand.net/blog/2026-04-26-sambar-styles/sambar_fftree_abstension.py
# Run the script with the data I used
uv run sambar_fftree_abstension.py --no-download

Only about a third of recipes have a clear signal.

Incidentally, Coconut alone is not a Kerala signal. It’s more “west coast”.

State	Uses grated coconut
Karnataka	60%
Kerala	58%
Maharashtra	32%
Tamil Nadu	24%
Andhra	14%

Garlic is not Andhra either. In one run, garlic + no coconut was a 50-50 split between Andhra and Maharashtra.

Rather than states, it’s better to think of styles.

Style	Ingredients
Amti	kokum, goda masala
Tamil tiffin-sambar	sesame oil, sambar powder, tamarind
Kerala coconut-tempering	coconut oil, shallots
Andhra pappu/charu	moong dal, less mustard
Karnataka sweet-roasted	byadagi, jaggery, coconut

Maharashtra is easy to identify if it’s amti style. Without kokum or goda masala, it’s generic sambar.

Tamil Nadu sambar has two distinctive styles: sesame/gingelly oil; or sambar powder + tamarind.

With all of this, we could identify the state only about one-third of the time based on ~2,000 recipes. But we can identify the distinctive styles from their ingredients, when it’s present.

Like my bachelor-days sambar, which was missing dal. (No one told me sambar needs dal.) And my neighbour could identify it instantly. As rasam.

Panchayat solves the wrong problem

Sun, 26 Apr 2026 19:02:18 -0400

In Panchayat Season 1 Episode 7 Ladka Tez Hai Lekin…, at around 17:00, Pradhan asks Abhishek to solve problem 42.

42. A takes 5 days more than B to do a certain job and 9 days more than C.
    A and B together can do the job in the same time as C.
    How many days would A take to do it?

    (a) 16 days
    (b) 18 days
    (c) 15 days
    (d) 20 days

The correct answer is (c) 15 days. But interestingly, ChatGPT got it wrong the first time too. It said (a) 15 days instead of (c) 15 days, and required a fact-check to correct itself.

But Abhishek’s steps have nothing to do with the problem - he proceeds to solve a completely different problem and ends up with the supposedly correct answer of 37/23.


(42)

Qty of x in container (100 l) = 40 lts
Qty of y in the container of 100 ltr = 60 ltr

Qty of Milk in final Mix = 100 × 3/4 = 75 ltr
Qty of water in final Mix = 100 − 75 = 25 ltr

[crossed / scratched working]

x = 40 × 2/3 , 40 × 1/3 = 80/3 , 40/3   [circled]

Qty of Milk & water in 60 ltr of Mix

y = (60 × a)/(a + 2) , (60 × 2)/(a + 2)
  = 60a/(a + 2) , 120/(a + 2)

Now, the problem I’m trying to solve is, which of these made the bigger mistake:

Abhishek, who solved a different problem
ChatGPT, who got the answer right but the option wrong
The writers of the show, who failed to correct this obvious error

AI advice for teams

Sat, 25 Apr 2026 12:17:03 -0400

I updated my AI Advice page by:

Transcribing my calls in the last 2 months (Gemini 3.1 Pro, “Transcribe this call recording…”)
Extracting AI advice (Gemini 3 Flash, “Summarize ALL AI-related advice … into 1-sentence bullets”)
Asking Claude, ChatGPT, and Gemini to document what’s new / changed.

I added this request:

But, and this is IMPORTANT, analyze my original writing style, write it exactly in that style, and then verify to make sure it follows the same style (correcting where required.)

This worked surprisingly well with Claude 4.6 Sonnet (adaptive). I used the resulting ai-advice.md almost verbatim.

Here’s the full list of changes it suggested:

The biggest structural update: ai-advice.md is still written as “how individuals use AI better.” Your actual advice has evolved into “how people, teams, and organizations operate in the agent era.” The whole document should eventually be reorganized to reflect this.

INSERT

Build an AI workspace, not just a chat (Very high frequency — 10+ docs) Every serious AI project needs a project folder containing:
- AGENTS.md — folder-specific instructions the agent reads on startup
- prompts.md — all prompts version-controlled as source code
- skills/ — encapsulated successful workflows (see #2)
- Git repository with commits at every checkpoint
- Test fixtures, synthetic datasets, logs, outputs
Treat prompts as the real IP. Code is disposable; prompts, tests, and skills are assets.
Encapsulate successful workflows into reusable skills (Very high frequency) Once an agent succeeds at a task three times, encapsulate it: the prompt, tools used, edge cases, constraints, validation tests. Store in a skill.md file. Skills are the new software libraries — they make workflows deterministically repeatable without re-explaining everything. Use agents to build these skills by asking them to summarize what they learned.
Run coding agents safely: Git + Docker (Very high frequency) Always: (a) work inside a Git repository and instruct the agent to commit as it goes — git checkout is your undo button, (b) run agents inside Docker containers so they cannot touch your actual files, (c) use “YOLO mode” (skip permission prompts) only inside isolated containers. These aren’t optional for anything beyond throwaway prototypes.
“LLMs hallucinate, but code doesn’t” — use code as the truth engine (Very high frequency) Broaden “have it write code to process numbers” significantly. The mantra is: wherever correctness matters, make the AI produce executable code or logic rather than natural language answers. Code either works or fails — it’s binary and auditable. Use domain-specific languages (Prolog-like rule trees, schema validators, policy-as-code) for logic-heavy tasks. This is the primary mechanism for eliminating hallucinations in production.
Build verification into the workflow, not after it (Very high frequency) Verification should be engineered as a product feature, not added as a post-hoc check. Every output should expose: source citations linked to snippets, confidence levels, what’s unverifiable, disagreement signals, and audit logs. Use model disagreement as a routing signal — when models disagree, send to human review; when they agree, lower review priority. Build golden sets to measure actual accuracy on your specific task.
Use AI for exception triage, not blanket automation (High frequency) Let AI classify outputs as red/yellow/green: green = automate fully, yellow = flag for review, red = human required. This is more mature than “80-90% AI, human for last mile.” It says exactly where the human loop belongs, and it scales: automation handles routine volume while humans focus only on high-stakes exceptions.
Use synthetic data deliberately (High frequency) Not just “realistic fake data for prototyping” — generate hypothesis-driven synthetic data that embeds specific behavioral patterns, edge cases, and known failure modes you expect in production. This lets you stress-test before real data arrives, without compliance concerns, and at whatever messiness level you choose.
Treat demos as imagination accelerators (High frequency) Demos are not just proof-of-concept — they are the fastest way to expand what stakeholders think is possible. Use “Hollywood set” demos: working outputs, simulated backends, precomputed workflows, client-specific synthetic data. Only demo live if the task completes in under 10 minutes. Simulate or precompute slow, expensive, or credential-heavy workflows. Show the output first; defend the architecture only if asked.
Maintain a living model radar — don’t freeze model advice (High frequency) Specific model recommendations go stale within months. The durable advice: continuously blind-test frontier models on your exact task, maintain a benchmark set, and route by capability. Current pattern: Claude for coding/aesthetic/style/writing; ChatGPT for rigorous analysis/financial modeling/extended thinking; Gemini for Google Workspace/research/video/speed. But measure this; don’t assume it. Additionally: use LiteLLM or Portkey as open-source gateways for organizational cost observability across models.
The Jevons Paradox applies to knowledge work (High frequency in strategic contexts) AI making cognitive tasks cheaper will increase total demand for cognitive work, not reduce it. Human roles shift from execution to verification and judgment — but there’s a talent crunch coming for verification roles. Hire now for people who can check, certify, and take accountability for AI output.
Use games to teach AI, not slide decks (High frequency) Replace passive L&D with Capture the Flag challenges, treasure hunts, forbidden-word jailbreaks, prompt-injection games, and coding-agent races. Evaluate proficiency by task completion speed with an agent, not syntax recall. Design challenges where using a coding agent is the only practical way to finish in time — this creates binary signal: those who can use agents solve everything; those who can’t solve nothing.
Bifurcate hallucination advice: operational vs. creative (Medium-high frequency) Current advice mixes these. Split explicitly:
- For operations, facts, finance, law, regulated outputs: eliminate hallucinations via multi-agent consensus, code execution, source grounding, and human routing
- For ideation, brainstorming, research: deliberately use hallucinations as stochastic ideation. Run the same prompt multiple times. Use weaker models without extended thinking — “speaking without thinking” produces more imaginative divergence
Move from dashboards to answers and actions (High frequency) Replace static BI dashboards with AI that answers “what should I do?” not just “what happened?” Ask AI to anticipate a stakeholder’s questions and pre-answer them. The endpoint: proactive agents that push insights to individuals rather than passive dashboards that wait to be queried.
Sell outcomes, accountability, and verification — not software (High frequency in business contexts) Software is a depreciating asset; any client can regenerate it tomorrow. Durable value: judgment, trust, domain expertise, data access, and taking responsibility for results. Shift toward outcome-based pricing. The “neck to catch” — human accountability for AI output — is increasingly the product.
Measure AI adoption by behavior, not attendance (Medium frequency) Track: unique days of active use (regularity beats volume), token consumption trends, tool diversity, quality of outputs produced, and business outcomes driven. Usage logs from NetSkope or LLM gateways give better signal than training completion rates.
Assess AI literacy by how people prompt, verify, and recover (High frequency in education contexts) Don’t evaluate final answers — AI can produce those. Evaluate: quality of prompts (specificity, guardrails, constraints), ability to identify and fix hallucinations, recovery from errors, and process discipline. Multiple-choice questions are essentially obsolete for AI-era assessment. Assess the process, not the output.
Use AI-native output formats (High frequency) Stop defaulting to PPT or PDF. AI generates HTML, SVG, JSON, interactive dashboards, podcasts, sketch notes, and games better than it generates static slides. A single source document can auto-generate: podcasts, explainer videos, interactive quizzes, sketch notes, executive summaries, slide decks. Use NotebookLM for audio synthesis from diverse sources.
Audit your own behavior via AI (Medium frequency) Feed your own meeting transcripts, email chains, and call recordings into LLMs to find personal blind spots, biases, and recurring errors. Conduct project post-mortems on email threads. Use adversarial prompting — pit GPT against Claude to stress-test your plans. This expands “mine your digital exhaust” from insight into behavioral coaching.
Optimize content for the agentic web (Medium frequency) More content is now consumed by AI agents than humans. Publish in formats AI can parse, cite, and remix: clean metadata, semantic structure, source links, reusable chunks. Develop MCP connectors to your proprietary content. This is SEO for the agentic era.
Ask AI what it needs before starting (Medium frequency) Don’t guess what context to provide. Ask: “What information, tools, files, and access do you need to do X?” Let the agent specify missing pieces before it starts, not halfway through. Simple and dramatically reduces mid-task derailment.

UPDATE

Two-Strike Rule → time-box and preserve context before abandoning
- Current: Abandon after two failed fix attempts.
- Update: After two failed repair loops: (a) ask the agent to produce a failure summary, minimal reproduction case, and fresh plan before abandoning, (b) switch to a time limit (2 hours) not an attempt limit, (c) in Docker/YOLO mode, let agents iterate without this limit. The current rule was written for chat-based coding; agentic tools self-correct across many more iterations. Pure restarts discard useful diagnostic context.
Paid subscription → quality + friction, not blanket privacy
- Current: “Your data isn’t used to train the models. This is the best $20/month.”
- Update: Paid subscriptions give better models and less friction. For privacy, the picture is more nuanced: consumer plans (ChatGPT, Claude.ai) have data controls you must check and configure; Enterprise/API plans explicitly exclude training by default. For sensitive work, use Enterprise/API or run locally. Don’t assume consumer paid = private. OpenAI Data Controls FAQ Also: maintain subscriptions to all three major models (~$60-80/month), not one. Heavy users: consider the $100/month tier to eliminate friction during peak experimentation.
Model recommendations — replace frozen Q1 2026 advice with routing logic
- Current: “Claude/Gemini still good at UI. GPT for rigorous testing.”
- Update: See INSERT #9. Add: use a more capable model (Claude) to write scripts and instructions for cheaper models (Codex) to execute. Benchmark on your exact task; these rankings shift quarterly.
“Intern” — expand to multiple mental models by task
- Current: “It’s as smart as a post-graduate intern.”
- Update: The right mental model depends on task:
  - Brilliant but stubborn intern: excellent at fetching/preparing materials, unreliable for precise design or nuanced judgment
  - Fresh MBA who needs full context: give it the same rules, examples, and feedback you’d give a new hire
  - Senior mentor to defer to: for syntax, library knowledge, and coding patterns, AI may know better than you — defer (“Mentor Flip”)
  - Alien intelligence that needs coaching: for novel tasks, it needs explanation, not just instruction
Human-in-the-loop → human-on-the-loop with exception routing
- Current: “Handle 80-90% of effort, human expert for last mile validation.”
- Update: More mature framing — “human-on-the-loop” rather than “human-in-the-loop.” Build a confidence-building period first; validate; then grant autonomy for routine cases. The human’s job is to review exceptions (disagreements, low-confidence, high-stakes), not everything.
“Code is disposable” → prompts and skills are the real assets
- Current: “Code is an AI compilation artifact. Don’t get attached to it.”
- Update: Code is disposable when the workflow is disposable. But prompts, skills, tests, data contracts, and validation logic are permanent assets that compound in value. Preserve these even when you throw away the code.
“Don’t learn to code” → learn logic, not syntax
- Current: “As a non-technical person, build apps. Don’t learn to code.”
- Update: Don’t worship syntax — it’s declining in value. But learn enough conceptual fluency to: specify what you want clearly, write test cases, debug outputs, assess security implications, and judge whether AI-generated code is correct. Syntax is less valuable; understanding is not.
“Buy, don’t build” → buy foundations, build thin orchestration
- Current: “Don’t train models. Build orchestration layers and proprietary data workflows.”
- Update: Don’t build or fine-tune base models (they’re obsolete on arrival). Do build: thin domain-specific orchestration, skills/prompt libraries, verification layers, data pipelines, and MCP connectors. Avoid custom SLMs unless you have strict air-gap, privacy, or cost-at-scale constraints — the “SLM Depreciation Trap” (custom models obsolete before deployment) is real.
“Wait for models to improve” → apply a 1-3 month ROI window
- Current: “Things not possible today will be possible in a few months.”
- Update: Apply a test: if a workaround won’t pay back within 1-3 months, wait. If building creates learning, adoption, or strategic leverage now, prototype anyway. The advice shouldn’t be “wait” or “build” — it should be “calculate the ROI window.”
Data safety → specific operational checklist
- Current: Send schema not data; pick trusted providers; anonymize.
- Update: Add specific controls: set Google Drive access to read-only and Gmail to draft-only for AI; keep a dedicated “AI-only” folder rather than granting full Drive access; use separate browser profiles for work/personal AI; run agents locally (Codex, Claude Code on-machine) for sensitive data; use MCP for restricted, scoped data access. Anonymize before cloud; schema+local-execution for sensitive tabular data.
“Hallucinations can be a great feature” → boundary-condition this
- Current: “Don’t always eliminate them. Use as appropriate.”
- Update: Great for: ideation, research brainstorming, creative divergence, humor. Never acceptable for: facts, finance, law, medicine, safety, or regulated outputs without verification. Be explicit about which mode you’re in.
Skills section — “declining” needs nuance
- Current: “Domain depth” listed as declining.
- Update: Routine versions of domain skills decline; judgment-heavy versions grow. Domain depth matters most for: problem framing, validation design, incentive mapping, ethics, and edge-case recognition. Don’t blanket-advise people to abandon domain expertise.

DELETE

Two-Strike Rule (current form): Outdated for agentic tools; replaced by time-boxing + context preservation |
“Paid subscription = privacy” (the simple version): Factually incomplete; needs the consumer/enterprise/API distinction |
“If all models agree, accept”: Too strong. Soften to: “agreement lowers review priority; measure on a golden set” |
“Claude/Gemini for UI, GPT for rigorous testing” Q1 2026 frozen claim: Goes stale; replace with routing logic + model radar |
“Prefer less experienced people” (blunt version): Replace with: “prefer AI-native, humble, high-agency people — could be interns, domain experts, or non-coders; the traits are delegation, verification, and fast learning” |
“Wait for the crisis” (adoption section): Reframe to: “watch for urgency windows; arrive prepared with demos, risk framing, and low-friction integration” |
“Domain depth is declining” (blanket): Replace with nuanced version from UPDATE #12 |
“Use AI for validation is safe and effective” (unqualified): Replace with: “use AI to design validation workflows; don’t treat AI output as validation itself” |
“Repurpose content and data” TODO placeholder: Fill it in or remove it |
“Have it write code to process numbers”: Upgrade to “LLMs hallucinate, but code doesn’t” — write and execute code for correctness |

Contradictions

“Stay out of the way” vs. “verify everything”: Risk ladder: low-risk/creative/prototype = get out of the way; high-risk/regulated/persistent = full verification stack |
“Code is disposable” vs. “use Git, Docker, versioning”: Code may be disposable; recovery, reproducibility, and auditability are not |
“Don’t learn to code” vs. “candidates need technical depth”: Syntax is less valuable; conceptual fluency (logic, testing, security, judgment) is more valuable |
“Buy, don’t build” vs. “build custom pipelines”: Don’t build foundation models or heavy platforms. Do build thin orchestration, domain workflows, verification layers, skills |
“AI wildly” vs. security/privacy constraints: Overuse for low-risk/personal tasks; use enterprise/local/sandboxed patterns for sensitive work |
“Human-in-the-loop” vs. autonomous agents: Exception routing: automate routine cases, route edge cases and disagreements to humans |
Live demos vs. simulated demos: Live only if task completes in under 10 minutes; simulate otherwise |
“Hallucinations as feature” vs. eliminate hallucinations: Feature for ideation; eliminated for operations/facts/regulated outputs |
“AI can do health/finance better than experts” vs. responsibility: AI supports preparation and second opinions; decisions need qualified human accountability |

10 principles

The scarce skill is not doing the work; it’s choosing the work, feeding the agent context, and verifying the result.
Prompts, skills, tests, and context files are assets. Code is a byproduct.
Use AI wildly in low-risk contexts; use it rigorously in high-risk contexts.
For facts and money, make AI produce evidence. For logic, make it produce code. For operations, make it produce audit trails.
LLMs hallucinate, but code doesn’t. When correctness matters, make AI write and run code.
Treat demos as imagination accelerators — show what’s now possible before arguing about architecture.
Don’t sell software if the client can regenerate it tomorrow. Sell outcomes, accountability, and verification.
AI training should be a game of doing, breaking, checking, and recovering — not a lecture about tools.
Every repeated AI success should become a reusable skill. Every failure is training data if you preserve the prompt, output, rejection reason, and fix.
The risk today is not just hallucination; it’s underuse, insecure overuse, and unverified scale.

Suggested structural reorganization

Your current document answers: “What tips should I follow?” Your actual advice answers: “How do I operate in the agent era?”

LLMs are as energy-efficient as brains

Sat, 25 Apr 2026 08:25:37 -0400

For a typical GDPVal style task, humans take ~7 hours and the brain consumes ~135 Wh. Frontier LLM agents spend 50-500 Wh. So, we may already be 3x more or less efficient than the brain.

Roughly in the same ballbark!

My food preferences

Fri, 24 Apr 2026 14:09:36 -0400

I use ChatGPT to recommend which restaurant I should eat at and what food I should eat. So often that I decided to share a profile of my eating preferences.

But rather than think about it and type it myself, I asked it to

Efficiently interview me to identify my food preferences. Document it for AI agents to help me pick restaurants. Plan like an expert.

(Knowing ChatGPT, I also had add “efficiently” - otherwise it would give me a huge list of questions! Which it did that anyway…)

That makes it easy. Now, I just have to answer questions about my preferences, e.g. my budget range, restaurant styles, cuisines etc.

I also gave it a dump of my daily food log. (This year I’ve been logging all the food that I eat - for no reason actually - along with which of those I liked and disliked.) This proved to be a good idea, because it picked up something I didn’t realize:

You are unusually tolerant of repeating humble comfort foods at home, …

Very true. I ate curd rice day and night for 22 years at every meal. I eat Kaya Peanut Toast every day at Ya Kun Kaya Toast. (They make it the instant they see me standing at the back of the line, and miss me when I travel.)

So, here’s my tweaked version of ChatGPT’s recommended food spec on how agents should pick for me:

Eggetarian. No meat or seafood; eggs & milk are OK.
Prioritize novel cuisines & known exceptional dishes.
Prefer casual, functional, non-fine dining. SGD $5-20 feels fine.
Any cuisine works: Indian, Italian, Mediterranean, Middle Eastern, ...
Creamy > crunchy > gooey > chewy > airy.
Spicy is good. Heavy, raw, or loud are bad.
Desserts are great! Too sweet is bad.
E.g. creative fusion, chaat, inventive small plates, nutty flavors, hot cookies, textured desserts.
Prefer 10 min travel, 5 min wait. For exceptional food in a new city, an hour is OK.

PS: This is styled quite differently from how ChatGPT writes, but the content was correct.

Using Codex as my OS

Thu, 23 Apr 2026 21:06:28 -0400

Increasingly, I’m using Codex (or other AI coding agents) as the “operating system” to run programs. That is, rather than directly run programs, I have the coding agent run the program.

Advantage: If the program breaks, or needs a configuration change, the coding agent debugs it and fixes it. I don’t need to do anything.

This is particularly useful for installation. For example:

Install demucs and run it against my music folder.

It’s also useful for brittle scripts. When my WhatsApp scraper broke, I told Codex:

The scraper was working fine but today, it fails. Find out what changed in the WhatsApp Web UI and fix it.

Codex researched and replied:

The current scraper still finds the chat area and message rows, but its row-identity parser is now invalid. On March 28 the code assumed each message node had a packed data-id like:
true_<chat-id>@..._<message-id>
false_<chat-id>@..._<message-id>
Today, in the live WhatsApp tab, the visible message nodes still exist under #main [role="row"], but data-id now looks like a bare message ID:
3A4689CE2A81856D4BB0
ACED689BB5C3115D3DD361F3520F9ECE
...
Because parseDataId() only accepts the old packed format, it rejects every row, so whatsappMessages() returns [], and the bookmarklet shows Copy 0 messages.

WhatsApp appears to have split what used to be one overloaded field into separate concerns:

data-id now behaves like messageId

outgoing direction is likely encoded by DOM class (.message-out)

chat/thread id is no longer bundled into the row data-id

… and went on to share a “Recommended hardening plan for future breakages”.

If coding agents can install, configure, write, run, test, and debug programs, “Agentic Operating Systems” are not nearby. We already have them.

Derived formats with Gemini

Sat, 18 Apr 2026 11:26:48 -0400

The natural capability of Generative AI is to generate stuff - and Gemini’s particularly good with media.

For example, we can take any document, like this MasterCard report on The State of Open Finance 2026, and generate videos, podcasts, sketchnotes, songs, and more from it.

How?

I uploaded the PDF to NotebookLM and created a 20-minute podcast by clicking on Generate Audio Overview - Deep Dive - English - Default.

Listen to the English podcast

It supports multiple languages, so I generated a Chinese and Filipino version as well.

Listen to the Chinese podcast Listen to the Filipino podcast

Clicking on Generate Video Overview - Cinematic led to this video overview:

Video

There are other formats in which we can generate videos. The Cinematic format is new, and the list is growing.

It’s not just NotebookLM that you can use to generate new formats. Gemini itself supports a variety of formats.

For example, I used my Gemini Sketchnote prompt to create a visual summary of the report:

… and, using Lyria via the “Create Music” option to generate a narrative song with this prompt:

Create a narrative summarizing this article.
Narrate it rather than sing it.
Use a voice like Bobby McFerrin's, as if he were narrating rather than singing.
Keep the music minimal, focus on the voice.

Listen to the narrative song

Next, I had Gemini create a slide deck by uploading the report and prompting:

Convert the attached report into a beautiful slide deck that conveys the most important actionable information for the audience.

STYLE:
Write it McKinsey style with action titles. Just reading the titles should give the audience the entire message of the deck.
Follow the pyramid principle. The contents of the slide should prove the title.
Make the slides content rich, i.e. clear and self-explanatory with enough detail to help the audience understand without a narrator.
Use iconography, typography, stock images, etc. as appropriate.
Write as a single page HTML application.

See the slides.

Then, a set of interactive explainers using this prompt:

Convert this report into 3 interactive explainers.
Pick the parts of the report that are best conveyed through interactive explanations. Identify the 3 most suitable ones.
Each explainer should, using animations, interactions, and simulations, explain a core point made in the report.
Render this as a single page HTML canvas.

See the explainers.

Finally, a narrative data story using Claude – which I could do with Gemini, too, but Claude is better at.

See the story.

Where this is becomes practical is in:

Proposals. No one pays attention to that company slide or RFP response. A 3-min video or 15-min podcast lets them absorb it during a walk.
Reviews. Skip copy-pasting metrics into PowerPoint. Feed the raw data and ask for a McKinsey-style deck with action titles.
Onboarding. Instead of a 100-page SOP or compliance manual, how about interactive explainers or a localized audio guide in Mandarin or Spanish?
Manuals: How about a visual sketchnotes or step-by-step interactive flows from that documentation for call center agents?
Case studies. Text-heavy fails. Maybe a 60-second narrative data story or sketchnote accompanied an upbeat narrative song?
Reports. No one reads the 10-page competitor analysis. A 5-minute podcast or a single-page visual sketchnote helps the execs.
Training. Create interactive simulations where people make actual decisions. Simsaram is my favorite example: family relationship training/simulation based on an iconic film.
Emails. Why not use illustrations, sketches, flowcharts, etc. to liven up internal / external emails?

When generative AI makes generation easy, why not generate actually interesting stuff?

Travel is exhausting

Thu, 16 Apr 2026 10:11:40 -0700

This is surprising because… well, we’re just sitting and the vehicle’s doing the work, right?

But:

Vehicles accelerate, brake, bump, turn, vibrate, … and our muscles micro-adjust continously so we sit upright. Over hours hours, that’s a lot of energy.
We feel like we’re still. But the inner-ear fluids, eyes, etc. constantly get feedback about motion. That mentally drains us (and causes motion sickness).
Noise from vehicles, traffic, … triggers cortisol, a stress hormone. That drains us.
Sitting in one place restricts blood flow and it pools in our legs, making the heart work harder.
In flights, the air pressure is low, lowering oxygen levels. The dehydration thickens our blood, making pumping harder.

What helps is:

Noise-cancelling headphones (reduces the rumbling / traffing sounds)
Compression socks (reduce blood pooling in the feet)
Eye masks
Good back support

… and the usual good habits:

Drink water
Exercise
Fasting / eating light
Breathe slowly

Agent Skills Usage

Mon, 13 Apr 2026 16:16:41 -0700

I have a bunch of coding agent skills I’ve accumulated over the last few months. Here’s how often my sessions use them:

Skill	Claude	Codex	Copilot	Overall
code	6.1%	69.1%	37.5%	51.5%
data-story	48.7%	16.4%	37.5%	28.0%
data-analysis	2.6%	35.2%	7.8%	21.8%
design	25.5%	23.6%	14.1%	21.8%
plan	8.5%	11.8%	14.1%	11.8%
agent-friendly-cli	3.7%	13.8%	11.1%	11.2%
devtools	20.4%	7.3%	9.4%	10.0%
llm	2.5%	8.7%	7.8%	7.4%
pdf	0.0%	7.9%	7.8%	6.6%
linkedin-cdp	14.3%	0.0%	5.6%	5.3%
uv-uvx	0.0%	9.5%	0.0%	4.9%
interactive-storytelling	7.1%	2.7%	7.1%	4.6%
demos	8.5%	2.8%	1.6%	3.5%
cloudflare	0.0%	4.3%	3.1%	3.3%
melt-mlt	0.0%	2.5%	1.6%	1.8%
vector-art	2.5%	2.4%	0.0%	1.7%
vitest-dom	0.0%	2.2%	0.0%	1.4%
memorable-explanations	2.6%	1.6%	0.0%	1.3%
npm-packages	0.0%	0.6%	0.0%	0.3%

Here are my observations, with surprises highlighted as ⁉️

code is the most used skill, by far. About half the sessions use it.
- But Claude doesn’t use it much⁉️
The data-story and data-analysis skills were the most rapidly adopted.
- I use Claude (with Claude Code and Copilot) a lot more for data stories. I use Codex for data analysis.
- Therefore the webapp-testing and devtools skilss are used less by Codex.
The design skill is used consistently across agents. It was inspired by Claude’s design skill - but I don’t think it is particularly good, and needs revision.
agent-friendly-cli tool development is mostly with Codex, followed by Copilot, and very little with Claude.
Most pdf sessions are with Copilot / Codex, not Claude⁉️
Codex reads most skills diligengly.
- It is the only one diligently reading my uv-uvx skill, even though every agent uses it⁉️
- In fact, it is the only agent to have read every skill except linkedin-cdp (it never needed it.)

Gemini Sketchnotes

Sat, 11 Apr 2026 16:11:22 +0800

I use this prompt to generate sketchnotes on Gemini:

Draw this as a visually rich, intricately detailed, colorful, and funny, sketchnote.

Below that, I paste (or attach) whatever content I want it to draw.

I also turn on “Create Images” and switch the model to “Pro” (for better thinking.)

Here are some examples of how to use it.

Summarize articles. Pick email, report, news, or website.

Here’s a sketchnote for this article: How to use AI for research. I used the prompt above and pasted the article text.

Summarize presentations. Pick any proposal, status update, project report, or meeting agenda.

Here’s a sketchnote we emailed SNHU as pre-read before the meeting. I used the same prompt and attached the presentation. They loved the sketchnote!

Summarize books. Pick any textbook, non-fiction, fiction, or anthology.

This sketchnote summarizes Morgan House’s book Same as Ever

Summarize talks. Pick from YouTube, TED, podcast, or even meeting transcripts.

This sketchnote summarizes a talk by Rob Schrauwen of Elsevier. He really liked the sketchnote too!

Explain a workflow. Pick from any slide, architecture diagram, process document, or paper sketches.

This sketchnote documents a proposal submission workflow

NOTE: For complex documents or long transcripts, I add this line so that Gemini thinks first and creates a more structured sketchnote:

Think about the most important points, structure it logically so that the sketchnote is easy to follow, then draw it.

With sketchnotes, you unlock am interesting capability.

It catches attention - it’s colorful and engaging.
It signals simplicity - so people are more likely to read it.
It sets you apart - it’s rare when people try something different.
It’s personal - you can create a distinct version tailored to each audience - in minutes.

It takes just one prompt in front of any content. Try it out:

Draw this as a visually rich, intricately detailed, colorful, and funny, sketchnote.

Workshops help AI adoption

Sat, 11 Apr 2026 13:08:46 +0800

To teach a mindshift change like AI adoption, I’ve tried to:

Workshop: get them to do it. “Let’s try something. Can you share your screen?”
Live-code: show them how. “I’ll share screens and tyep this.”
Demo: show what’s possible. “Here’s what I built.”
Talk: explain it. “Here’s something we can build.”
Interview: ask them about it. “What do you think?”
Listen: let them yap.

The most effective are on top.

But there is one intervention I didn’t evaluate:

Exercise: give them a problem to solve.

This takes more time and patience on my part, but might have the highest value.

What this means for me is:

Don’t live-code. Workshop it. “Can you share your screen and try that?” Preferably as an open-ended problem.
Don’t demo/talk. Live-code it. “Let me share my screen and try that.”
Prepare micro-experiments. Keep a ready catalog if things to try.

Singing a Vote of Thanks

Thu, 09 Apr 2026 07:56:28 +0530

Lyria (Gemini’s new “Create Song” feature) is helping me in new ways. Earlier this week, it created a jingle for my talk.

Yesterday I ran an AI Workshop for IAS officers. As part of that, I asked Gemini:

Create a soulful vote of thanks (with patriotic Indian music playing in the background) naming each of these people.

… and listed each person in the workshop.

The song began…

(Listen to the song)

… with these lyrics:

The morning sun rises over the secretariat corridors,
illumining the echoes of long nights spent in duty.

… it began. Then, it went on to their names.

To Ms. Vatsala Vasudeva, for steady hands in every storm.
To Shri Shyamal Misra, for the silent strength of leadership.

It was in a western accent, but by now, the audience were looking at each other, pointing out as names were mentioned - with surprise and delight.

To Shri Amit Rathore, for the vision that breaks through the haze.
To Shri V. Shashank Shekhar, for the quiet pursuit of excellence.
To Shri Pankaj Kumar, for building bridges of progress.
To Shri Robert L. Chongthu, for the pulse of the hills in the heart of the city.
To Shri Sanjeev Hans, for the clarity of purpose in complex paths.
To Shri Subodh Kumar Singh, for the resolve that never falters.
To Shri Asheesh Sharma, for the foundation laid with integrity.
To Shri Ramesh Kumar Sudhanshu, for the spark that inspires the team.
To Shri Kaling Tayeng, for the voice of the frontier in our council.
To Shri Manish Thakur, for the wisdom that informs every decision.

We stand as one, guided by the compass of our republic.
Each name a pillar, each service a prayer for the nation.
For the republic we serve, for the people we honor.

For a minute, I thought it was done, but we were only mid-way through.

To Shri Devesh Kumar, for the dedication beyond the call.
To Ms. Aradhana Patnaik, for the grace of leadership under fire.
To Ms. Himani Pande, for the brilliance of thought and action.
To Shri Rahul Sharma, for the steady hand on the rudder.
To Shri Rajeev K. Mital, for the legacy of selfless work.
To Shri Ranjit Singh Deol, for the grit that transforms challenges into victories.
To Dr. Sirra Karuna Raju, for the healing touch of administration.
To Shri Anandrao Vishnu Patil, for the roots that hold us to the soil.
To Shri Sudhir Kumar, for the light that guides through the complexity.
To Shri Subodh Yadav, for the courage of conviction in every file.
To Dr. Richa Bagla, for the wisdom that creates space for growth.
To Ms. Indra Mallo, for the spirit of the Northeast in our hearts.
To Shri Nilkanth S. Avhad, for the depth of knowledge that serves the many.
And to Ms. Mugdha Sinha, for the innovation that shapes the future of our service.

It had covered them all. In well under three minutes.

The comments were personalized and felt accurate, somehow. Not sure if it’s real - it might be like fortune-telling, where we find non-existent patterms, and totally enjoy it!

Thank you for the years of service. May the path forward be as bright as the morning sun.

Jai Hind.

At “Jai Hind”, I had goosebumps. And the audience broke into an applause.

We should all do this sort of thing more often. 🙂

UPDATE: 14 Apr 2026. Here are several ideas on where else we might use songs:

Celebrations. Personalized birthday songs. Or anniversary. Or a celebration - like graduation, promotion, closing a deal, delivering a project, thanking team members, etc.
Tough situations. Before a difficult conversation / meeting, an apology, firing someone, a couple’s argument, a protest anthem, etc.
Updates. As an earnings call, status update, project update, contract summary, etc.

I used it to narrate a data journalism process using Gemini with this prompt:

Narrate a summary of this process for creating Statoistics for the Times of India in a light, engaging way, as if you were explaining it to a friend.
Think about the most important points, structure it logically so that the narrative is easy to follow, then generate the song.

… followed by:

Narrate, don’t sing.
Less music, more focus on the voice.
Far simpler narrative - EXPLAIN to someone who doesn’t understand ANYTHING of the context.

The result is pretty good!

UPDATE: 16 Apr 2026. This is proving more useful than I thought.

A colleague shared a song thanking his team for their contributions.
Another colleague converted classroom videos into rap songs summarizing the session - making this more memorable.
I created a narrative songs sharing (rhythmic) feedback for teachers based on their transcripts - making this a less scary process.

Speaking unprepared

Tue, 07 Apr 2026 13:14:13 +0800

I deliver about 3-5 talks a month and usually prepare for them.

Thanks to AI (but even otherwise), I have a steady stream of new content. So, I just to assemble the story.

For example, in my TEDx Whitefield talk “Prisoners of Birth”, I shared the impact of name, gender, lineage, place, and time of birth. I didn’t execute any new analysis. I just cherry-picked disparate analyses into a theme. (Took me three days to plan, though.)

But repetition bores me. I’ve known this since 2002 when I tried coaching CAT students. Conventional teaching isn’t for me. And talks get boring too.

So, often, I ad-lib. Impromptu. Deliberately under-prepared. (Panels are good practice for this. I must try improv next.)

This has a surprising benefit. In the pressure of the moment, inspiration can strike.

For example, yesterday, at an internal Straive fireside chat, I went completely off-script and answered a question on the chat about the benefits of Gemini.

The sort of things that you can do with Gemini is amazing. And what I do … is to poke … what can I do here? What’s in here? What’s in here? Recently, I saw that it can create music. And some of this music is stunning!

Now, I had to think of something interesting to do with the music. Despite weeks of exploring Lyria, I never found a use for it, let alone “stunning”. Now I had 3 seconds.

Then, inspiration struck!

Now, we had music at the beginning of this Zoom call. Why does that need to be something that is unpersonalized?

I went to ChatGPT and asked it to create a personalized jingle for the talk I was delivering, providing it my name, title, talk topic, etc. - something the L&D team could replace their stock jingle with. And it was pretty good, actually.

That’s a novel and a useful idea - something many people can use!

It’s times like this that I really enjoy delivering talks. This is why I live-code. This is why I pause for questions. This is why I’m thrilled when I have no clue of the answer. Pretending to be know is fun and all that, but the kick of discovery on stage – that’s something else!