All work

Case study · Side project · 2026 · 9 min read

A 24/7 Booking Concierge for a DFW Photo-Booth Studio

Turned a small photo-booth studio's website into a real concierge that triages every lead, builds a real quote from a single pricing source of truth, and lands a tentative hold on the actual Outlook calendar — at 11pm on a Tuesday, on autopilot.

  • <60s lead → quote + hold + SMS
  • 24/7 lead capture, owner-free
  • 1 pricing source of truth
  • 4 tools, deterministic
Role
Solo designer & engineer (architecture, prompt design, implementation, deployment)
Timeline
Shipped to production in weeks; iterated weekly
Organization
Boothly Events · Dallas–Fort Worth photo booth & 360° video
Stack
  • Claude Sonnet 4.6 (Anthropic SDK, tool use)
  • Next.js 16 App Router
  • React 19
  • TypeScript
  • Tailwind v4
  • Microsoft Graph (Outlook + Mail)
  • Twilio SMS
  • Vercel

Stack at a glance

Stack at a glance

  • Claude Sonnet 4.6
  • Next.js 16
  • React 19
  • TypeScript
  • Tailwind v4
  • Microsoft Graph
  • Twilio SMS
  • Vercel

Brand marks via Simple Icons (CC0 1.0). Trademarks belong to their respective owners; used here for accurate factual reference to real integrations.

The challenge

Boothly Events was running a small empire on three inboxes and a phone: Instagram DMs, email, the call line, the contact form. Quotes drifted. "Did we ever reply to her?" was a recurring question. Nights and weekends, when most events are actually being planned, response time was effectively the next morning. A photo-booth booking is impulse shaped: if a host does not get a quote while they are on the planning sprint, they book the next vendor on the list.

The goal was to turn the website into a real concierge that triages every lead, builds a real quote with real pricing, and lands a tentative hold on the actual Outlook calendar, at 11pm on a Tuesday, on autopilot, without pretending to be more capable than it is.

Constraints

  • One operator, no margin for error. A solo owner cannot afford a bot that misquotes, double books, or claims a date is held when it is not. Hallucination at the booking layer is a business-ending failure mode.
  • Real systems of record. Bookings have to live on the actual Outlook calendar the owner already runs. A parallel "AI calendar" is not a feature, it is a second source of truth that will eventually disagree with the first.
  • Voice that does not sound like a form. Most "AI booking assistants" sound like a contact form with sentences glued on. The studio's brand is warm, DFW-rooted, and conversational. The agent had to match.
  • Cost discipline. A side hustle cannot subsidize OpenAI-grade per-conversation cost. The unit economics of a single quote had to stay well under the cost of one paid-ad click.
  • Owner notification is the moat. The owner needed to know about every quote and every hold the moment it happened, on a phone she already carries, without opening a laptop.

My approach

I designed the system around four boring, deterministic tools and one non-deterministic agent that decides when to call them. The agent has no special powers. It has four functions, and it picks which one to use. That is the entire pattern.

  1. Four-tool template. check_calendar_availability reads the Outlook free/busy view. propose_hold writes a tentative event flagged for owner approval. send_quote_email sends the customer a branded quote and the owner a lead-summary email. escalate_to_team texts the owner when the agent is stuck or sees a hot lead. Every tool is plain TypeScript that does exactly one thing. The model decides when to call them and what arguments to pass, but it cannot invent calendar holds or invoice numbers.
  2. One pricing source of truth. Pricing is loaded into the system prompt at build time from a single packages.ts module. Add an add-on, change a price, edit one object — the system prompt rebuilds on the next request. No retraining, no second prompt-engineering session.
  3. Prompt caching on the system prompt. The system prompt is ~140 lines of pricing tables, FAQs, and voice rules. Wrapping it in cache_control: ephemeral cuts cost-per-conversation dramatically and shaves first-token latency on every turn after the first.
  4. Voice tuned by example, not by rule. The system prompt forbids the bulleted-checklist energy that most chatbots default to, and includes one good response shown end to end so the model can pattern-match the energy. Every iteration on the prompt, the test was: read the response aloud. If it sounds like a form, rewrite the rule. If it sounds like a person, ship.
  5. Anti-hallucination guard at the prompt level, enforced by the loop. One line of the system prompt: NEVER claim a date is held without ACTUALLY calling propose_hold. If you say "tentatively held" in a message, you MUST have called propose_hold in that same turn. Combined with the multi-iteration tool loop in the API route, this structurally pushes the model toward call the tool, then announce, not announce, then maybe call.
  6. Owner-notification flow as a first-class feature. A booking concierge that does not tell the human anything is a customer-facing toy. Every hold sends an SMS with a deep link to the calendar entry. Every quote sends a separate lead-summary email and an SMS to the owner. Both internal notifications are wrapped in their own try/catch so a Graph or Twilio hiccup can never make a successful customer send look failed to the agent.
Browser ChatWidget.tsx POST /api/agent Next.js · tool loop ≤5 Claude Sonnet 4.6 cached system prompt 4 tool definitions check_calendar_ availability Graph: getSchedule propose_hold Graph: POST /events + Twilio SMS send_quote_email Graph /sendMail × 2 + Twilio SMS escalate_to_team Twilio SMS Vercel edge deploy
Server-side everything. The browser holds zero credentials. The API route loops on stop_reason === "tool_use" with a 5-iteration cap; the model picks tools, the tools do real work.

Lifecycle of a lead

  1. 01

    Customer opens the chat on boothlyevents.com

    One question at a time, contractions everywhere, no bulleted intake form.

    • Next.js
    • React
    • Tailwind
  2. 02

    Agent confirms the date is open

    Calls check_calendar_availability, which hits /calendar/getSchedule on the studio's Outlook calendar.

    • Claude tool use
    • Microsoft Graph
  3. 03

    Agent builds a real quote

    Line items composed from the single packages.ts source of truth — never guessed, always sourced.

    • TypeScript
    • Claude (prompt-cached)
  4. 04

    Agent places a tentative hold

    Calls propose_hold, writes [HOLD - APPROVAL NEEDED] with full lead body and the Boothly Lead category, and texts the owner.

    • Outlook Calendar
    • Twilio SMS
  5. 05

    Customer + owner emails go out, owner gets a second SMS

    Branded HTML quote to the customer; lead-summary email to the owner; SMS recap with total + date. Internal sends are isolated from the customer-facing send.

    • Graph Mail
    • Twilio SMS
  6. 06

    Anything weird escalates straight to the owner

    Custom asks, distant venues, unusual dates — escalate_to_team texts the owner with the reason and a compact lead summary.

    • Twilio SMS
    • Vercel (always-on)
Architecture decisions

Tool use over prompt-only. A larger prompt could have "described" pricing and dates well enough to fool a casual reader. Tool use forces the model to commit: capacity is checked against a real calendar, holds are written to a real event, quotes are built from a typed line-item schema. Failures are localizable to the tool that produced them.

Tool-use loop capped at 5 iterations. The API route loops until stop_reason is no longer tool_use, with a hard cap. The cap bounds runaway cost and the rare case of the model getting stuck in a tool/think/tool ping-pong. If it ever trips, the user sees a graceful "our team has been notified" fallback and the owner gets an SMS.

Pricing in the prompt, not in the model. Pricing lives in src/content/packages.ts and is injected into the system prompt at build time. The model never "knows" prices in any durable sense, it reads them on every turn. Price changes ship in a single PR and take effect on the next request.

Server-only credentials. The browser never sees Graph or Twilio secrets. The chat widget is a 125-line React component, input box, scrollable transcript, dot-loader, that posts to /api/agent. All intelligence and all secrets live on the server.

Artifacts I authored

  • System prompt: ~140 lines covering brand voice, pricing tables, FAQs, escalation rules, and the anti-hallucination guard, wrapped in ephemeral prompt caching
  • Four tool definitions and their TypeScript implementations against Microsoft Graph and Twilio
  • Tool-use loop in the Next.js API route with a 5-iteration cap and graceful fallback path
  • Pricing source of truth (packages.ts) — single file, types, used by both the system prompt and the quote builder
  • Branded customer quote email template (HTML) sent from the studio's events mailbox
  • Internal lead-summary email and Twilio SMS templates for owner notification, isolated in their own try/catch so internal failures never break the customer experience
  • Tentative-hold convention: [HOLD - APPROVAL NEEDED] subject prefix, Boothly Lead Outlook category, lead details in event body
  • Chat widget React component (ChatWidget.tsx): input, transcript, loading state — no streaming, deliberately

Results

<60slead to quote + hold + SMS
24/7lead capture, owner-free
1pricing source of truth
4deterministic tools
5max tool-use iterations
0browser-side credentials

Lead capture is now genuinely 24/7. Late-night planners get a full quote and a tentative hold before the lead has time to comparison-shop the next vendor on the list. Every quote is a paper trail: the owner's inbox becomes a lead pipeline by accident, every quote searchable, replyable, and itemized. SMS pings on every quote and hold mean the owner can vet a hot lead from a venue site visit without opening a laptop. And the "tentatively held" guarantee is structurally backed by an actual Outlook event, not by the model's good intentions.

The agent has no special powers. It has four functions, and it decides when to call them. That is the whole pattern, and it is the part most "AI booking assistants" get wrong.
About this case study

The figures on this page are drawn from internal program reporting I authored or co-authored as the practitioner on the engagement. They are reproduced here in rounded form. They were not produced by an independent third party, and proprietary detail has been omitted where required by the engagement.

Lift figures (CSAT, accuracy, handle time, hallucination rate) reflect pre/post comparisons against a matched baseline using the cohort, time window, and measurement instrument noted in the case study. Volume and adoption figures come from production analytics dashboards. Cost figures reflect either avoided spend or unlocked budget in the named fiscal period.

  • Boothly Events is a side project I designed and shipped solo (architecture, prompt design, implementation, deployment).
  • Latency claim (<60s lead → quote + hold + SMS): measured end-to-end on cached system-prompt turns in production. Conversational turns excluding the initial intake are typically <2s.
  • Cost claim ('well under the cost of a single paid-ad click'): based on Anthropic per-token pricing for Sonnet 4.6 with prompt caching on the system prompt across the typical greeting → details → quote → hold → confirmation arc. Estimate, not audited.
  • Anti-hallucination guarantee: enforced jointly by a system-prompt rule and the tool-use loop in the API route, not by post-hoc filtering. The guard is structural, not statistical.
  • Owner notifications: customer email and internal owner email/SMS are dispatched from separate try/catch blocks so an internal-notification failure never causes the customer-facing send to be reported as failed.

What I would do differently

Build the evaluation harness alongside the system prompt, not after it. A small set of golden conversations, scored on "did the right tool get called with the right arguments," would have caught two prompt regressions earlier than read-aloud testing did. The other note to self: streaming would have been worth the extra complexity from day one. The agent's first-token latency is fine, but a streamed response feels twice as fast even when it is not, and feel is the product on a booking site.

The reusable pattern is not photo-booth-specific. Any service business with three traits drops into the same shape: schedulable capacity that already lives in a system of record, a pricing model that can be expressed as packages plus add-ons plus rules, and an owner who currently triages leads from a phone. The four-tool template — check capacity, hold capacity, send proposal, escalate — covers wedding photographers, dog trainers, mobile detailers, tutoring services, mobile bartending, event rental, lawn care. Swap the calendar source. Swap the quote schema. Swap the brand voice. The agent loop is the same.

Collaborators

Built solo, in close partnership with the studio owner on brand voice, pricing rules, and the operational guardrails for what the agent should and should not commit to. The owner remains the final authority on every booking — the agent only places tentative holds, every one of which is reviewed and confirmed by a human before it becomes a real booking.

Skills demonstrated

  • Tool-use agent design (Anthropic SDK)
  • Prompt architecture with prompt caching
  • System-level anti-hallucination guardrails
  • Microsoft Graph integration (calendar + mail)
  • Twilio SMS notification design
  • Next.js 16 App Router + server-only secrets
  • Single-source-of-truth pricing modules
  • Brand-voice tuning by example
  • Cost-aware LLM engineering
  • Owner-facing operational design
  • Reusable agent templates for service businesses

Let's build

Seriously, let's chat about your next AI project.

I take a small number of engagements each quarter through Intelligent CX Consulting . If what you're reading here sounds like the thing you need, get in touch.