The WhatsApp AI agents I deploy with Malaysian fintechs, insurers, and consumer brands all share a similar architecture in 2026. After dozens of production deployments, the pattern is stable enough to write down. This is the build guide I wish I'd had three years ago.
The agent we'll build handles inbound WhatsApp messages on a real business number. It qualifies leads, answers FAQ, books appointments, and escalates to human staff when out of scope. It runs on the WhatsApp Business Platform Cloud API, n8n as the orchestration layer, and Claude Sonnet for the reasoning.
1. The architecture (in plain English)
Three layers, each with a clear job:
- WhatsApp Business Platform (Cloud API) — the transport. Receives inbound messages via webhook, sends outbound messages via REST. Operated by Meta, billed per conversation.
- n8n — the nervous system. Receives the webhook from Meta, makes routing decisions, calls Claude for reasoning, calls your CRM, logs everything to your audit store, sends the response back through the API.
- Claude Sonnet (via Anthropic API) — the reasoning. Classifies intent, decides actions, drafts responses, calls tools (CRM lookups, appointment availability, FAQ retrieval).
Three other systems sit alongside but aren't in the critical path: the CRM (for customer context and conversation logging), the human-handoff queue (Slack channel or shared inbox), and the kill switch (a single config flag that disables the agent).
2. Setup prerequisites
Before writing a single workflow, three things must exist:
- An approved WhatsApp Business Platform account. Get this through an official BSP (respond.io, Wati, 360dialog, or equivalent). Approval typically takes 1-3 business days for Malaysian businesses with valid SSM registration.
- A verified business display name and a green-tick application in progress. The green tick takes longer (sometimes weeks), but you can start building before it's issued.
- An Anthropic API account with a hard spend cap set. Pick a number you would not panic about — RM 500/month for a starter agent is sensible. This is your circuit breaker.
3. Webhook setup
n8n receives WhatsApp messages via webhook from Meta. The setup:
- In n8n, create a new workflow. Add a Webhook node as the trigger.
- n8n gives you a URL like
https://your-n8n.example.com/webhook/whatsapp-inbound. - In your BSP's management portal (or directly via Meta Business Manager if you're running native API), set this URL as the webhook for inbound messages.
- Set a webhook verification token. Meta uses this to confirm your webhook is legitimately yours.
- Subscribe to message events specifically — not status events, not media events, just user-sent messages.
Test by sending a message to your business number from a personal phone. The webhook should fire in n8n with the message content.
4. The agent prompt
This is where most teams underinvest. The prompt determines half the quality of the agent. Here's the system prompt template I've iterated on across multiple Malaysian deployments:
You are the WhatsApp assistant for [BUSINESS NAME], a [SECTOR] business in Malaysia.
For every incoming message, you will:
1. Determine intent: lead qualification, FAQ, appointment booking, complaint, or out-of-scope.
2. If FAQ, answer using only the approved knowledge base provided. Cite the source if asked.
3. If lead qualification, ask 2-3 short qualifying questions per [SECTOR] best practice. Capture: budget, timeline, contact preference.
4. If appointment booking, look up availability via the bookAppointment tool and confirm slot.
5. If complaint, escalate to human immediately via the escalateToHuman tool. Do not attempt to resolve.
6. If out-of-scope (anything else), politely say you'll have a human get back within 1 business hour, and call escalateToHuman.
Never: promise specific pricing not in the knowledge base, commit to delivery dates, share other customers' data, accept payment instructions, or impersonate human staff.
Always reply in the language the user wrote in (Bahasa Malaysia, English, or Mandarin). Keep replies under 80 words unless quoting from the knowledge base. Use one emoji per reply maximum.
Notice what this prompt does. It's explicit about scope, explicit about refusal, explicit about output format. It tells the model when to escalate, when to refuse, and when to use tools. Boring prompts win.
5. The tools the agent can call
Each tool is a separate n8n workflow that the AI agent can invoke. Keep the tool surface narrow.
- lookupCustomer(phone) — calls your CRM to fetch existing customer record. Returns name, tier, recent interactions. Returns null if not found.
- bookAppointment(date, time, name) — checks availability against your calendar API, books the slot, returns confirmation.
- retrieveFAQ(query) — returns relevant FAQ chunks. Implement with a vector store over your knowledge base (pgvector + Cohere embed-v4 works well in 2026).
- captureLead(name, contact, sector, budget, timeline) — writes the qualified lead to your CRM and notifies sales via Slack.
- escalateToHuman(reason, urgency) — routes the conversation to your human-handoff queue. Sets a flag so the agent stops responding to that specific user.
- logInteraction(intent, decision, output) — writes a row to your audit log with timestamp, conversation ID, intent, decision, output.
The agent should NOT have direct database access, payment system access, or arbitrary code execution. Every action goes through one of these named tools.
6. The PDPA-compliant flow
Three things every Malaysian WhatsApp AI deployment needs:
- Documented opt-in for marketing messages. If the user initiated contact, you can respond without prior opt-in. If you want to send marketing follow-ups outside the 24-hour window, explicit opt-in must be captured (e.g., "Would you like our weekly insurance tips? Reply YES to subscribe, or ignore.")
- Audit trail of every interaction. Inputs, decisions, outputs, escalations, opt-ins. PDPA-aware retention policy — retain only what you need, for as long as you need it.
- One-tap opt-out. "Reply STOP to unsubscribe" should work and must be tested. The agent must recognise STOP, UNSUB, BERHENTI, and similar tokens, and immediately call captureOptOut + escalateToHuman.
7. The five safety rails (non-negotiable)
- Hard spend cap on Anthropic. Already mentioned. Set it, don't skip it.
- Allowlist of tools. Agent can only call the named functions; no arbitrary code execution.
- Human-in-the-loop on consequential actions. Anything affecting money, contracts, or commitments goes to human approval. Especially in the first 90 days.
- Audit logging. Every input, decision, output. Non-negotiable for any Malaysian business in a regulated sector.
- Kill switch. A single config flag that disables the workflow without deleting it. You will need this the first time something goes wrong. Test it monthly.
8. Costs (be specific)
For a Malaysian SME running 5,000-15,000 inbound conversations per month:
- WhatsApp Business Platform conversation pricing — varies by category (utility, marketing, authentication, service) and country. For Malaysian numbers: RM 0.04-0.30 per conversation depending on category. Budget RM 200-1,500/month at typical SME volumes.
- Anthropic API (Claude Sonnet) — for typical agent prompt + response sizes, around RM 0.01-0.04 per conversation. Budget RM 50-600/month.
- n8n self-hosted on a small VPS — RM 30-80/month. Or n8n Cloud at USD 24/month.
- Claude Opus for occasional complex reasoning — optional, only if your agent needs heavy reasoning on a fraction of conversations. Budget RM 100-300/month.
Total for a competent production agent: RM 400-2,500/month for typical SME volumes. Compare to one full-time customer service hire (RM 4,000-7,000 loaded) and the economics are obvious.
9. The 30-day rollout I recommend
- Days 1-7: BSP onboarding. Webhook setup. First end-to-end test message handled by the agent.
- Days 8-14: Knowledge base ingestion. FAQ retrieval working. Initial prompt iteration.
- Days 15-21: Tool integrations: CRM lookup, lead capture, escalation. Audit logging stable.
- Days 22-30: Human-in-the-loop testing. PDPA flows. Kill-switch test. Soft launch on a subset of inbound traffic.
- Day 30+: Full traffic with weekly evaluation cycles. Track: deflection rate, escalation rate, customer-satisfaction (one-tap thumbs after escalation), cost per conversation.
For Malaysian teams ready to put this into practice, our AI Marketing + WhatsApp programme covers the full WhatsApp + n8n + Claude stack hands-on, plus PDPA-compliant flow design and BSP selection. HRDC SBL-KHAS claimable for eligible employers. The Coexistence article covers the architectural choice that comes before this build.