Part 9

Call-level controls

Three small state machines that govern the call as a whole: what happens when the caller goes quiet, when the call runs too long, when an outbound call hits voicemail, and how to capture keypad digits reliably. These are the knobs behind "it went dead silent then hung up," "it got cut off at ten minutes," and "it can't get my card number right."

Step 49

Silence timeout live

What it is. The amount of quiet (no caller speech) the agent tolerates before it speaks a check-in line. This is the trigger that starts the whole silence state machine: once silence exceeds this threshold, the IdleHandler fires the first check-in attempt.

callTimeoutSettings.silenceTimeout type int seconds validator ≥0 default 0 validator / 10 seed

Runtime. Read by the idle handler (idle_handler.py:78-123) via core/types.py:425. Node-level overrides win over the call-level value (idle_handler.py:80-85), so a single conversation node can run a shorter or longer leash than the rest of the flow. The dashboard surfaces it through a DurationPicker (maxHours 1).

Default mismatch — read this before you trust a blank fieldThe validator default is 0 but the dashboard seed default is 10 (defaults.ts=10 vs validators.ts=0). A "0" you see in the raw config is not necessarily what a freshly-seeded agent runs. When auditing a customer's silence behavior, confirm the effective value, not the schema default — and remember a literal 0 means "no tolerance," which behaves very differently from "10 seconds."

Symptom it fixes: "it kept interrupting me with 'are you still there?' when I was just thinking" (raise it) or "it waited forever while I was silent and the call felt frozen" (lower it).

Step 50

Silence check-in attempts (maxRetries) live

What it is. How many times the agent will speak a check-in line before giving up and closing the call. Each elapsed silenceTimeout consumes one attempt; once attempts are exhausted, the agent speaks the closure message and hangs up.

callTimeoutSettings.maxRetries type int range 0–10 default 2

Runtime. Read at idle_handler.py:78-85. The handler counts check-ins against this cap. A value of 0 means end immediately after the first silence timeout — no check-in line at all, straight to closure/hang-up.

Symptom it fixes: "it nagged me five times then hung up" (lower it) or "it gave up the instant I paused" (raise it, or check it isn't sitting at 0).

Step 51

Check-in vs closure vs silence-timeout prompt live

What it is. The three message slots of the silence machine. checkInMessage is the fixed line spoken on each silence ("Are you still there?"). closureMessage is the final line before hanging up. silenceTimeoutPrompt is a full LLM prompt that replaces the default check-in behavior — instead of a canned line, it hands the LLM control to react in context.

callTimeoutSettings.checkInMessage callTimeoutSettings.closureMessage callTimeoutSettings.silenceTimeoutPrompt type string|null max ≤2000 default all null

Runtime. idle_handler.py:111-123,138-142; types at core/types.py:872-874. Check-in is read at :111-112, closure at :114-115,138-142, and the prompt path at :117-123.

Mutual exclusion — checkInMessage XOR silenceTimeoutPromptcheckInMessage and silenceTimeoutPrompt are mutually exclusive: the UI disables one field when the other is set. One is a fixed, compliant line; the other hands the LLM control of the silence reaction. You cannot run both — pick scripted or contextual. (closureMessage is independent and should almost always be set regardless.)

Symptom it fixes: "it just went dead silent then hung up" → set checkInMessage + closureMessage (or a silenceTimeoutPrompt + closureMessage).

Try it — silence-timeout state machine stepper

Set the silence threshold and how many check-in attempts the agent gets, then press Step to walk the machine: caller goes quiet → the timeout elapses → a check-in line plays → (repeat up to maxRetries) → closure line → hang up. With maxRetries = 0 the machine ends immediately after the first timeout, with no check-in.

Silence machine walk-through
10s
2
Press Step to begin. State: caller is silent.
The path adapts to your maxRetries.
Step 52

Maximum duration live

What it is. A hard cap on total call length, independent of silence. When the cap is reached the call ends regardless of what's happening in the conversation.

callTimeoutSettings.maximumDuration type int seconds range 0–3600 default 600 (10 min)

Runtime. Implemented by CallTimeoutMonitor (call_timeout_monitor.py:63,90-112), an observer added during boot at boot_steps.py:644-648,2742-2750. It simply sleeps for maximumDuration, then fires an end_call. Because it's a sleep-and-fire observer, the cap is wall-clock exact. (The wrap-up branch on messageMode lives a bit further down at :120-131 — see Step 53.)

Symptom it fixes: "the call got cut off at exactly 10 minutes" → raise maximumDuration.

Step 53

Max-duration end-call behavior live

What it is. How the call ends when the duration cap hits. messageMode picks the wrap-up style: llm (the LLM composes a contextual wrap-up), custom (speak spokenMessage verbatim), or none (hang up immediately, no goodbye). interruptible controls whether the caller can talk over the wrap-up.

maxDurationEndCall.messageMode enum llm | custom | none default llm maxDurationEndCall.spokenMessage maxDurationEndCall.llmPrompt maxDurationEndCall.interruptible bool default false

Runtime. call_timeout_monitor.py:120-131 branches on messageMode (custom speaks spokenMessage at :120; llm uses llmPrompt). The dashboard reassembles these fields in call-settings-panel.tsx:76-85. triggerTimeout is a hidden/derived field set only when interruptible is on.

Symptom it fixes: "it cut off abruptly with no goodbye at the time limit" → set messageMode to llm or custom (it was on none, or the monitor reached the cap before any closing).

Step 54

Voicemail detection live

What it is. On outbound calls, detect that the line reached a voicemail machine and react deliberately — leave a message or hang up — instead of talking to a beep and a recording. voicemailResponseDelay is the wait after the beep before the agent starts speaking.

voicemailDetection.enabled bool default false voicemailDetection.messageMode enum none | llm | custom default none voicemailDetection.staticMessage string|null, mode=custom voicemailDetection.voicemailResponseDelay range 0–30s, step 0.5 default 2.0

Runtime. VoicemailDetector / VoicemailGate wired in at boot_steps.py:2057-2063; the message logic lives in voicemail_detector.py:376-385 (delay read at :376, messageMode at :382, staticMessage at :384-385). It works as a race-and-gate: the detector races to classify human-vs-machine, the gate holds output until the verdict.

Symptom it fixes: "our voicemail recording starts mid-greeting / talks over the beep" → raise voicemailResponseDelay.

Ask Claude Code: "In pipecat-agent, show me where voicemailResponseDelay and voicemailDetection.messageMode are read (voicemail_detector.py around 376-385) and how the voicemail gate decides human-vs-machine, then tell me the validator range and default for the delay from the dashboard."

Try it — voicemail beep → delay → message timeline

The detector classifies the machine and waits for the beep. voicemailResponseDelay shifts when the agent's message begins after that beep — too small and the message rides over the beep; large enough and the recording captures a clean message.

Beep / delay / message
2.0s
Move the slider to shift the message start.
Step 55

DTMF (keypad) input live

What it is. Accept touch-tone digits (codes, menu choices, card numbers) and aggregate the keypresses into one input the LLM can act on. Each keypress resets an inter-digit timeout; the configured terminationDigit flushes immediately; the prefix is prepended to the emitted text so the model knows the input came from the keypad, not from speech.

dtmfConfig.enabled bool default false dtmfConfig.timeout range 0–30s, step 0.5 default 2.0 dtmfConfig.terminationDigit enum #, *, 0–9 default # dtmfConfig.prefix string ≤100 default "DTMF: "

Runtime. FreyaDTMFAggregator is created and configured at boot_steps.py:2508-2519 (timeout at :2511, prefix at :2512, termination digit mapped to a KeypadEntry at :2513-2516) and inserted as the first real processor in the pipeline at boot_steps.py:2650. The prefix itself is applied in dtmf_aggregator.py:33. timeout is the inter-digit flush window; terminationDigit flushes the buffer the instant it's pressed.

Symptom it fixes: "it can't get my card number right by voice" → switch that step to DTMF capture. This is often the real fix for digit-accuracy complaints, more than any amount of STT tuning.

Try it — DTMF aggregation

Press keys to fill the buffer. The configured prefix is prepended so the LLM knows it's keypad input. Pressing the terminationDigit flushes the buffer to the output line and clears it. The inter-digit timeout here is simulated: each keypress restarts a single timer, and if you stop typing for the configured window, the buffer auto-flushes (the prior timer is cleared on every press so timers never stack up).

Keypad → aggregated input
2.0s
buffer: (empty)
Emitted inputs will appear here, each carrying the prefix.
Press the termination digit (or wait for the timeout) to emit.
Step 56

The silence / voicemail / DTMF state machines live

What it is. A summary of the three machines this part controls. Each is small, but knowing the exact transition order is what lets you predict behavior and diagnose customer reports.

Silence

quiet > silenceTimeout → speak checkInMessage (or run silenceTimeoutPrompt) → repeat up to maxRetries → speak closureMessage → hang up. With maxRetries = 0, the first timeout goes straight to closure/hang-up.

Voicemail

detect machine → wait voicemailResponseDelay after the beep → speak per messageMode (none/llm/custom) → hang up. Detector races, gate holds output until the verdict.

DTMF

each keypress resets the timeout → on terminationDigit or timeout, flush the accumulated digits (with prefix) as one input to the LLM.

Runtime. Silence/idle handled by IdleHandler (idle_handler.py:78-123; node-level overrides win, :80-85). Max-duration is an observer that sleeps then fires end_call (call_timeout_monitor.py:63,90-112; wrap-up branch :120-131). Voicemail is detect-and-gate (voicemail_detector.py:376-385; wired at boot_steps.py:2057-2063). The DTMF aggregator is the first real processor (boot_steps.py:2508-2519,2650).

Symptom it fixes: any "why did the call do X at the end" report — trace the relevant machine's transitions and you'll find the knob.

Ask Claude Code: "Walk me through idle_handler.py:78-123 in pipecat-agent and confirm the exact order: does maxRetries = 0 skip the check-in entirely and go straight to closureMessage? Show the lines that count attempts and the line that fires the hang-up."

Checkpoint

A customer collecting 16-digit card numbers reports constant misrecognition in their noisy call center. What is the single highest-leverage change?

Enable DTMF for that step (dtmfConfig.enabled = true, with a sensible timeout and terminationDigit). Keypad tones are immune to background noise and STT errors, so they capture digits far more reliably than speech. All the STT and keyterm tuning in the world will not beat DTMF for digit capture in a noisy environment.

(Re-ask would have been the soft fallback, but it is dead — see Part 8. So DTMF is not just the best option here; for digits in noise it is effectively the only robust one.)