Silence timeout live
What it is. The amount of quiet (no caller speech) the agent tolerates before it speaks a check-in line. This is the trigger that starts the whole silence state machine: once silence exceeds this threshold, the IdleHandler fires the first check-in attempt.
Runtime. Read by the idle handler (idle_handler.py:78-123) via core/types.py:425. Node-level overrides win over the call-level value (idle_handler.py:80-85), so a single conversation node can run a shorter or longer leash than the rest of the flow. The dashboard surfaces it through a DurationPicker (maxHours 1).
- When to change: shorten for crisp IVR-style flows where dead air should be nudged quickly; lengthen for reflective calls (e.g. the caller is reading a document) where premature check-ins feel pushy.
0 but the dashboard seed default is 10 (defaults.ts=10 vs validators.ts=0). A "0" you see in the raw config is not necessarily what a freshly-seeded agent runs. When auditing a customer's silence behavior, confirm the effective value, not the schema default — and remember a literal 0 means "no tolerance," which behaves very differently from "10 seconds."Symptom it fixes: "it kept interrupting me with 'are you still there?' when I was just thinking" (raise it) or "it waited forever while I was silent and the call felt frozen" (lower it).
Silence check-in attempts (maxRetries) live
What it is. How many times the agent will speak a check-in line before giving up and closing the call. Each elapsed silenceTimeout consumes one attempt; once attempts are exhausted, the agent speaks the closure message and hangs up.
Runtime. Read at idle_handler.py:78-85. The handler counts check-ins against this cap. A value of 0 means end immediately after the first silence timeout — no check-in line at all, straight to closure/hang-up.
- When to change: raise to give a distracted caller more chances to come back; set to
0for unattended flows where a single timeout should just end the call.
Symptom it fixes: "it nagged me five times then hung up" (lower it) or "it gave up the instant I paused" (raise it, or check it isn't sitting at 0).
Check-in vs closure vs silence-timeout prompt live
What it is. The three message slots of the silence machine. checkInMessage is the fixed line spoken on each silence ("Are you still there?"). closureMessage is the final line before hanging up. silenceTimeoutPrompt is a full LLM prompt that replaces the default check-in behavior — instead of a canned line, it hands the LLM control to react in context.
Runtime. idle_handler.py:111-123,138-142; types at core/types.py:872-874. Check-in is read at :111-112, closure at :114-115,138-142, and the prompt path at :117-123.
checkInMessage and silenceTimeoutPrompt are mutually exclusive: the UI disables one field when the other is set. One is a fixed, compliant line; the other hands the LLM control of the silence reaction. You cannot run both — pick scripted or contextual. (closureMessage is independent and should almost always be set regardless.)- When to change: use
checkInMessagefor a scripted, audit-friendly nudge; usesilenceTimeoutPromptwhen you want the LLM to reference what was just discussed. Always setclosureMessagefor a graceful goodbye instead of a silent drop.
Symptom it fixes: "it just went dead silent then hung up" → set checkInMessage + closureMessage (or a silenceTimeoutPrompt + closureMessage).
Try it — silence-timeout state machine stepper
Set the silence threshold and how many check-in attempts the agent gets, then press Step to walk the machine: caller goes quiet → the timeout elapses → a check-in line plays → (repeat up to maxRetries) → closure line → hang up. With maxRetries = 0 the machine ends immediately after the first timeout, with no check-in.
Maximum duration live
What it is. A hard cap on total call length, independent of silence. When the cap is reached the call ends regardless of what's happening in the conversation.
Runtime. Implemented by CallTimeoutMonitor (call_timeout_monitor.py:63,90-112), an observer added during boot at boot_steps.py:644-648,2742-2750. It simply sleeps for maximumDuration, then fires an end_call. Because it's a sleep-and-fire observer, the cap is wall-clock exact. (The wrap-up branch on messageMode lives a bit further down at :120-131 — see Step 53.)
- When to change: raise for genuinely long support calls that legitimately exceed ten minutes; lower to contain cost and abuse on simple, short flows.
Symptom it fixes: "the call got cut off at exactly 10 minutes" → raise maximumDuration.
Max-duration end-call behavior live
What it is. How the call ends when the duration cap hits. messageMode picks the wrap-up style: llm (the LLM composes a contextual wrap-up), custom (speak spokenMessage verbatim), or none (hang up immediately, no goodbye). interruptible controls whether the caller can talk over the wrap-up.
Runtime. call_timeout_monitor.py:120-131 branches on messageMode (custom speaks spokenMessage at :120; llm uses llmPrompt). The dashboard reassembles these fields in call-settings-panel.tsx:76-85. triggerTimeout is a hidden/derived field set only when interruptible is on.
- When to change:
customfor a compliant fixed closing;nonefor hard cutoffs where any extra speech is unwanted; turn oninterruptibleso the caller can respond to the wrap-up instead of being talked over.
Symptom it fixes: "it cut off abruptly with no goodbye at the time limit" → set messageMode to llm or custom (it was on none, or the monitor reached the cap before any closing).
Voicemail detection live
What it is. On outbound calls, detect that the line reached a voicemail machine and react deliberately — leave a message or hang up — instead of talking to a beep and a recording. voicemailResponseDelay is the wait after the beep before the agent starts speaking.
Runtime. VoicemailDetector / VoicemailGate wired in at boot_steps.py:2057-2063; the message logic lives in voicemail_detector.py:376-385 (delay read at :376, messageMode at :382, staticMessage at :384-385). It works as a race-and-gate: the detector races to classify human-vs-machine, the gate holds output until the verdict.
- When to change: enable on outbound campaigns; tune
voicemailResponseDelayso the message starts after the beep clears, not over it.
Symptom it fixes: "our voicemail recording starts mid-greeting / talks over the beep" → raise voicemailResponseDelay.
voicemailResponseDelay and voicemailDetection.messageMode are read (voicemail_detector.py around 376-385) and how the voicemail gate decides human-vs-machine, then tell me the validator range and default for the delay from the dashboard."Try it — voicemail beep → delay → message timeline
The detector classifies the machine and waits for the beep. voicemailResponseDelay shifts when the agent's message begins after that beep — too small and the message rides over the beep; large enough and the recording captures a clean message.
DTMF (keypad) input live
What it is. Accept touch-tone digits (codes, menu choices, card numbers) and aggregate the keypresses into one input the LLM can act on. Each keypress resets an inter-digit timeout; the configured terminationDigit flushes immediately; the prefix is prepended to the emitted text so the model knows the input came from the keypad, not from speech.
Runtime. FreyaDTMFAggregator is created and configured at boot_steps.py:2508-2519 (timeout at :2511, prefix at :2512, termination digit mapped to a KeypadEntry at :2513-2516) and inserted as the first real processor in the pipeline at boot_steps.py:2650. The prefix itself is applied in dtmf_aggregator.py:33. timeout is the inter-digit flush window; terminationDigit flushes the buffer the instant it's pressed.
- When to change: enable for any flow collecting numeric codes — DTMF tones are immune to background noise and STT errors, so it is far more reliable than speech for digits. Lower
timeoutfor fast typists; setterminationDigitto match the customer's existing IVR convention.
Symptom it fixes: "it can't get my card number right by voice" → switch that step to DTMF capture. This is often the real fix for digit-accuracy complaints, more than any amount of STT tuning.
Try it — DTMF aggregation
Press keys to fill the buffer. The configured prefix is prepended so the LLM knows it's keypad input. Pressing the terminationDigit flushes the buffer to the output line and clears it. The inter-digit timeout here is simulated: each keypress restarts a single timer, and if you stop typing for the configured window, the buffer auto-flushes (the prior timer is cleared on every press so timers never stack up).
The silence / voicemail / DTMF state machines live
What it is. A summary of the three machines this part controls. Each is small, but knowing the exact transition order is what lets you predict behavior and diagnose customer reports.
quiet > silenceTimeout → speak checkInMessage (or run silenceTimeoutPrompt) → repeat up to maxRetries → speak closureMessage → hang up. With maxRetries = 0, the first timeout goes straight to closure/hang-up.
detect machine → wait voicemailResponseDelay after the beep → speak per messageMode (none/llm/custom) → hang up. Detector races, gate holds output until the verdict.
each keypress resets the timeout → on terminationDigit or timeout, flush the accumulated digits (with prefix) as one input to the LLM.
Runtime. Silence/idle handled by IdleHandler (idle_handler.py:78-123; node-level overrides win, :80-85). Max-duration is an observer that sleeps then fires end_call (call_timeout_monitor.py:63,90-112; wrap-up branch :120-131). Voicemail is detect-and-gate (voicemail_detector.py:376-385; wired at boot_steps.py:2057-2063). The DTMF aggregator is the first real processor (boot_steps.py:2508-2519,2650).
- When to change: reach for these descriptions when a customer's complaint doesn't match a single knob — usually the answer is "the machine reached a transition you didn't expect" (e.g. closure fired because
maxRetrieswas 0, or voicemail spoke over the beep because the delay was too small).
Symptom it fixes: any "why did the call do X at the end" report — trace the relevant machine's transitions and you'll find the knob.
idle_handler.py:78-123 in pipecat-agent and confirm the exact order: does maxRetries = 0 skip the check-in entirely and go straight to closureMessage? Show the lines that count attempts and the line that fires the hang-up."Checkpoint
A customer collecting 16-digit card numbers reports constant misrecognition in their noisy call center. What is the single highest-leverage change?
Enable DTMF for that step (dtmfConfig.enabled = true, with a sensible timeout and terminationDigit). Keypad tones are immune to background noise and STT errors, so they capture digits far more reliably than speech. All the STT and keyterm tuning in the world will not beat DTMF for digit capture in a noisy environment.
(Re-ask would have been the soft fallback, but it is dead — see Part 8. So DTMF is not just the best option here; for digits in noise it is effectively the only robust one.)