Part 3

TTS: start & stop speaking plans

Panel: TTS › Advanced › Start Speaking Plan / Stop Speaking Plan. This is the cluster for barge-in tuning — every "it interrupts me", "it talks over me", and "I say stop and it keeps going" complaint lands here. Strategies are built in create_user_turn_params (base_service.py:1430-1576) and attached to the user context aggregator.

Step 10

Start Speaking Plan: wait seconds live

What it is. The answer delay: how long the agent waits after it believes the caller has finished before it starts speaking its reply. Lower feels snappy; higher gives pause-heavy callers room to keep going.

ttsConfig.startSpeakingPlan.waitSeconds range 0–5 step 0.1 default 0.4

Runtime. Feeds the user_speech_timeout of the stop strategy (base_service.py:1497-1503,1537). In chat-mode the userTurnStopTimeout safety-net is clamped to waitSeconds + 0.1 (see Part 7).

Footgun: runtime fallback is 0.6, not 0.4 The schema default is 0.4, but if an agent's config omits this field entirely the runtime falls back to 0.6 (base_service.py:1497). You will observe 0.6s behavior, not 0.4s. Always set waitSeconds explicitly when tuning instead of relying on the default.

Symptom it fixes: "awkward gap before the bot replies" → lower it. "the bot jumps in before I finish" → raise it (and check Part 7 VAD stop delay).

Step 11

Stop Speaking Plan: number of words live

What it is. The barge-in threshold: how many words the caller must say before their speech interrupts the bot mid-utterance. This is the single most important knob for noisy call centers.

ttsConfig.stopSpeakingPlan.numberOfWords range 0–10 step 1 default 1

Runtime. Selects the start strategy (base_service.py:1471-1490): 0DynamicVADUserTurnStartStrategy (pure-VAD barge-in — any detected speech interrupts), ≥1DynamicMinWordsUserTurnStartStrategy(min_words=N) (needs N words before interrupting).

Symptom it fixes: "background noise keeps stopping the bot mid-sentence" → raise it. "I say 'stop' and it keeps talking" → lower it (or add an interruption phrase, Step 13).

Step 12

Acknowledgement phrases live

What it is. Short backchannels ("evet", "hı hı", "tamam") that should not interrupt the bot — the caller is just signalling they are listening, not trying to take the turn.

ttsConfig.stopSpeakingPlan.acknowledgementPhrases type string[]|null ≤200 items, each ≤100 chars default []

Runtime. base_service.py:1473,1488. Only a whole-utterance match counts as an acknowledgement (it will not interrupt). Disabled in the UI when numberOfWords = 0, because pure-VAD mode does no word matching at all.

Symptom it fixes: "when I say 'okay'/'evet' to show I'm listening, it stops talking" → add those words here.

Step 13

Interruption phrases live

What it is. Phrases that force an immediate interruption even before the word threshold is met ("dur", "bekle", "hayır hayır"). This is the hard stop-word path.

ttsConfig.stopSpeakingPlan.interruptionPhrases type string[]|null ≤200 items, each ≤100 chars default []

Runtime. base_service.py:1474,1489; bypasses numberOfWords entirely to stop the bot instantly. A one-word "dur" interrupts even when numberOfWords is 3.

Symptom it fixes: "I have to repeat 'stop' three times before it listens" → add those stop-words here.

Step 14

Voice seconds dead

What it is. Nominally, how much continuous voiced audio counts toward stopping the bot.

ttsConfig.stopSpeakingPlan.voiceSeconds range 0–0.5 step 0.1 default 0.4 validator / 0.3 defaults
No runtime consumer found The slider is visible in the UI but no pipecat-agent code reads voiceSeconds — changing it does nothing at call time. Note also the default mismatch: the validator says 0.4 while defaults.ts says 0.3. Do not promise a customer a barge-in fix via this knob; use numberOfWords + interruption phrases instead. (Verify with your pair — it may be wired in a newer branch.)

Symptom it fixes: none it can fix today — included here only so you do not waste a tuning cycle on it.

Step 15

Back off seconds dead + hidden

What it is. Nominally a cooldown after an interruption.

ttsConfig.stopSpeakingPlan.backOffSeconds range 0–3 step 0.1 default 0.4
No runtime consumer found Dead on both sides. No pipecat-agent code reads backOffSeconds, and worse, the UI slider is rendered inside a hidden div (tts-config-panel.tsx:1076), so it is not even user-visible in the dashboard. Never tune it (today).

Symptom it fixes: none.

Try it — barge-in simulator

Does the caller interrupt the bot?

The bot speaks for the band below. The caller starts talking somewhere inside it. Whether the bot actually stops depends on three things: how many words the caller spoke, the numberOfWords threshold, and whether the caller's phrase is on the interruption-phrases list (which bypasses the threshold entirely).

2
2.0
1
Adjust the controls.
Ask Claude Code: "In pipecat-agent, show me base_service.py:1471-1490 and explain how numberOfWords picks between DynamicVADUserTurnStartStrategy and DynamicMinWordsUserTurnStartStrategy, and where the interruption-phrase shortcut bypasses the word count."

Try it — answer-delay (waitSeconds)

Gap between caller-ends and bot-starts

The caller finishes at the marker. The bot waits waitSeconds before it starts replying. Watch the gap.

0.4
Move the slider.

Reminder: if this field were absent from the config, the runtime would use 0.6 (base_service.py:1497), not whatever the slider shows. Set it explicitly.

Try it — acknowledgement vs interruption matcher

Type a phrase and see what the bot does

Interruption list: dur, bekle, stop, hayır hayır. Acknowledgement list: evet, tamam, hı hı, okay. Matching is whole-utterance.

Type something.
Step 16

Tying barge-in to symptoms (recipe)

The fast lookup from "what the customer says" to "which knob to turn". When two knobs are needed they do not conflict.

SymptomKnob move
Bot interrupted by call-center background chatternumberOfWords 1 → 2 or 3
Caller can't interrupt at allnumberOfWords → 0, or add interruption phrases
"Okay"/"evet" stops the botadd to acknowledgement phrases
Want noise immunity and instant stop on commandkeep numberOfWords high + add interruption phrases
Awkward gap before bot replieslower waitSeconds
Bot jumps in before caller finishesraise waitSeconds (and see Part 7 VAD)

Checkpoint

Noisy call center: callers complain the bot both gets cut off by noise and ignores them when they say "dur". One change to fix both?

You can't do it with one knob, but you can with two that don't conflict: raise numberOfWords to 2–3 (noise immunity) and add "dur" to interruptionPhrases (instant stop bypasses the word threshold). The interruption-phrase path ignores numberOfWords entirely, so you keep noise resistance while still giving the caller a reliable hard stop-word.

Ask Claude Code: "Confirm whether ttsConfig.stopSpeakingPlan.voiceSeconds and backOffSeconds have any consumer in pipecat-agent (grep both keys), and show me the hidden div at tts-config-panel.tsx:1076 that wraps the back-off slider."