Master Simulator — Freya Voice Settings

Presets:

Caller & scene

Scene inputs (the caller and environment) — not dashboard settings. Hover any row for a one-line explanation.

Background noise0.20

Caller loudness0.80

Mid-sentence pause (s)0.30

Caller barges in at (s)1.5

Caller words spoken3

Caller's phrase

Audio filters — dashboard: STT › Advanced (Part 6)

WebRTC Audio Processing webrtcApmEnabled live

DeepFilterNet Noise Cancellation deepFilterEnabled

NC_OPT_URL set (env)

AIC Filter aicEnabled dead

VAD / endpointing — dashboard: STT › Voice Activity Detection (Part 7)

Confidence vadConfidence0.70

Minimum Volume vadMinVolume0.40

Start Delay vadStartSecs0.20

Stop Delay vadStopSecs0.20

Turn Stop Timeout userTurnStopTimeout1.0

Smart Turn smartTurnEnabled live

Turn-taking & barge-in — dashboard: TTS › Speaking Plans + STT echo (Part 3 / 7)

Wait seconds waitSeconds0.4

Number of words numberOfWords1

Echo Suppression (grace) botSpeechGraceSecs0.00

Bot speech Caller speech (heard) Cut-off / false trigger Noise floor Grace / wait window

Adjust the knobs.

Pipeline trace — every decision, with its code reference

How to read it

This simulator is a teaching model, not the real DSP. The numbers and thresholds match the validator ranges and the runtime defaults exactly (see phase0-report.md), and the branch logic mirrors base_service.py, but real audio is messier. Use it to build intuition for which knob moves which outcome, then confirm a real change with a test call and the VAD boot log line (base_service.py:1552), per Part 10.

Endpointing mode shows one caller utterance with a mid-sentence pause. Watch how vadStopSecs decides whether that pause ends the turn early (the classic cut-off), how smartTurn sidesteps it, and how vadConfidence/vadMinVolume plus the filters decide whether the caller is even heard over the noise.
Barge-in mode shows the bot speaking while the caller tries to cut in. Watch how numberOfWords sets the interruption bar, how an interruption phrase bypasses it, how an acknowledgement never interrupts, and how botSpeechGraceSecs swallows the bot's own echo at the start of its utterance.

Not in this simulator on purposevoiceSeconds and backOffSeconds are absent because they are dead knobs (no runtime consumer in pipecat-agent@dev — verified). Including sliders for them would teach a fiction. aicEnabled is present but, like in production, does nothing when ticked.

STT streaming vs batch — what changes

Freya's on-prem STT can run in two modes via Streaming Transcription (sttConfig.additionalSettings.streaming; picks FreyaSTTStreamingService vs the default batch FreyaSTTService, base_service.py:639). It is not a control in this simulator, because flipping it does not change what any slider above does. Here is the honest mapping so you can reason about it.

Every VAD / turn slider behaves identically in both modes. Confidence, Minimum Volume, Start Delay, Stop Delay, Wait seconds, Number of words, interruption/acknowledgement phrases, Smart Turn, and the audio filters are unchanged — Freya runs VAD upstream of STT either way, so every Endpointing and Barge-in visual here is valid for streaming and batch alike.
Low-confidence re-ask only works with streaming. The streaming service emits per-word confidences (word_confidences) that the re-ask gate needs; batch does not. So the whole confidenceReask group is live on streaming and inert on batch. This is the single biggest behavioral difference (see Part 8).
Batch adds latency and leans harder on Turn Stop Timeout. Batch transcribes the whole VAD-delimited segment after the turn closes, so the reply gap is larger and userTurnStopTimeout is more load-bearing as the backstop. The dashboard's own help even says: "Increase for batch STT with higher latency." Streaming transcribes continuously, so it is lower-latency.
Same knobs, slightly different role. In batch, the VAD stop boundary is also where the audio is sliced and handed to STT, so the endpointing knobs additionally decide what gets transcribed. In streaming, STT runs continuously and VAD mainly governs turn-taking.

Bottom lineStreaming vs batch does not move any slider's behavior. It flips two things: (1) whether the low-confidence re-ask gate can fire at all (streaming only), and (2) end-to-end latency / how much you rely on Turn Stop Timeout (batch leans on it more).