Part 6

STT: audio filters (the noise/echo pipeline)

Four toggles in STT › Advanced that clean the caller's audio before it reaches the speech model — but two of them have surprising runtime behavior: one is hard-disabled, and one does nothing unless an env var is set. This part separates the live filters from the traps.

Read this before promising a customer a fix Of the four "audio filter" toggles, only WebRTC APM and Smart turn work unconditionally. AIC is read but hard-disabled in code. DeepFilterNet is a no-op unless the deployment has NC_OPT_URL set. Flipping a toggle in the dashboard is not the same as the filter running.
Step 28

AIC filter dead

What it is. An "AI noise cancellation" input filter that was meant to sit at audio-in alongside the WebRTC processor. The toggle still exists in the UI and the zod schema, and the agent still reads the value at boot.

sttConfig.additionalSettings.aicEnabled type bool default false

Runtime. Read at base_service.py:1395-1408 — but at :1407-1408 the code logs that AIC is hard disabled due to crate instability and never instantiates the filter. The boolean has no effect: checked or unchecked, the audio path is identical.

Hard-disabled at runtime — never promise a fix via this toggle Even with aicEnabled: true, no AIC filter is ever placed in the chain. The dashboard switch is decorative. Use WebRTC APM (Step 30) for cheap CPU-side cleanup, or DeepFilterNet (Step 29) where a GPU noise-cancel service exists.

Symptom it does NOT fix: "there's background noise" — flipping AIC will appear to do something in the UI and change nothing on the call. Reach for WebRTC APM or DeepFilterNet instead.

Step 29

DeepFilterNet noise cancellation env-gated

What it is. A neural denoiser that strips steady and transient background noise. Critically, it is not an audio-processor-chain filter — it does not sit between transport-in and STT. Instead it wraps the VAD analyzer: the Silero VAD is swapped for a DeepFilterNetVADAnalyzer that denoises internally before deciding speech/no-speech.

sttConfig.additionalSettings.deepFilterEnabled type bool default false cost high (GPU, WebSocket)

Runtime. Built in base_service.py:1557-1567. The toggle alone is not enough: the code checks for the NC_OPT_URL environment variable. If it is unset it logs nc_opt_url_set=False and falls back to plain SileroVADAnalyzer — the denoiser never runs. Only when both the toggle is on and NC_OPT_URL points at a running nc-opt service does the VAD become a DeepFilterNetVADAnalyzer.

No-op unless NC_OPT_URL is set deepFilterEnabled: true on a deployment with no NC_OPT_URL = plain Silero VAD, zero denoising. It is an env+toggle pair, not a toggle alone. This is the single most common "I turned it on and nothing changed" trap (see the checkpoint at the end of this part).

Symptom it fixes: "the noise behind me makes it mishear / cuts me off" — but only when the env var is wired up.

Step 30

WebRTC Audio Processing (APM) live

What it is. The WebRTC Audio Processing Module: noise suppression (NS), automatic gain control (AGC), acoustic echo cancellation (AEC), and a high-pass filter — a cheap, robust, CPU-side cleanup that sits right at audio-in.

sttConfig.additionalSettings.webrtcApmEnabled type bool default false cost low (CPU)

Runtime. The filter class is wired at boot_steps.py:158, placed on the transport's audio-in. The AEC reverse tap — feeding the bot's own TTS output back as the echo reference signal — is added only when echo cancellation is active, at boot_steps.py:2542-2554.

Symptom it fixes: "quiet callers aren't heard" (AGC), "there's echo" (AEC), "constant hiss" (NS) — WebRTC APM addresses all three at once.

Step 31

Smart turn (AI end-of-turn) live

What it is. Replaces pure-silence endpointing with an AI model that predicts when the caller has actually finished, from prosody (intonation, trailing pitch) rather than from a fixed silence timer.

sttConfig.additionalSettings.smartTurnEnabled type bool default false cost medium (per-turn inference)

Runtime. base_service.py:1525-1535 swaps the turn-stop strategy to TurnAnalyzerUserTurnStopStrategy(LocalSmartTurnAnalyzerV3()) instead of the timeout-based SpeechTimeoutUserTurnStopStrategy. It changes when the turn ends, not the audio content.

Symptom it fixes: "it interrupts when I pause to think / read a number off my card."

Ask Claude Code: "In pipecat-agent, show me base_service.py:1395-1408 and confirm whether aicEnabled ever instantiates a filter, and show me base_service.py:1557-1567 and confirm deepFilterEnabled falls back to plain Silero when NC_OPT_URL is unset."
Step 32

Order in the chain & cost/latency tradeoffs

These four toggles do not all live in the same place. WebRTC APM is a true audio-in filter; DeepFilterNet lives inside the VAD; smart turn changes the turn-stop decision after STT; AIC is nowhere (hard-disabled). The effective runtime order is:

audio-in | v [ WebRTC APM ] (NS + AGC + AEC + high-pass, CPU, boot_steps.py:158) | v [ DeepFilterNet ] (VAD-internal denoise; ONLY if NC_OPT_URL set, base_service.py:1557-1567) | v [ Silero VAD ] (speech / no-speech) | v [ STT ] (speech -> text) | v [ smart-turn stop ] (AI end-of-turn, base_service.py:1525-1535) (AIC: never in the chain -- hard-disabled, base_service.py:1407-1408) (Diagram shows the fully-enabled path; the DeepFilterNet box only exists when NC_OPT_URL is set -- try the live visualizer below.)
FilterCostEnable when
WebRTC APM low (CPU) almost always for telephony noise/echo — start here
DeepFilterNet high (GPU, WebSocket) heavy noise and the nc-opt service (NC_OPT_URL) is deployed
AIC n/a dead — skip (hard-disabled in code)
Smart turn medium (per-turn inference) pause-heavy callers, clean-ish audio; off when latency-critical

Try it — filter chain visualizer

Toggle filters and the env var; the chain redraws faithfully (AIC is never active, DeepFilterNet is a no-op without NC_OPT_URL).
(chain renders here)
Adjust the toggles above.
Step 33

Noisy-call-center recipe

For a loud bank call center, layer cheap-to-expensive:

Try it — cost/latency picker

Pick a scenario; get a recommended filter set with one-line rationale each.
(recommendation renders here)
Pick a scenario.
Ask Claude Code: "For the Garanti on-prem deployment, is NC_OPT_URL set in the agent env, and is deepFilterEnabled true in the published STT config? If the toggle is on but the env var is missing, DeepFilterNet is silently a no-op."
Checkpoint: you enable DeepFilterNet for a noisy customer and nothing improves. First thing to check?

Whether NC_OPT_URL is set on that deployment. The toggle is a no-op without it (base_service.py:1557-1567 logs nc_opt_url_set=False and falls back to plain Silero VAD). It is an env+toggle pair, not a toggle alone.