Voice Input

The Amurg UI supports voice dictation with two backends. The default mode uses the browser's built-in Web Speech API and requires no setup. For private, offline speech recognition, you can connect a local Whisper server.

Voice Modes

Mode Backend Setup Privacy
Browser (default) Web Speech API (Chrome, Edge, Safari) None — works out of the box Audio may be sent to the browser vendor's cloud service
Local Whisper Self-hosted Whisper ASR server via WebSocket Run a Whisper server, configure the URL in settings Fully local — audio never leaves your machine

Switch modes via the gear icon next to the microphone button. Settings are saved in localStorage under the key amurg-voice.

How to Use

The microphone button supports two interaction styles:

Gesture Action
Hold to talk Press and hold the mic button for more than 200ms. Recording stops when you release.
Tap to toggle Quick-tap (under 200ms) to start recording, tap again to stop. Useful on mobile.

Edit before send

Transcribed text is appended to the message input field — it is never sent automatically. You can review, edit, or add to it before pressing send. While you speak, a real-time interim preview is shown above the input field.

Visual Feedback

Indicator Meaning
Red mic button Recording is active
Pulsing ring around button Audio level visualization (scales with input volume)
Italic text above input Interim transcription (partial, live as you speak)
Green ring on input field Final transcription received (flashes briefly)

Browser Mode

The default mode uses the browser's SpeechRecognition API (or webkitSpeechRecognition on Safari). It requires no server or configuration.

Browser Support
Chrome / Edge Full support
Safari (iOS 14.5+, macOS) Full support
Firefox Not supported (mic button hidden)

Language Detection

The browser mode uses navigator.language for speech recognition language, falling back to en-US. This means it automatically matches your browser's language setting.

Local Whisper Mode

For private, offline speech recognition, you can run a Whisper-compatible ASR server and point the UI at it. Audio never leaves your machine.

Setup

  1. Run a Whisper ASR server that accepts WebSocket connections and receives audio/webm chunks.
  2. Click the gear icon next to the microphone button in the Amurg UI.
  3. Select Local Whisper.
  4. Enter the WebSocket URL (e.g. ws://localhost:8000/asr).

Protocol

The UI streams audio to the Whisper server in 250ms chunks using MediaRecorder with audio/webm;codecs=opus format. The server is expected to respond with JSON messages containing transcription results.

Expected server responses

The UI looks for a text or transcript field in the JSON response. It also recognizes partial/interim results via buffer, segments, is_final, and type: "partial" fields.

// Partial transcription (shown as interim preview)
{"type": "partial", "text": "hello wor"}

// Final transcription (appended to input field)
{"text": "hello world", "is_final": true}

// Alternative field names also accepted
{"transcript": "hello world"}

Compatible Servers

Any Whisper ASR server that accepts WebSocket audio streaming and returns JSON with a text or transcript field will work. Popular options include whisper_streaming and WhisperLive.

Settings Storage

Voice settings are persisted in localStorage under the key amurg-voice as a JSON object:

{
  "mode": "browser",
  "whisperUrl": "ws://localhost:8000/asr"
}
Field Type Default Description
mode string "browser" "browser" or "whisper"
whisperUrl string "" WebSocket URL for the Whisper server

Troubleshooting

Problem Solution
No microphone button visible Your browser does not support the Web Speech API (e.g. Firefox). Switch to Chrome, Edge, or Safari.
"Microphone access denied" toast Grant microphone permission in your browser settings. On mobile, check app-level permissions too.
Recognition stops after ~60 seconds Some mobile browsers kill long-running speech recognition. The UI auto-restarts it. Tap the mic again if needed.
Whisper mode shows no transcription Check that the Whisper server is running and the WebSocket URL is correct. Open browser dev tools to inspect WebSocket frames.
Whisper mode: "Connection failed" Ensure the Whisper server accepts WebSocket connections. If using HTTPS for the UI, the Whisper URL must also be wss:// (browsers block mixed content).