Voice WebSocket

Voice sessions use a WebSocket URL returned by POST /sessions. The protocol is based on Pipecat RTVI events.

Text-only sessions use a different transport (HTTP Server-Sent Events) with a simplified event shape — see the Chat API reference for that surface. Event names and payloads do not match between the two transports.

Endpoints

URL	Modality
`wss://api.hyponema.ai/v1/sessions/{session_id}/ws?token=...`	Voice or voice+text
`wss://api.hyponema.ai/v1/sessions/{session_id}/chat?token=...`	Text-only WebSocket

The session_id in the path must match the session returned by POST /sessions.

Client startup

Create a session from your backend.
Pass the signed URL to the browser or client app.
Open the WebSocket.
Send {"type":"client-ready","data":{}}.
Stream audio or text events.

Audio format

Use PCM 16-bit signed little-endian mono audio at 16 kHz. Send audio chunks as base64-encoded bytes inside client-audio-data events.

{
  "type": "client-audio-data",
  "data": {
    "audio": "..."
  }
}

Useful server events

Event	Meaning
`bot-ready`	Pipeline initialized.
`bot-llm-text`	Assistant text chunk.
`bot-tts-audio`	TTS audio chunk.
`user-transcription`	STT transcript update.
`function-call`	LLM invoked a tool.
`function-call-result`	Tool result returned.
`metrics-data`	Provider and runtime metrics.
`error`	Runtime error.

Close codes

Code	Meaning
`4001`	Invalid token, expired token, or session mismatch.
`4003`	Origin denied.
`4404`	Agent, version, persona, user, or session not found.
`4500`	Voice runtime disabled.

Reconnect

If the WebSocket drops before a graceful close, reconnect with the same session ID and token within the configured resume window. Hyponema can restore the parked LLM context for that session.