IVR Menus
Multi-level menus with audio greetings and DTMF digit routing
Whatomate supports WhatsApp voice calling with WebRTC-based audio bridging. Incoming calls are handled by an Interactive Voice Response (IVR) system that plays greetings, collects DTMF input, and routes callers to agent teams. Agents answer calls from the browser — no phone hardware required.
IVR Menus
Multi-level menus with audio greetings and DTMF digit routing
Call Transfers
Route callers to agent teams with hold music
Call Recording
Record agent-caller audio as OGG/Opus, stored in S3
Outgoing Calls
Agents can place outbound calls to contacts from the chat view
IVR flows are configured from Settings > IVR Flows in the admin UI.
Each flow has:
Each menu level has:
| Action | Description |
|---|---|
| Transfer | Route the caller to a team queue. An agent from the team picks up the call. |
| Submenu | Navigate to a nested menu with its own greeting and options. |
| Go to flow | Jump to a different IVR flow (for modular menu trees). |
| Go back | Return to the parent menu. |
| Repeat | Replay the current menu greeting. |
| Hang up | Terminate the call. |
Main Menu (greeting: "Welcome to Acme Corp") Press 1 → Transfer to "Sales" team Press 2 → Submenu: "Support" Press 1 → Transfer to "Technical Support" team Press 2 → Transfer to "Billing" team Press 9 → Go back to Main Menu Press 0 → Hang upWhen a caller selects a transfer action:
When enabled, calls are recorded during the agent-caller bridge phase. Recordings are saved as OGG/Opus files and uploaded to S3.
To enable, add to your config.toml:
[calling]recording_enabled = true
[storage]s3_bucket = "your-bucket"s3_region = "us-east-1"s3_key = "AKIA..."s3_secret = "..."Recordings are accessible from the call log detail view, which generates time-limited presigned URLs for playback.
All calls (incoming and outgoing) are logged with:
Filter logs by status, direction, account, or IVR flow.
Add to your config.toml:
[calling]audio_dir = "./audio" # Directory for IVR audio fileshold_music_file = "hold-music.ogg" # Hold music file (relative to audio_dir)ringback_file = "ringback.ogg" # Ringback tone for outgoing callsmax_call_duration = 3600 # Max call duration in secondstransfer_timeout_secs = 120 # Seconds to wait for agent to acceptrecording_enabled = false # Enable call recording to S3udp_port_min = 10000 # WebRTC UDP port range startudp_port_max = 10100 # WebRTC UDP port range endpublic_ip = "" # Public IP for NAT (required on cloud/AWS)relay_only = false # Force all media through TURN relay
# ICE servers (STUN/TURN) for WebRTC connectivity[[calling.ice_servers]]urls = ["stun:stun.l.google.com:19302"]
[[calling.ice_servers]]urls = ["turn:your-turn-server:3478"]username = "user"credential = "pass"Calling is enabled per-organization in the database. Set calling_enabled = true on the organization record to allow calls for that org.
Whatomate uses Piper for offline text-to-speech generation. When admins type greeting text in the IVR flow editor, the server generates OGG/Opus audio files using Piper + opusenc. This is optional — you can also upload pre-recorded audio files directly.
Piper requires the espeak-ng shared library at runtime, and opusenc is needed to convert WAV output to OGG/Opus:
# Debian/Ubuntusudo apt install espeak-ng opus-tools
# Fedorasudo dnf install espeak-ng opus-tools# Download Piper binary (Linux x86_64)wget https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gztar xf piper_linux_x86_64.tar.gzsudo mv piper/piper /usr/local/bin/Piper voices are available at huggingface.co/rhasspy/piper-voices (mirrors at OHF-Voice). Each voice has a .onnx model file and a .onnx.json config file — both are required.
Choosing a voice:
low, medium, and high — medium is a good balance of quality and speeden_US-lessac-medium is recommended (~60MB)mkdir -p /opt/piper/models
# Download model and configwget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx \ -O /opt/piper/models/en_US-lessac-medium.onnxwget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json \ -O /opt/piper/models/en_US-lessac-medium.onnx.jsonAdd to your config.toml:
[tts]piper_binary = "/usr/local/bin/piper"piper_model = "/opt/piper/models/en_US-lessac-medium.onnx"# opusenc_binary = "opusenc" # defaults to finding in PATHecho "Press 1 for sales, press 2 for support." | piper \ --model /opt/piper/models/en_US-lessac-medium.onnx \ --output_file test.wavopusenc --bitrate 24 test.wav test.ogg# Play: aplay test.wav OR ffplay test.oggFor WebRTC to work, ensure the following ports are open:
| Port | Protocol | Purpose |
|---|---|---|
| 10000–10100 | UDP | WebRTC media (configurable via udp_port_min/udp_port_max) |
| 3478 | TCP/UDP | TURN server (if using relay) |