"Just Give Me
The Audio File"

No AI lock-in. No prompt migration. No vendor TTS.
Just HTTP webhooks with WAV files. You control the AI. We control the phone call.

Already solved AI? Just add phone calls
n8n/Make workflows work as-is
BYOC - keep your carrier
Your SIP Provider
Twilio / Telnyx / Flowroute
SIP URI Route
sip:your-slug@api.monkeydial.com
MonkeyDial
VAD + Conference + Control
Your Webhooks
POST /webhook/audio

Most Voice AI Platforms Control Your Stack

We just give you the audio. You already solved the AI part.

Typical Voice AI Platform

Call comes in
Their STT (locked in)
Their LLM (prompts don't transfer)
Their TTS (limited voices)
Audio plays back

Can't reuse your existing AI workflows

Prompts are platform-specific

Migrate everything or start over

MonkeyDial

Call comes in
WAV file webhook → your endpoint
Your STT, LLM, TTS (whatever you want)
Send WAV URL back (we play it)
Audio plays back

Use your existing n8n/Make workflows

Prompts work anywhere (it's your code)

Just add 3 nodes to what you have

Our Philosophy

You've already figured out STT, TTS, and LLM prompting. You've got n8n workflows, Make.com scenarios, or custom API integrations that work perfectly.

Why rebuild all that just to add phone calls?

We handle telephony (SIP, VAD, DTMF, conferencing). You handle AI. Simple HTTP webhooks connect the two. That's it.

API Reference

Call control, audio webhooks (WAV segments), VAD modes, and conference bridges

CALL CONTROL
POST /dial Originate outbound call
Request
POST /v1/dial
Authorization: Bearer YOUR_API_KEY

{
  "to": "+13105551234",
  "from": "+17025559999"
}
Response
{
  "success": true,
  "message": "Call initiated successfully",
  "call_uuid": "86fd19ef-d17d-482f-bbda-65287272ac22",
  "details": {
    "to_number": "+13105551234",
    "from_did": "+17025559999",
    "customer_slug": "your-company-abc123"
  },
  "response_time_ms": 154
}
GET /calls List active calls
Request
GET /v1/calls
Authorization: Bearer YOUR_API_KEY
Response
{
  "success": true,
  "count": 1,
  "calls": [
    {
      "call_id": "bc1473dd-73b9-42c6-8a88-11518ff615c6",
      "status": "active",
      "direction": "outbound",
      "caller_id": "17029860828",
      "caller_name": "unknown",
      "did": "+17029860828",
      "started_at": "2025-11-05 17:00:28"
    }
  ]
}
GET /call/{call_uuid} Get call details
Request
GET /v1/call/{call_uuid}
Authorization: Bearer YOUR_API_KEY
Response
{
  "success": true,
  "call_id": "bc1473dd-73b9-42c6-8a88-11518ff615c6",
  "status": "completed",
  "direction": "outbound",
  "caller_id": "17029860828",
  "caller_name": "unknown",
  "did": "+17029860828",
  "initiated_at": "2025-11-05 17:00:28",
  "hangup_at": "2025-11-05 17:00:30",
  "duration": 2
}
POST /call/{call_uuid}/hangup End active call
Request
POST /v1/call/{call_uuid}/hangup
Authorization: Bearer YOUR_API_KEY
Response
{
  "success": true,
  "message": "Hangup request sent",
  "call_uuid": "86fd19ef-d17d-482f-bbda-65287272ac22"
}
POST /pre-answer/{call_uuid} Control pre-answer screening
Request
POST /v1/pre-answer/{call_uuid}
Authorization: Bearer YOUR_API_KEY

{
  "action": "answer"
}
Response
{
  "success": true,
  "message": "Pre-answer action set to answer",
  "call_uuid": "86fd19ef-d17d-482f-bbda-65287272ac22",
  "action": "answer"
}

action: "answer" (accept call) or "reject" (hang up without answering). Used during pre-answer webhook timeout window to control call before it's answered. Only effective for inbound calls with pre-answer screening enabled.

POST /calls/join Bridge two active calls
Request
POST /v1/calls/join
Authorization: Bearer YOUR_API_KEY

{
  "call_ids": [
    "call_550e8400-e29b-41d4-a716-446655440000",
    "call_550e8401-e29c-41d4-a716-446655440001"
  ]
}
Response
{
  "success": true,
  "message": "Bridge established",
  "bridge_type": "same-server",
  "response_time_ms": 45
}

Connect two active calls. Works across servers automatically. Both calls must be active.

POST /calls/unjoin Disconnect bridged calls
Request
POST /v1/calls/unjoin
Authorization: Bearer YOUR_API_KEY

{
  "call_ids": [
    "call_550e8400-e29b-41d4-a716-446655440000",
    "call_550e8401-e29c-41d4-a716-446655440001"
  ]
}
Response
{
  "success": true,
  "message": "Cross-server unjoin complete",
  "unjoin_type": "cross-server",
  "call1": {
    "unjoined": true,
    "status": "active"
  },
  "call2": {
    "unjoined": true,
    "status": "active"
  },
  "response_time_ms": 78
}

Disconnect two bridged calls and return them to AudioSocket. Both calls must be active.

POST /call/{call_uuid}/record/start Start recording call
cURL
curl -X POST https://api.monkeydial.com/v1/call/{call_uuid}/record/start \
  -H "X-API-Key: YOUR_API_KEY"
Node.js
const response = await fetch(
  `https://api.monkeydial.com/v1/call/${callUuid}/record/start`,
  {
    method: 'POST',
    headers: { 'X-API-Key': 'YOUR_API_KEY' }
  }
);
const data = await response.json();
Python
import requests

response = requests.post(
    f"https://api.monkeydial.com/v1/call/{call_uuid}/record/start",
    headers={"X-API-Key": "YOUR_API_KEY"}
)
data = response.json()
Response
{
  "success": true,
  "recording_id": "call_rec_550e8400_2511071430_1",
  "call_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "status": "recording",
  "hub_server": "audiobot-asterisk-1"
}

Starts recording an active call. Multiple recordings can be started per call (sequence numbers: _1, _2, _3). Recording captures both sides (caller + system audio) mixed into a single MP3 file. Recording automatically stops when call ends.

POST /call/{call_uuid}/record/stop Stop recording call
cURL
curl -X POST https://api.monkeydial.com/v1/call/{call_uuid}/record/stop \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"recording_id": "call_rec_550e8400_2511071430_1"}'
Node.js
const response = await fetch(
  `https://api.monkeydial.com/v1/call/${callUuid}/record/stop`,
  {
    method: 'POST',
    headers: {
      'X-API-Key': 'YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      recording_id: 'call_rec_550e8400_2511071430_1' // Optional
    })
  }
);
const data = await response.json();
Python
import requests

response = requests.post(
    f"https://api.monkeydial.com/v1/call/{call_uuid}/record/stop",
    headers={"X-API-Key": "YOUR_API_KEY"},
    json={"recording_id": "call_rec_550e8400_2511071430_1"}  # Optional
)
data = response.json()
Response
{
  "success": true,
  "recording_id": "call_rec_550e8400_2511071430_1",
  "status": "processing"
}

Stops active recording and triggers MP3 conversion. If recording_id is not provided, stops the most recent recording. Processing typically completes within 1-2 seconds. Webhook "call_recording_ready" is sent when MP3 is available.

GET /call/recording/{recording_id} Download recording
cURL
curl -X GET https://api.monkeydial.com/v1/call/recording/{recording_id} \
  -H "X-API-Key: YOUR_API_KEY" \
  -o recording.mp3
Node.js
const response = await fetch(
  `https://api.monkeydial.com/v1/call/recording/${recordingId}`,
  { headers: { 'X-API-Key': 'YOUR_API_KEY' } }
);
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(arrayBuffer);
fs.writeFileSync('recording.mp3', buffer);
Python
import requests

response = requests.get(
    f"https://api.monkeydial.com/v1/call/recording/{recording_id}",
    headers={"X-API-Key": "YOUR_API_KEY"}
)
with open('recording.mp3', 'wb') as f:
    f.write(response.content)
Response Headers
Content-Type: audio/mpeg
Content-Disposition: attachment; filename="{recording_id}.mp3"
Content-Length: 1480000

Downloads MP3 recording file. Only available when status is "ready" (check webhook or poll status). Returns 404 if recording is still processing or not found.

WEBHOOK call_recording_ready Recording available for download
Webhook Payload
{
  "event": "call_recording_ready",
  "call_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "recording_id": "call_rec_550e8400_2511071430_1",
  "customer_id": "acme-corp",
  "duration_seconds": 185,
  "file_size_bytes": 1480000,
  "format": "mp3",
  "download_url": "https://api.monkeydial.com/v1/call/recording/call_rec_550e8400_2511071430_1",
  "started_at": "2025-11-07T14:30:00Z",
  "stopped_at": "2025-11-07T14:33:05Z"
}

Sent when MP3 conversion completes and file is ready for download. Triggered after manual stop or automatic stop on call hangup.

WEBHOOK call_start New call initiated
{
  "event": "call_start",
  "call_id": "86fd19ef-d17d-482f-bbda-65287272ac22",
  "customer_id": "acme-corp-x7k2mn",
  "did": "+17029860828",
  "caller_id": "+13105551234",
  "caller_name": "John Doe",
  "direction": "inbound",
  "timestamp": "2025-11-05T19:14:06+00:00"
}
WEBHOOK call_end Call terminated
{
  "event": "call_end",
  "call_uuid": "86fd19ef-d17d-482f-bbda-65287272ac22",
  "customer_id": "acme-corp-001",
  "did": "+17029860828",
  "caller_id": "+13105551234",
  "caller_name": "John Doe",
  "status": "completed",
  "error_reason": null,
  "duration": 120,
  "timestamp": "2025-11-05T10:32:00+00:00"
}

status: "completed" or "failed" | duration: total call duration in seconds

WEBHOOK call_pre_answer Pre-answer call screening
Webhook Payload
{
  "event": "call_pre_answer",
  "timestamp": "2025-11-10T04:28:56Z",
  "call_uuid": "4ea15232-5384-41dd-9291-805f18995211",
  "customer_slug": "acme-corp-x7k2mn",
  "did": "+17029860828",
  "caller_id": "+14073461117",
  "caller_name": "John Doe",
  "direction": "inbound"
}

Fired before answering inbound calls when pre-answer screening is enabled. Gives you a configurable timeout window (1-60 seconds) to decide whether to accept or reject the call. Use POST /v1/pre-answer/{call_uuid} with {"action": "answer"} or {"action": "reject"} to control the call. If no action is taken within the timeout, the configured default action (answer or reject) is applied. Perfect for spam filtering, business hours validation, or caller screening.

VAD AND PLAYBACK
POST /vad/{call_uuid}/playback_audio Upload and play audio (queues)
Request
POST /v1/vad/{call_uuid}/playback_audio
Authorization: Bearer YOUR_API_KEY
Content-Type: multipart/form-data

audio_file: [binary audio file]
Response
{
  "success": true,
  "message": "Audio routed successfully",
  "call_uuid": "call_550e8400-e29b..."
}

Upload audio file - queues multiple files. Supports WAV, MP3, μ-law.

POST /vad/{call_uuid}/playback_audio/replace Replace current playback
Request
POST /v1/vad/{call_uuid}/playback_audio/replace
Authorization: Bearer YOUR_API_KEY
Content-Type: multipart/form-data

audio_file: [binary audio file]
Response
{
  "success": true,
  "message": "Audio routed successfully",
  "call_uuid": "call_550e8400-e29b..."
}

Stops current playback and plays immediately.

POST /vad/{call_uuid}/playback_audio/stop Stop playback
Request
POST /v1/vad/{call_uuid}/playback_audio/stop
Authorization: Bearer YOUR_API_KEY
Response
{
  "success": true,
  "message": "Playback stopped",
  "call_uuid": "call_550e8400..."
}

Stops current playback and clears queue.

POST /vad/{call_uuid}/vad_mode Change voice detection mode
Request
POST /v1/vad/{call_uuid}/vad_mode
Authorization: Bearer YOUR_API_KEY

{
  "vad_mode": "block"
}
block VAD enabled - capture and send audio segments (default listening mode)
allow Always capture and send audio (continuous mode)
off VAD runs but doesn't capture/send audio
interrupt Stop playback when caller speaks

DTMF detection works in all VAD modes. VAD automatically switches to mode 2 (blocked) during playback to prevent echo.

POST /vad/{call_uuid}/record Start timed recording
Request
POST /v1/vad/{call_uuid}/record
Authorization: Bearer YOUR_API_KEY

{
  "duration": 30
}
Response
{
  "success": true,
  "message": "Recording started",
  "uuid": "call_550e8400...",
  "duration": 30
}

Record 1-60 seconds. Webhook sent when complete.

WEBHOOK audio Receive voice segments
You receive
POST https://your-app.com/webhook
Content-Type: application/json

{
  "event": "audio",
  "call_id": "call_550e8400-e29b...",
  "audio_url": "https://storage.md.com/seg_001.wav",
  "duration_ms": 1500,
  "caller_id": "+13105551234",
  "did": "+17025559999"
}

Audio segments delivered in real-time as VAD detects speech. WAV format, 8kHz-48kHz.

WEBHOOK dtmf Touch-tone pressed
{
  "event": "dtmf",
  "uuid": "86fd19ef-d17d-482f-bbda-65287272ac22",
  "digit": "1",
  "customer_id": "acme-corp-001",
  "did": "+17029860828",
  "caller_id": "+13105551234",
  "timestamp": "2025-11-05T10:31:30+00:00"
}

Supports digits 0-9, *, #, A-D

WEBHOOK recording Recording completed
Multipart/form-data
event: recording
uuid: 86fd19ef-d17d-482f-bbda-65287272ac22
customer_id: acme-corp-001
did: +17029860828
duration: 5
actual_duration: 5.2
partial: false
audio_data: [WAV file - 8kHz, 16-bit, mono]

Sent when timed recording completes. Contains full audio as WAV.

CONFERENCE BRIDGES
POST /conference/create Create conference room
Request
POST /v1/conference/create
Authorization: Bearer YOUR_API_KEY

{
  "name": "support-call-123",
  "webhook_url": "https://yourdomain.com/webhooks/conference"
}
Response
{
  "success": true,
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "name": "support-call-123",
  "created_at": "2025-11-06T21:27:56+00:00"
}
POST /conference/{id}/join Add calls to conference
Request
POST /v1/conference/{conf_id}/join
Authorization: Bearer YOUR_API_KEY

{
  "call_ids": ["call_550e8400-e29b...", "call_abc123..."]
}
Response
{
  "success": true,
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "joined": [
    {"call_id": "call_550e8400-e29b...", "status": "joined"}
  ],
  "failed": [],
  "participant_count": 2
}
GET /conference/{conf_id}/list Get conference details and participants
Request
GET /v1/conference/{conf_id}/list
Authorization: Bearer YOUR_API_KEY
Response
{
  "success": true,
  "conference": {
    "conference_id": "conf_550e8400-e29b...",
    "name": "Sales Call",
    "status": "active",
    "created_at": "2025-11-06T21:27:56Z",
    "max_participants": 10,
    "participant_count": 2,
    "hub": "audiobot-asterisk-2"
  },
  "participants": [
    {
      "call_uuid": "call_abc123",
      "caller_id": "+13105551234",
      "did": "+17025559999",
      "joined_at": "2025-11-06T21:28:15Z",
      "duration_in_conference": 145,
      "is_muted": false
    }
  ]
}

Get conference metadata and real-time participant list with join times and durations. Perfect for dashboards.

POST /conference/{conf_id}/leave/{call_uuid} Remove single call from conference
Request
POST /v1/conference/{conf_id}/leave/{call_uuid}
Authorization: Bearer YOUR_API_KEY

{
  "post_conference_action": "VAD"
}
Response
{
  "success": true,
  "message": "Participant redirected to VAD",
  "call_id": "call_550e8400-e29b-41d4-a716-446655440000",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "action": "VAD"
}
VAD Return to voice detection mode (default)
HANGUP Terminate the call immediately

Remove a single participant from conference. Choose whether to return to VAD mode or hang up.

POST /conference/{conf_id}/kick/{call_uuid} Kick participant (force hangup)
Request
POST /v1/conference/{conf_id}/kick/{call_uuid}
Authorization: Bearer YOUR_API_KEY
Response
{
  "success": true,
  "message": "Participant kicked from conference",
  "call_id": "call_550e8400-e29b-41d4-a716-446655440000",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000"
}

Forcibly remove and hang up a participant. Use /leave to return to VAD instead.

POST /conference/{conf_id}/record/start Start recording conference
POST /v1/conference/{conf_id}/record/start
Headers: x-api-key: YOUR_API_KEY

Response:
{
  "success": true,
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "recording_id": "conf_rec_a1b2c3d4_2511071030",
  "status": "recording",
  "hub_server": "audiobot-asterisk-2"
}

Start recording conference audio. Only one recording per conference at a time. Supports multiple segments (stop/start creates separate files). MP3 format, ~190kbps VBR.

POST /conference/{conf_id}/record/stop Stop recording & convert to MP3
POST /v1/conference/{conf_id}/record/stop
Headers: x-api-key: YOUR_API_KEY

Response:
{
  "success": true,
  "recording_id": "conf_rec_a1b2c3d4_2511071030",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "status": "ready",
  "file_size_bytes": 1810000,
  "duration_seconds": 1205,
  "format": "mp3"
}

Stop recording and convert WAV to MP3. Triggers conference_recording_ready webhook. File available for download immediately.

GET /conference/recording/{recording_id} Download recording file
GET /v1/conference/recording/{recording_id}
Headers: x-api-key: YOUR_API_KEY

Response:
Content-Type: audio/mpeg
Content-Disposition: attachment; filename="conf_rec_a1b2c3d4_2511071030.mp3"

[Binary MP3 data]

Download conference recording MP3 file. Customer-isolated access. Supports Range requests for partial downloads.

POST /conference/{conf_id}/play Play audio to all participants
POST /v1/conference/{conf_id}/play
Headers: x-api-key: YOUR_API_KEY
Content-Type: multipart/form-data

Body:
  file: [audio file - MP3, WAV, OGG, M4A]

Response:
{
  "success": true,
  "playback_id": "conf_playback_690e6ea1261050_72120897",
  "conference_id": "conf_abc123",
  "duration_seconds": 29.57
}

Play audio file into active conference. All participants hear the audio. Supports MP3, WAV, OGG, M4A formats. Auto-converts to Asterisk format (16kHz mono). Temporary channel joins conference, plays audio, then leaves.

WEBHOOK conference_created Conference room created
{
  "event": "conference_created",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "name": "support-call-123",
  "customer_id": "acme-corp-x7k2mn",
  "max_participants": 10,
  "created_at": "2025-11-06T21:27:56+00:00"
}

Sent when a new conference room is created.

WEBHOOK conference_participant_joined Participant joined
{
  "event": "conference_participant_joined",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "call_id": "call_abc123",
  "caller_id": "+13105551234",
  "participant_count": 2,
  "timestamp": "2025-11-06T21:28:15+00:00"
}

Sent when a participant joins the conference.

WEBHOOK conference_participant_left Participant left
{
  "event": "conference_participant_left",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "call_id": "call_abc123",
  "reason": "left",
  "duration_in_conference": 295,
  "participant_count": 1,
  "timestamp": "2025-11-06T21:33:10+00:00"
}
left Removed via /leave endpoint
kicked Removed via /kick endpoint
hangup Call ended naturally
conference_ended Conference deleted

Sent when a participant leaves the conference. Includes reason and duration.

WEBHOOK conference_ended Conference ended
{
  "event": "conference_ended",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "reason": "deleted",
  "duration": 610,
  "total_participants": 3,
  "timestamp": "2025-11-06T21:38:06+00:00"
}

Sent when conference is deleted or ends. Includes total duration and participant count.

WEBHOOK conference_recording_ready Recording converted & ready
{
  "event": "conference_recording_ready",
  "conference_id": "conf_550e8400-e29b-41d4-a716-446655440000",
  "recording_id": "conf_rec_a1b2c3d4_2511071030",
  "customer_id": "acme-corp-x7k2mn",
  "duration_seconds": 1205,
  "file_size_bytes": 1810000,
  "format": "mp3",
  "download_url": "https://api.monkeydial.com/v1/conference/recording/conf_rec_a1b2c3d4_2511071030",
  "started_at": "2025-11-07T10:30:00Z",
  "stopped_at": "2025-11-07T10:50:05Z"
}

Sent after recording is stopped and MP3 conversion completes. File ready for download. Use for archival, transcription, or analytics.

What You Can Build

Inbound + Outbound + VAD + Conference + DTMF = Endless possibilities

Inbound AI Agents

Customer calls your Twilio number
We send audio segments to your webhook
You process with OpenAI/Claude/Groq
Send back TTS audio URL
We play response (with barge-in support)

Warm Transfers

AI answers call
Create conference bridge
Join customer to conference
Dial agent (auto-joins on answer)
Transfer complete - AI drops off

IVR + DTMF

Play menu prompt (VAD: block)
Receive DTMF webhook (customer presses 1)
Route based on selection
Handle with AI or transfer to human

Bring Your Own Carrier

Works with your existing SIP provider

Flowroute Tested
Route Management → SIP URI
Twilio Coming Soon
TwiML Bin → SIP URI
Telnyx Coming Soon
Call Control → SIP Connection
Bandwidth
Any SIP provider with URI routing

How to Connect:

  1. Get your unique SIP URI from MonkeyDial dashboard
  2. Configure inbound route in your SIP provider
  3. Start receiving webhooks

No migration. No downtime. No risk.

What's Included

  • Inbound call handling via SIP URI
  • Outbound dialing (uses your carrier credentials)
  • Real-time audio webhooks (WAV files)
  • 5 VAD modes (block, barge-in, vad_off, etc.)
  • Conference bridges
  • DTMF detection
  • Call control (hangup, transfer)

What's Not Included (Yet)

  • Phone number provisioning (use your carrier)
  • SMS/MMS (voice calls only)
  • Call recording downloads (coming soon)
  • Real-time audio streaming (on roadmap)
  • AMD - Answering Machine Detection (on roadmap)

On the Roadmap

  • Enhanced VAD (Gaussian-based, offered as option)
  • Real-time audio streaming (WebSocket)
  • Call recording downloads
  • AMD for outbound campaigns

Frequently Asked Questions

No! Keep your existing Twilio/Telnyx/Flowroute account. MonkeyDial layers on top via SIP URI routing.

You pay your carrier directly for call costs. MonkeyDial is currently in early access with pricing to be determined.

We can recommend one based on your volume and region (we're not affiliated with any provider). Flowroute is our current test carrier.

Yes! Each DID can route from a different provider. Mix and match as needed.

Audio segments delivered in real-time as VAD detects speech (typically 1-3 seconds after speaker stops talking). Perfect for conversational AI.

Not yet. Audit logs and webhook history only. Recording downloads coming soon.