Guide: Multi-Agent Orchestration
Build multi-step workflows with tool use, memory, and reliability controls on top of /v1/chat/completions.
Build multi-step workflows with tool use, memory, and reliability controls on top of /v1/chat/completions.
Everything in this guide is optional. If you omit orchestration fields, requests run in default auto mode.
1) How orchestration is activated
The gateway classifies each request from your payload (messages, tools, response format, metadata). You can let this be automatic or provide explicit hints.
Auto mode (recommended first)
{
"model": "auto",
"messages": [{"role": "user", "content": "Summarize this report."}]
}Explicit workflow hint (xantly.workflow_type)
Accepted values:
single_turnexecution_taskmulti_step_conversationallong_horizon_autonomousvoice_simplevoice_complexcreative
"xantly": {
"workflow_type": "long_horizon_autonomous"
}If an unrecognized value is supplied, the gateway falls back to automatic classification.
2) Chain controls
Continue a chain
Use xantly.chain_id (UUID recommended) to signal continuation:
"xantly": {
"chain_id": "7f2c8d45-3d7f-4b4b-8d0f-a3e84fd8d6b2"
}Limit chain depth and runtime
"xantly": {
"max_chain_steps": 12,
"chain_timeout_secs": 180
}Important: these limits are applied only when the resolved workflow is long_horizon_autonomous.
Sticky vs mixed chain routing
Set this under routing_hints:
"routing_hints": {
"chain_routing": "sticky"
}sticky: preserve continuation consistency.mixed: allow per-step re-routing (disables sticky continuation behavior).
3) Planning mode
You can steer planning style with xantly.planning_mode:
"xantly": {
"planning_mode": "planact"
}Accepted values:
preactplanact
If omitted, tenant defaults and heuristics are used.
4) Memory, cache, and conversation continuity
Persistent conversation scope
"xantly": {
"conversation_id": "acct-42-onboarding",
"enable_memory": true,
"enable_cache": true
}conversation_idscopes continuity.enable_memorycontrols L1/L2 persistence for this request path.enable_cachecontrols cache eligibility for this request.
Defaults:
enable_memory:trueenable_cache:true
5) Reliability and output verification
Reliability level
"xantly": {
"reliability_level": "high"
}Supported values:
standardhighcritical
Verification strategy override
"xantly": {
"output_verification": "schema"
}Supported values:
nonenativeschemacross_model
If omitted, strategy is auto-selected from request and tenant settings.
6) Voice workflows
Voice behavior can be signaled with xantly.voice_mode (commonly "true" for voice path handling):
"xantly": {
"voice_mode": "true"
}For streaming voice requests, responses are returned as SSE chat.completion.chunk events.
7) End-to-end examples
Autonomous multi-step workflow with guardrails
{
"model": "auto",
"messages": [
{"role": "system", "content": "You are an ops agent. Execute safely and report results."},
{"role": "user", "content": "Create a remediation plan for recurring API timeouts and include milestones."}
],
"routing_hints": {
"mode": "balanced",
"chain_routing": "mixed",
"task_complexity": "complex"
},
"xantly": {
"workflow_type": "long_horizon_autonomous",
"max_chain_steps": 8,
"chain_timeout_secs": 120,
"reliability_level": "high",
"output_verification": "cross_model",
"conversation_id": "incident-2026-03-09",
"enable_memory": true,
"enable_cache": true
}
}Structured tool workflow (execution task)
{
"model": "auto",
"messages": [
{"role": "user", "content": "Extract the top 5 action items from this incident transcript."}
],
"tools": [
{
"type": "function",
"function": {
"name": "fetch_incident_transcript",
"parameters": {
"type": "object",
"properties": {"incident_id": {"type": "string"}},
"required": ["incident_id"]
}
}
}
],
"response_format": {"type": "json_object"},
"xantly": {
"workflow_type": "execution_task",
"output_verification": "schema"
}
}8) Edge cases to plan for
xantly.max_chain_stepsandxantly.chain_timeout_secsare conditional (long-horizon only).xantly.chain_idshould be UUID-formatted; invalid values are ignored for chain classification.- Some orchestration fields are currently accepted but reserved (
cache_ttl_secs,compress_context,enable_tool_reranking,xantly.chain_routing). - For chain routing behavior today, prefer
routing_hints.chain_routingoverxantly.chain_routing.