
When to use it
- Incoming requests could be answered by multiple skills—agents with tools, direct MCP servers, or lightweight functions.
- You want dynamic dispatch instead of a maze of
if/else
statements or handcrafted prompts. - You need confidence scores and rationale so a human (or another workflow) can make the final decision.
- You want to fall back to a generalist agent when no high-confidence match is found.
Destinations and scoring
create_router_llm(...)
builds an LLMRouter
that instantiates a classifier LLM, inspects the candidates, and returns ranked LLMRouterResult
objects. Each result contains:
category
:"agent"
,"server"
, or"function"
.result
: the routed object (anAgent
/AugmentedLLM
, a server name, or a callable).confidence
:"high"
,"medium"
, or"low"
—computed from the model’s probability.reasoning
: the model’s natural language justification.
create_router_embedding(...)
which compares embeddings via EmbeddingRouter
.
Quick start
Configuration knobs
top_k
: expose the top k candidates to give humans (or downstream logic) choices.routing_instruction
: prime the classifier with custom rubric; defaults to a generic prompt that lists every destination, its description, and available tools.provider
/model
: choose the model that performs routing (openai
oranthropic
today). You can also passrequest_params
for temperature, stop sequences, or strict JSON mode.server_names
: include raw MCP servers. The router pulls descriptions from the server registry so the model knows what each server can do.functions
: register local Python callables. Handy for telemetry, logging, or immediate fallbacks.route_to_agent
/route_to_server
/route_to_function
: skip the multi-category prompt when you already know the desired destination type.create_router_embedding
: swap in embedding similarity when you prefer deterministic scoring or offline model execution.
Guardrails and observability
- Use the
confidence
signal to decide when to short-circuit or escalate. For example, enforceconfidence == "high"
before allowing automated actions. - The router records detailed spans (
router.route
, candidate reasoning, chosen categories) when tracing is enabled, making it easy to debug ambiguous decisions in Jaeger or another OTLP backend. - Pair with the Intent Classifier for two-stage routing: first map the request to an intent, then feed the intent into the router for fine-grained dispatch.
- Wrap the router itself in the Evaluator-Optimizer pattern if you want an automated supervisor to veto low-quality routing rationales.
Example projects
- workflow_router – routes across agents, MCP servers, and plain functions with confidence/rationale logging.
- workflow_intent_classifier – classifies intent first, then routes to specialised handlers.
- Temporal router – demonstrates durable routing inside Temporal workflows.