How to Build Dynamic AI Routing Rules
Create routing rules that send each request to the optimal model based on content.
Jay Banlasan
The AI Systems Guy
A dynamic AI model routing rules engine sends each request to the right model without you thinking about it. Short classification tasks go to the cheapest model. Complex analysis goes to the strongest. Code generation goes to the one that writes the best code.
I built this after realizing I was wasting money sending simple tasks to expensive models. The router cut my API costs by 40% in the first month.
What You Need Before Starting
- API keys for at least two AI providers
- Python 3.8+ with
anthropicandopenaiinstalled - A config file defining your routing rules
- Basic understanding of your task categories
Step 1: Define Your Task Categories
Map every type of request you send to AI into categories:
TASK_CATEGORIES = {
"classification": {"complexity": "low", "needs_reasoning": False},
"summarization": {"complexity": "low", "needs_reasoning": False},
"ad_copy": {"complexity": "medium", "needs_reasoning": True},
"data_analysis": {"complexity": "high", "needs_reasoning": True},
"code_generation": {"complexity": "high", "needs_reasoning": True},
"translation": {"complexity": "low", "needs_reasoning": False},
}
Step 2: Create the Model Registry
MODEL_REGISTRY = {
"claude-haiku": {"provider": "anthropic", "model": "claude-3-5-haiku-20241022", "cost_per_1k": 0.001, "strength": "speed"},
"claude-sonnet": {"provider": "anthropic", "model": "claude-sonnet-4-20250514", "cost_per_1k": 0.003, "strength": "balanced"},
"gpt4o-mini": {"provider": "openai", "model": "gpt-4o-mini", "cost_per_1k": 0.00015, "strength": "speed"},
"gpt4o": {"provider": "openai", "model": "gpt-4o", "cost_per_1k": 0.005, "strength": "quality"},
}
Step 3: Build the Routing Engine
def classify_request(prompt):
word_count = len(prompt.split())
has_code = any(kw in prompt.lower() for kw in ["function", "code", "script", "python", "javascript"])
has_data = any(kw in prompt.lower() for kw in ["analyze", "data", "numbers", "calculate", "report"])
if has_code:
return "code_generation"
if has_data:
return "data_analysis"
if word_count < 50:
return "classification"
return "ad_copy"
def route_request(prompt, prefer_cost=True):
category = classify_request(prompt)
task = TASK_CATEGORIES[category]
if task["complexity"] == "low" and prefer_cost:
return MODEL_REGISTRY["gpt4o-mini"]
elif task["complexity"] == "medium":
return MODEL_REGISTRY["claude-sonnet"]
else:
return MODEL_REGISTRY["claude-sonnet"]
Step 4: Add the Execution Layer
import anthropic
import openai
def execute_routed(prompt, prefer_cost=True):
model_config = route_request(prompt, prefer_cost)
provider = model_config["provider"]
model = model_config["model"]
if provider == "anthropic":
client = anthropic.Anthropic()
resp = client.messages.create(model=model, max_tokens=1024, messages=[{"role": "user", "content": prompt}])
return {"text": resp.content[0].text, "model": model, "provider": provider}
elif provider == "openai":
client = openai.OpenAI()
resp = client.chat.completions.create(model=model, max_tokens=1024, messages=[{"role": "user", "content": prompt}])
return {"text": resp.choices[0].message.content, "model": model, "provider": provider}
Step 5: Add Rules That Adapt Over Time
Log every routing decision and its outcome. Use this data to adjust rules:
import sqlite3
def log_routing(db_path, prompt_hash, category, model_used, quality_score, latency, cost):
conn = sqlite3.connect(db_path)
conn.execute("""CREATE TABLE IF NOT EXISTS routing_log (
id INTEGER PRIMARY KEY, prompt_hash TEXT, category TEXT, model TEXT,
quality REAL, latency REAL, cost REAL, timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)""")
conn.execute("INSERT INTO routing_log (prompt_hash, category, model, quality, latency, cost) VALUES (?,?,?,?,?,?)",
(prompt_hash, category, model_used, quality_score, latency, cost))
conn.commit()
Review the routing log weekly. If a cheaper model handles a category well, update the rules to route there by default.
What to Build Next
Add a fallback chain so if the primary model fails or is slow, the router automatically tries the next best option. Then connect this to your cost alert system so you get notified when routing shifts spending patterns.
Related Reading
- The Automation Decision Tree - deciding what to automate first
- Input, Process, Output: The Universal AI Framework - the core framework behind every AI system
- Why AI Operations, Not AI Tools - building systems instead of collecting tools
Want this system built for your business?
Get a free assessment. We will map every system your business needs and show you the ROI.
Get Your Free Assessment