Systems Library / AI Model Setup / How to Build Dynamic AI Routing Rules
AI Model Setup routing optimization

How to Build Dynamic AI Routing Rules

Create routing rules that send each request to the optimal model based on content.

Jay Banlasan

Jay Banlasan

The AI Systems Guy

A dynamic AI model routing rules engine sends each request to the right model without you thinking about it. Short classification tasks go to the cheapest model. Complex analysis goes to the strongest. Code generation goes to the one that writes the best code.

I built this after realizing I was wasting money sending simple tasks to expensive models. The router cut my API costs by 40% in the first month.

What You Need Before Starting

Step 1: Define Your Task Categories

Map every type of request you send to AI into categories:

TASK_CATEGORIES = {
    "classification": {"complexity": "low", "needs_reasoning": False},
    "summarization": {"complexity": "low", "needs_reasoning": False},
    "ad_copy": {"complexity": "medium", "needs_reasoning": True},
    "data_analysis": {"complexity": "high", "needs_reasoning": True},
    "code_generation": {"complexity": "high", "needs_reasoning": True},
    "translation": {"complexity": "low", "needs_reasoning": False},
}

Step 2: Create the Model Registry

MODEL_REGISTRY = {
    "claude-haiku": {"provider": "anthropic", "model": "claude-3-5-haiku-20241022", "cost_per_1k": 0.001, "strength": "speed"},
    "claude-sonnet": {"provider": "anthropic", "model": "claude-sonnet-4-20250514", "cost_per_1k": 0.003, "strength": "balanced"},
    "gpt4o-mini": {"provider": "openai", "model": "gpt-4o-mini", "cost_per_1k": 0.00015, "strength": "speed"},
    "gpt4o": {"provider": "openai", "model": "gpt-4o", "cost_per_1k": 0.005, "strength": "quality"},
}

Step 3: Build the Routing Engine

def classify_request(prompt):
    word_count = len(prompt.split())
    has_code = any(kw in prompt.lower() for kw in ["function", "code", "script", "python", "javascript"])
    has_data = any(kw in prompt.lower() for kw in ["analyze", "data", "numbers", "calculate", "report"])
    
    if has_code:
        return "code_generation"
    if has_data:
        return "data_analysis"
    if word_count < 50:
        return "classification"
    return "ad_copy"

def route_request(prompt, prefer_cost=True):
    category = classify_request(prompt)
    task = TASK_CATEGORIES[category]
    
    if task["complexity"] == "low" and prefer_cost:
        return MODEL_REGISTRY["gpt4o-mini"]
    elif task["complexity"] == "medium":
        return MODEL_REGISTRY["claude-sonnet"]
    else:
        return MODEL_REGISTRY["claude-sonnet"]

Step 4: Add the Execution Layer

import anthropic
import openai

def execute_routed(prompt, prefer_cost=True):
    model_config = route_request(prompt, prefer_cost)
    provider = model_config["provider"]
    model = model_config["model"]
    
    if provider == "anthropic":
        client = anthropic.Anthropic()
        resp = client.messages.create(model=model, max_tokens=1024, messages=[{"role": "user", "content": prompt}])
        return {"text": resp.content[0].text, "model": model, "provider": provider}
    elif provider == "openai":
        client = openai.OpenAI()
        resp = client.chat.completions.create(model=model, max_tokens=1024, messages=[{"role": "user", "content": prompt}])
        return {"text": resp.choices[0].message.content, "model": model, "provider": provider}

Step 5: Add Rules That Adapt Over Time

Log every routing decision and its outcome. Use this data to adjust rules:

import sqlite3

def log_routing(db_path, prompt_hash, category, model_used, quality_score, latency, cost):
    conn = sqlite3.connect(db_path)
    conn.execute("""CREATE TABLE IF NOT EXISTS routing_log (
        id INTEGER PRIMARY KEY, prompt_hash TEXT, category TEXT, model TEXT,
        quality REAL, latency REAL, cost REAL, timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
    )""")
    conn.execute("INSERT INTO routing_log (prompt_hash, category, model, quality, latency, cost) VALUES (?,?,?,?,?,?)",
                 (prompt_hash, category, model_used, quality_score, latency, cost))
    conn.commit()

Review the routing log weekly. If a cheaper model handles a category well, update the rules to route there by default.

What to Build Next

Add a fallback chain so if the primary model fails or is slow, the router automatically tries the next best option. Then connect this to your cost alert system so you get notified when routing shifts spending patterns.

Related Reading

Want this system built for your business?

Get a free assessment. We will map every system your business needs and show you the ROI.

Get Your Free Assessment

Related Systems