Systems Library / AI Model Setup / How to Optimize Batch AI Processing for Cost
AI Model Setup routing optimization

How to Optimize Batch AI Processing for Cost

Process large AI workloads at fraction of the cost using batch APIs.

Jay Banlasan

Jay Banlasan

The AI Systems Guy

I had a workflow enriching 2,000 leads every night with AI-generated summaries. Running them synchronously was costing $180/month and taking 45 minutes. After switching to batch ai processing cost reduction methods, the same job costs $90/month and I don't care how long it takes because it runs overnight. Half the cost, zero impact on speed where it matters.

Batch processing is the biggest underused lever in AI operations. Most providers offer 50% discounts on batch API calls because they can schedule them during off-peak hours. The only requirement is that you can tolerate a delay (usually under 24 hours). For nightly enrichment, weekly reporting, and bulk classification tasks, that's almost always fine.

What You Need Before Starting

Step 1: Identify What Qualifies for Batch Processing

Not everything should go to batch. Use this decision rule:

Real-time required (< 5 seconds)    → synchronous API
Background OK (minutes to hours)    → async queue workers
Overnight OK (up to 24 hours)       → batch API (50% cheaper)

Good batch candidates: lead enrichment, document classification, content generation, email personalization at scale, weekly summaries, SEO meta descriptions, product tag generation.

Bad batch candidates: customer support replies, live tool calls, anything inside a user-facing request loop.

Step 2: Build the Batch Submission Function

Anthropic's Message Batches API accepts up to 10,000 requests in one submission.

import anthropic
import json
from pathlib import Path

client = anthropic.Anthropic()

def submit_batch(tasks: list[dict], batch_name: str) -> str:
    """
    tasks = list of dicts with 'id' and 'prompt' keys
    Returns batch_id for polling later
    """
    requests = [
        anthropic.types.message_create_params.MessageCreateParamsNonStreaming(
            model="claude-haiku-3",
            max_tokens=512,
            messages=[{"role": "user", "content": task["prompt"]}]
        )
        for task in tasks
    ]
    
    # Build batch requests with custom IDs so you can match results back
    batch_requests = []
    for task in tasks:
        batch_requests.append(
            anthropic.types.message_create_params.Request(
                custom_id=task["id"],
                params=anthropic.types.message_create_params.MessageCreateParamsNonStreaming(
                    model="claude-haiku-3",
                    max_tokens=512,
                    messages=[{"role": "user", "content": task["prompt"]}]
                )
            )
        )
    
    batch = client.messages.batches.create(requests=batch_requests)
    
    # Save batch metadata locally
    meta = {"batch_id": batch.id, "name": batch_name, "task_count": len(tasks)}
    Path(f"batch_{batch_name}.json").write_text(json.dumps(meta))
    
    print(f"Submitted batch {batch.id} with {len(tasks)} tasks")
    return batch.id

Step 3: Build the Lead Enrichment Task Generator

Here's a real example: enriching a list of leads with AI-generated company summaries.

def build_lead_tasks(leads: list[dict]) -> list[dict]:
    tasks = []
    for lead in leads:
        prompt = f"""You are a B2B research assistant.
Summarize this company in 2 sentences for a sales rep.
Focus on: what they do, their likely pain points, and who buys from them.

Company: {lead['company']}
Industry: {lead.get('industry', 'unknown')}
Website: {lead.get('website', 'unknown')}
Employee count: {lead.get('employees', 'unknown')}

Be specific. No filler."""

        tasks.append({
            "id": f"lead_{lead['id']}",
            "prompt": prompt
        })
    return tasks

# Example usage
leads = [
    {"id": "001", "company": "Acme Corp", "industry": "Construction", 
     "website": "acme.com", "employees": "50"},
    # ... up to 10,000
]

tasks = build_lead_tasks(leads)
batch_id = submit_batch(tasks, "lead_enrichment_2024_07_28")

Step 4: Poll for Completion

Batches can complete in minutes or up to 24 hours. Poll every 5 minutes and process when done.

import time

def wait_for_batch(batch_id: str, poll_interval: int = 300) -> list:
    """Polls until batch is complete. Returns list of results."""
    print(f"Polling batch {batch_id}...")
    
    while True:
        batch = client.messages.batches.retrieve(batch_id)
        status = batch.processing_status
        
        if status == "ended":
            print(f"Batch complete. Processing results...")
            return collect_results(batch_id)
        
        pending = batch.request_counts.processing + batch.request_counts.in_progress
        print(f"Status: {status} | Pending: {pending} | "
              f"Succeeded: {batch.request_counts.succeeded} | "
              f"Errored: {batch.request_counts.errored}")
        
        time.sleep(poll_interval)

def collect_results(batch_id: str) -> list:
    results = []
    for result in client.messages.batches.results(batch_id):
        if result.result.type == "succeeded":
            results.append({
                "id": result.custom_id,
                "text": result.result.message.content[0].text,
                "input_tokens": result.result.message.usage.input_tokens,
                "output_tokens": result.result.message.usage.output_tokens
            })
        else:
            results.append({
                "id": result.custom_id,
                "text": None,
                "error": result.result.error.type if hasattr(result.result, "error") else "unknown"
            })
    return results

Step 5: Write Results Back to Your Database

Match results by custom_id back to your original records.

import sqlite3

def save_enrichment_results(results: list, db_path: str = "leads.db"):
    conn = sqlite3.connect(db_path)
    success = 0
    failed = 0
    
    for result in results:
        lead_id = result["id"].replace("lead_", "")
        if result["text"]:
            conn.execute("""
                UPDATE leads SET ai_summary = ?, enriched_at = datetime('now')
                WHERE id = ?
            """, (result["text"], lead_id))
            success += 1
        else:
            conn.execute("""
                UPDATE leads SET enrichment_error = ? WHERE id = ?
            """, (result.get("error", "failed"), lead_id))
            failed += 1
    
    conn.commit()
    conn.close()
    print(f"Saved: {success} successful, {failed} failed")

Step 6: Schedule as a Nightly Cron Job

Put the whole pipeline in one script and schedule it.

# enrich_leads_batch.py
if __name__ == "__main__":
    # 1. Pull unenriched leads from DB
    conn = sqlite3.connect("leads.db")
    rows = conn.execute(
        "SELECT id, company, industry, website, employees FROM leads "
        "WHERE ai_summary IS NULL LIMIT 2000"
    ).fetchall()
    conn.close()
    
    leads = [{"id": r[0], "company": r[1], "industry": r[2],
              "website": r[3], "employees": r[4]} for r in rows]
    
    if not leads:
        print("No leads to enrich. Exiting.")
        exit(0)
    
    # 2. Submit batch
    tasks = build_lead_tasks(leads)
    batch_id = submit_batch(tasks, f"nightly_{date.today().isoformat()}")
    
    # 3. Wait and collect
    results = wait_for_batch(batch_id, poll_interval=300)
    
    # 4. Save
    save_enrichment_results(results)
# crontab entry — runs at 2am daily
0 2 * * * /usr/bin/python3 /root/scripts/enrich_leads_batch.py >> /var/log/lead_enrichment.log 2>&1

What to Build Next

Related Reading

Want this system built for your business?

Get a free assessment. We will map every system your business needs and show you the ROI.

Get Your Free Assessment

Related Systems