How to Build an AI-Powered Headline Testing System
Test and optimize content headlines using AI scoring and A/B testing.
Jay Banlasan
The AI Systems Guy
Most content teams pick headlines by gut feel or by whoever speaks loudest in the meeting. This ai headline testing optimization system gives you a scoring framework that runs before you publish, plus a structure for tracking which headline types actually win. I use it to score every headline against seven criteria before it goes live, and I log the results so the system gets smarter over time.
The ROI is straightforward. A headline that pulls 2x the clicks at the same ranking position doubles your traffic without touching your SEO spend. This system makes headline optimization a process, not a guess.
What You Need Before Starting
- Python 3.10 or higher
- Anthropic API key
- SQLite (built into Python, no install needed)
pip install anthropic python-dotenv
Step 1: Set Up the Scoring Database
You need somewhere to store headlines and their scores so you can track patterns over time:
import sqlite3
import os
def init_db(db_path: str = "headlines.db"):
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
cursor.execute("""
CREATE TABLE IF NOT EXISTS headlines (
id INTEGER PRIMARY KEY AUTOINCREMENT,
original_headline TEXT NOT NULL,
variant TEXT NOT NULL,
topic TEXT,
audience TEXT,
clarity_score INTEGER,
curiosity_score INTEGER,
specificity_score INTEGER,
benefit_score INTEGER,
urgency_score INTEGER,
emotion_score INTEGER,
keyword_score INTEGER,
total_score INTEGER,
ai_notes TEXT,
actual_ctr REAL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
conn.commit()
conn.close()
print(f"Database ready at {db_path}")
Step 2: Build the AI Scorer
This function sends a headline to Claude and gets back structured scores across seven dimensions:
import anthropic
import json
from dotenv import load_dotenv
load_dotenv()
client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
def score_headline(headline: str, topic: str, audience: str) -> dict:
prompt = f"""Score this headline across 7 dimensions. Return JSON only, no other text.
HEADLINE: {headline}
TOPIC: {topic}
TARGET AUDIENCE: {audience}
Score each dimension from 1-10:
1. clarity: Does the reader instantly know what the article is about?
2. curiosity: Does it make you want to click without being clickbait?
3. specificity: Does it contain numbers, names, timeframes, or specific claims?
4. benefit: Does it communicate what the reader will gain?
5. urgency: Does it imply timeliness or importance?
6. emotion: Does it trigger a feeling (fear, hope, frustration, excitement)?
7. keyword: Does it naturally contain a searchable phrase?
Also include:
- total: sum of all scores (max 70)
- grade: "A" (60-70), "B" (50-59), "C" (40-49), "D" (below 40)
- top_strength: one sentence on its best quality
- top_weakness: one sentence on its biggest problem
- suggested_fix: rewrite of the headline that scores higher
Return format:
{{
"clarity": 0,
"curiosity": 0,
"specificity": 0,
"benefit": 0,
"urgency": 0,
"emotion": 0,
"keyword": 0,
"total": 0,
"grade": "",
"top_strength": "",
"top_weakness": "",
"suggested_fix": ""
}}"""
message = client.messages.create(
model="claude-opus-4-5",
max_tokens=600,
messages=[{"role": "user", "content": prompt}]
)
raw = message.content[0].text.strip()
return json.loads(raw)
Step 3: Generate Headline Variants
Instead of scoring just one headline, generate a batch of variants and score them all:
def generate_variants(topic: str, audience: str, keyword: str, count: int = 8) -> list:
prompt = f"""Generate {count} headline variants for this content piece.
TOPIC: {topic}
TARGET AUDIENCE: {audience}
PRIMARY KEYWORD: {keyword}
Use these formats, one each:
1. How to [achieve outcome] in [timeframe]
2. [Number] ways to [solve problem]
3. Why [common belief] is wrong (and what to do instead)
4. The [adjective] guide to [topic]
5. [Specific result]: how [audience] can [replicate it]
6. What [authority] knows about [topic] that you don't
7. Stop [bad habit]. Do this instead.
8. [Number]-step system for [outcome]
Return as a JSON array of strings only."""
message = client.messages.create(
model="claude-opus-4-5",
max_tokens=800,
messages=[{"role": "user", "content": prompt}]
)
raw = message.content[0].text.strip()
return json.loads(raw)
Step 4: Score All Variants and Store Results
def score_and_store(original: str, topic: str, audience: str, keyword: str, db_path: str = "headlines.db"):
variants = generate_variants(topic, audience, keyword)
variants.insert(0, original) # Always include the original
results = []
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
for variant in variants:
print(f"Scoring: {variant[:60]}...")
scores = score_headline(variant, topic, audience)
cursor.execute("""
INSERT INTO headlines
(original_headline, variant, topic, audience, clarity_score, curiosity_score,
specificity_score, benefit_score, urgency_score, emotion_score, keyword_score,
total_score, ai_notes)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""", (
original, variant, topic, audience,
scores["clarity"], scores["curiosity"], scores["specificity"],
scores["benefit"], scores["urgency"], scores["emotion"], scores["keyword"],
scores["total"],
f"{scores['top_strength']} | Fix: {scores['suggested_fix']}"
))
results.append({"headline": variant, "scores": scores})
conn.commit()
conn.close()
results.sort(key=lambda x: x["scores"]["total"], reverse=True)
return results
Step 5: Print a Ranked Report
def print_report(results: list):
print("\n" + "="*60)
print("HEADLINE SCORING REPORT")
print("="*60)
for i, item in enumerate(results, 1):
s = item["scores"]
print(f"\n#{i} [{s['grade']}] Score: {s['total']}/70")
print(f" {item['headline']}")
print(f" Strength: {s['top_strength']}")
print(f" Weakness: {s['top_weakness']}")
if __name__ == "__main__":
init_db()
results = score_and_store(
original="Content Brief Generator Tool",
topic="Building an AI-powered content brief generator for SEO teams",
audience="Content managers and SEO specialists at agencies",
keyword="ai content brief generator"
)
print_report(results)
print(f"\nWinner: {results[0]['headline']}")
What to Build Next
- Add a feedback loop that records actual CTR from your CMS and updates the
actual_ctrfield, then trains future scoring on real data - Build a Slack slash command that lets any team member score a headline on demand without touching the terminal
- Export weekly reports showing which headline formats perform best for your specific audience
Related Reading
- How to Build an AI Blog Post Generator - Once you have the best headline, generate the full article
- How to Build an AI Product Description Generator - Apply the same scoring logic to product copy hooks
- How to Create an AI-Powered FAQ Generator - Optimize FAQ question phrasing using the same scoring system
Want this system built for your business?
Get a free assessment. We will map every system your business needs and show you the ROI.
Get Your Free Assessment