Systems Library / AI Model Setup / How to Set Up Cohere API for Text Classification

AI Model Setup foundations

How to Set Up Cohere API for Text Classification

Use Cohere models for automated text classification and categorization.

Jay Banlasan

The AI Systems Guy

Cohere's API is purpose-built for enterprise text tasks, and the cohere api text classification setup is fast to get running. Where GPT-4 and Claude are general-purpose, Cohere's Classify endpoint is specifically designed to sort text into categories with high accuracy and at scale. I use it for routing incoming emails, categorizing support tickets, labeling leads, and flagging content for review. At volume, it is cheaper than running classification prompts through a large general model.

Cohere also offers a free trial tier with enough requests to build and test a complete classification pipeline before you spend anything.

What You Need Before Starting

A Cohere account at cohere.com
Python 3.9 or higher
cohere Python SDK
Training examples for your specific categories (15+ examples per category is the minimum)
A clear set of mutually exclusive categories for your use case

Step 1: Get Your API Key

Sign up at cohere.com. Go to the Dashboard and click "API Keys" in the left panel. Copy the default trial key or create a production key.

Add to .env:

COHERE_API_KEY=your-cohere-key-here

Install the SDK:

pip install cohere python-dotenv

Step 2: Use the Generate API for Basic Classification

The simplest approach: send a prompt and ask Cohere to classify:

import os
import cohere
from dotenv import load_dotenv

load_dotenv()

co = cohere.Client(api_key=os.getenv("COHERE_API_KEY"))


def classify_text_prompt(text: str, categories: list[str]) -> str:
    """
    Classify text using a generation prompt.
    Good for quick tests and small volumes.
    """
    categories_str = ", ".join(categories)
    
    response = co.generate(
        model="command",
        prompt=f"""Classify the following text into exactly one category.
Categories: {categories_str}
Return only the category name, nothing else.

Text: {text}
Category:""",
        max_tokens=20,
        temperature=0.0,
        stop_sequences=["\n"]
    )
    
    return response.generations[0].text.strip()


# Test
categories = ["billing", "technical_support", "account_management", "general_inquiry"]
test_texts = [
    "I was charged twice this month",
    "The app keeps crashing when I open the reports section",
    "I need to change the email on my account",
    "When did you launch your product?"
]

for text in test_texts:
    result = classify_text_prompt(text, categories)
    print(f"'{text[:50]}' -> {result}")

Step 3: Use the Classify API for Production Volume

For high-volume classification, the dedicated Classify endpoint is faster and more accurate. It requires training examples:

from cohere import ClassifyExample


def build_email_classifier() -> callable:
    """
    Build a classifier for incoming business emails with training examples.
    Returns a function that classifies new emails.
    """
    
    # Training examples: at least 5 per category, 15+ is better
    examples = [
        # BILLING examples
        ClassifyExample(text="I was charged twice this month", label="billing"),
        ClassifyExample(text="My invoice shows the wrong amount", label="billing"),
        ClassifyExample(text="How do I update my payment method?", label="billing"),
        ClassifyExample(text="I need a refund for last month's charge", label="billing"),
        ClassifyExample(text="Can I change my plan to annual billing?", label="billing"),
        ClassifyExample(text="My card was declined and I can't update it", label="billing"),
        
        # TECHNICAL SUPPORT examples
        ClassifyExample(text="The application won't load on my browser", label="technical"),
        ClassifyExample(text="I'm getting an error when I try to save", label="technical"),
        ClassifyExample(text="The API is returning 500 errors", label="technical"),
        ClassifyExample(text="My data exports are corrupted", label="technical"),
        ClassifyExample(text="Integration with Salesforce stopped working", label="technical"),
        ClassifyExample(text="Password reset link is expired", label="technical"),
        
        # ACCOUNT examples
        ClassifyExample(text="I need to change the email on my account", label="account"),
        ClassifyExample(text="Can I transfer my account to someone else?", label="account"),
        ClassifyExample(text="How do I add a team member?", label="account"),
        ClassifyExample(text="I want to close my account", label="account"),
        ClassifyExample(text="How do I change my username?", label="account"),
        
        # SALES examples
        ClassifyExample(text="What are your enterprise pricing options?", label="sales"),
        ClassifyExample(text="We have a team of 50 people, what would that cost?", label="sales"),
        ClassifyExample(text="Can we schedule a demo?", label="sales"),
        ClassifyExample(text="Do you offer nonprofit discounts?", label="sales"),
        ClassifyExample(text="What integrations do you support?", label="sales"),
    ]
    
    def classify(texts: list[str]) -> list[dict]:
        """
        Classify a list of email subjects or body snippets.
        
        Args:
            texts: List of strings to classify
        
        Returns:
            List of dicts with 'text', 'label', and 'confidence'
        """
        response = co.classify(
            model="embed-english-v3.0",
            inputs=texts,
            examples=examples
        )
        
        results = []
        for item in response.classifications:
            results.append({
                "text": item.input,
                "label": item.prediction,
                "confidence": item.confidence,
                "all_scores": {label: score for label, score in 
                              zip(item.labels, [p.confidence for p in item.predictions])}
            })
        
        return results
    
    return classify


# Build and use the classifier
classify_email = build_email_classifier()

new_emails = [
    "My subscription renewed but I wanted to cancel",
    "Dashboard not loading in Chrome",
    "We have 200 users, do you have volume pricing?",
    "How do I invite colleagues to my workspace?"
]

results = classify_email(new_emails)
for r in results:
    print(f"[{r['confidence']:.0%} {r['label']}] {r['text'][:60]}")

Step 4: Route Classified Inputs to Handlers

from typing import Callable

# Define handler functions for each category
def handle_billing(text: str) -> str:
    return f"BILLING TEAM: {text} | Priority: check payment records"

def handle_technical(text: str) -> str:
    return f"TECH SUPPORT: {text} | Priority: check error logs"

def handle_account(text: str) -> str:
    return f"ACCOUNT OPS: {text} | Priority: verify identity first"

def handle_sales(text: str) -> str:
    return f"SALES TEAM: {text} | Priority: respond within 2 hours"

def handle_unknown(text: str) -> str:
    return f"GENERAL INBOX: {text} | Priority: review manually"


HANDLERS: dict[str, Callable] = {
    "billing": handle_billing,
    "technical": handle_technical,
    "account": handle_account,
    "sales": handle_sales
}

CONFIDENCE_THRESHOLD = 0.7


def route_incoming_email(subject: str, body_snippet: str) -> dict:
    """
    Classify and route an incoming email.
    
    Args:
        subject: Email subject line
        body_snippet: First 200 characters of email body
    
    Returns:
        Routing decision with handler output
    """
    combined = f"Subject: {subject}. {body_snippet}"
    results = classify_email([combined])
    classification = results[0]
    
    label = classification["label"]
    confidence = classification["confidence"]
    
    # If confidence is too low, route to manual review
    if confidence < CONFIDENCE_THRESHOLD:
        label = "unknown"
    
    handler = HANDLERS.get(label, handle_unknown)
    routing = handler(combined)
    
    return {
        "original": combined,
        "label": label,
        "confidence": confidence,
        "routing": routing,
        "auto_routed": confidence >= CONFIDENCE_THRESHOLD
    }


# Test the routing system
test_emails = [
    ("Billing question", "Hi, I noticed two charges on my credit card this month"),
    ("App not working", "Since the last update the dashboard refuses to load"),
    ("Enterprise inquiry", "We're a 500-person company evaluating your platform"),
]

for subject, body in test_emails:
    result = route_incoming_email(subject, body)
    status = "AUTO" if result["auto_routed"] else "MANUAL"
    print(f"[{status}] [{result['confidence']:.0%} {result['label']}] {result['routing'][:80]}")

Step 5: Add a Feedback Loop to Improve Accuracy

import json
from datetime import datetime


class ClassifierWithFeedback:
    """
    Classification system that logs corrections and can be improved over time.
    """
    
    def __init__(self, classifier_fn: callable, log_file: str = "classification_feedback.jsonl"):
        self.classify = classifier_fn
        self.log_file = log_file
    
    def predict(self, text: str) -> dict:
        results = self.classify([text])
        return results[0]
    
    def record_correction(self, text: str, predicted: str, correct: str) -> None:
        """Log when a classification was wrong."""
        entry = {
            "timestamp": datetime.now().isoformat(),
            "text": text,
            "predicted": predicted,
            "correct": correct
        }
        with open(self.log_file, "a") as f:
            f.write(json.dumps(entry) + "\n")
        print(f"Correction logged: {predicted} -> {correct}")
    
    def get_accuracy_report(self) -> dict:
        """Analyze correction logs to find where the model struggles."""
        try:
            with open(self.log_file) as f:
                corrections = [json.loads(line) for line in f]
        except FileNotFoundError:
            return {"total_corrections": 0}
        
        from collections import Counter
        errors = Counter([(c["predicted"], c["correct"]) for c in corrections])
        
        return {
            "total_corrections": len(corrections),
            "common_errors": errors.most_common(5)
        }

What to Build Next

Feed correction logs back into your examples list to improve accuracy over time
Add a confidence score UI so reviewers know when to trust auto-classification
Connect the routing system to your CRM or helpdesk via API to close the automation loop

How to Set Up Cohere API for Text Classification

What You Need Before Starting

Step 1: Get Your API Key

Step 2: Use the Generate API for Basic Classification

Step 3: Use the Classify API for Production Volume

Step 4: Route Classified Inputs to Handlers

Step 5: Add a Feedback Loop to Improve Accuracy

What to Build Next

Related Reading

Related Systems

How to Set Up Your First Claude API Call

How to Build a Multi-Turn Conversation with Claude

How to Create AI API Keys Securely