Systems Library / Infrastructure / How to Set Up Application Performance Monitoring
Infrastructure monitoring

How to Set Up Application Performance Monitoring

Monitor application performance with APM tools for proactive optimization.

Jay Banlasan

Jay Banlasan

The AI Systems Guy

Application performance monitoring APM setup is something I run on every production system I manage. Without it, you find out about slow endpoints when clients complain. With it, you catch problems before anyone notices.

I use a lightweight custom APM layer built on Python and SQLite. It tracks response times, error rates, and throughput for every endpoint. No expensive SaaS required.

What You Need Before Starting

Step 1: Create the Metrics Collector

Build a simple module that captures timing data on every request:

import time
import sqlite3
from datetime import datetime

def init_metrics_db(db_path="metrics.db"):
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS request_metrics (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            endpoint TEXT,
            method TEXT,
            status_code INTEGER,
            response_time_ms REAL,
            timestamp TEXT
        )
    """)
    conn.commit()
    conn.close()

def log_request(endpoint, method, status_code, response_time_ms, db_path="metrics.db"):
    conn = sqlite3.connect(db_path)
    conn.execute(
        "INSERT INTO request_metrics (endpoint, method, status_code, response_time_ms, timestamp) VALUES (?, ?, ?, ?, ?)",
        (endpoint, method, status_code, response_time_ms, datetime.utcnow().isoformat())
    )
    conn.commit()
    conn.close()

Step 2: Add Middleware to Your App

For FastAPI, wrap every request with timing logic:

from fastapi import FastAPI, Request

app = FastAPI()
init_metrics_db()

@app.middleware("http")
async def track_performance(request: Request, call_next):
    start = time.time()
    response = await call_next(request)
    elapsed_ms = (time.time() - start) * 1000
    log_request(
        endpoint=request.url.path,
        method=request.method,
        status_code=response.status_code,
        response_time_ms=round(elapsed_ms, 2)
    )
    return response

Every request now gets timed and logged. No external dependencies.

Step 3: Build the Query Layer

Create functions that pull actionable insights from your metrics:

def get_slow_endpoints(threshold_ms=500, hours=24, db_path="metrics.db"):
    conn = sqlite3.connect(db_path)
    rows = conn.execute("""
        SELECT endpoint, AVG(response_time_ms) as avg_ms, COUNT(*) as hits
        FROM request_metrics
        WHERE timestamp > datetime('now', ?)
        GROUP BY endpoint
        HAVING avg_ms > ?
        ORDER BY avg_ms DESC
    """, (f"-{hours} hours", threshold_ms)).fetchall()
    conn.close()
    return rows

def get_error_rate(hours=24, db_path="metrics.db"):
    conn = sqlite3.connect(db_path)
    row = conn.execute("""
        SELECT 
            COUNT(CASE WHEN status_code >= 500 THEN 1 END) * 100.0 / COUNT(*) as error_pct
        FROM request_metrics
        WHERE timestamp > datetime('now', ?)
    """, (f"-{hours} hours",)).fetchone()
    conn.close()
    return round(row[0], 2) if row[0] else 0

Step 4: Set Up Alerting

Run a cron job every 5 minutes that checks for problems:

import requests

def check_and_alert():
    slow = get_slow_endpoints(threshold_ms=1000, hours=1)
    error_rate = get_error_rate(hours=1)
    
    alerts = []
    if slow:
        endpoints = ", ".join([f"{s[0]} ({s[1]:.0f}ms)" for s in slow[:3]])
        alerts.append(f"Slow endpoints: {endpoints}")
    if error_rate > 5:
        alerts.append(f"Error rate: {error_rate}%")
    
    if alerts:
        message = "APM Alert:\n" + "\n".join(alerts)
        requests.post("YOUR_SLACK_WEBHOOK", json={"text": message})

if __name__ == "__main__":
    check_and_alert()

Add to crontab:

*/5 * * * * python3 /path/to/apm_alert.py

Step 5: Add a Summary Endpoint

Expose a simple dashboard endpoint for quick checks:

@app.get("/metrics/summary")
def metrics_summary():
    return {
        "slow_endpoints": get_slow_endpoints(),
        "error_rate_24h": get_error_rate(hours=24),
        "error_rate_1h": get_error_rate(hours=1)
    }

What to Build Next

Add percentile tracking (p95, p99) for response times. That catches tail latency problems that averages hide. You can also pipe these metrics into a Grafana dashboard if you want visual trends over time.

Related Reading

Want this system built for your business?

Get a free assessment. We will map every system your business needs and show you the ROI.

Get Your Free Assessment

Related Systems