Systems Library / Operations & Admin / How to Automate Payroll Data Collection
Operations & Admin hr people

How to Automate Payroll Data Collection

Collect and validate payroll data from timesheets and systems automatically.

Jay Banlasan

Jay Banlasan

The AI Systems Guy

Chasing down timesheets every pay period is a waste of everyone's time. I built a system to automate payroll data collection from timesheets that pulls hours from tracking tools, validates the data, flags discrepancies, and packages it for your payroll provider.

No more spreadsheet wrangling. No more missed entries. The system collects and checks everything before payroll runs.

What You Need Before Starting

Step 1: Pull Timesheet Data from Your Tracking Tool

Using the Toggl API as an example:

import requests
from datetime import datetime, timedelta

def fetch_toggl_entries(api_token, start_date, end_date):
    url = "https://api.track.toggl.com/api/v9/me/time_entries"
    params = {
        "start_date": start_date.isoformat() + "Z",
        "end_date": end_date.isoformat() + "Z"
    }
    response = requests.get(url, auth=(api_token, "api_token"), params=params)
    return response.json()

Step 2: Normalize the Data

Different tracking tools return different formats. Normalize everything:

def normalize_entries(raw_entries):
    normalized = []
    for entry in raw_entries:
        normalized.append({
            "employee_id": entry.get("user_id"),
            "date": entry.get("start", "")[:10],
            "hours": round(entry.get("duration", 0) / 3600, 2),
            "project": entry.get("project_name", "Unassigned"),
            "description": entry.get("description", ""),
            "billable": entry.get("billable", False)
        })
    return normalized

Step 3: Validate Against Business Rules

PAYROLL_RULES = {
    "max_daily_hours": 12,
    "max_weekly_hours": 50,
    "required_days": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
    "min_daily_hours": 4
}

def validate_entries(entries, rules):
    issues = []
    by_employee = {}

    for entry in entries:
        emp = entry["employee_id"]
        if emp not in by_employee:
            by_employee[emp] = []
        by_employee[emp].append(entry)

    for emp, emp_entries in by_employee.items():
        daily_totals = {}
        for e in emp_entries:
            daily_totals[e["date"]] = daily_totals.get(e["date"], 0) + e["hours"]

        weekly_total = sum(daily_totals.values())

        for date, hours in daily_totals.items():
            if hours > rules["max_daily_hours"]:
                issues.append({"employee": emp, "issue": f"Over {rules['max_daily_hours']}h on {date}: {hours}h"})
            if hours < rules["min_daily_hours"]:
                issues.append({"employee": emp, "issue": f"Under {rules['min_daily_hours']}h on {date}: {hours}h"})

        if weekly_total > rules["max_weekly_hours"]:
            issues.append({"employee": emp, "issue": f"Weekly total {weekly_total}h exceeds {rules['max_weekly_hours']}h"})

    return issues

Step 4: Package for Payroll Provider

import csv
from pathlib import Path

def export_for_payroll(entries, output_path="payroll_export.csv"):
    by_employee = {}
    for entry in entries:
        emp = entry["employee_id"]
        if emp not in by_employee:
            by_employee[emp] = {"regular": 0, "overtime": 0}
        by_employee[emp]["regular"] += min(entry["hours"], 8)
        by_employee[emp]["overtime"] += max(entry["hours"] - 8, 0)

    with open(output_path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Employee ID", "Regular Hours", "Overtime Hours"])
        for emp, hours in by_employee.items():
            writer.writerow([emp, round(hours["regular"], 2), round(hours["overtime"], 2)])

    return output_path

Step 5: Run the Full Pipeline

def run_payroll_collection(api_token, pay_period_start, pay_period_end):
    raw = fetch_toggl_entries(api_token, pay_period_start, pay_period_end)
    entries = normalize_entries(raw)
    issues = validate_entries(entries, PAYROLL_RULES)

    if issues:
        print(f"Found {len(issues)} validation issues:")
        for issue in issues:
            print(f"  - {issue['employee']}: {issue['issue']}")

    export_path = export_for_payroll(entries)
    print(f"Payroll export ready: {export_path}")
    return {"entries": len(entries), "issues": len(issues), "export": export_path}

What to Build Next

Add approval routing so managers sign off on their team's hours before the export runs. Connect the validation issues to Slack so flagged employees get a DM to fix their entries before the deadline.

Related Reading

Want this system built for your business?

Get a free assessment. We will map every system your business needs and show you the ROI.

Get Your Free Assessment

Related Systems