Systems Library / AI Model Setup / How to Set Up Perplexity API for Research Automation

AI Model Setup foundations

How to Set Up Perplexity API for Research Automation

Connect Perplexity search-augmented AI for automated research tasks.

Jay Banlasan

The AI Systems Guy

Setting up the Perplexity API for research automation gives you something no standard LLM can do: grounded answers with real-time web access and citations. When I run competitive research, market analysis, or prospect intelligence for clients, I route those tasks through Perplexity. The perplexity api research automation setup process is one of the fastest in this series because Perplexity uses an OpenAI-compatible endpoint, meaning you can swap it in with minimal code changes.

The business case is straightforward. Regular GPT-4 or Claude answers based on training data that may be 6 to 12 months stale. Perplexity answers based on what is live on the web right now. For anything time-sensitive, market-related, or factual, Perplexity is the right tool.

What You Need Before Starting

A Perplexity AI account at perplexity.ai
Python 3.9 or higher
openai SDK installed (Perplexity uses the same API format)
Your Perplexity API key
A research task or question type you want to automate

Step 1: Get Your API Key

Go to perplexity.ai. Click your profile icon and navigate to "API." Click "Generate" to create a new API key. Copy it.

Add to your .env file:

PERPLEXITY_API_KEY=pplx-your-key-here

Perplexity charges per 1,000 tokens, similar to OpenAI. The sonar models are priced competitively for research tasks.

Step 2: Make Your First Research Request

Perplexity's API is OpenAI-compatible, so you use the openai SDK with a custom base URL:

import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

perplexity_client = OpenAI(
    api_key=os.getenv("PERPLEXITY_API_KEY"),
    base_url="https://api.perplexity.ai"
)

response = perplexity_client.chat.completions.create(
    model="llama-3.1-sonar-large-128k-online",
    messages=[
        {
            "role": "system",
            "content": "You are a market research analyst. Provide specific, factual information with sources."
        },
        {
            "role": "user",
            "content": "What are the top 3 AI automation tools for small businesses in 2024? Include pricing information."
        }
    ]
)

print(response.choices[0].message.content)

The online suffix in the model name means it has real-time web access. Remove online for offline-only processing.

Step 3: Extract Citations from Responses

Perplexity responses include source citations. Here is how to extract and display them:

def research_with_citations(query: str, model: str = "llama-3.1-sonar-large-128k-online") -> dict:
    """
    Run a research query and return both the answer and its sources.
    
    Returns:
        Dict with 'answer' and 'citations' keys
    """
    response = perplexity_client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "system",
                "content": "You are a research assistant. Answer based on current information. Cite your sources."
            },
            {
                "role": "user",
                "content": query
            }
        ]
    )
    
    answer = response.choices[0].message.content
    
    # Extract citations if present in the response object
    citations = []
    if hasattr(response, 'citations'):
        citations = response.citations
    
    return {
        "query": query,
        "answer": answer,
        "citations": citations,
        "model": response.model
    }


# Test it
result = research_with_citations("What is the current market size of AI in business automation?")
print(f"Answer:\n{result['answer']}\n")
if result['citations']:
    print("Sources:")
    for i, cite in enumerate(result['citations'], 1):
        print(f"{i}. {cite}")

Step 4: Build a Competitor Research Automation

This is one of the highest-value uses: automate the research you would do manually in a browser:

import json
from datetime import datetime


def research_competitor(company_name: str, industry: str) -> dict:
    """
    Research a competitor and return structured intelligence.
    """
    
    queries = [
        f"What products and services does {company_name} offer? Include pricing if available.",
        f"What are the main customer complaints about {company_name}? Check recent reviews.",
        f"What is {company_name}'s main marketing message and target audience?",
        f"Has {company_name} made any recent announcements, funding rounds, or product launches in 2024?"
    ]
    
    research = {
        "company": company_name,
        "industry": industry,
        "researched_at": datetime.now().isoformat(),
        "findings": {}
    }
    
    sections = ["products_and_pricing", "customer_complaints", "marketing_positioning", "recent_news"]
    
    for query, section in zip(queries, sections):
        print(f"Researching: {section}...")
        result = research_with_citations(query)
        research["findings"][section] = {
            "summary": result["answer"],
            "sources": result.get("citations", [])
        }
    
    return research


def format_competitor_report(research: dict) -> str:
    """Format competitor research into a readable report."""
    
    report = f"""COMPETITOR INTELLIGENCE REPORT
Company: {research['company']}
Industry: {research['industry']}
Generated: {research['researched_at']}
{'='*60}

"""
    
    section_titles = {
        "products_and_pricing": "Products and Pricing",
        "customer_complaints": "Customer Complaints",
        "marketing_positioning": "Marketing Positioning",
        "recent_news": "Recent News"
    }
    
    for key, title in section_titles.items():
        if key in research["findings"]:
            report += f"## {title}\n\n"
            report += research["findings"][key]["summary"]
            
            sources = research["findings"][key].get("sources", [])
            if sources:
                report += f"\n\nSources: {', '.join(sources[:3])}"
            
            report += "\n\n" + "-"*40 + "\n\n"
    
    return report


# Run it
report_data = research_competitor("HubSpot", "CRM and Marketing Software")
report = format_competitor_report(report_data)
print(report)

# Save to file
with open(f"competitor_research_{report_data['company'].lower().replace(' ', '_')}.txt", "w") as f:
    f.write(report)

Step 5: Automate Prospect Research

def research_prospect(company_name: str, website: str = None) -> dict:
    """
    Research a sales prospect before an outreach or meeting.
    """
    
    context = f"for the company {company_name}"
    if website:
        context += f" (website: {website})"
    
    questions = {
        "business_overview": f"What does {company_name} do? What market do they serve? Approximate size?",
        "pain_points": f"What common challenges do companies like {company_name} in their industry face with operations or marketing?",
        "tech_stack": f"What software tools does {company_name} appear to use? Check job listings and reviews.",
        "recent_activity": f"What has {company_name} been doing recently? Hiring, launching, funding, partnerships?"
    }
    
    results = {}
    
    for key, question in questions.items():
        resp = perplexity_client.chat.completions.create(
            model="llama-3.1-sonar-small-128k-online",  # Use smaller model for cost efficiency
            messages=[
                {"role": "system", "content": "Answer concisely in 3-5 sentences. Focus on facts."},
                {"role": "user", "content": question}
            ]
        )
        results[key] = resp.choices[0].message.content
    
    return results

Step 6: Schedule Research to Run Automatically

import time

def daily_market_brief(topics: list[str], output_file: str = "daily_brief.md") -> None:
    """
    Generate a daily market research brief on specified topics.
    Run this as a scheduled job.
    """
    
    brief = f"# Daily Market Brief\n{datetime.now().strftime('%Y-%m-%d %H:%M')}\n\n"
    
    for topic in topics:
        print(f"Researching: {topic}")
        
        response = perplexity_client.chat.completions.create(
            model="llama-3.1-sonar-large-128k-online",
            messages=[
                {"role": "system", "content": "Provide a 3-4 sentence current summary with specific facts and numbers where available."},
                {"role": "user", "content": f"What are the latest developments in: {topic}?"}
            ]
        )
        
        brief += f"## {topic}\n\n{response.choices[0].message.content}\n\n---\n\n"
        time.sleep(1)  # Be kind to the API
    
    with open(output_file, "w") as f:
        f.write(brief)
    
    print(f"Brief saved to {output_file}")


# Example usage
daily_market_brief([
    "AI automation for small business",
    "Meta Ads performance trends",
    "Marketing agency industry news"
])

What to Build Next

Add Perplexity to your prospect research pipeline to auto-populate briefing docs before sales calls
Build a daily news digest that covers your clients' industries and sends via Slack
Compare Perplexity answers against your internal knowledge base to spot information gaps

How to Set Up Perplexity API for Research Automation

What You Need Before Starting

Step 1: Get Your API Key

Step 2: Make Your First Research Request

Step 3: Extract Citations from Responses

Step 4: Build a Competitor Research Automation

Step 5: Automate Prospect Research

Step 6: Schedule Research to Run Automatically

What to Build Next

Related Reading

Related Systems

How to Set Up Your First Claude API Call

How to Build a Multi-Turn Conversation with Claude

How to Create AI API Keys Securely