How to Set Up AWS Bedrock for Enterprise AI

Deploy Claude and other models through AWS Bedrock for enterprise compliance.

Jay Banlasan

The AI Systems Guy

When a client needs AI but their legal team will not sign off on data leaving their AWS environment, AWS Bedrock enterprise ai setup is the answer. Bedrock gives you access to Claude, Titan, Llama, and Mistral through a managed AWS endpoint, so all inference stays within the customer's cloud account and compliance boundary. I set this up for a financial services client who needed Claude's reasoning but could not send data to Anthropic's API directly.

The tradeoff is setup complexity versus compliance coverage. Direct API calls to Anthropic are faster to build. Bedrock is the right call when you need VPC isolation, CloudTrail audit logs, IAM access control, and data residency in a specific AWS region.

What You Need Before Starting

An AWS account with Bedrock access enabled (not available in all regions, use us-east-1 or us-west-2)
AWS CLI configured (aws configure) with a user or role that has Bedrock permissions
Python 3.10+ with boto3 (pip install boto3)
Model access granted in the Bedrock console (you must explicitly request access per model)

Step 1: Enable Model Access in the Bedrock Console

Bedrock does not give you access to all models by default. You must request it manually.

Go to AWS Console, navigate to Amazon Bedrock
Click "Model access" in the left sidebar
Click "Manage model access"
Check the boxes for the models you need. For Claude: select Anthropic Claude 3 Haiku, Claude 3 Sonnet, Claude 3 Opus
Click "Save changes"

Access is usually granted within a few minutes for Claude models. Some third-party models take longer or require an agreement.

Step 2: Set Up IAM Permissions

Create a policy that allows your application to call Bedrock. Attach it to the IAM role or user your app runs as.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:ListFoundationModels"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-opus-20240229-v1:0"
      ]
    }
  ]
}

Lock the resource ARNs to specific models. Do not use * on production. If your security team requires VPC endpoints, add bedrock:CreateModelInvocationJob and configure a VPC endpoint for com.amazonaws.region.bedrock-runtime.

Step 3: Make Your First Bedrock API Call

The Bedrock runtime client uses a different request format than Anthropic's direct SDK. The payload is JSON-encoded and varies by model provider.

import boto3
import json

def call_claude_bedrock(
    prompt: str,
    model_id: str = "anthropic.claude-3-sonnet-20240229-v1:0",
    region: str = "us-east-1",
    max_tokens: int = 1024
) -> str:
    client = boto3.client("bedrock-runtime", region_name=region)

    # Claude on Bedrock uses the Messages API format
    body = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": max_tokens,
        "messages": [
            {
                "role": "user",
                "content": prompt
            }
        ]
    }

    response = client.invoke_model(
        modelId=model_id,
        body=json.dumps(body),
        contentType="application/json",
        accept="application/json"
    )

    response_body = json.loads(response["body"].read())
    return response_body["content"][0]["text"]

# Test it
result = call_claude_bedrock("Summarize the benefits of VPC isolation for AI inference in two sentences.")
print(result)

The anthropic_version field is required for Claude on Bedrock. Leave it exactly as shown.

Step 4: Add a System Prompt

For most business applications you need system prompt support. Here is how to include it:

def call_claude_with_system(
    system_prompt: str,
    user_message: str,
    model_id: str = "anthropic.claude-3-sonnet-20240229-v1:0",
    region: str = "us-east-1",
    max_tokens: int = 1024
) -> str:
    client = boto3.client("bedrock-runtime", region_name=region)

    body = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": max_tokens,
        "system": system_prompt,
        "messages": [
            {
                "role": "user",
                "content": user_message
            }
        ]
    }

    response = client.invoke_model(
        modelId=model_id,
        body=json.dumps(body),
        contentType="application/json",
        accept="application/json"
    )

    response_body = json.loads(response["body"].read())
    return response_body["content"][0]["text"]

Step 5: Enable Streaming for Long Responses

For responses over a few hundred tokens, streaming gives users visible progress instead of a blank wait.

def stream_claude_bedrock(
    prompt: str,
    model_id: str = "anthropic.claude-3-sonnet-20240229-v1:0",
    region: str = "us-east-1"
):
    client = boto3.client("bedrock-runtime", region_name=region)

    body = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 2048,
        "messages": [{"role": "user", "content": prompt}]
    }

    response = client.invoke_model_with_response_stream(
        modelId=model_id,
        body=json.dumps(body),
        contentType="application/json",
        accept="application/json"
    )

    for event in response["body"]:
        chunk = json.loads(event["chunk"]["bytes"])
        if chunk["type"] == "content_block_delta":
            delta = chunk.get("delta", {})
            if delta.get("type") == "text_delta":
                print(delta["text"], end="", flush=True)

    print()  # newline after stream completes

Step 6: Set Up CloudTrail Logging for Compliance

This is the main reason enterprises choose Bedrock. Every model invocation gets logged automatically to CloudTrail if you have it enabled. To make those logs queryable:

import boto3

def check_bedrock_invocations(hours: int = 24):
    cloudtrail = boto3.client("cloudtrail", region_name="us-east-1")

    from datetime import datetime, timedelta
    end_time = datetime.utcnow()
    start_time = end_time - timedelta(hours=hours)

    response = cloudtrail.lookup_events(
        LookupAttributes=[
            {
                "AttributeKey": "EventSource",
                "AttributeValue": "bedrock.amazonaws.com"
            }
        ],
        StartTime=start_time,
        EndTime=end_time,
        MaxResults=50
    )

    for event in response["Events"]:
        print(f"{event['EventTime']} | {event['EventName']} | {event.get('Username', 'unknown')}")

For full prompt/response logging, enable "Model Invocation Logging" in the Bedrock console and point it to an S3 bucket. Be careful here: prompt content will be stored in S3, so apply bucket policies accordingly.

What to Build Next

Add a wrapper class that handles model ID selection by capability tier (Haiku for simple tasks, Sonnet for complex, Opus for critical)
Implement request retry logic with exponential backoff for throttling errors (Bedrock throttles at the account level)
Set up Bedrock Guardrails for content filtering without managing it in application code

How to Set Up AWS Bedrock for Enterprise AI

What You Need Before Starting

Step 1: Enable Model Access in the Bedrock Console

Step 2: Set Up IAM Permissions

Step 3: Make Your First Bedrock API Call

Step 4: Add a System Prompt

Step 5: Enable Streaming for Long Responses

Step 6: Set Up CloudTrail Logging for Compliance

What to Build Next

Related Reading

Related Systems

How to Set Up Your First Claude API Call

How to Build a Multi-Turn Conversation with Claude

How to Set Up Groq for Ultra-Fast AI Inference