AWS Bedrock: Getting Started with Generative AI on AWS

Amazon Bedrock is AWS's fully managed service for building generative AI applications. Instead of training your own models or managing ML infrastructure, Bedrock gives you API access to leading foundation models from Anthropic, Meta, Mistral, Cohere, and Amazon.

This guide covers what Bedrock offers, how pricing works, and how to get started building AI-powered applications.

What Is AWS Bedrock?

Amazon Bedrock is a serverless service that provides:

Foundation model access - Claude, Llama, Mistral, and more via API
Fine-tuning capabilities - Customize models with your data
RAG support - Connect models to your knowledge bases
Agents - Build autonomous AI workflows
Guardrails - Control model outputs for safety

Think of Bedrock as a "model marketplace with infrastructure"—you choose which AI models to use, AWS handles scaling, security, and availability.

Why Use Bedrock Instead of Direct APIs?

You could use Anthropic's API directly, so why go through AWS?

Factor	Direct API	AWS Bedrock
Billing	Separate vendor	Consolidated AWS billing
Data residency	Varies by provider	Stays in your AWS region
VPC integration	Requires configuration	PrivateLink available
IAM integration	API keys only	Native IAM policies
Multiple models	Multiple accounts/APIs	Single API, many models
Compliance	Varies	AWS compliance (HIPAA, SOC2, etc.)
Model switching	Code changes	Configuration changes

Key benefit: If you're already on AWS and need enterprise controls (VPC, IAM, compliance), Bedrock simplifies integration significantly.

Available Foundation Models

Bedrock offers models from multiple providers:

Text Generation Models

Provider	Model	Context Window	Best For
Anthropic	Claude 3.5 Sonnet	200K tokens	Complex reasoning, coding
Anthropic	Claude 3 Haiku	200K tokens	Fast, cost-effective tasks
Meta	Llama 3.1 70B	128K tokens	Open-weight alternative
Mistral	Mistral Large	128K tokens	Multilingual, coding
Amazon	Titan Text	8K tokens	Basic text tasks
Cohere	Command R+	128K tokens	RAG applications

Image Generation Models

Provider	Model	Use Case
Stability AI	SDXL 1.0	High-quality image generation
Amazon	Titan Image Generator	Text-to-image, editing

Embedding Models

Provider	Model	Dimensions
Amazon	Titan Embeddings V2	1024
Cohere	Embed	1024

Bedrock Pricing Explained

Bedrock uses token-based pricing—you pay per input and output token processed. Pricing varies significantly by model.

Text Model Pricing (per 1,000 tokens)

Model	Input Price	Output Price
Claude 3.5 Sonnet	$0.003	$0.015
Claude 3 Haiku	$0.00025	$0.00125
Llama 3.1 70B	$0.00099	$0.00099
Mistral Large	$0.004	$0.012
Titan Text Express	$0.0002	$0.0006

Example Cost Calculation

Processing a customer support request (500 input tokens, 200 output tokens) with Claude 3.5 Sonnet:

Input: 500 / 1000 × $0.003 = $0.0015
Output: 200 / 1000 × $0.015 = $0.003
Total per request: $0.0045

1,000 requests/day = $4.50/day = ~$135/month

Provisioned Throughput

For predictable, high-volume workloads, Bedrock offers Provisioned Throughput:

Commit to model units for 1 or 6 months
Get guaranteed capacity
Potentially lower per-token costs at scale
Prices vary by model and commitment term

Getting Started: Your First Bedrock Call

Prerequisites

AWS account with Bedrock access enabled
Request model access in the Bedrock console (some models require approval)
IAM permissions for bedrock:InvokeModel

Python Example with Boto3

import boto3
import json

# Create Bedrock runtime client
bedrock = boto3.client(
    service_name='bedrock-runtime',
    region_name='us-east-1'
)

# Prepare the request for Claude
body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": "Explain what AWS Bedrock is in 2 sentences."
        }
    ]
})

# Invoke the model
response = bedrock.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    contentType="application/json",
    accept="application/json",
    body=body
)

# Parse response
response_body = json.loads(response['body'].read())
print(response_body['content'][0]['text'])

Using the Converse API (Recommended)

The Converse API provides a unified interface across all models:

import boto3

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock.converse(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    messages=[
        {
            "role": "user",
            "content": [{"text": "What is AWS Bedrock?"}]
        }
    ],
    inferenceConfig={
        "maxTokens": 1024,
        "temperature": 0.7
    }
)

print(response['output']['message']['content'][0]['text'])

Benefit: Switch models by changing modelId without changing code structure.

Bedrock Knowledge Bases (RAG)

Knowledge Bases let you connect foundation models to your own data using Retrieval-Augmented Generation (RAG).

How It Works

┌──────────────────────────────────────────────────────────────┐
│                      User Question                            │
└──────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────┐
│                    Bedrock Knowledge Base                     │
│  1. Convert question to embedding                             │
│  2. Search vector database for relevant chunks                │
│  3. Pass chunks + question to foundation model                │
│  4. Return grounded answer                                    │
└──────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────┐
│                   Answer with Citations                       │
└──────────────────────────────────────────────────────────────┘

Supported Data Sources

Amazon S3 (PDF, TXT, MD, HTML, DOC, CSV)
Web crawlers
Confluence
Salesforce
SharePoint

Creating a Knowledge Base

import boto3

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

# Create knowledge base
response = bedrock_agent.create_knowledge_base(
    name='company-docs-kb',
    roleArn='arn:aws:iam::123456789:role/BedrockKBRole',
    knowledgeBaseConfiguration={
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
        }
    },
    storageConfiguration={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789:collection/abc123',
            'vectorIndexName': 'bedrock-kb-index',
            'fieldMapping': {
                'vectorField': 'embedding',
                'textField': 'text',
                'metadataField': 'metadata'
            }
        }
    }
)

Bedrock Agents

Agents enable foundation models to take actions by connecting to external tools and APIs.

Example: Customer Service Agent

An agent can:

Look up customer orders in your database
Check inventory status via API
Create support tickets
Send confirmation emails

Agent Architecture

┌─────────────────────────────────────────────────────────────┐
│                        User Input                            │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Bedrock Agent                           │
│  ┌─────────────────────────────────────────────────────┐    │
│  │              Foundation Model (Claude)               │    │
│  │         Reasons about what actions to take           │    │
│  └─────────────────────────────────────────────────────┘    │
│                              │                               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │ Action Group │  │ Action Group │  │ Knowledge    │       │
│  │ (Lambda)     │  │ (API Schema) │  │ Base         │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    Final Response                            │
└─────────────────────────────────────────────────────────────┘

Guardrails: Control Model Outputs

Guardrails let you define policies to filter harmful content and protect sensitive data.

Guardrail Capabilities

Feature	Description
Content filters	Block hate, violence, sexual content, etc.
Denied topics	Prevent discussion of specific topics
Word filters	Block specific words or phrases
PII detection	Mask or block personal information
Contextual grounding	Reduce hallucinations with source verification

Creating a Guardrail

import boto3

bedrock = boto3.client('bedrock', region_name='us-east-1')

response = bedrock.create_guardrail(
    name='customer-support-guardrail',
    description='Guardrail for customer support chatbot',
    contentPolicyConfig={
        'filtersConfig': [
            {'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
            {'type': 'VIOLENCE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
        ]
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': [
            {'type': 'CREDIT_DEBIT_CARD_NUMBER', 'action': 'BLOCK'},
            {'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'},
        ]
    },
    blockedInputMessaging='I cannot process this request.',
    blockedOutputsMessaging='I cannot provide this information.'
)

Model Fine-Tuning

Bedrock supports fine-tuning select models with your own data to improve performance on specific tasks.

Supported Models for Fine-Tuning

Amazon Titan Text
Cohere Command
Meta Llama 2 (select variants)

Fine-Tuning Process

Prepare training data - JSONL format with prompt/completion pairs
Upload to S3 - Training data in your bucket
Create fine-tuning job - Specify base model and hyperparameters
Deploy custom model - Use via Provisioned Throughput

Training Data Format

{"prompt": "Summarize this support ticket:", "completion": "Customer reports login issue..."}
{"prompt": "Summarize this support ticket:", "completion": "User cannot reset password..."}

Best Practices

1. Start with Smaller Models

Use Claude 3 Haiku or Titan Text for development and testing. Move to larger models only when needed for production quality.

2. Implement Caching

Cache common responses to reduce costs:

import hashlib

def get_cached_or_call(prompt, cache):
    cache_key = hashlib.md5(prompt.encode()).hexdigest()
    if cache_key in cache:
        return cache[cache_key]

    response = bedrock.converse(...)
    cache[cache_key] = response
    return response

3. Use Streaming for Long Responses

response = bedrock.converse_stream(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    messages=[{"role": "user", "content": [{"text": "Write a long story"}]}]
)

for event in response['stream']:
    if 'contentBlockDelta' in event:
        print(event['contentBlockDelta']['delta']['text'], end='')

4. Monitor Costs

Set up CloudWatch alarms for Bedrock metrics:

InvocationCount - Track usage volume
InvocationLatency - Monitor response times
TokenCount - Watch token consumption

5. Use VPC Endpoints for Security

aws ec2 create-vpc-endpoint \
  --vpc-id vpc-123456 \
  --service-name com.amazonaws.us-east-1.bedrock-runtime \
  --vpc-endpoint-type Interface

Bedrock vs OpenAI vs Direct Anthropic

Factor	AWS Bedrock	OpenAI API	Anthropic Direct
Model variety	Multiple providers	OpenAI only	Claude only
AWS integration	Native	Manual	Manual
Enterprise compliance	Strong	Developing	Developing
Pricing	Comparable	Comparable	Often cheaper
Fine-tuning	Limited models	GPT-3.5/4	Not available
Setup complexity	AWS knowledge needed	Simple API key	Simple API key

Getting Started Checklist

Enable Bedrock in your AWS account
Request model access for the models you need
Set up IAM permissions for your users/services
Start with Converse API for model-agnostic code
Test with Haiku/Titan before using expensive models
Add Guardrails for production deployments
Monitor costs with CloudWatch and billing alerts

AWS Bedrock lowers the barrier to building production AI applications while providing enterprise-grade security and compliance. Start experimenting with the playground in the AWS console, then move to the API for production workloads.

AWS Bedrock: Getting Started with Generative AI on AWS

Is your cloud secure? Find out free.

Related Articles

AI Gateway Guide: What They Are, Why You Need One, and How to Choose

CDN Showdown: Cloudflare vs CloudFront vs Azure CDN vs Google Cloud CDN

Object Storage Face-Off: Cloudflare R2 vs S3 vs Azure Blob vs Google Cloud Storage

DNS Infrastructure Compared: Cloudflare DNS vs Route 53 vs Azure DNS vs Google Cloud DNS

Serverless Showdown: Cloudflare Workers vs Lambda vs Cloud Functions vs Azure Functions

Web Security Compared: Cloudflare vs AWS Shield/WAF vs Azure DDoS/WAF vs Google Cloud Armor