Home/Blog/Cloud/AWS Bedrock: Getting Started with Generative AI on AWS
Cloud

AWS Bedrock: Getting Started with Generative AI on AWS

Learn what AWS Bedrock is, how pricing works, which foundation models are available, and how to build your first generative AI application. Complete guide with code examples.

By InventiveHQ Team
AWS Bedrock: Getting Started with Generative AI on AWS

Amazon Bedrock is AWS's fully managed service for building generative AI applications. Instead of training your own models or managing ML infrastructure, Bedrock gives you API access to leading foundation models from Anthropic, Meta, Mistral, Cohere, and Amazon.

This guide covers what Bedrock offers, how pricing works, and how to get started building AI-powered applications.

What Is AWS Bedrock?

Amazon Bedrock is a serverless service that provides:

  • Foundation model access - Claude, Llama, Mistral, and more via API
  • Fine-tuning capabilities - Customize models with your data
  • RAG support - Connect models to your knowledge bases
  • Agents - Build autonomous AI workflows
  • Guardrails - Control model outputs for safety

Think of Bedrock as a "model marketplace with infrastructure"—you choose which AI models to use, AWS handles scaling, security, and availability.

Why Use Bedrock Instead of Direct APIs?

You could use Anthropic's API directly, so why go through AWS?

FactorDirect APIAWS Bedrock
BillingSeparate vendorConsolidated AWS billing
Data residencyVaries by providerStays in your AWS region
VPC integrationRequires configurationPrivateLink available
IAM integrationAPI keys onlyNative IAM policies
Multiple modelsMultiple accounts/APIsSingle API, many models
ComplianceVariesAWS compliance (HIPAA, SOC2, etc.)
Model switchingCode changesConfiguration changes

Key benefit: If you're already on AWS and need enterprise controls (VPC, IAM, compliance), Bedrock simplifies integration significantly.

Available Foundation Models

Bedrock offers models from multiple providers:

Text Generation Models

ProviderModelContext WindowBest For
AnthropicClaude 3.5 Sonnet200K tokensComplex reasoning, coding
AnthropicClaude 3 Haiku200K tokensFast, cost-effective tasks
MetaLlama 3.1 70B128K tokensOpen-weight alternative
MistralMistral Large128K tokensMultilingual, coding
AmazonTitan Text8K tokensBasic text tasks
CohereCommand R+128K tokensRAG applications

Image Generation Models

ProviderModelUse Case
Stability AISDXL 1.0High-quality image generation
AmazonTitan Image GeneratorText-to-image, editing

Embedding Models

ProviderModelDimensions
AmazonTitan Embeddings V21024
CohereEmbed1024

Bedrock Pricing Explained

Bedrock uses token-based pricing—you pay per input and output token processed. Pricing varies significantly by model.

Text Model Pricing (per 1,000 tokens)

ModelInput PriceOutput Price
Claude 3.5 Sonnet$0.003$0.015
Claude 3 Haiku$0.00025$0.00125
Llama 3.1 70B$0.00099$0.00099
Mistral Large$0.004$0.012
Titan Text Express$0.0002$0.0006

Example Cost Calculation

Processing a customer support request (500 input tokens, 200 output tokens) with Claude 3.5 Sonnet:

Input: 500 / 1000 × $0.003 = $0.0015
Output: 200 / 1000 × $0.015 = $0.003
Total per request: $0.0045

1,000 requests/day = $4.50/day = ~$135/month

Provisioned Throughput

For predictable, high-volume workloads, Bedrock offers Provisioned Throughput:

  • Commit to model units for 1 or 6 months
  • Get guaranteed capacity
  • Potentially lower per-token costs at scale
  • Prices vary by model and commitment term

Getting Started: Your First Bedrock Call

Prerequisites

  1. AWS account with Bedrock access enabled
  2. Request model access in the Bedrock console (some models require approval)
  3. IAM permissions for bedrock:InvokeModel

Python Example with Boto3

import boto3
import json

# Create Bedrock runtime client
bedrock = boto3.client(
    service_name='bedrock-runtime',
    region_name='us-east-1'
)

# Prepare the request for Claude
body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": "Explain what AWS Bedrock is in 2 sentences."
        }
    ]
})

# Invoke the model
response = bedrock.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    contentType="application/json",
    accept="application/json",
    body=body
)

# Parse response
response_body = json.loads(response['body'].read())
print(response_body['content'][0]['text'])

The Converse API provides a unified interface across all models:

import boto3

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock.converse(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    messages=[
        {
            "role": "user",
            "content": [{"text": "What is AWS Bedrock?"}]
        }
    ],
    inferenceConfig={
        "maxTokens": 1024,
        "temperature": 0.7
    }
)

print(response['output']['message']['content'][0]['text'])

Benefit: Switch models by changing modelId without changing code structure.

Bedrock Knowledge Bases (RAG)

Knowledge Bases let you connect foundation models to your own data using Retrieval-Augmented Generation (RAG).

How It Works

┌──────────────────────────────────────────────────────────────┐
│                      User Question                            │
└──────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────┐
│                    Bedrock Knowledge Base                     │
│  1. Convert question to embedding                             │
│  2. Search vector database for relevant chunks                │
│  3. Pass chunks + question to foundation model                │
│  4. Return grounded answer                                    │
└──────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────┐
│                   Answer with Citations                       │
└──────────────────────────────────────────────────────────────┘

Supported Data Sources

  • Amazon S3 (PDF, TXT, MD, HTML, DOC, CSV)
  • Web crawlers
  • Confluence
  • Salesforce
  • SharePoint

Creating a Knowledge Base

import boto3

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

# Create knowledge base
response = bedrock_agent.create_knowledge_base(
    name='company-docs-kb',
    roleArn='arn:aws:iam::123456789:role/BedrockKBRole',
    knowledgeBaseConfiguration={
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
        }
    },
    storageConfiguration={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789:collection/abc123',
            'vectorIndexName': 'bedrock-kb-index',
            'fieldMapping': {
                'vectorField': 'embedding',
                'textField': 'text',
                'metadataField': 'metadata'
            }
        }
    }
)

Bedrock Agents

Agents enable foundation models to take actions by connecting to external tools and APIs.

Example: Customer Service Agent

An agent can:

  1. Look up customer orders in your database
  2. Check inventory status via API
  3. Create support tickets
  4. Send confirmation emails

Agent Architecture

┌─────────────────────────────────────────────────────────────┐
│                        User Input                            │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Bedrock Agent                           │
│  ┌─────────────────────────────────────────────────────┐    │
│  │              Foundation Model (Claude)               │    │
│  │         Reasons about what actions to take           │    │
│  └─────────────────────────────────────────────────────┘    │
│                              │                               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │ Action Group │  │ Action Group │  │ Knowledge    │       │
│  │ (Lambda)     │  │ (API Schema) │  │ Base         │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    Final Response                            │
└─────────────────────────────────────────────────────────────┘

Guardrails: Control Model Outputs

Guardrails let you define policies to filter harmful content and protect sensitive data.

Guardrail Capabilities

FeatureDescription
Content filtersBlock hate, violence, sexual content, etc.
Denied topicsPrevent discussion of specific topics
Word filtersBlock specific words or phrases
PII detectionMask or block personal information
Contextual groundingReduce hallucinations with source verification

Creating a Guardrail

import boto3

bedrock = boto3.client('bedrock', region_name='us-east-1')

response = bedrock.create_guardrail(
    name='customer-support-guardrail',
    description='Guardrail for customer support chatbot',
    contentPolicyConfig={
        'filtersConfig': [
            {'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
            {'type': 'VIOLENCE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
        ]
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': [
            {'type': 'CREDIT_DEBIT_CARD_NUMBER', 'action': 'BLOCK'},
            {'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'},
        ]
    },
    blockedInputMessaging='I cannot process this request.',
    blockedOutputsMessaging='I cannot provide this information.'
)

Model Fine-Tuning

Bedrock supports fine-tuning select models with your own data to improve performance on specific tasks.

Supported Models for Fine-Tuning

  • Amazon Titan Text
  • Cohere Command
  • Meta Llama 2 (select variants)

Fine-Tuning Process

  1. Prepare training data - JSONL format with prompt/completion pairs
  2. Upload to S3 - Training data in your bucket
  3. Create fine-tuning job - Specify base model and hyperparameters
  4. Deploy custom model - Use via Provisioned Throughput

Training Data Format

{"prompt": "Summarize this support ticket:", "completion": "Customer reports login issue..."}
{"prompt": "Summarize this support ticket:", "completion": "User cannot reset password..."}

Best Practices

1. Start with Smaller Models

Use Claude 3 Haiku or Titan Text for development and testing. Move to larger models only when needed for production quality.

2. Implement Caching

Cache common responses to reduce costs:

import hashlib

def get_cached_or_call(prompt, cache):
    cache_key = hashlib.md5(prompt.encode()).hexdigest()
    if cache_key in cache:
        return cache[cache_key]

    response = bedrock.converse(...)
    cache[cache_key] = response
    return response

3. Use Streaming for Long Responses

response = bedrock.converse_stream(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    messages=[{"role": "user", "content": [{"text": "Write a long story"}]}]
)

for event in response['stream']:
    if 'contentBlockDelta' in event:
        print(event['contentBlockDelta']['delta']['text'], end='')

4. Monitor Costs

Set up CloudWatch alarms for Bedrock metrics:

  • InvocationCount - Track usage volume
  • InvocationLatency - Monitor response times
  • TokenCount - Watch token consumption

5. Use VPC Endpoints for Security

aws ec2 create-vpc-endpoint \
  --vpc-id vpc-123456 \
  --service-name com.amazonaws.us-east-1.bedrock-runtime \
  --vpc-endpoint-type Interface

Bedrock vs OpenAI vs Direct Anthropic

FactorAWS BedrockOpenAI APIAnthropic Direct
Model varietyMultiple providersOpenAI onlyClaude only
AWS integrationNativeManualManual
Enterprise complianceStrongDevelopingDeveloping
PricingComparableComparableOften cheaper
Fine-tuningLimited modelsGPT-3.5/4Not available
Setup complexityAWS knowledge neededSimple API keySimple API key

Getting Started Checklist

  1. Enable Bedrock in your AWS account
  2. Request model access for the models you need
  3. Set up IAM permissions for your users/services
  4. Start with Converse API for model-agnostic code
  5. Test with Haiku/Titan before using expensive models
  6. Add Guardrails for production deployments
  7. Monitor costs with CloudWatch and billing alerts

AWS Bedrock lowers the barrier to building production AI applications while providing enterprise-grade security and compliance. Start experimenting with the playground in the AWS console, then move to the API for production workloads.

Is your cloud secure? Find out free.

Get a complimentary cloud security review. We'll identify misconfigurations, excess costs, and security gaps across AWS, GCP, or Azure.