Home/Blog/AWS S3 Glacier Backup Guide: Long-Term Archive and Compliance
Cloud & DevOps

AWS S3 Glacier Backup Guide: Long-Term Archive and Compliance

Complete guide to AWS S3 Glacier for backups and archives. Learn Glacier tiers, retrieval options, compliance features, and cost optimization for long-term data retention.

By Inventive HQ Team
AWS S3 Glacier Backup Guide: Long-Term Archive and Compliance

Every organization accumulates data that must be retained but rarely accessed: compliance records, old backups, legal holds, and historical archives. Storing this in standard S3 wastes money. AWS S3 Glacier provides archive storage at a fraction of the cost—as low as $1 per TB per month—while maintaining the same 11 nines durability as standard S3.

Understanding Glacier Tiers

AWS offers three Glacier storage classes, each balancing cost against retrieval speed:

TierStorage CostRetrieval TimeMin DurationBest For
Glacier Instant Retrieval$0.004/GBMilliseconds90 daysQuarterly access archives
Glacier Flexible Retrieval$0.0036/GB1 min - 12 hours90 daysAnnual access archives
Glacier Deep Archive$0.00099/GB12-48 hours180 days7+ year compliance

All tiers provide 99.999999999% durability across multiple Availability Zones.

Glacier Instant Retrieval

Despite the "Glacier" name, this tier provides millisecond access—same as Standard S3. The trade-off is higher retrieval fees ($0.03/GB) and a 90-day minimum storage duration.

Ideal for:

  • Medical imaging archives (accessed for patient visits)
  • Media libraries (occasional streaming)
  • Quarterly financial reports
  • Data accessed 1-4 times per year
# Upload directly to Glacier Instant
aws s3 cp patient-scan.dcm s3://medical-archive/ --storage-class GLACIER_IR

Glacier Flexible Retrieval

The classic Glacier tier with configurable retrieval speeds. Choose between fast/expensive or slow/cheap based on urgency.

Retrieval options:

OptionTimeCost per GBUse Case
Expedited1-5 minutes$0.03Urgent requests
Standard3-5 hours$0.01Normal operations
Bulk5-12 hours$0.0025Large batch restores
# Upload to Glacier Flexible
aws s3 cp annual-backup.tar.gz s3://backups/ --storage-class GLACIER

Glacier Deep Archive

The lowest cost storage in AWS, designed for data retained 7-10+ years with rare access. Retrieval takes 12-48 hours.

Ideal for:

  • Regulatory compliance (HIPAA, SOX, SEC Rule 17a-4)
  • Legal hold documents
  • Historical records
  • Tape archive replacement
# Upload to Deep Archive
aws s3 cp legal-documents-2019.tar s3://compliance/ --storage-class DEEP_ARCHIVE

Setting Up Glacier Backups

Direct Upload

Upload directly to Glacier when you know data won't be accessed soon:

# Single file
aws s3 cp database-backup.sql.gz s3://backups/db/ --storage-class GLACIER

# Directory with Deep Archive
aws s3 sync ./old-logs/ s3://archive/logs/2024/ \
  --storage-class DEEP_ARCHIVE \
  --exclude "*.tmp"

Lifecycle Policy Transitions

The recommended approach: store in Standard initially, then transition automatically:

{
  "Rules": [
    {
      "ID": "BackupLifecycle",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "backups/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

Apply the policy:

aws s3api put-bucket-lifecycle-configuration \
  --bucket my-backup-bucket \
  --lifecycle-configuration file://backup-lifecycle.json

Cross-Region Replication to Glacier

Replicate backups to another region for disaster recovery, storing the replica in Glacier:

{
  "Role": "arn:aws:iam::123456789012:role/replication-role",
  "Rules": [
    {
      "ID": "DR-Replication",
      "Status": "Enabled",
      "Priority": 1,
      "Filter": {"Prefix": "critical/"},
      "Destination": {
        "Bucket": "arn:aws:s3:::dr-bucket-us-west-2",
        "StorageClass": "GLACIER"
      },
      "DeleteMarkerReplication": {"Status": "Disabled"}
    }
  ]
}

Cost Calculation

Storage Costs

For 10 TB retained for 7 years:

Storage ClassMonthly7-Year Total
S3 Standard$235$19,740
S3 Standard-IA$128$10,752
Glacier Flexible$37$3,108
Deep Archive$10$840

Deep Archive saves $18,900 (95%) compared to Standard over 7 years.

Retrieval Cost Scenarios

Restoring 1 TB from each tier:

TierBulkStandardExpedited
Glacier Flexible$2.50$10$30
Deep Archive$2.50$20N/A

Best practice: Use Bulk retrieval unless urgent. The cost difference compounds with large restores.

Minimum Duration Penalties

Objects deleted before minimum duration incur pro-rated charges:

Deep Archive object (180-day minimum):
- Stored: 60 days
- Deleted: Day 60
- Charged: 120 additional days of storage
- Penalty: $0.00099/GB × 120/30 = $0.00396/GB

Retrieving from Glacier

Initiate Restore

Glacier objects aren't directly accessible. You must initiate a restore, which creates a temporary copy in Standard storage:

# Restore with Standard retrieval (3-5 hours)
aws s3api restore-object \
  --bucket my-bucket \
  --key archived/backup-2023.tar.gz \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}'

# Restore with Bulk retrieval (5-12 hours, cheaper)
aws s3api restore-object \
  --bucket my-bucket \
  --key archived/backup-2023.tar.gz \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Bulk"}}'

Check Restore Status

# Check if restore is complete
aws s3api head-object \
  --bucket my-bucket \
  --key archived/backup-2023.tar.gz

# Response includes:
# "Restore": "ongoing-request=\"false\", expiry-date=\"Sun, 20 Jan 2026 00:00:00 GMT\""

Download After Restore

Once restored, download normally:

# Wait for restore to complete
while true; do
  STATUS=$(aws s3api head-object --bucket my-bucket --key archived/backup.tar.gz \
    --query 'Restore' --output text 2>/dev/null)
  if [[ "$STATUS" == *"ongoing-request=\"false\""* ]]; then
    echo "Restore complete"
    break
  fi
  echo "Waiting for restore..."
  sleep 300
done

# Download
aws s3 cp s3://my-bucket/archived/backup.tar.gz ./

Batch Restore Script

Restore multiple objects efficiently:

#!/bin/bash
BUCKET="my-archive-bucket"
PREFIX="backups/2023/"
TIER="Bulk"
DAYS=7

# List all objects under prefix
aws s3api list-objects-v2 \
  --bucket "$BUCKET" \
  --prefix "$PREFIX" \
  --query 'Contents[].Key' \
  --output text | tr '\t' '\n' | while read KEY; do

  echo "Initiating restore for: $KEY"
  aws s3api restore-object \
    --bucket "$BUCKET" \
    --key "$KEY" \
    --restore-request "{\"Days\":$DAYS,\"GlacierJobParameters\":{\"Tier\":\"$TIER\"}}" \
    2>/dev/null || echo "Already restoring or not archived: $KEY"
done

echo "Restore requests submitted. Check status in 12-48 hours."

Compliance and Data Protection

Object Lock for Immutability

Object Lock prevents deletion or modification for a specified retention period. Combined with Glacier, this creates tamper-proof archives:

# Enable Object Lock on bucket (must be done at creation)
aws s3api create-bucket \
  --bucket compliance-archive \
  --object-lock-enabled-for-bucket

# Set default retention
aws s3api put-object-lock-configuration \
  --bucket compliance-archive \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "COMPLIANCE",
        "Years": 7
      }
    }
  }'

Lock modes:

  • Governance: Admins with special permissions can delete
  • Compliance: Nobody can delete until retention expires (even root)

Place a legal hold on objects involved in litigation:

# Apply legal hold
aws s3api put-object-legal-hold \
  --bucket my-bucket \
  --key evidence/document.pdf \
  --legal-hold '{"Status":"ON"}'

# Objects with legal hold cannot be deleted regardless of retention

Versioning for Recovery

Enable versioning to protect against accidental deletion:

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-backup-bucket \
  --versioning-configuration Status=Enabled

# Delete markers don't remove data—previous versions remain
# Lifecycle can transition old versions to Glacier

Encryption

All Glacier data should be encrypted:

# Server-side encryption with S3-managed keys (default)
aws s3 cp backup.tar s3://bucket/ --storage-class GLACIER --sse AES256

# Server-side encryption with KMS
aws s3 cp backup.tar s3://bucket/ --storage-class GLACIER \
  --sse aws:kms \
  --sse-kms-key-id alias/backup-key

# Client-side encryption (you manage keys)
# Encrypt locally before upload

Automation Patterns

Daily Database Backup to Glacier

#!/bin/bash
# backup-to-glacier.sh
DATE=$(date +%Y-%m-%d)
BUCKET="database-backups"

# Dump database
pg_dump mydb | gzip > /tmp/db-$DATE.sql.gz

# Upload to S3 (Standard initially for verification)
aws s3 cp /tmp/db-$DATE.sql.gz s3://$BUCKET/daily/

# Lifecycle policy transitions to Glacier after 7 days
# Deep Archive after 90 days
# Expires after 7 years

# Cleanup local
rm /tmp/db-$DATE.sql.gz

Log Archival Pipeline

# Archive logs older than 7 days directly to Glacier
find /var/log/app/ -name "*.log" -mtime +7 -exec \
  aws s3 mv {} s3://log-archive/$(date +%Y/%m)/ \
    --storage-class GLACIER \;

# Or sync entire directory
aws s3 sync /var/log/archive/ s3://log-archive/ \
  --storage-class DEEP_ARCHIVE \
  --delete

Event-Driven Archival with Lambda

Trigger archival when objects are created:

import boto3

def lambda_handler(event, context):
    s3 = boto3.client('s3')

    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        # Copy to archive bucket with Glacier class
        s3.copy_object(
            CopySource={'Bucket': bucket, 'Key': key},
            Bucket='archive-bucket',
            Key=f'archived/{key}',
            StorageClass='GLACIER',
            MetadataDirective='COPY'
        )

        # Delete from source
        s3.delete_object(Bucket=bucket, Key=key)

Common Backup Patterns

Pattern 1: Grandfather-Father-Son (GFS)

Traditional backup rotation adapted for cloud:

{
  "Rules": [
    {
      "ID": "Daily-to-IA",
      "Filter": {"Prefix": "backups/daily/"},
      "Transitions": [{"Days": 7, "StorageClass": "STANDARD_IA"}],
      "Expiration": {"Days": 30}
    },
    {
      "ID": "Weekly-to-Glacier",
      "Filter": {"Prefix": "backups/weekly/"},
      "Transitions": [{"Days": 30, "StorageClass": "GLACIER"}],
      "Expiration": {"Days": 365}
    },
    {
      "ID": "Monthly-to-DeepArchive",
      "Filter": {"Prefix": "backups/monthly/"},
      "Transitions": [{"Days": 90, "StorageClass": "DEEP_ARCHIVE"}],
      "Expiration": {"Days": 2555}
    }
  ]
}

Pattern 2: Compliance Archive

For regulated industries with strict retention:

# Structure
compliance-bucket/
├── hipaa/          → 6-year retention, Deep Archive
├── sox/            → 7-year retention, Deep Archive + Object Lock
├── sec-17a4/       → 7-year retention, WORM compliance
└── legal-hold/     → Indefinite, Object Lock + Legal Hold

Pattern 3: Ransomware-Resistant Backups

Create immutable copies that attackers cannot delete:

# 1. Create bucket with Object Lock
aws s3api create-bucket \
  --bucket immutable-backups \
  --object-lock-enabled-for-bucket

# 2. Enable versioning (required for Object Lock)
aws s3api put-bucket-versioning \
  --bucket immutable-backups \
  --versioning-configuration Status=Enabled

# 3. Set compliance retention
aws s3api put-object-lock-configuration \
  --bucket immutable-backups \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "COMPLIANCE",
        "Years": 1
      }
    }
  }'

# 4. Upload backups (automatically locked)
aws s3 sync ./critical-backups/ s3://immutable-backups/ \
  --storage-class GLACIER

Best Practices

1. Test Restores Regularly

Don't discover retrieval issues during an incident:

# Monthly restore test
aws s3api restore-object \
  --bucket archive \
  --key test/restore-test.tar \
  --restore-request '{"Days":1,"GlacierJobParameters":{"Tier":"Standard"}}'

# Verify data integrity after restore
sha256sum restored-file.tar

2. Use Bulk Retrieval for Large Restores

Expedited retrieval for 1 TB costs $30; Bulk costs $2.50. Plan ahead.

3. Set Up Restore Notifications

Get notified when restores complete:

# Configure S3 event notification
aws s3api put-bucket-notification-configuration \
  --bucket my-bucket \
  --notification-configuration '{
    "TopicConfigurations": [{
      "TopicArn": "arn:aws:sns:us-east-1:123456789012:restore-complete",
      "Events": ["s3:ObjectRestore:Completed"]
    }]
  }'

4. Document Retention Policies

Maintain clear documentation of what's archived and why:

# Tag objects with retention metadata
aws s3api put-object-tagging \
  --bucket compliance \
  --key records/2024/audit.tar \
  --tagging 'TagSet=[{Key=RetentionPolicy,Value=SOX-7Year},{Key=DeleteAfter,Value=2031-12-31}]'

5. Monitor Archive Growth

Track Glacier usage to predict costs:

# Get storage metrics
aws cloudwatch get-metric-statistics \
  --namespace AWS/S3 \
  --metric-name BucketSizeBytes \
  --dimensions Name=BucketName,Value=my-archive Name=StorageType,Value=GlacierStorage \
  --start-time 2026-01-01T00:00:00Z \
  --end-time 2026-01-17T00:00:00Z \
  --period 86400 \
  --statistics Average

Summary

AWS S3 Glacier transforms long-term data retention from a cost burden to a manageable expense. Key takeaways:

  1. Choose the right tier: Instant for quarterly access, Flexible for yearly, Deep Archive for compliance
  2. Use lifecycle policies: Automate transitions based on data age
  3. Plan for retrieval time: Bulk takes 12+ hours but saves 90% on retrieval costs
  4. Enable immutability: Object Lock + Compliance mode for ransomware protection
  5. Test restores: Verify your recovery process works before you need it

For building S3 commands with proper Glacier flags, use our AWS S3 Command Generator. For a complete overview of all storage classes and when to use each, see our S3 Storage Classes Guide.

Frequently Asked Questions

Find answers to common questions

Retrieval time depends on the tier and option chosen. Glacier Instant Retrieval: milliseconds. Glacier Flexible Retrieval: 1-5 minutes (Expedited), 3-5 hours (Standard), or 5-12 hours (Bulk). Glacier Deep Archive: 12 hours (Standard) or 48 hours (Bulk).

Let's turn this knowledge into action

Get a free 30-minute consultation with our experts. We'll help you apply these insights to your specific situation.