AWS Macie is a data security service that uses machine learning and pattern matching to discover and protect sensitive data stored in Amazon S3. This guide covers enabling Macie, configuring data discovery jobs, creating custom identifiers, and managing findings.
This article is part of our comprehensive Cloud Security Tips for 2026 guide covering essential practices for protecting your cloud environment.
What Macie Detects
| Category | Examples |
|---|---|
| Personal Identifiers | Names, addresses, phone numbers, dates of birth |
| Government IDs | SSN, passport, driver's license, tax IDs |
| Financial Data | Credit cards, bank accounts, financial statements |
| Credentials | AWS keys, passwords, API tokens, private keys |
| Healthcare | PHI, medical record numbers, health insurance IDs |
| Custom | Organization-specific patterns you define |
Enable AWS Macie
Using AWS Console
- Open the Macie Console
- Click Get started
- Review the service-linked role permissions
- Click Enable Macie
Using AWS CLI
# Enable Macie
aws macie2 enable-macie
# Verify Macie is enabled
aws macie2 get-macie-session
# Check bucket inventory status
aws macie2 describe-buckets \
--query 'buckets[*].[bucketName,classifiableSizeInBytes]' \
--output tableMulti-Account Setup
For organizations, use delegated administrator:
# From management account: Enable delegated admin
aws macie2 enable-organization-admin-account \
--admin-account-id 111122223333
# From delegated admin: Enable for member accounts
aws macie2 create-member \
--account '{
"accountId": "444455556666",
"email": "[email protected]"
}'
# Auto-enable for new organization accounts
aws macie2 update-organization-configuration \
--auto-enable
# List member accounts
aws macie2 list-membersAnalyze S3 Bucket Security
Macie automatically analyzes S3 bucket security posture:
# Get bucket inventory
aws macie2 describe-buckets \
--query 'buckets[*].[bucketName,publicAccess.effectivePermission,sharedAccess]' \
--output table
# Filter for public buckets
aws macie2 describe-buckets \
--criteria '{
"publicAccess.effectivePermission": {
"eq": ["PUBLIC"]
}
}'
# Get bucket statistics
aws macie2 get-bucket-statistics \
--account-id 123456789012Bucket Security Findings
| Finding Type | Description |
|---|---|
| Policy:IAMUser/S3BucketPublic | Bucket has public access via policy |
| Policy:IAMUser/S3BucketSharedExternally | Bucket shared with external accounts |
| Policy:IAMUser/S3BucketReplicatedExternally | Bucket replicates to external account |
| Policy:IAMUser/S3BlockPublicAccessDisabled | Public access block not enabled |
Create Sensitive Data Discovery Job
One-Time Scan
# Create discovery job for specific buckets
aws macie2 create-classification-job \
--name "PII-Discovery-Production" \
--description "Scan production buckets for PII" \
--job-type ONE_TIME \
--s3-job-definition '{
"bucketDefinitions": [{
"accountId": "123456789012",
"buckets": ["prod-data-bucket", "prod-reports-bucket"]
}]
}' \
--managed-data-identifier-selector ALL \
--tags Environment=Production
# Check job status
aws macie2 describe-classification-job \
--job-id abc123def456Scheduled Scan
# Create weekly scheduled job
aws macie2 create-classification-job \
--name "Weekly-PII-Scan" \
--description "Weekly scan for sensitive data" \
--job-type SCHEDULED \
--schedule-frequency '{
"weeklySchedule": {
"dayOfWeek": "SUNDAY"
}
}' \
--s3-job-definition '{
"bucketDefinitions": [{
"accountId": "123456789012",
"buckets": ["customer-data-bucket"]
}],
"scoping": {
"includes": {
"and": [{
"simpleScopeTerm": {
"comparator": "STARTS_WITH",
"key": "OBJECT_KEY",
"values": ["reports/", "exports/"]
}
}]
}
}
}' \
--managed-data-identifier-selector ALLScan with Sampling
# Sample 10% of objects (cost optimization)
aws macie2 create-classification-job \
--name "Sampled-Scan" \
--job-type ONE_TIME \
--s3-job-definition '{
"bucketDefinitions": [{
"accountId": "123456789012",
"buckets": ["large-data-lake"]
}],
"scoping": {
"includes": {
"and": [{
"simpleScopeTerm": {
"comparator": "GT",
"key": "OBJECT_SIZE",
"values": ["0"]
}
}]
}
}
}' \
--sampling-percentage 10 \
--managed-data-identifier-selector ALLCreate Custom Data Identifiers
Create custom identifiers for organization-specific data:
# Create custom identifier for employee IDs
aws macie2 create-custom-data-identifier \
--name "EmployeeID" \
--description "Internal employee ID format: EMP-XXXXX" \
--regex "EMP-[0-9]{5}" \
--keywords '["employee", "emp id", "staff"]' \
--maximum-match-distance 50 \
--tags Type=Internal
# Create identifier for internal project codes
aws macie2 create-custom-data-identifier \
--name "ProjectCode" \
--description "Internal project code format" \
--regex "PROJ-[A-Z]{3}-[0-9]{4}" \
--keywords '["project", "initiative", "program"]'
# List custom identifiers
aws macie2 list-custom-data-identifiers \
--query 'items[*].[name,id]' \
--output tableUse Custom Identifiers in Jobs
# Create job with custom and managed identifiers
aws macie2 create-classification-job \
--name "Complete-Scan" \
--job-type ONE_TIME \
--s3-job-definition '{
"bucketDefinitions": [{
"accountId": "123456789012",
"buckets": ["internal-docs"]
}]
}' \
--custom-data-identifier-ids "id1" "id2" \
--managed-data-identifier-selector ALLManage Findings
View Findings
# List all findings
aws macie2 list-findings \
--sort-criteria '{
"attributeName": "severity.score",
"orderBy": "DESC"
}'
# Get finding details
aws macie2 get-findings \
--finding-ids "finding-id-1" "finding-id-2"
# Filter for high severity findings
aws macie2 list-findings \
--finding-criteria '{
"criterion": {
"severity.description": {
"eq": ["High"]
}
}
}'
# Filter by finding type
aws macie2 list-findings \
--finding-criteria '{
"criterion": {
"category": {
"eq": ["SENSITIVE_DATA"]
},
"classificationDetails.result.sensitiveData.detections.type": {
"eq": ["CREDIT_CARD_NUMBER"]
}
}
}'Archive Findings
# Archive investigated findings
aws macie2 archive-findings \
--finding-ids "finding-id-1" "finding-id-2"
# Unarchive findings if needed
aws macie2 unarchive-findings \
--finding-ids "finding-id-1"Create Suppression Rules
# Suppress findings for test data bucket
aws macie2 create-findings-filter \
--name "Suppress-Test-Bucket" \
--description "Suppress findings from test data bucket" \
--action ARCHIVE \
--finding-criteria '{
"criterion": {
"resourcesAffected.s3Bucket.name": {
"eq": ["test-data-bucket"]
}
}
}'
# Suppress specific data type findings
aws macie2 create-findings-filter \
--name "Suppress-Employee-IDs" \
--description "Expected employee IDs in HR bucket" \
--action ARCHIVE \
--finding-criteria '{
"criterion": {
"resourcesAffected.s3Bucket.name": {
"eq": ["hr-documents"]
},
"classificationDetails.result.customDataIdentifiers.detections.name": {
"eq": ["EmployeeID"]
}
}
}'
# List suppression rules
aws macie2 list-findings-filtersSet Up Notifications
EventBridge Integration
# Create SNS topic
aws sns create-topic --name macie-alerts
# Create EventBridge rule for high severity findings
aws events put-rule \
--name "MacieHighSeverity" \
--event-pattern '{
"source": ["aws.macie"],
"detail-type": ["Macie Finding"],
"detail": {
"severity": {
"description": ["High", "Critical"]
}
}
}'
# Add SNS target
aws events put-targets \
--rule MacieHighSeverity \
--targets Id=1,Arn=arn:aws:sns:us-east-1:123456789012:macie-alerts
# Configure SNS access
aws sns set-topic-attributes \
--topic-arn arn:aws:sns:us-east-1:123456789012:macie-alerts \
--attribute-name Policy \
--attribute-value '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "events.amazonaws.com"},
"Action": "sns:Publish",
"Resource": "arn:aws:sns:us-east-1:123456789012:macie-alerts"
}]
}'Security Hub Integration
Macie automatically sends findings to Security Hub when both are enabled:
# Verify findings in Security Hub
aws securityhub get-findings \
--filters '{
"ProductName": [{"Value": "Macie", "Comparison": "EQUALS"}]
}'Export Findings
# Configure findings export to S3
aws macie2 put-findings-publication-configuration \
--security-hub-configuration '{
"publishClassificationFindings": true,
"publishPolicyFindings": true
}'
# Export finding statistics
aws macie2 get-finding-statistics \
--group-by SEVERITY_DESCRIPTION \
--finding-criteria '{
"criterion": {
"category": {"eq": ["SENSITIVE_DATA"]}
}
}'
# Get aggregated findings by bucket
aws macie2 get-finding-statistics \
--group-by resourcesAffected.s3Bucket.name \
--sort-criteria '{
"attributeName": "count",
"orderBy": "DESC"
}'Cost Optimization
# Exclude buckets from discovery
aws macie2 update-classification-scope \
--s3 '{
"excludes": {
"bucketNames": ["logs-bucket", "cloudtrail-bucket"]
}
}'
# Use sampling for large buckets
# Set sampling percentage when creating jobs
# Review job costs
aws macie2 get-usage-statistics \
--filter-by '[{"comparator":"EQ","key":"accountId","values":["123456789012"]}]' \
--time-range MONTH_TO_DATE
# Get usage totals by type
aws macie2 get-usage-totalsBest Practices
| Practice | Recommendation |
|---|---|
| Coverage | Enable for all accounts via Organizations |
| Scheduling | Run weekly scans on critical data buckets |
| Custom Identifiers | Create identifiers for organization-specific data |
| Exclusions | Exclude log and audit buckets to reduce costs |
| Sampling | Use 10-20% sampling for large data lakes |
| Alerts | Configure notifications for high severity findings |
| Remediation | Establish SLAs for finding remediation by severity |
Remediation Actions
When sensitive data is discovered:
- Verify finding - Review the actual data detected
- Assess risk - Determine exposure level and compliance impact
- Remediate:
- Delete unnecessary sensitive data
- Encrypt unencrypted sensitive data
- Restrict bucket access
- Move to appropriate storage tier
- Document - Record findings and actions taken
- Prevent - Update policies to prevent recurrence