Google Kubernetes Engine (GKE) cost allocation provides detailed insights into resource consumption and costs at the namespace, deployment, and pod level. This guide walks you through enabling cost allocation metering to track and attribute Kubernetes costs accurately across teams and applications.
Prerequisites
Before you begin, ensure you have:
- A GKE cluster (Standard or Autopilot mode)
- Kubernetes Engine Admin role or Container Cluster Admin role
kubectlcommand-line tool installed and configuredgcloudCLI installed and authenticated- BigQuery billing exports enabled (recommended for full cost visibility)
- Basic understanding of Kubernetes namespaces, pods, and resources
Understanding GKE Cost Allocation
GKE cost allocation helps you answer questions like:
- Which teams are consuming the most resources? - Track costs by namespace
- What is the ROI of each application? - Measure resource consumption per workload
- How can we optimize costs? - Identify over-provisioned or idle resources
- How do we chargeback cloud costs? - Attribute costs to specific business units
Key features:
- Namespace-level breakdowns: See costs per namespace
- Resource utilization tracking: CPU, memory, storage, and network usage
- Pod-level granularity: Drill down to individual pods and containers
- Integration with BigQuery: Export data for custom analysis
- GKE usage metering: Track compute, storage, and network costs separately
Step-by-Step Guide
Phase 1: Enable GKE Usage Metering (Resource Consumption)
GKE usage metering collects CPU, memory, and storage consumption data at the pod and namespace level.
For New Clusters (During Creation)
Using Google Cloud Console
-
Navigate to GKE
- Open the Google Cloud Console
- Go to Kubernetes Engine > Clusters
- Click CREATE
-
Configure Cluster Basics
- Choose Standard or Autopilot mode
- Set cluster name, location, and other basic settings
-
Enable Cost Allocation
- Expand Features section
- Under "Resource usage export", check Enable usage metering
- Dataset: Select or create a BigQuery dataset for usage data
- Consumption metering: Check to enable (tracks CPU/memory usage)
- Network egress metering: Check to enable (tracks network costs)
-
Create Cluster
- Complete other configuration steps
- Click CREATE
Using gcloud CLI
# Create cluster with usage metering enabled
gcloud container clusters create my-cluster \
--zone=us-central1-a \
--enable-cloud-logging \
--enable-cloud-monitoring \
--resource-usage-bigquery-dataset=PROJECT_ID.DATASET_NAME \
--enable-network-egress-metering \
--enable-resource-consumption-metering
Example:
gcloud container clusters create production-cluster \
--zone=us-central1-a \
--num-nodes=3 \
--machine-type=n1-standard-2 \
--resource-usage-bigquery-dataset=finops-billing-prod.gke_usage_metering \
--enable-network-egress-metering \
--enable-resource-consumption-metering
For Existing Clusters (Enable After Creation)
Using Google Cloud Console
-
Navigate to Your Cluster
- Go to Kubernetes Engine > Clusters
- Click on your cluster name
-
Edit Cluster Settings
- Click the EDIT button at the top
- Scroll to "Resource usage export"
-
Enable Usage Metering
- Check Enable usage metering
- BigQuery dataset: Select a dataset in the format
PROJECT_ID.DATASET_NAME - Check Enable resource consumption metering
- Check Enable network egress metering (optional but recommended)
-
Save Changes
- Click SAVE
- Changes take effect immediately
Using gcloud CLI
# Enable usage metering on existing cluster
gcloud container clusters update CLUSTER_NAME \
--zone=ZONE \
--resource-usage-bigquery-dataset=PROJECT_ID.DATASET_NAME \
--enable-network-egress-metering \
--enable-resource-consumption-metering
Example:
gcloud container clusters update production-cluster \
--zone=us-central1-a \
--resource-usage-bigquery-dataset=finops-billing-prod.gke_usage_metering \
--enable-network-egress-metering \
--enable-resource-consumption-metering
Verify Usage Metering is Enabled
# Check cluster configuration
gcloud container clusters describe CLUSTER_NAME \
--zone=ZONE \
--format="value(resourceUsageExportConfig)"
Expected output:
bigqueryDestination:
datasetId: gke_usage_metering
enableNetworkEgressMetering: true
consumptionMeteringConfig:
enabled: true
Phase 2: Create BigQuery Dataset for Usage Data
If you haven't already created a dataset:
# Create dataset for GKE usage metering
bq mk \
--dataset \
--location=US \
--description="GKE usage metering and cost allocation data" \
PROJECT_ID:gke_usage_metering
Phase 3: Label Resources for Cost Attribution
Apply labels to namespaces, pods, and services for better cost tracking:
Label Namespaces
# Label namespace with team and environment
kubectl label namespace production \
team=backend \
environment=prod \
cost-center=engineering
kubectl label namespace staging \
team=backend \
environment=staging \
cost-center=engineering
Label Deployments
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
labels:
app: my-app
team: backend
cost-center: engineering
environment: prod
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
team: backend
cost-center: engineering
spec:
containers:
- name: my-app
image: gcr.io/my-project/my-app:latest
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1Gi"
Apply with:
kubectl apply -f deployment.yaml
Phase 4: Set Resource Requests and Limits
For accurate cost allocation, all pods must have resource requests defined:
Example Pod with Resource Requests
apiVersion: v1
kind: Pod
metadata:
name: cost-tracked-pod
namespace: production
labels:
team: frontend
spec:
containers:
- name: app
image: nginx:latest
resources:
requests:
cpu: "250m" # Required for cost allocation
memory: "256Mi" # Required for cost allocation
limits:
cpu: "500m"
memory: "512Mi"
Check Pods Without Resource Requests
# Find pods without CPU requests (won't be tracked accurately)
kubectl get pods --all-namespaces -o json | \
jq -r '.items[] | select(.spec.containers[].resources.requests.cpu == null) |
"\(.metadata.namespace)/\(.metadata.name)"'
Phase 5: Query Usage Data in BigQuery
After enabling usage metering, data will populate in BigQuery within 24 hours.
View Usage Data Tables
# List tables in the usage metering dataset
bq ls PROJECT_ID:gke_usage_metering
You'll see tables like:
gke_cluster_resource_usage- CPU and memory consumptiongke_cluster_resource_consumption- Detailed resource consumption with labelsgke_network_egress- Network egress costs (if enabled)
Query Total Cost by Namespace
-- Monthly cost by namespace
SELECT
cluster_name,
namespace_name,
ROUND(SUM(cpu_core_seconds) / 3600, 2) AS cpu_core_hours,
ROUND(SUM(memory_byte_seconds) / (1024*1024*1024*3600), 2) AS memory_gb_hours,
-- Estimate cost (adjust rates based on your machine types)
ROUND((SUM(cpu_core_seconds) / 3600) * 0.0327, 2) AS estimated_cpu_cost,
ROUND((SUM(memory_byte_seconds) / (1024*1024*1024*3600)) * 0.0044, 2) AS estimated_memory_cost
FROM
`PROJECT_ID.gke_usage_metering.gke_cluster_resource_consumption`
WHERE
usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
cluster_name,
namespace_name
ORDER BY
estimated_cpu_cost DESC
Query Cost by Label (Team Attribution)
-- Cost by team label
SELECT
cluster_name,
resource_labels.value AS team,
ROUND(SUM(cpu_core_seconds) / 3600, 2) AS cpu_core_hours,
ROUND(SUM(memory_byte_seconds) / (1024*1024*1024*3600), 2) AS memory_gb_hours,
ROUND((SUM(cpu_core_seconds) / 3600) * 0.0327 +
(SUM(memory_byte_seconds) / (1024*1024*1024*3600)) * 0.0044, 2) AS estimated_total_cost
FROM
`PROJECT_ID.gke_usage_metering.gke_cluster_resource_consumption`,
UNNEST(resource_labels) AS resource_labels
WHERE
usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
AND resource_labels.key = 'team'
GROUP BY
cluster_name,
team
ORDER BY
estimated_total_cost DESC
Query Cost by Pod
-- Top 10 most expensive pods
SELECT
cluster_name,
namespace_name,
pod_name,
ROUND(SUM(cpu_core_seconds) / 3600, 2) AS cpu_core_hours,
ROUND(SUM(memory_byte_seconds) / (1024*1024*1024*3600), 2) AS memory_gb_hours,
ROUND((SUM(cpu_core_seconds) / 3600) * 0.0327 +
(SUM(memory_byte_seconds) / (1024*1024*1024*3600)) * 0.0044, 2) AS estimated_total_cost
FROM
`PROJECT_ID.gke_usage_metering.gke_cluster_resource_consumption`
WHERE
usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY
cluster_name,
namespace_name,
pod_name
ORDER BY
estimated_total_cost DESC
LIMIT 10
Phase 6: Integrate with Cloud Billing Data
Combine GKE usage data with actual billing costs for precise cost attribution:
-- Join usage data with actual billing costs
WITH gke_usage AS (
SELECT
cluster_name,
namespace_name,
SUM(cpu_core_seconds) / 3600 AS cpu_core_hours,
SUM(memory_byte_seconds) / (1024*1024*1024*3600) AS memory_gb_hours
FROM
`PROJECT_ID.gke_usage_metering.gke_cluster_resource_consumption`
WHERE
usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
cluster_name,
namespace_name
),
billing_costs AS (
SELECT
resource.name AS cluster_name,
SUM(cost) AS total_cost
FROM
`PROJECT_ID.billing_data.gcp_billing_export_v1_*`
WHERE
service.description = 'Kubernetes Engine'
AND _TABLE_SUFFIX >= FORMAT_DATE('%Y%m01', DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH))
GROUP BY
cluster_name
)
SELECT
u.cluster_name,
u.namespace_name,
u.cpu_core_hours,
u.memory_gb_hours,
b.total_cost AS cluster_total_cost,
-- Proportionally allocate cost based on resource usage
ROUND((u.cpu_core_hours / SUM(u.cpu_core_hours) OVER (PARTITION BY u.cluster_name)) * b.total_cost, 2) AS allocated_cost
FROM
gke_usage u
JOIN
billing_costs b
ON
u.cluster_name = b.cluster_name
ORDER BY
allocated_cost DESC
Best Practices
Resource Management
- Always set resource requests: Pods without requests won't be accurately tracked
- Set realistic limits: Prevents cost overruns from runaway containers
- Use resource quotas: Enforce namespace-level resource limits
- Monitor utilization: Identify over-provisioned resources
Labeling Strategy
- Consistent labels: Standardize labels across all clusters
- Required labels:
team,environment,cost-center,app - Apply at namespace level: Labels propagate to all pods
- Document label taxonomy: Create a label governance document
Cost Optimization
- Rightsize nodes: Use appropriate machine types for workloads
- Enable autoscaling: HPA for pods, cluster autoscaler for nodes
- Use spot VMs: Reduce costs by 60-90% for fault-tolerant workloads
- Review idle resources: Delete unused deployments and services
Monitoring and Alerting
- Set budget alerts: Notify teams when costs exceed thresholds
- Track trends: Monitor cost changes month-over-month
- Create dashboards: Visualize costs in Looker Studio or Grafana
- Regular reviews: Monthly cost review meetings per team
Troubleshooting
No Data in BigQuery
Problem: Usage metering enabled but no data in BigQuery
Solution:
- Wait 24-48 hours for initial data population
- Verify dataset exists:
bq ls PROJECT_ID:gke_usage_metering - Check cluster configuration:
gcloud container clusters describe CLUSTER_NAME - Ensure pods have resource requests defined
- Verify GKE service account has BigQuery Data Editor role
Permission Denied Errors
Problem: Cannot enable usage metering or query data
Solution:
- Verify you have Kubernetes Engine Admin role
- Ensure BigQuery API is enabled in the project
- Grant
roles/bigquery.dataEditorto GKE service account - Check project-level IAM bindings
Incomplete Cost Data
Problem: Some pods not showing in cost allocation
Solution:
- All pods must have CPU and memory requests defined
- Check for pods without requests: See Phase 4 query above
- Update deployments to include resource requests
- Wait 24 hours for updated data to populate
High Query Costs
Problem: BigQuery queries consuming excessive budget
Solution:
- Use date partitioning: Filter on
usage_start_time - Limit query scope: Query specific namespaces or time ranges
- Create materialized views: Pre-aggregate common queries
- Set query quotas: Limit per-user query costs
Labels Not Appearing
Problem: Resource labels missing from BigQuery data
Solution:
- Verify labels are applied:
kubectl describe pod POD_NAME - Labels must be on pod template, not just deployment metadata
- Wait 24 hours for label changes to propagate
- Check label format: Must be lowercase, alphanumeric, and hyphens
Next Steps
After enabling GKE cost allocation:
- Create cost dashboards: Build Looker Studio reports for team visibility
- Implement chargebacks: Allocate costs to business units or projects
- Optimize resources: Identify and eliminate waste
- Set budget alerts: Notify teams of cost overruns
- Automate reporting: Schedule BigQuery queries for regular cost reports
- Enable pod autoscaling: Use HPA and VPA for dynamic resource management
Related Resources
Frequently Asked Questions
Find answers to common questions
Need Professional Help?
Our team of experts can help you implement and configure these solutions for your organization.