How to Backfill Missing Azure Billing Data in Synapse or Power BI
When setting up Azure Cost Management exports or migrating to new analytics platforms, you may discover gaps in your historical billing data. These gaps can occur due to export configuration changes, storage account migrations, or delayed export setup. This guide explains how to retrieve missing historical billing data and merge it with your existing datasets in Azure Synapse Analytics or Power BI for complete cost analysis and reporting.
Backfilling ensures you have continuous historical data for trend analysis, forecasting, and year-over-year comparisons without gaps that could skew financial reporting.
Prerequisites
Before you begin, ensure you have:
- Billing Account Reader or Billing Account Contributor role on the billing account
- Access to the Azure portal (portal.azure.com)
- Existing Azure Storage account for cost data exports
- Azure Synapse Analytics workspace or Power BI Desktop installed
- Storage Blob Data Contributor role on the storage account
- Understanding of the date range of missing data
- PowerShell 7.0+ or Azure CLI installed for automation scripts
Understanding Data Backfill Options
Azure provides several methods to retrieve historical billing data:
1. Cost Management Portal Download (Manual)
- Range: Up to 13 months of historical data
- Format: CSV files
- Best for: Small one-time backfills
2. Cost Management Export API (Programmatic)
- Range: Up to 13 months for standard subscriptions
- Format: CSV files in Azure Storage
- Best for: Automated backfills and scheduled refreshes
3. Usage Details API (Advanced)
- Range: Up to 13 months (varies by billing agreement)
- Format: JSON response
- Best for: Custom data processing and transformation
4. Backfill via Existing Export Recreation
- Range: Limited to export retention settings
- Format: Same as existing exports
- Best for: Filling small gaps
Step-by-Step Guide
Method 1: Manual Backfill via Cost Management Portal
Step 1: Download Historical Usage Data
- Sign in to the Azure portal
- Navigate to Cost Management + Billing
- Select your billing scope (subscription, resource group, or billing account)
- Click Cost analysis in the left menu
- Configure the date range:
- Click the date selector at the top
- Select Custom date range
- Set start and end dates for the missing period
- Click Download in the toolbar
- Select Usage details (CSV) or Amortized cost
- Click Download to save the CSV file
Step 2: Upload to Azure Storage
# Connect to Azure
Connect-AzAccount
# Define variables
$storageAccountName = "costmgmtstorage"
$resourceGroupName = "finops-rg"
$containerName = "cost-exports"
$localFilePath = "C:\Downloads\usage-data-2024-01.csv"
$blobName = "manual-backfill/usage-2024-01.csv"
# Get storage account context
$ctx = (Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName).Context
# Upload the file
Set-AzStorageBlobContent `
-File $localFilePath `
-Container $containerName `
-Blob $blobName `
-Context $ctx `
-Force
Method 2: Automated Backfill Using PowerShell
Create a PowerShell script to download multiple months of historical data:
# Install required modules
Install-Module -Name Az.CostManagement -Force -Scope CurrentUser
Install-Module -Name Az.Storage -Force -Scope CurrentUser
# Connect to Azure
Connect-AzAccount
# Define parameters
$subscriptionId = "12345678-1234-1234-1234-123456789012"
$storageAccountName = "costmgmtstorage"
$resourceGroupName = "finops-rg"
$containerName = "cost-exports"
# Define date range to backfill (e.g., January 2024 to March 2024)
$startDate = Get-Date "2024-01-01"
$endDate = Get-Date "2024-03-31"
# Get storage context
$ctx = (Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName).Context
# Function to download usage data for a specific month
function Get-MonthlyUsageData {
param(
[DateTime]$Month,
[string]$Subscription,
[object]$StorageContext,
[string]$Container
)
$monthStart = Get-Date -Year $Month.Year -Month $Month.Month -Day 1
$monthEnd = $monthStart.AddMonths(1).AddDays(-1)
Write-Host "Downloading usage data for $($monthStart.ToString('yyyy-MM'))..."
# Create export for the month
$exportName = "backfill-$($monthStart.ToString('yyyy-MM'))"
$blobPath = "backfill/$($monthStart.ToString('yyyy-MM'))"
# Note: This uses the Cost Management API to generate exports
# The actual API call would be:
$scope = "/subscriptions/$Subscription"
# Download usage details using Azure CLI (more reliable for historical data)
$fileName = "usage-$($monthStart.ToString('yyyy-MM')).csv"
$localPath = Join-Path $env:TEMP $fileName
# Use Azure CLI to download
az costmanagement query `
--type "ActualCost" `
--dataset-granularity "Daily" `
--dataset-aggregation "totalCost=sum(PreTaxCost)" `
--timeframe "Custom" `
--time-period from=$($monthStart.ToString('yyyy-MM-dd')) to=$($monthEnd.ToString('yyyy-MM-dd')) `
--scope $scope `
--output json | ConvertFrom-Json | Export-Csv -Path $localPath -NoTypeInformation
# Upload to storage
Set-AzStorageBlobContent `
-File $localPath `
-Container $Container `
-Blob "$blobPath/$fileName" `
-Context $StorageContext `
-Force
Write-Host "Uploaded $fileName to $blobPath"
# Clean up local file
Remove-Item $localPath -Force
}
# Loop through each month and download
$currentMonth = $startDate
while ($currentMonth -le $endDate) {
Get-MonthlyUsageData -Month $currentMonth -Subscription $subscriptionId -StorageContext $ctx -Container $containerName
$currentMonth = $currentMonth.AddMonths(1)
}
Write-Host "Backfill complete!"
Method 3: Backfill Using Azure CLI
#!/bin/bash
# Set variables
SUBSCRIPTION_ID="12345678-1234-1234-1234-123456789012"
STORAGE_ACCOUNT="costmgmtstorage"
RESOURCE_GROUP="finops-rg"
CONTAINER="cost-exports"
# Date range to backfill
START_DATE="2024-01-01"
END_DATE="2024-03-31"
# Login to Azure
az login
# Function to download monthly data
download_month() {
local year=$1
local month=$2
# Calculate first and last day of month
local first_day="${year}-${month}-01"
local last_day=$(date -d "${first_day} +1 month -1 day" +%Y-%m-%d)
echo "Downloading data for ${year}-${month}..."
# Download usage details
local filename="usage-${year}-${month}.json"
az costmanagement query \
--type "Usage" \
--timeframe "Custom" \
--time-period from="${first_day}" to="${last_day}" \
--dataset-granularity "Daily" \
--scope "/subscriptions/${SUBSCRIPTION_ID}" \
--output json > "${filename}"
# Upload to storage
az storage blob upload \
--account-name "${STORAGE_ACCOUNT}" \
--container-name "${CONTAINER}" \
--name "backfill/${year}-${month}/${filename}" \
--file "${filename}" \
--auth-mode login
# Clean up
rm "${filename}"
echo "Completed ${year}-${month}"
}
# Loop through months
current_date="${START_DATE}"
while [[ "${current_date}" < "${END_DATE}" ]]; do
year=$(date -d "${current_date}" +%Y)
month=$(date -d "${current_date}" +%m)
download_month "${year}" "${month}"
# Move to next month
current_date=$(date -d "${current_date} +1 month" +%Y-%m-01)
done
echo "Backfill complete!"
Method 4: Backfill Using Cost Management Export API
# Create a one-time export for historical data
$exportName = "historical-backfill-2024-Q1"
$scope = "/subscriptions/12345678-1234-1234-1234-123456789012"
$storageAccountId = "/subscriptions/12345678-1234-1234-1234-123456789012/resourceGroups/finops-rg/providers/Microsoft.Storage/storageAccounts/costmgmtstorage"
$containerName = "cost-exports"
$directory = "backfill"
# Define the export configuration
$exportParams = @{
Scope = $scope
ExportName = $exportName
Definition = @{
Type = "Usage"
Timeframe = "Custom"
TimePeriod = @{
From = "2024-01-01T00:00:00Z"
To = "2024-03-31T23:59:59Z"
}
DataSet = @{
Granularity = "Daily"
Configuration = @{
Columns = @(
"Date", "MeterId", "ResourceId", "ResourceLocation",
"ConsumedService", "ResourceGroup", "InstanceName",
"Cost", "UnitPrice", "Quantity", "Tags"
)
}
}
}
DeliveryInfo = @{
Destination = @{
ResourceId = $storageAccountId
Container = $containerName
RootFolderPath = $directory
}
}
Format = "Csv"
Schedule = @{
Status = "Inactive" # One-time export
}
}
# Create the export
New-AzCostManagementExport @exportParams
# Execute the export immediately
Start-AzCostManagementExport -Scope $scope -ExportName $exportName
Write-Host "Export created and started. Check the storage account for results."
Merging Backfilled Data in Azure Synapse Analytics
Step 1: Create External Table for Backfill Data
-- Create external file format for CSV
CREATE EXTERNAL FILE FORMAT csv_format
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = FALSE,
ENCODING = 'UTF8'
)
);
-- Create external data source
CREATE EXTERNAL DATA SOURCE backfill_data
WITH (
LOCATION = 'wasbs://[email protected]/backfill/',
CREDENTIAL = storage_credential
);
-- Create external table for backfill data
CREATE EXTERNAL TABLE ext_backfill_usage (
[Date] DATE,
[MeterId] NVARCHAR(100),
[ResourceId] NVARCHAR(500),
[ResourceLocation] NVARCHAR(100),
[ConsumedService] NVARCHAR(100),
[ResourceGroup] NVARCHAR(200),
[InstanceName] NVARCHAR(500),
[Cost] DECIMAL(18,4),
[UnitPrice] DECIMAL(18,8),
[Quantity] DECIMAL(18,4),
[Tags] NVARCHAR(MAX)
)
WITH (
LOCATION = '/',
DATA_SOURCE = backfill_data,
FILE_FORMAT = csv_format,
REJECT_TYPE = VALUE,
REJECT_VALUE = 100
);
Step 2: Merge with Existing Data
-- Create a unified view that combines existing and backfilled data
CREATE OR ALTER VIEW vw_unified_cost_data
AS
SELECT
[Date],
[MeterId],
[ResourceId],
[ResourceLocation],
[ConsumedService],
[ResourceGroup],
[InstanceName],
[Cost],
[UnitPrice],
[Quantity],
[Tags],
'existing' AS DataSource
FROM dbo.existing_cost_data
UNION ALL
SELECT
[Date],
[MeterId],
[ResourceId],
[ResourceLocation],
[ConsumedService],
[ResourceGroup],
[InstanceName],
[Cost],
[UnitPrice],
[Quantity],
[Tags],
'backfill' AS DataSource
FROM ext_backfill_usage
WHERE [Date] NOT IN (
-- Exclude dates already in existing data to avoid duplicates
SELECT DISTINCT [Date] FROM dbo.existing_cost_data
);
-- Materialize the unified data into a new table
CREATE TABLE dbo.unified_cost_data
WITH (
DISTRIBUTION = HASH([Date]),
CLUSTERED COLUMNSTORE INDEX
)
AS
SELECT *
FROM vw_unified_cost_data;
-- Create statistics for query optimization
CREATE STATISTICS stat_date ON dbo.unified_cost_data([Date]);
CREATE STATISTICS stat_resourceid ON dbo.unified_cost_data([ResourceId]);
CREATE STATISTICS stat_cost ON dbo.unified_cost_data([Cost]);
Step 3: Verify Data Completeness
-- Check for gaps in date coverage
WITH DateSequence AS (
SELECT
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1, '2024-01-01') AS ExpectedDate
FROM sys.all_columns
WHERE DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1, '2024-01-01') <= '2024-12-31'
),
ActualDates AS (
SELECT DISTINCT [Date] AS ActualDate
FROM dbo.unified_cost_data
)
SELECT
ds.ExpectedDate,
CASE WHEN ad.ActualDate IS NULL THEN 'Missing' ELSE 'Present' END AS Status
FROM DateSequence ds
LEFT JOIN ActualDates ad ON ds.ExpectedDate = ad.ActualDate
WHERE ad.ActualDate IS NULL
ORDER BY ds.ExpectedDate;
-- Compare record counts before and after backfill
SELECT
'Before Backfill' AS Period,
COUNT(*) AS RecordCount,
MIN([Date]) AS FirstDate,
MAX([Date]) AS LastDate,
SUM([Cost]) AS TotalCost
FROM dbo.existing_cost_data
UNION ALL
SELECT
'After Backfill' AS Period,
COUNT(*) AS RecordCount,
MIN([Date]) AS FirstDate,
MAX([Date]) AS LastDate,
SUM([Cost]) AS TotalCost
FROM dbo.unified_cost_data;
Merging Backfilled Data in Power BI
Step 1: Load Backfill Data from Azure Storage
let
// Connect to Azure Storage
Source = AzureStorage.Blobs("https://costmgmtstorage.blob.core.windows.net"),
// Filter to backfill container
BackfillContainer = Source{[Name="cost-exports"]}[Data],
BackfillFolder = BackfillContainer{[Name="backfill"]}[Data],
// Get all CSV files
FilteredFiles = Table.SelectRows(BackfillFolder, each Text.EndsWith([Name], ".csv")),
// Combine all CSV files
CombinedCSV = Table.Combine(
List.Transform(
Table.ToRecords(FilteredFiles),
each Csv.Document([Content], [Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.Csv])
)
),
// Promote headers
PromotedHeaders = Table.PromoteHeaders(CombinedCSV, [PromoteAllScalars=true]),
// Transform data types
TypedData = Table.TransformColumnTypes(PromotedHeaders, {
{"Date", type date},
{"Cost", type number},
{"Quantity", type number},
{"UnitPrice", type number},
{"MeterId", type text},
{"ResourceId", type text},
{"ResourceGroup", type text}
}),
// Add data source tag
AddedSource = Table.AddColumn(TypedData, "DataSource", each "Backfill", type text)
in
AddedSource
Step 2: Merge with Existing Data
let
// Load existing cost data
ExistingData = #"Existing Cost Data",
AddSourceTag = Table.AddColumn(ExistingData, "DataSource", each "Current Export", type text),
// Load backfill data
BackfillData = #"Backfill Data",
// Combine both datasets
CombinedData = Table.Combine({AddSourceTag, BackfillData}),
// Remove duplicates (keep existing data if overlap)
RemovedDuplicates = Table.Distinct(CombinedData, {"Date", "ResourceId", "MeterId"}),
// Sort by date
SortedData = Table.Sort(RemovedDuplicates, {{"Date", Order.Ascending}})
in
SortedData
Step 3: Create Date Validation Table
let
// Create complete date sequence
StartDate = #date(2024, 1, 1),
EndDate = #date(2024, 12, 31),
DateList = List.Dates(StartDate, Duration.Days(EndDate - StartDate) + 1, #duration(1, 0, 0, 0)),
DateTable = Table.FromList(DateList, Splitter.SplitByNothing(), {"ExpectedDate"}, null, ExtraValues.Error),
// Get actual dates from data
ActualDates = Table.Distinct(#"Merged Cost Data", {"Date"}),
RenamedActual = Table.RenameColumns(ActualDates, {{"Date", "ActualDate"}}),
// Left join to find gaps
MergedDates = Table.NestedJoin(DateTable, {"ExpectedDate"}, RenamedActual, {"ActualDate"}, "ActualData", JoinKind.LeftOuter),
ExpandedDates = Table.ExpandTableColumn(MergedDates, "ActualData", {"ActualDate"}, {"ActualDate"}),
// Add status column
AddedStatus = Table.AddColumn(ExpandedDates, "Status", each if [ActualDate] = null then "Missing" else "Present"),
// Filter to show only missing dates
MissingDates = Table.SelectRows(AddedStatus, each ([Status] = "Missing"))
in
MissingDates
Best Practices
1. Validate Before Merging
Always verify the integrity of backfilled data before merging:
- Check date ranges for completeness
- Verify cost totals match known invoices
- Ensure no duplicate records
- Validate data schema consistency
2. Maintain Data Lineage
Tag backfilled data with metadata:
ALTER TABLE dbo.unified_cost_data
ADD
DataSource NVARCHAR(50),
BackfillDate DATETIME2 DEFAULT GETDATE(),
BackfillMethod NVARCHAR(100);
UPDATE dbo.unified_cost_data
SET
DataSource = 'Manual Backfill',
BackfillMethod = 'Cost Management Portal Download'
WHERE DataSource IS NULL;
3. Implement Incremental Refresh
Configure Power BI for incremental refresh to handle large datasets:
// Add RangeStart and RangeEnd parameters
let
Source = #"Merged Cost Data",
FilteredRows = Table.SelectRows(Source,
each [Date] >= RangeStart and [Date] < RangeEnd)
in
FilteredRows
4. Archive Original Backfill Files
Keep original backfill files separate from ongoing exports:
# Create archive folder structure
az storage blob directory create \
--account-name costmgmtstorage \
--container-name cost-exports \
--directory-path "backfill-archive/$(date +%Y-%m-%d)" \
--auth-mode login
5. Document the Backfill Process
Create a backfill log table:
CREATE TABLE dbo.backfill_log (
BackfillId INT IDENTITY(1,1) PRIMARY KEY,
BackfillDate DATETIME2 DEFAULT GETDATE(),
DateRangeStart DATE,
DateRangeEnd DATE,
RecordsAdded INT,
TotalCostAdded DECIMAL(18,2),
BackfillMethod NVARCHAR(200),
ExecutedBy NVARCHAR(100),
Notes NVARCHAR(MAX)
);
-- Log backfill operations
INSERT INTO dbo.backfill_log (DateRangeStart, DateRangeEnd, RecordsAdded, TotalCostAdded, BackfillMethod, ExecutedBy, Notes)
SELECT
'2024-01-01',
'2024-03-31',
COUNT(*),
SUM([Cost]),
'PowerShell Script - Cost Management API',
SYSTEM_USER,
'Q1 2024 backfill for new Synapse workspace';
Troubleshooting
Issue: Downloaded data has different schema than existing exports
Cause: Cost Management export schema changes over time or different export types selected.
Solution:
-
Compare schemas:
-- Check existing schema SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'existing_cost_data'; -- Check backfill schema EXEC sp_describe_first_result_set N'SELECT * FROM ext_backfill_usage';
-
Create a mapping/transformation layer:
CREATE VIEW vw_normalized_backfill AS SELECT [Date], [MeterId], [ResourceId], -- Map old column names to new ones ISNULL([Cost], [PreTaxCost]) AS Cost, ISNULL([UnitPrice], [EffectivePrice]) AS UnitPrice FROM ext_backfill_usage;
Issue: Duplicate records after merging
Cause: Overlapping date ranges between backfill and existing data.
Solution:
-- Identify duplicates
SELECT
[Date],
[ResourceId],
[MeterId],
COUNT(*) AS DuplicateCount
FROM dbo.unified_cost_data
GROUP BY [Date], [ResourceId], [MeterId]
HAVING COUNT(*) > 1;
-- Remove duplicates keeping the most recent record
WITH CTE AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY [Date], [ResourceId], [MeterId]
ORDER BY BackfillDate DESC
) AS RowNum
FROM dbo.unified_cost_data
)
DELETE FROM CTE WHERE RowNum > 1;
Issue: Cost Management API returns "Data not available" for historical dates
Cause: Data retention limits or billing account type restrictions.
Solution:
-
Check data retention limits:
- Standard subscriptions: 13 months
- Enterprise Agreement: Up to 36 months
- Microsoft Customer Agreement: Up to 13 months
-
Use alternate data sources:
# Try the legacy API az consumption usage list \ --start-date 2024-01-01 \ --end-date 2024-01-31 \ --output json
-
Contact Azure Support for data older than retention period
Issue: Power BI refresh fails with "Out of Memory" error
Cause: Too much data being loaded at once.
Solution:
-
Enable incremental refresh in Power BI:
- Configure RangeStart and RangeEnd parameters
- Set refresh policy (e.g., refresh last 6 months, archive 2 years)
-
Use DirectQuery for large datasets:
// Use DirectQuery to Synapse instead of Import let Source = Sql.Database("synapse-workspace.sql.azuresynapse.net", "cost_db", [Query="SELECT * FROM dbo.unified_cost_data", EnableDirectQuery=true]) in Source
Issue: Backfill data shows inconsistent costs compared to invoices
Cause: Different cost types (actual vs. amortized) or currency conversion.
Solution:
-
Verify cost type in exports:
# Ensure using the same cost metric Get-AzCostManagementExport -Scope $scope -ExportName $exportName | Select-Object -ExpandProperty Definition | Select-Object Type
-
Reconcile with invoice data:
SELECT YEAR([Date]) AS Year, MONTH([Date]) AS Month, SUM([Cost]) AS ExportedCost FROM dbo.unified_cost_data GROUP BY YEAR([Date]), MONTH([Date]) ORDER BY Year, Month; -- Compare with invoice amounts
Next Steps
After successfully backfilling missing billing data, consider these related tasks:
- Merge historical and new data: How to Merge Historical and New Cost Export Data in Azure
- Monitor export data freshness: How to Monitor Cost Export Status and Data Freshness in Azure
- Visualize complete dataset: How to Visualize Azure Billing Data in Power BI
- Set up automated exports: How to Enable Cost Management Export to Azure Storage
Related Resources
Frequently Asked Questions
Find answers to common questions
Need Professional Help?
Our team of experts can help you implement and configure these solutions for your organization.