How to Backfill Missing Azure Billing Data in Synapse or Power BI

Retrieve and merge historical billing data for complete cost reports

15 min readUpdated January 2025

How to Backfill Missing Azure Billing Data in Synapse or Power BI

When setting up Azure Cost Management exports or migrating to new analytics platforms, you may discover gaps in your historical billing data. These gaps can occur due to export configuration changes, storage account migrations, or delayed export setup. This guide explains how to retrieve missing historical billing data and merge it with your existing datasets in Azure Synapse Analytics or Power BI for complete cost analysis and reporting.

Backfilling ensures you have continuous historical data for trend analysis, forecasting, and year-over-year comparisons without gaps that could skew financial reporting.

Prerequisites

Before you begin, ensure you have:

  • Billing Account Reader or Billing Account Contributor role on the billing account
  • Access to the Azure portal (portal.azure.com)
  • Existing Azure Storage account for cost data exports
  • Azure Synapse Analytics workspace or Power BI Desktop installed
  • Storage Blob Data Contributor role on the storage account
  • Understanding of the date range of missing data
  • PowerShell 7.0+ or Azure CLI installed for automation scripts

Understanding Data Backfill Options

Azure provides several methods to retrieve historical billing data:

1. Cost Management Portal Download (Manual)

  • Range: Up to 13 months of historical data
  • Format: CSV files
  • Best for: Small one-time backfills

2. Cost Management Export API (Programmatic)

  • Range: Up to 13 months for standard subscriptions
  • Format: CSV files in Azure Storage
  • Best for: Automated backfills and scheduled refreshes

3. Usage Details API (Advanced)

  • Range: Up to 13 months (varies by billing agreement)
  • Format: JSON response
  • Best for: Custom data processing and transformation

4. Backfill via Existing Export Recreation

  • Range: Limited to export retention settings
  • Format: Same as existing exports
  • Best for: Filling small gaps

Step-by-Step Guide

Method 1: Manual Backfill via Cost Management Portal

Step 1: Download Historical Usage Data

  1. Sign in to the Azure portal
  2. Navigate to Cost Management + Billing
  3. Select your billing scope (subscription, resource group, or billing account)
  4. Click Cost analysis in the left menu
  5. Configure the date range:
    • Click the date selector at the top
    • Select Custom date range
    • Set start and end dates for the missing period
  6. Click Download in the toolbar
  7. Select Usage details (CSV) or Amortized cost
  8. Click Download to save the CSV file

Step 2: Upload to Azure Storage

# Connect to Azure
Connect-AzAccount

# Define variables
$storageAccountName = "costmgmtstorage"
$resourceGroupName = "finops-rg"
$containerName = "cost-exports"
$localFilePath = "C:\Downloads\usage-data-2024-01.csv"
$blobName = "manual-backfill/usage-2024-01.csv"

# Get storage account context
$ctx = (Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName).Context

# Upload the file
Set-AzStorageBlobContent `
  -File $localFilePath `
  -Container $containerName `
  -Blob $blobName `
  -Context $ctx `
  -Force

Method 2: Automated Backfill Using PowerShell

Create a PowerShell script to download multiple months of historical data:

# Install required modules
Install-Module -Name Az.CostManagement -Force -Scope CurrentUser
Install-Module -Name Az.Storage -Force -Scope CurrentUser

# Connect to Azure
Connect-AzAccount

# Define parameters
$subscriptionId = "12345678-1234-1234-1234-123456789012"
$storageAccountName = "costmgmtstorage"
$resourceGroupName = "finops-rg"
$containerName = "cost-exports"

# Define date range to backfill (e.g., January 2024 to March 2024)
$startDate = Get-Date "2024-01-01"
$endDate = Get-Date "2024-03-31"

# Get storage context
$ctx = (Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName).Context

# Function to download usage data for a specific month
function Get-MonthlyUsageData {
    param(
        [DateTime]$Month,
        [string]$Subscription,
        [object]$StorageContext,
        [string]$Container
    )

    $monthStart = Get-Date -Year $Month.Year -Month $Month.Month -Day 1
    $monthEnd = $monthStart.AddMonths(1).AddDays(-1)

    Write-Host "Downloading usage data for $($monthStart.ToString('yyyy-MM'))..."

    # Create export for the month
    $exportName = "backfill-$($monthStart.ToString('yyyy-MM'))"
    $blobPath = "backfill/$($monthStart.ToString('yyyy-MM'))"

    # Note: This uses the Cost Management API to generate exports
    # The actual API call would be:
    $scope = "/subscriptions/$Subscription"

    # Download usage details using Azure CLI (more reliable for historical data)
    $fileName = "usage-$($monthStart.ToString('yyyy-MM')).csv"
    $localPath = Join-Path $env:TEMP $fileName

    # Use Azure CLI to download
    az costmanagement query `
      --type "ActualCost" `
      --dataset-granularity "Daily" `
      --dataset-aggregation "totalCost=sum(PreTaxCost)" `
      --timeframe "Custom" `
      --time-period from=$($monthStart.ToString('yyyy-MM-dd')) to=$($monthEnd.ToString('yyyy-MM-dd')) `
      --scope $scope `
      --output json | ConvertFrom-Json | Export-Csv -Path $localPath -NoTypeInformation

    # Upload to storage
    Set-AzStorageBlobContent `
      -File $localPath `
      -Container $Container `
      -Blob "$blobPath/$fileName" `
      -Context $StorageContext `
      -Force

    Write-Host "Uploaded $fileName to $blobPath"

    # Clean up local file
    Remove-Item $localPath -Force
}

# Loop through each month and download
$currentMonth = $startDate
while ($currentMonth -le $endDate) {
    Get-MonthlyUsageData -Month $currentMonth -Subscription $subscriptionId -StorageContext $ctx -Container $containerName
    $currentMonth = $currentMonth.AddMonths(1)
}

Write-Host "Backfill complete!"

Method 3: Backfill Using Azure CLI

#!/bin/bash

# Set variables
SUBSCRIPTION_ID="12345678-1234-1234-1234-123456789012"
STORAGE_ACCOUNT="costmgmtstorage"
RESOURCE_GROUP="finops-rg"
CONTAINER="cost-exports"

# Date range to backfill
START_DATE="2024-01-01"
END_DATE="2024-03-31"

# Login to Azure
az login

# Function to download monthly data
download_month() {
    local year=$1
    local month=$2

    # Calculate first and last day of month
    local first_day="${year}-${month}-01"
    local last_day=$(date -d "${first_day} +1 month -1 day" +%Y-%m-%d)

    echo "Downloading data for ${year}-${month}..."

    # Download usage details
    local filename="usage-${year}-${month}.json"

    az costmanagement query \
      --type "Usage" \
      --timeframe "Custom" \
      --time-period from="${first_day}" to="${last_day}" \
      --dataset-granularity "Daily" \
      --scope "/subscriptions/${SUBSCRIPTION_ID}" \
      --output json > "${filename}"

    # Upload to storage
    az storage blob upload \
      --account-name "${STORAGE_ACCOUNT}" \
      --container-name "${CONTAINER}" \
      --name "backfill/${year}-${month}/${filename}" \
      --file "${filename}" \
      --auth-mode login

    # Clean up
    rm "${filename}"

    echo "Completed ${year}-${month}"
}

# Loop through months
current_date="${START_DATE}"
while [[ "${current_date}" < "${END_DATE}" ]]; do
    year=$(date -d "${current_date}" +%Y)
    month=$(date -d "${current_date}" +%m)

    download_month "${year}" "${month}"

    # Move to next month
    current_date=$(date -d "${current_date} +1 month" +%Y-%m-01)
done

echo "Backfill complete!"

Method 4: Backfill Using Cost Management Export API

# Create a one-time export for historical data
$exportName = "historical-backfill-2024-Q1"
$scope = "/subscriptions/12345678-1234-1234-1234-123456789012"
$storageAccountId = "/subscriptions/12345678-1234-1234-1234-123456789012/resourceGroups/finops-rg/providers/Microsoft.Storage/storageAccounts/costmgmtstorage"
$containerName = "cost-exports"
$directory = "backfill"

# Define the export configuration
$exportParams = @{
    Scope = $scope
    ExportName = $exportName
    Definition = @{
        Type = "Usage"
        Timeframe = "Custom"
        TimePeriod = @{
            From = "2024-01-01T00:00:00Z"
            To = "2024-03-31T23:59:59Z"
        }
        DataSet = @{
            Granularity = "Daily"
            Configuration = @{
                Columns = @(
                    "Date", "MeterId", "ResourceId", "ResourceLocation",
                    "ConsumedService", "ResourceGroup", "InstanceName",
                    "Cost", "UnitPrice", "Quantity", "Tags"
                )
            }
        }
    }
    DeliveryInfo = @{
        Destination = @{
            ResourceId = $storageAccountId
            Container = $containerName
            RootFolderPath = $directory
        }
    }
    Format = "Csv"
    Schedule = @{
        Status = "Inactive"  # One-time export
    }
}

# Create the export
New-AzCostManagementExport @exportParams

# Execute the export immediately
Start-AzCostManagementExport -Scope $scope -ExportName $exportName

Write-Host "Export created and started. Check the storage account for results."

Merging Backfilled Data in Azure Synapse Analytics

Step 1: Create External Table for Backfill Data

-- Create external file format for CSV
CREATE EXTERNAL FILE FORMAT csv_format
WITH (
    FORMAT_TYPE = DELIMITEDTEXT,
    FORMAT_OPTIONS (
        FIELD_TERMINATOR = ',',
        STRING_DELIMITER = '"',
        FIRST_ROW = 2,
        USE_TYPE_DEFAULT = FALSE,
        ENCODING = 'UTF8'
    )
);

-- Create external data source
CREATE EXTERNAL DATA SOURCE backfill_data
WITH (
    LOCATION = 'wasbs://[email protected]/backfill/',
    CREDENTIAL = storage_credential
);

-- Create external table for backfill data
CREATE EXTERNAL TABLE ext_backfill_usage (
    [Date] DATE,
    [MeterId] NVARCHAR(100),
    [ResourceId] NVARCHAR(500),
    [ResourceLocation] NVARCHAR(100),
    [ConsumedService] NVARCHAR(100),
    [ResourceGroup] NVARCHAR(200),
    [InstanceName] NVARCHAR(500),
    [Cost] DECIMAL(18,4),
    [UnitPrice] DECIMAL(18,8),
    [Quantity] DECIMAL(18,4),
    [Tags] NVARCHAR(MAX)
)
WITH (
    LOCATION = '/',
    DATA_SOURCE = backfill_data,
    FILE_FORMAT = csv_format,
    REJECT_TYPE = VALUE,
    REJECT_VALUE = 100
);

Step 2: Merge with Existing Data

-- Create a unified view that combines existing and backfilled data
CREATE OR ALTER VIEW vw_unified_cost_data
AS
SELECT
    [Date],
    [MeterId],
    [ResourceId],
    [ResourceLocation],
    [ConsumedService],
    [ResourceGroup],
    [InstanceName],
    [Cost],
    [UnitPrice],
    [Quantity],
    [Tags],
    'existing' AS DataSource
FROM dbo.existing_cost_data

UNION ALL

SELECT
    [Date],
    [MeterId],
    [ResourceId],
    [ResourceLocation],
    [ConsumedService],
    [ResourceGroup],
    [InstanceName],
    [Cost],
    [UnitPrice],
    [Quantity],
    [Tags],
    'backfill' AS DataSource
FROM ext_backfill_usage
WHERE [Date] NOT IN (
    -- Exclude dates already in existing data to avoid duplicates
    SELECT DISTINCT [Date] FROM dbo.existing_cost_data
);

-- Materialize the unified data into a new table
CREATE TABLE dbo.unified_cost_data
WITH (
    DISTRIBUTION = HASH([Date]),
    CLUSTERED COLUMNSTORE INDEX
)
AS
SELECT *
FROM vw_unified_cost_data;

-- Create statistics for query optimization
CREATE STATISTICS stat_date ON dbo.unified_cost_data([Date]);
CREATE STATISTICS stat_resourceid ON dbo.unified_cost_data([ResourceId]);
CREATE STATISTICS stat_cost ON dbo.unified_cost_data([Cost]);

Step 3: Verify Data Completeness

-- Check for gaps in date coverage
WITH DateSequence AS (
    SELECT
        DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1, '2024-01-01') AS ExpectedDate
    FROM sys.all_columns
    WHERE DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1, '2024-01-01') <= '2024-12-31'
),
ActualDates AS (
    SELECT DISTINCT [Date] AS ActualDate
    FROM dbo.unified_cost_data
)
SELECT
    ds.ExpectedDate,
    CASE WHEN ad.ActualDate IS NULL THEN 'Missing' ELSE 'Present' END AS Status
FROM DateSequence ds
LEFT JOIN ActualDates ad ON ds.ExpectedDate = ad.ActualDate
WHERE ad.ActualDate IS NULL
ORDER BY ds.ExpectedDate;

-- Compare record counts before and after backfill
SELECT
    'Before Backfill' AS Period,
    COUNT(*) AS RecordCount,
    MIN([Date]) AS FirstDate,
    MAX([Date]) AS LastDate,
    SUM([Cost]) AS TotalCost
FROM dbo.existing_cost_data

UNION ALL

SELECT
    'After Backfill' AS Period,
    COUNT(*) AS RecordCount,
    MIN([Date]) AS FirstDate,
    MAX([Date]) AS LastDate,
    SUM([Cost]) AS TotalCost
FROM dbo.unified_cost_data;

Merging Backfilled Data in Power BI

Step 1: Load Backfill Data from Azure Storage

let
    // Connect to Azure Storage
    Source = AzureStorage.Blobs("https://costmgmtstorage.blob.core.windows.net"),

    // Filter to backfill container
    BackfillContainer = Source{[Name="cost-exports"]}[Data],
    BackfillFolder = BackfillContainer{[Name="backfill"]}[Data],

    // Get all CSV files
    FilteredFiles = Table.SelectRows(BackfillFolder, each Text.EndsWith([Name], ".csv")),

    // Combine all CSV files
    CombinedCSV = Table.Combine(
        List.Transform(
            Table.ToRecords(FilteredFiles),
            each Csv.Document([Content], [Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.Csv])
        )
    ),

    // Promote headers
    PromotedHeaders = Table.PromoteHeaders(CombinedCSV, [PromoteAllScalars=true]),

    // Transform data types
    TypedData = Table.TransformColumnTypes(PromotedHeaders, {
        {"Date", type date},
        {"Cost", type number},
        {"Quantity", type number},
        {"UnitPrice", type number},
        {"MeterId", type text},
        {"ResourceId", type text},
        {"ResourceGroup", type text}
    }),

    // Add data source tag
    AddedSource = Table.AddColumn(TypedData, "DataSource", each "Backfill", type text)
in
    AddedSource

Step 2: Merge with Existing Data

let
    // Load existing cost data
    ExistingData = #"Existing Cost Data",
    AddSourceTag = Table.AddColumn(ExistingData, "DataSource", each "Current Export", type text),

    // Load backfill data
    BackfillData = #"Backfill Data",

    // Combine both datasets
    CombinedData = Table.Combine({AddSourceTag, BackfillData}),

    // Remove duplicates (keep existing data if overlap)
    RemovedDuplicates = Table.Distinct(CombinedData, {"Date", "ResourceId", "MeterId"}),

    // Sort by date
    SortedData = Table.Sort(RemovedDuplicates, {{"Date", Order.Ascending}})
in
    SortedData

Step 3: Create Date Validation Table

let
    // Create complete date sequence
    StartDate = #date(2024, 1, 1),
    EndDate = #date(2024, 12, 31),
    DateList = List.Dates(StartDate, Duration.Days(EndDate - StartDate) + 1, #duration(1, 0, 0, 0)),
    DateTable = Table.FromList(DateList, Splitter.SplitByNothing(), {"ExpectedDate"}, null, ExtraValues.Error),

    // Get actual dates from data
    ActualDates = Table.Distinct(#"Merged Cost Data", {"Date"}),
    RenamedActual = Table.RenameColumns(ActualDates, {{"Date", "ActualDate"}}),

    // Left join to find gaps
    MergedDates = Table.NestedJoin(DateTable, {"ExpectedDate"}, RenamedActual, {"ActualDate"}, "ActualData", JoinKind.LeftOuter),
    ExpandedDates = Table.ExpandTableColumn(MergedDates, "ActualData", {"ActualDate"}, {"ActualDate"}),

    // Add status column
    AddedStatus = Table.AddColumn(ExpandedDates, "Status", each if [ActualDate] = null then "Missing" else "Present"),

    // Filter to show only missing dates
    MissingDates = Table.SelectRows(AddedStatus, each ([Status] = "Missing"))
in
    MissingDates

Best Practices

1. Validate Before Merging

Always verify the integrity of backfilled data before merging:

  • Check date ranges for completeness
  • Verify cost totals match known invoices
  • Ensure no duplicate records
  • Validate data schema consistency

2. Maintain Data Lineage

Tag backfilled data with metadata:

ALTER TABLE dbo.unified_cost_data
ADD
    DataSource NVARCHAR(50),
    BackfillDate DATETIME2 DEFAULT GETDATE(),
    BackfillMethod NVARCHAR(100);

UPDATE dbo.unified_cost_data
SET
    DataSource = 'Manual Backfill',
    BackfillMethod = 'Cost Management Portal Download'
WHERE DataSource IS NULL;

3. Implement Incremental Refresh

Configure Power BI for incremental refresh to handle large datasets:

// Add RangeStart and RangeEnd parameters
let
    Source = #"Merged Cost Data",
    FilteredRows = Table.SelectRows(Source,
        each [Date] >= RangeStart and [Date] < RangeEnd)
in
    FilteredRows

4. Archive Original Backfill Files

Keep original backfill files separate from ongoing exports:

# Create archive folder structure
az storage blob directory create \
  --account-name costmgmtstorage \
  --container-name cost-exports \
  --directory-path "backfill-archive/$(date +%Y-%m-%d)" \
  --auth-mode login

5. Document the Backfill Process

Create a backfill log table:

CREATE TABLE dbo.backfill_log (
    BackfillId INT IDENTITY(1,1) PRIMARY KEY,
    BackfillDate DATETIME2 DEFAULT GETDATE(),
    DateRangeStart DATE,
    DateRangeEnd DATE,
    RecordsAdded INT,
    TotalCostAdded DECIMAL(18,2),
    BackfillMethod NVARCHAR(200),
    ExecutedBy NVARCHAR(100),
    Notes NVARCHAR(MAX)
);

-- Log backfill operations
INSERT INTO dbo.backfill_log (DateRangeStart, DateRangeEnd, RecordsAdded, TotalCostAdded, BackfillMethod, ExecutedBy, Notes)
SELECT
    '2024-01-01',
    '2024-03-31',
    COUNT(*),
    SUM([Cost]),
    'PowerShell Script - Cost Management API',
    SYSTEM_USER,
    'Q1 2024 backfill for new Synapse workspace';

Troubleshooting

Issue: Downloaded data has different schema than existing exports

Cause: Cost Management export schema changes over time or different export types selected.

Solution:

  1. Compare schemas:

    -- Check existing schema
    SELECT COLUMN_NAME, DATA_TYPE
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME = 'existing_cost_data';
    
    -- Check backfill schema
    EXEC sp_describe_first_result_set N'SELECT * FROM ext_backfill_usage';
    
  2. Create a mapping/transformation layer:

    CREATE VIEW vw_normalized_backfill AS
    SELECT
        [Date],
        [MeterId],
        [ResourceId],
        -- Map old column names to new ones
        ISNULL([Cost], [PreTaxCost]) AS Cost,
        ISNULL([UnitPrice], [EffectivePrice]) AS UnitPrice
    FROM ext_backfill_usage;
    

Issue: Duplicate records after merging

Cause: Overlapping date ranges between backfill and existing data.

Solution:

-- Identify duplicates
SELECT
    [Date],
    [ResourceId],
    [MeterId],
    COUNT(*) AS DuplicateCount
FROM dbo.unified_cost_data
GROUP BY [Date], [ResourceId], [MeterId]
HAVING COUNT(*) > 1;

-- Remove duplicates keeping the most recent record
WITH CTE AS (
    SELECT *,
        ROW_NUMBER() OVER (
            PARTITION BY [Date], [ResourceId], [MeterId]
            ORDER BY BackfillDate DESC
        ) AS RowNum
    FROM dbo.unified_cost_data
)
DELETE FROM CTE WHERE RowNum > 1;

Issue: Cost Management API returns "Data not available" for historical dates

Cause: Data retention limits or billing account type restrictions.

Solution:

  1. Check data retention limits:

    • Standard subscriptions: 13 months
    • Enterprise Agreement: Up to 36 months
    • Microsoft Customer Agreement: Up to 13 months
  2. Use alternate data sources:

    # Try the legacy API
    az consumption usage list \
      --start-date 2024-01-01 \
      --end-date 2024-01-31 \
      --output json
    
  3. Contact Azure Support for data older than retention period

Issue: Power BI refresh fails with "Out of Memory" error

Cause: Too much data being loaded at once.

Solution:

  1. Enable incremental refresh in Power BI:

    • Configure RangeStart and RangeEnd parameters
    • Set refresh policy (e.g., refresh last 6 months, archive 2 years)
  2. Use DirectQuery for large datasets:

    // Use DirectQuery to Synapse instead of Import
    let
        Source = Sql.Database("synapse-workspace.sql.azuresynapse.net", "cost_db",
            [Query="SELECT * FROM dbo.unified_cost_data",
             EnableDirectQuery=true])
    in
        Source
    

Issue: Backfill data shows inconsistent costs compared to invoices

Cause: Different cost types (actual vs. amortized) or currency conversion.

Solution:

  1. Verify cost type in exports:

    # Ensure using the same cost metric
    Get-AzCostManagementExport -Scope $scope -ExportName $exportName |
        Select-Object -ExpandProperty Definition |
        Select-Object Type
    
  2. Reconcile with invoice data:

    SELECT
        YEAR([Date]) AS Year,
        MONTH([Date]) AS Month,
        SUM([Cost]) AS ExportedCost
    FROM dbo.unified_cost_data
    GROUP BY YEAR([Date]), MONTH([Date])
    ORDER BY Year, Month;
    
    -- Compare with invoice amounts
    

Next Steps

After successfully backfilling missing billing data, consider these related tasks:

  1. Merge historical and new data: How to Merge Historical and New Cost Export Data in Azure
  2. Monitor export data freshness: How to Monitor Cost Export Status and Data Freshness in Azure
  3. Visualize complete dataset: How to Visualize Azure Billing Data in Power BI
  4. Set up automated exports: How to Enable Cost Management Export to Azure Storage

Related Resources

Frequently Asked Questions

Find answers to common questions

If you encounter 'Data not available' errors, first verify your data retention limits: standard subscriptions retain data for 13 months, while Enterprise Agreements may hold data for up to 36 months. If your data exceeds these limits, try using the legacy API with the command `az consumption usage list --start-date <start-date> --end-date <end-date> --output json` for older data retrieval. For missing data older than the retention period, contact Azure Support for potential recovery options, as they may help access long-lost data.

Need Professional Help?

Our team of experts can help you implement and configure these solutions for your organization.