Home/Blog/How do I convert nested JSON to CSV format?
Data Management

How do I convert nested JSON to CSV format?

Master the techniques for flattening complex nested JSON structures into spreadsheet-compatible CSV format with practical strategies.

By Inventive HQ Team
How do I convert nested JSON to CSV format?

Understanding JSON Nesting and CSV Limitations

JSON is a hierarchical, flexible format that excels at representing complex, nested data structures. Objects can contain other objects, arrays can hold mixed data types, and the format naturally supports relationships and hierarchies. CSV, by contrast, is fundamentally a flat, two-dimensional format—rows and columns where each cell contains a single value. Converting nested JSON to CSV requires collapsing this hierarchy into a tabular structure, which introduces unique challenges not present when converting simple, flat JSON.

The fundamental incompatibility between JSON's hierarchical nature and CSV's flat structure means you cannot preserve all the nuance of nested JSON in CSV format without some loss or transformation of the data structure. However, with the right strategies, you can convert nested JSON into CSV that maintains data integrity and usability.

Flattening Simple Nested Objects

The simplest nested structure involves an object containing properties that themselves are objects. For example:

{
  "id": 1,
  "name": "John Doe",
  "address": {
    "street": "123 Main St",
    "city": "Boston",
    "zip": "02101"
  }
}

To convert this to CSV, you "flatten" the nested object by creating column headers that represent the path to each value. The most common approach uses dot notation: address.street, address.city, address.zip. The resulting CSV would have columns:

id,name,address.street,address.city,address.zip
1,John Doe,123 Main St,Boston,02101

This approach scales to multiple levels of nesting. If you had an even deeper structure with address.location.coordinates.latitude, you'd simply create a column named exactly that. Most CSV-to-JSON converters support this dot notation approach, making it straightforward to work with moderately nested objects.

Handling Arrays in Nested JSON

Arrays within JSON objects present more complex challenges. Consider:

{
  "id": 1,
  "name": "John Doe",
  "phone_numbers": ["555-1234", "555-5678", "555-9999"]
}

You have several options for converting array data to CSV:

Option 1: Multiple columns - Create separate columns for each array element: phone_numbers_1, phone_numbers_2, phone_numbers_3. This works well when arrays have a known, limited size.

Option 2: Semicolon-separated values - Put all array values in a single column, separated by a delimiter (often semicolon or pipe): phone_numbers: "555-1234;555-5678;555-9999". This is compact but requires special handling when re-importing.

Option 3: Multiple rows - Create separate CSV rows for each array element, repeating non-array fields. This "denormalization" creates more rows but maintains pure CSV format without special conventions.

Option 4: String representation - Convert the array to a JSON string within the CSV cell: phone_numbers: "[\"555-1234\",\"555-5678\",\"555-9999\"]". This preserves the exact structure but requires parsing to use the data.

The best choice depends on your data characteristics and intended use. If arrays are small and consistent in size, multiple columns work well. If arrays are variable-length, the semicolon-separated or multi-row approach might be better.

Working with Arrays of Objects

The most complex scenario involves arrays of objects—a common JSON structure:

{
  "id": 1,
  "name": "John Doe",
  "orders": [
    {
      "order_id": "ORD-001",
      "amount": 99.99,
      "date": "2025-01-15"
    },
    {
      "order_id": "ORD-002",
      "amount": 149.99,
      "date": "2025-01-20"
    }
  ]
}

Converting this to CSV requires deciding how to represent the one-to-many relationship:

Denormalization approach - Create multiple CSV rows, repeating the parent object's fields:

id,name,orders.order_id,orders.amount,orders.date
1,John Doe,ORD-001,99.99,2025-01-15
1,John Doe,ORD-002,149.99,2025-01-20

This flattens the data into CSV format but creates data duplication. It's suitable when the array size is small and consistent.

Aggregation approach - If your analysis doesn't require individual array elements, aggregate the data. For example, calculate the total order amount and count:

id,name,total_orders,total_amount
1,John Doe,2,249.98

This loses the detail but produces a cleaner CSV.

Separate tables approach - Create two CSV files: one for customers, one for orders, with a foreign key relationship:

customers.csv:

id,name
1,John Doe

orders.csv:

order_id,customer_id,amount,date
ORD-001,1,99.99,2025-01-15
ORD-002,1,149.99,2025-01-20

This maintains data integrity and structure but requires multiple files and import discipline.

Recursive Nesting and Deeply Nested Structures

Some JSON structures contain recursively nested data—objects containing arrays of objects containing arrays. Converting these requires deciding how deep to flatten:

{
  "id": 1,
  "name": "Department",
  "teams": [
    {
      "team_id": "T1",
      "members": [
        {"name": "Alice", "role": "Lead"},
        {"name": "Bob", "role": "Developer"}
      ]
    }
  ]
}

You might flatten all the way to the deepest level:

  • teams.0.team_id, teams.0.members.0.name, teams.0.members.0.role, etc.

Or you might flatten only partially, keeping deep arrays as JSON strings. The decision depends on your analysis needs and usability requirements. Flattening too much creates a table with overwhelming columns and complexity, while flattening too little maintains JSON within CSV cells.

Handling Null Values and Missing Data

Nested JSON often contains null values and missing properties that need careful handling during conversion:

{
  "id": 1,
  "name": "John",
  "address": null,
  "company": {
    "name": "Acme Corp"
  }
}

When converting, you must decide:

  • Should address: null become an empty CSV cell or the literal text "null"?
  • For company.phone (missing property), should the cell be empty?
  • Should you include columns for all properties from all objects, even if some objects lack certain properties?

Professional converters typically include all possible columns and use empty cells for null or missing values, which is the standard approach. Some allow you to specify how to represent null values—as blank, as "null", as "N/A", or other conventions.

Implementation Strategies

Several approaches work for converting nested JSON to CSV:

Custom scripts - Write code using your preferred language. Python with pandas makes flattening straightforward:

import pandas as pd
import json

with open('data.json') as f:
    data = json.load(f)

df = pd.json_normalize(data)
df.to_csv('output.csv', index=False)

The json_normalize function handles flattening with dot notation automatically.

Online converters - Many free online tools handle nested JSON conversion with configurable options. These are convenient for one-off conversions but may have file size limits.

ETL tools - Enterprise tools like Apache Nifi, Talend, or custom Apache Spark jobs provide robust handling of complex transformations with monitoring and error handling.

Programming libraries - Use established JSON-to-CSV libraries specific to your language ecosystem. These handle edge cases and provide configurable flattening strategies.

Preserving Data Integrity During Conversion

When converting nested JSON to CSV, prioritize data integrity:

Test with sample data - Always test conversion with a small representative sample before processing large datasets. Verify that converted CSV contains all expected fields and values.

Validate row counts - If using denormalization (multiple rows per parent object), ensure the row count matches expectations. Count array elements to verify they're all represented.

Compare values - Spot-check specific values in the converted CSV against the source JSON to ensure nothing was corrupted or truncated.

Document the conversion - Record what flattening strategy you used, how arrays were handled, and what fields map to what columns. This documentation is essential if you need to re-convert or restore data.

Keep source files - Always retain the original nested JSON alongside the converted CSV. If issues emerge later, having the source allows re-conversion with different parameters.

Handling Special Cases

Certain JSON structures require special consideration:

Duplicate keys - Some systems generate JSON with duplicate keys (technically invalid but encountered in practice). Conversion behavior is unpredictable; validate the JSON is well-formed first.

Mixed-type arrays - Arrays containing different data types (strings and numbers, objects and primitives) complicate conversion. Clarify what to do with mixed types or consider cleaning the source JSON first.

Very large arrays - JSON with arrays containing thousands of elements can create impractically large CSV files (if denormalized) or very complex structures (if flattened with array indices). Consider aggregation or filtering before conversion.

Unicode and special characters - Nested JSON might contain Unicode characters, emojis, or special characters. Ensure your conversion tool preserves proper encoding.

Best Practices for Nested JSON Conversion

Choose the right strategy for your data shape - Understand your JSON structure before converting. Different shapes (flat objects vs. arrays of objects vs. deeply nested) require different approaches.

Document your choices - Record whether you denormalized, aggregated, or used separate tables. This documentation guides future work and helps others understand your data decisions.

Validate the output - Never assume conversion succeeded. Spot-check values, verify row counts, and validate structure matches expectations.

Consider the conversion flow - Think about whether you'll need to convert back to JSON later. Some flattening approaches are reversible; others are lossy.

Use appropriate tools - Don't reinvent the wheel with custom string manipulation. Use proper libraries or converters designed for nested JSON, which handle edge cases and RFC standards.

Test with your specific data - Generic advice helps, but your data might have unique characteristics. Test thoroughly with actual data before relying on converted results.

Conclusion

Converting nested JSON to CSV requires understanding your data structure and choosing an appropriate flattening strategy. Whether you flatten completely with dot notation, denormalize arrays into multiple rows, aggregate data, or use separate tables depends on your specific data shape and analytical needs. By understanding your options, testing thoroughly, and documenting your approach, you can reliably convert complex nested JSON into usable CSV format that maintains data integrity and serves your downstream processes effectively.

Need Expert IT & Security Guidance?

Our team is ready to help protect and optimize your business technology infrastructure.