What's actually useful in perfstat output vs noise I can ignore?

Focus on: CPU utilization >80% (indicates overloaded filer), disk utilization >70% per aggregate (performance bottleneck), network throughput approaching interface limits, high latency (>10ms for typical workloads). Ignore: memory stats (NetApp caches aggressively, high memory use is normal), specific protocol stats unless troubleshooting that protocol, detailed per-volume stats (aggregate-level is usually sufficient). Start with dashboard view in perfstat: overall CPU, disk busy %, network throughput. Drill into details only when dashboard shows problems. Most performance issues show up as: high CPU wait time (disk can't keep up), specific aggregate at 100% busy (storage bottleneck), or network interface saturated.

How long should I run perfstat to get meaningful results?

Minimum: 1 hour during problem period (if troubleshooting specific issue), 24 hours for baseline (captures daily patterns). Best: 1 week for comprehensive analysis (includes weekly patterns, weekend vs weekday). Longer captures more patterns but creates huge files (1-week perfstat can be 500MB-2GB). For performance troubleshooting: capture during problem window (if slow every morning 9-10AM, capture 8AM-11AM). For capacity planning: capture 1 week representative period (not during holidays/special events). Perfstat has minimal performance impact (<2% CPU), safe to run continuously. NetApp support often requests 24-hour perfstat for cases—this is their sweet spot for analysis.

Can I analyze perfstat myself or do I need NetApp support?

You can do basic analysis with perfstat HTML viewer (shows graphs of CPU, disk, network over time). Look for: sudden spikes (indicates problem at specific time), gradual trends (capacity running out), patterns (every day at 9AM performance drops). Advanced analysis needs tools: NetApp Harvest + Grafana (visualize perfstat data), PerfStat Analyzer (third-party tool), or NetApp support (they have internal analysis tools). DIY sufficient for: identifying obvious bottlenecks (disk at 100%, CPU maxed), correlating performance issues with time (slow at 2PM, perfstat shows spike at 2PM). Need NetApp support for: root cause analysis of complex issues, capacity planning calculations, optimization recommendations. Start with HTML viewer—if you can't identify problem, open case with perfstat attached.

What should I do if perfstat shows disk utilization at 100%?

First determine which aggregate is bottlenecked (perfstat shows per-aggregate stats). Solutions: 1) Add disks to overloaded aggregate (more spindles = more IOPS), 2) Enable FlexShare (QoS to prevent one workload hogging resources), 3) Move hot volumes to faster aggregate (SSD aggregate for databases), 4) Optimize workload (reduce unnecessary I/O, enable deduplication/compression to reduce actual writes). Quick wins: check for dedup/scrub running during business hours (reschedule to nights), identify largest I/O consumer from perfstat (specific volume hammering storage), enable caching for read-heavy workloads. Before buying more disks: verify aggregate is truly bottlenecked and not just misconfigured (sometimes problem is RAID group layout, not total disk count).

NetApp Perfstat Analysis | 7-Mode Performance Guide

Q: Can I analyze perfstat myself or do I need NetApp support?

You can do basic analysis with perfstat HTML viewer (shows graphs of CPU, disk, network over time). Look for: sudden spikes (indicates problem at specific time), gradual trends (capacity running out), patterns (every day at 9AM performance drops). Advanced analysis needs tools: NetApp Harvest + Grafana (visualize perfstat data), PerfStat Analyzer (third-party tool), or NetApp support (they have internal analysis tools). DIY sufficient for: identifying obvious bottlenecks (disk at 100%, CPU maxed), correlating performance issues with time (slow at 2PM, perfstat shows spike at 2PM). Need NetApp support for: root cause analysis of complex issues, capacity planning calculations, optimization recommendations. Start with HTML viewer—if you can't identify problem, open case with perfstat attached.

Q: What should I do if perfstat shows disk utilization at 100%?

First determine which aggregate is bottlenecked (perfstat shows per-aggregate stats). Solutions: 1) Add disks to overloaded aggregate (more spindles = more IOPS), 2) Enable FlexShare (QoS to prevent one workload hogging resources), 3) Move hot volumes to faster aggregate (SSD aggregate for databases), 4) Optimize workload (reduce unnecessary I/O, enable deduplication/compression to reduce actual writes). Quick wins: check for dedup/scrub running during business hours (reschedule to nights), identify largest I/O consumer from perfstat (specific volume hammering storage), enable caching for read-heavy workloads. Before buying more disks: verify aggregate is truly bottlenecked and not just misconfigured (sometimes problem is RAID group layout, not total disk count).

Collecting Perfstat Data

Before you can analyze performance data, you need to collect it properly during the time when performance issues are occurring. The perfstat utility must be downloaded from NetApp’s support site and executed with the correct parameters to gather meaningful data.

Perfstat Collection Command

# Basic perfstat collection command
perfstat -f [hostname] -t 5 -i 1,6 -l [username]:[password] > C:\perfstat.out

# Example with actual values
perfstat -f nas01.company.com -t 5 -i 1,6 -l admin:password123 > C:\perfstat.out

This command collects six 5-minute perfstat samples with 1-minute intervals between each collection. The parameters ensure comprehensive data collection:

-f [hostname]: Target NetApp filer hostname or IP address
-t 5: 5-minute collection interval (recommended by NetApp support)
-i 1,6: 1-minute pause between 6 iterations
-l [credentials]: Username and password for authentication

Important: Keep the collection interval (-t value) relatively short. NetApp support recommends avoiding excessively long iterations as they can produce skewed results that don’t accurately reflect system behavior.

Analyzing CPU Performance

CPU bottlenecks can significantly impact storage system performance. The CPU Statistics section in perfstat provides detailed information about processor utilization and can help you determine if CPU constraints are affecting your storage operations.

Locating CPU Statistics

To find CPU performance data in your perfstat file, search for “CPU Statistics” using your text editor’s search function (Ctrl+F in most editors). Once located, look for the timestamp to understand when this sample was collected.

Time Zone Note: NetApp timestamps are in GMT. Remember to convert to your local time zone when correlating perfstat data with user-reported issues or other system logs.

Key CPU Metrics to Monitor

The most critical metric in the CPU Statistics section is the idle time percentage. This indicates what percentage of time the CPU spends waiting for work, which inversely shows CPU utilization:

High idle time (90-95%): CPU utilization is low (5-10%), no CPU bottleneck
Medium idle time (70-80%): Moderate CPU usage (20-30%), monitor closely
Low idle time (below 50%): High CPU utilization (above 50%), potential bottleneck

Analyzing Multiple CPU Samples

Since our collection command gathered 6 iterations, you’ll find 6 separate CPU Statistics sections in your perfstat file. Don’t rely on a single sample—examine all iterations to identify patterns:

Consistent low idle time across multiple samples indicates a sustained CPU bottleneck
Intermittent spikes suggest periodic high-load operations
Generally high idle time rules out CPU as the performance culprit

Network Performance Analysis

Network bottlenecks are common in storage environments, especially when dealing with high-throughput applications. The Network Interface Statistics section provides comprehensive data about network utilization, errors, and performance characteristics.

Network Statistics Column Headers

Search for “Network Interface Statistics” in your perfstat file. The data is presented in columns with these headers:

Column	Description	What to Monitor
iface	Network interface name	Identify which adapter is being monitored
side	Traffic direction (send/receive)	Understand data flow patterns
bytes	Bytes per second	Primary throughput metric
packets	Packets per second	Packet rate and efficiency
errors	Error count	Should always be zero
collisions	Network collisions	Should be minimal or zero
pkt drops	Dropped packets	Critical error indicator

Critical Warning Signs

Errors or packet drops > 0: Indicates network hardware problems or configuration issues
High collision rates: Suggests network congestion or duplex mismatches
Bytes/sec approaching interface limits: Network saturation warning

Network Capacity Planning

Understanding your hardware capabilities is crucial for interpreting network statistics:

1 Gbps ports: Theoretical maximum ~134,217,728 bytes/second
10 Gbps ports: Theoretical maximum ~1,342,177,280 bytes/second
Virtual interfaces (VIF): Capacity depends on configuration (failover vs. load-balanced)

VIF Configuration Impact: In failover mode, a VIF provides no additional bandwidth—just redundancy. In load-balanced mode, bandwidth combines across member interfaces, doubling throughput for dual-port configurations.

Disk Performance Analysis

Disk bottlenecks are the most common cause of performance issues in storage systems. The Disk Statistics section provides detailed IOPS and utilization metrics that help identify storage subsystem constraints and guide capacity planning decisions.

Disk Statistics Layout

Search for “Disk Statistics” in your perfstat file. The data is organized with column headers followed by aggregate information. Key columns include:

disk ut% xfers ureads-chain-usecs writes-chain-usecs cpreads-chain-usecs greads-chain-usecs gwrites-chain-usecs

Critical Metrics to Monitor

Focus on these two primary metrics for quick performance assessment:

ut% (Utilization Percentage): Time the disk spends busy vs. idle
xfers (Transfers/IOPS): Input/output operations per second

Disk Type Performance Characteristics

Disk Type	IOPS Capacity	Performance Threshold	Utilization Warning
SATA 7200 RPM	~40 IOPS	Performance degrades above 40 IOPS	Monitor when ut% > 50%
SAS 10K RPM	~140 IOPS	Performance degrades above 140 IOPS	Monitor when ut% > 50%
FC 15K RPM	~300 IOPS	Performance degrades above 300 IOPS	Monitor when ut% > 50%
SSD	Varies widely	Typically thousands of IOPS	Monitor when ut% > 70%

Aggregate-Level Analysis

Don’t focus on individual disk performance—analyze the aggregate as a whole. Individual disks with high utilization are normal in a properly functioning RAID group. Instead:

Calculate average utilization across all disks in the aggregate
Look for consistently high utilization across multiple disks
Compare IOPS against the expected capacity for your disk type
Correlate high utilization with user-reported performance issues

Performance Rule: If average disk utilization consistently exceeds 50% across an aggregate, you’re likely experiencing performance degradation. Consider adding more disks or upgrading to faster storage.

Perfstat Analysis Best Practices

Effective perfstat analysis requires a systematic approach and understanding of how different metrics correlate. Here are proven strategies for getting the most value from your performance data:

Systematic Analysis Approach

Start with disk statistics: Most performance issues stem from storage bottlenecks
Check network utilization: Verify if network capacity matches your storage throughput
Examine CPU as last resort: CPU bottlenecks are less common in storage systems
Correlate timestamps: Match perfstat data with user-reported problem timeframes
Analyze trends: Look at all 6 iterations to identify patterns vs. anomalies

Tools and Environment

Use a capable text editor: TextPad or similar tools that handle large files efficiently
Leverage search functionality: Quickly navigate between different statistics sections
Document your findings: Note baseline performance for future comparisons
Multiple controller analysis: If you have an HA pair, analyze both controllers

Common Performance Scenarios

High disk utilization + low network usage: Storage bottleneck confirmed
Low disk utilization + high network saturation: Network bottleneck
Low CPU idle time + normal disk/network: CPU bottleneck (rare)
Errors in network statistics: Hardware or configuration issues

Pro Tip: Perfstat analysis becomes intuitive with practice. Start with simple scenarios and gradually build your expertise with more complex multi-controller environments and mixed workloads.

If still on 7-Mode (EOL 2015), focus on migration to ONTAP rather than performance tuning—7-Mode has no future. That said, perfstat still works on 7-Mode and is useful for: pre-migration baseline (understand current performance before migration), troubleshooting during migration planning (identify performance issues to fix in new ONTAP cluster), capacity planning for new hardware (how much performance do you actually need?). Don't: invest heavily in 7-Mode performance optimization (spend effort on migration instead), delay migration because of performance analysis (fix performance in ONTAP, not aging 7-Mode). Do: run perfstat on 7-Mode as baseline, use data to properly size ONTAP cluster, troubleshoot migration issues by comparing 7-Mode perfstat to ONTAP performance. If you're still on 7-Mode in 2025, migration should be priority #1.

NetApp Perfstat Analysis | 7-Mode Performance Guide

Collecting Perfstat Data

Perfstat Collection Command

Analyzing CPU Performance

Locating CPU Statistics

Key CPU Metrics to Monitor

Analyzing Multiple CPU Samples

Network Performance Analysis

Network Statistics Column Headers

Critical Warning Signs

Network Capacity Planning

Disk Performance Analysis

Disk Statistics Layout

Critical Metrics to Monitor

Disk Type Performance Characteristics

Aggregate-Level Analysis

Perfstat Analysis Best Practices

Systematic Analysis Approach

Tools and Environment

Common Performance Scenarios

Frequently Asked Questions

Let's turn this knowledge into action

SMB Compliance Challenges | Cybersecurity

Infrastructure-as-Code Security & Change Management: Terraform Best Practices 2025

Complete Web Application Security Audit Workflow: From Reconnaissance to Remediation

Network Troubleshooting & Performance Optimization: OSI Model Systematic Approach

API Development & Security Testing Workflow: OWASP API Security Top 10 Guide

The Complete Developer Debugging & Data Transformation Workflow

NetApp Perfstat Analysis | 7-Mode Performance Guide

Frequently Asked Questions

Let's turn this knowledge into action

Related Articles

SMB Compliance Challenges | Cybersecurity

Infrastructure-as-Code Security & Change Management: Terraform Best Practices 2025

Complete Web Application Security Audit Workflow: From Reconnaissance to Remediation

Network Troubleshooting & Performance Optimization: OSI Model Systematic Approach

API Development & Security Testing Workflow: OWASP API Security Top 10 Guide

The Complete Developer Debugging & Data Transformation Workflow