Home/Blog/NetApp Perfstat Analysis | 7-Mode Performance Guide
Uncategorized

NetApp Perfstat Analysis | 7-Mode Performance Guide

Master NetApp 7-mode perfstat analysis to troubleshoot CPU, disk, and network performance issues on your storage systems.

NetApp Perfstat Analysis | 7-Mode Performance Guide

Collecting Perfstat Data

Before you can analyze performance data, you need to collect it properly during the time when performance issues are occurring. The perfstat utility must be downloaded from NetApp’s support site and executed with the correct parameters to gather meaningful data.

Perfstat Collection Command

# Basic perfstat collection command
perfstat -f [hostname] -t 5 -i 1,6 -l [username]:[password] > C:\perfstat.out

# Example with actual values
perfstat -f nas01.company.com -t 5 -i 1,6 -l admin:password123 > C:\perfstat.out

This command collects six 5-minute perfstat samples with 1-minute intervals between each collection. The parameters ensure comprehensive data collection:

  • -f [hostname]: Target NetApp filer hostname or IP address
  • -t 5: 5-minute collection interval (recommended by NetApp support)
  • -i 1,6: 1-minute pause between 6 iterations
  • -l [credentials]: Username and password for authentication

Important: Keep the collection interval (-t value) relatively short. NetApp support recommends avoiding excessively long iterations as they can produce skewed results that don’t accurately reflect system behavior.

Analyzing CPU Performance

CPU bottlenecks can significantly impact storage system performance. The CPU Statistics section in perfstat provides detailed information about processor utilization and can help you determine if CPU constraints are affecting your storage operations.

Locating CPU Statistics

To find CPU performance data in your perfstat file, search for “CPU Statistics” using your text editor’s search function (Ctrl+F in most editors). Once located, look for the timestamp to understand when this sample was collected.

Time Zone Note: NetApp timestamps are in GMT. Remember to convert to your local time zone when correlating perfstat data with user-reported issues or other system logs.

Key CPU Metrics to Monitor

The most critical metric in the CPU Statistics section is the idle time percentage. This indicates what percentage of time the CPU spends waiting for work, which inversely shows CPU utilization:

  • High idle time (90-95%): CPU utilization is low (5-10%), no CPU bottleneck
  • Medium idle time (70-80%): Moderate CPU usage (20-30%), monitor closely
  • Low idle time (below 50%): High CPU utilization (above 50%), potential bottleneck

Analyzing Multiple CPU Samples

Since our collection command gathered 6 iterations, you’ll find 6 separate CPU Statistics sections in your perfstat file. Don’t rely on a single sample—examine all iterations to identify patterns:

  • Consistent low idle time across multiple samples indicates a sustained CPU bottleneck
  • Intermittent spikes suggest periodic high-load operations
  • Generally high idle time rules out CPU as the performance culprit

Network Performance Analysis

Network bottlenecks are common in storage environments, especially when dealing with high-throughput applications. The Network Interface Statistics section provides comprehensive data about network utilization, errors, and performance characteristics.

Network Statistics Column Headers

Search for “Network Interface Statistics” in your perfstat file. The data is presented in columns with these headers:

ColumnDescriptionWhat to Monitor
ifaceNetwork interface nameIdentify which adapter is being monitored
sideTraffic direction (send/receive)Understand data flow patterns
bytesBytes per secondPrimary throughput metric
packetsPackets per secondPacket rate and efficiency
errorsError countShould always be zero
collisionsNetwork collisionsShould be minimal or zero
pkt dropsDropped packetsCritical error indicator

Critical Warning Signs

  • Errors or packet drops > 0: Indicates network hardware problems or configuration issues
  • High collision rates: Suggests network congestion or duplex mismatches
  • Bytes/sec approaching interface limits: Network saturation warning

Network Capacity Planning

Understanding your hardware capabilities is crucial for interpreting network statistics:

  • 1 Gbps ports: Theoretical maximum ~134,217,728 bytes/second
  • 10 Gbps ports: Theoretical maximum ~1,342,177,280 bytes/second
  • Virtual interfaces (VIF): Capacity depends on configuration (failover vs. load-balanced)

VIF Configuration Impact: In failover mode, a VIF provides no additional bandwidth—just redundancy. In load-balanced mode, bandwidth combines across member interfaces, doubling throughput for dual-port configurations.

Disk Performance Analysis

Disk bottlenecks are the most common cause of performance issues in storage systems. The Disk Statistics section provides detailed IOPS and utilization metrics that help identify storage subsystem constraints and guide capacity planning decisions.

Disk Statistics Layout

Search for “Disk Statistics” in your perfstat file. The data is organized with column headers followed by aggregate information. Key columns include:

disk ut% xfers ureads-chain-usecs writes-chain-usecs cpreads-chain-usecs greads-chain-usecs gwrites-chain-usecs

Critical Metrics to Monitor

Focus on these two primary metrics for quick performance assessment:

  • ut% (Utilization Percentage): Time the disk spends busy vs. idle
  • xfers (Transfers/IOPS): Input/output operations per second

Disk Type Performance Characteristics

Disk TypeIOPS CapacityPerformance ThresholdUtilization Warning
SATA 7200 RPM~40 IOPSPerformance degrades above 40 IOPSMonitor when ut% > 50%
SAS 10K RPM~140 IOPSPerformance degrades above 140 IOPSMonitor when ut% > 50%
FC 15K RPM~300 IOPSPerformance degrades above 300 IOPSMonitor when ut% > 50%
SSDVaries widelyTypically thousands of IOPSMonitor when ut% > 70%

Aggregate-Level Analysis

Don’t focus on individual disk performance—analyze the aggregate as a whole. Individual disks with high utilization are normal in a properly functioning RAID group. Instead:

  • Calculate average utilization across all disks in the aggregate
  • Look for consistently high utilization across multiple disks
  • Compare IOPS against the expected capacity for your disk type
  • Correlate high utilization with user-reported performance issues

Performance Rule: If average disk utilization consistently exceeds 50% across an aggregate, you’re likely experiencing performance degradation. Consider adding more disks or upgrading to faster storage.

Perfstat Analysis Best Practices

Effective perfstat analysis requires a systematic approach and understanding of how different metrics correlate. Here are proven strategies for getting the most value from your performance data:

Systematic Analysis Approach

  • Start with disk statistics: Most performance issues stem from storage bottlenecks
  • Check network utilization: Verify if network capacity matches your storage throughput
  • Examine CPU as last resort: CPU bottlenecks are less common in storage systems
  • Correlate timestamps: Match perfstat data with user-reported problem timeframes
  • Analyze trends: Look at all 6 iterations to identify patterns vs. anomalies

Tools and Environment

  • Use a capable text editor: TextPad or similar tools that handle large files efficiently
  • Leverage search functionality: Quickly navigate between different statistics sections
  • Document your findings: Note baseline performance for future comparisons
  • Multiple controller analysis: If you have an HA pair, analyze both controllers

Common Performance Scenarios

  • High disk utilization + low network usage: Storage bottleneck confirmed
  • Low disk utilization + high network saturation: Network bottleneck
  • Low CPU idle time + normal disk/network: CPU bottleneck (rare)
  • Errors in network statistics: Hardware or configuration issues

Pro Tip: Perfstat analysis becomes intuitive with practice. Start with simple scenarios and gradually build your expertise with more complex multi-controller environments and mixed workloads.

Frequently Asked Questions

Find answers to common questions

Focus on: CPU utilization >80% (indicates overloaded filer), disk utilization >70% per aggregate (performance bottleneck), network throughput approaching interface limits, high latency (>10ms for typical workloads). Ignore: memory stats (NetApp caches aggressively, high memory use is normal), specific protocol stats unless troubleshooting that protocol, detailed per-volume stats (aggregate-level is usually sufficient). Start with dashboard view in perfstat: overall CPU, disk busy %, network throughput. Drill into details only when dashboard shows problems. Most performance issues show up as: high CPU wait time (disk can't keep up), specific aggregate at 100% busy (storage bottleneck), or network interface saturated.

Minimum: 1 hour during problem period (if troubleshooting specific issue), 24 hours for baseline (captures daily patterns). Best: 1 week for comprehensive analysis (includes weekly patterns, weekend vs weekday). Longer captures more patterns but creates huge files (1-week perfstat can be 500MB-2GB). For performance troubleshooting: capture during problem window (if slow every morning 9-10AM, capture 8AM-11AM). For capacity planning: capture 1 week representative period (not during holidays/special events). Perfstat has minimal performance impact (<2% CPU), safe to run continuously. NetApp support often requests 24-hour perfstat for cases—this is their sweet spot for analysis.

You can do basic analysis with perfstat HTML viewer (shows graphs of CPU, disk, network over time). Look for: sudden spikes (indicates problem at specific time), gradual trends (capacity running out), patterns (every day at 9AM performance drops). Advanced analysis needs tools: NetApp Harvest + Grafana (visualize perfstat data), PerfStat Analyzer (third-party tool), or NetApp support (they have internal analysis tools). DIY sufficient for: identifying obvious bottlenecks (disk at 100%, CPU maxed), correlating performance issues with time (slow at 2PM, perfstat shows spike at 2PM). Need NetApp support for: root cause analysis of complex issues, capacity planning calculations, optimization recommendations. Start with HTML viewer—if you can't identify problem, open case with perfstat attached.

First determine which aggregate is bottlenecked (perfstat shows per-aggregate stats). Solutions: 1) Add disks to overloaded aggregate (more spindles = more IOPS), 2) Enable FlexShare (QoS to prevent one workload hogging resources), 3) Move hot volumes to faster aggregate (SSD aggregate for databases), 4) Optimize workload (reduce unnecessary I/O, enable deduplication/compression to reduce actual writes). Quick wins: check for dedup/scrub running during business hours (reschedule to nights), identify largest I/O consumer from perfstat (specific volume hammering storage), enable caching for read-heavy workloads. Before buying more disks: verify aggregate is truly bottlenecked and not just misconfigured (sometimes problem is RAID group layout, not total disk count).

If still on 7-Mode (EOL 2015), focus on migration to ONTAP rather than performance tuning—7-Mode has no future. That said, perfstat still works on 7-Mode and is useful for: pre-migration baseline (understand current performance before migration), troubleshooting during migration planning (identify performance issues to fix in new ONTAP cluster), capacity planning for new hardware (how much performance do you actually need?). Don't: invest heavily in 7-Mode performance optimization (spend effort on migration instead), delay migration because of performance analysis (fix performance in ONTAP, not aging 7-Mode). Do: run perfstat on 7-Mode as baseline, use data to properly size ONTAP cluster, troubleshoot migration issues by comparing 7-Mode perfstat to ONTAP performance. If you're still on 7-Mode in 2025, migration should be priority #1.

Let's turn this knowledge into action

Our experts can help you apply these insights to your specific situation. No sales pitch — just a technical conversation.