Preface. AIX Performance Management Structure. Performance Tools Packages and Their Documentation. BEST/1. AIX Performance Diagnostic Tool (PDT). AIX Base Operating System (BOS). Performance Toolbox (PTX). AIX Performance PMR Data Collection Tool (PerfPMR). Overview of Contents. Highlighting. Performance Concepts. How Fast Is That Computer in the Window? First, Understand the Workload. Program Execution Dynamics. System Dynamics. An Introduction to the Performance Tuning Process. Identifying the Workloads. Setting Objectives. Identifying the Critical Resources. Minimizing Critical Resource Requirements. Reflecting Priorities in Resource Allocation. Repeating the Tuning Steps. Applying Additional Resources. Performance Benchmarking the Inevitable Dirtiness of Performance Data. AIX Resource Management Overview. Performance Overview of the AIX CPU Scheduler. AIX Version 4. Thread Support. Scheduling Policy for Threads with Local or Global Contention Scope. Process and Thread Priority. AIX Scheduler Run Queue. Scheduler CPU Time Slice. Performance Overview of the Virtual Memory Manager (VMM). Real Memory Management. VMM Memory Load Control Facility. Allocation and Reclamation of Paging Space Slots. Performance Overview of AIX Management of Fixed Disk Storage. Sequential-Access Read Ahead. Write Behind. Memory Mapped Files and Write Behind. Disk I/O Pacing. Disk Array. An Introduction to Multiprocessing. Symmetrical Multiprocessor (SMP) Concepts and Architecture. Symmetrical vs Asymmetrical Multiprocessors. Data Serialization. Lock Granularity. Locking Overhead. Cache Coherency. Processor Affinity. Memory and Bus Contention. SMP Performance Issues. Workload Concurrency. Throughput. Response Time. Adapting Programs to an SMP Environment. SMP Workloads. Workload Multiprocessability. Multiprocessor Throughput Scalability. Multiprocessor Response Time. SMP Scheduling. Default Scheduler Processing of Migrated Workloads. Scheduling Algorithm Variables. Processor Affinity and Binding. Performance-Conscious Planning, Design, and Implementation. Identifying the Components of the Workload. Documenting Performance Requirements. Estimating the Resource Requirements of the Workload. Design and Implementation of Efficient Programs. CPU Limited Programs. Design and Coding for Effective Use of Caches. Registers and Pipeline. Cache and TLBs. Effective Use of Preprocessors and the XL Compilers. Levels of Optimization. XL C Options for string.h Subroutine Performance. C and C++ Coding Style for Best Performance. Compiler Execution Time. Memory Limited Programs. Performance Related Installation Guidelines. AIX Pre-Installation Guidelines. CPU Pre-Installation Guidelines. Memory Pre-Installation Guidelines. Disk Pre-Installation Guidelines. Communications Pre-Installation Guidelines. System Monitoring and Initial Performance Diagnosis. System Monitoring and Initial Performance Diagnosis. The Case for Continuous Performance Monitoring. Performance Monitoring Using iostat, netstat, vmstat. The Performance Diagnostic Tool. The AIX Performance Toolbox. Inference from the Kind of Performance Problem Reported. A Particular Program Runs Slowly. Everything Runs Slowly at a Particular Time of Day. Everything Runs Slowly at Unpredictable Times. Everything an Individual User Runs Is Slow. A Number of LAN-Connected Systems Slow Down Simultaneously. Everything That Uses a Particular Service or Device Slows Down at Times. Using PerfPMR for Performance Diagnosis. Check before You Change. Identifying the Performance-Limiting Resource. Starting with an Overview of System Performance. Determining the Limiting Factor for a Single Program. Disk or Memory? Workload Management. Monitoring and Tuning CPU Use. Using vmstat to Monitor CPU Use. Using the time Command to Measure CPU Use. time and timex Cautions. Using xmperf to Monitor CPU Use. Using ps to Identify CPU-Intensive Programs. Using tprof to Analyze Programs for CPU Use. A (Synthetic) Cautionary Example. Detailed Control Flow Analysis with stem. Basic stem Analysis. Restructuring Executables with fdpr. Controlling Contention for the CPU. Controlling the Priority of User Processes. Running a Command at a Nonstandard Priority with nice. Setting a Fixed Priority with the setpri Subroutine. Displaying Process Priority with ps. Modifying the Priority of a Running Process with renice. Clarification of nice/renice Syntax. Tuning the Process Priority Value Calculation with schedtune. Modifying the Scheduler Time Slice. CPU Efficient User ID Administration. Monitoring and Tuning Memory Use. How Much Memory Is Really Being Used? vmstat. ps. svmon. Example of vmstat, ps, and svmon Output. Memory Leaking Programs. Analyzing Patterns of Memory Use with BigFoot. Assessing Memory Requirements via the rmss Command. Two Styles of Using rmss. Tuning VMM Memory Load Control. Memory Load Control Tuning - Possible, but Usually Inadvisable. Tuning VMM Page Replacement. Choosing minfree and maxfree Settings. Choosing minperm and maxperm Settings. Monitoring and Tuning Disk I/O. Pre Installation Planning. Building a Pre Tuning Baseline. Assessing Disk Performance after Installation. Assessing Physical Placement of Data on Disk. Reorganizing a Logical Volume or Volume Group. Reorganizing a File System. Performance Considerations of Paging Spaces. Measuring Overall Disk I/O with vmstat. Using filemon for Detailed I/O Analysis. Disk Limited Programs. Expanding the Configuration. Background Information. Tuning Sequential Read Ahead. Use Of Disk I/O Pacing. Example. Logical Volume Striping. Designing a Striped Logical Volume. Tuning for Striped Logical Volume I/O. File System Fragment Size. Compression. Asynchronous Disk I/O. Using Raw Disk I/O. Using sync/fsync. Modifying the SCSI Device Driver max_coalesce Parameter. Setting SCSI-Adapter and Disk Device Queue Limits. Non IBM Disk Drive. Non IBM Disk Array. Disk Adapter Outstanding Request Limits. Controlling the Number of System pbufs. Monitoring and Tuning Communications I/O. UDP/TCP/IP Performance Overview. Communication Subsystem Memory (mbuf) Management. Socket Layer. Relative Level of Function in UDP and TCP. IP Layer. IF Layer (Demux Layer in AIX Version). LAN Adapters and Device Drivers. TCP and UDP Performance Tuning. Overall Recommendations. Tuning TCP Maximum Segment Size (MSS). IP Protocol Performance Tuning Recommendations. Ethernet Performance Tuning Recommendations. Token Ring (4Mb) Performance Tuning Recommendations. Token Ring (16Mb) Performance Tuning Recommendations. FDDI Performance Tuning Recommendations. ATM Performance Tuning Recommendations. SOCC Performance Tuning Recommendations. HIPPI Performance Tuning Recommendations. AIX Version 3.2.5 mbuf Pool Performance Tuning. Overview of the mbuf Management Facility. When to Tune the mbuf Pools. How to Tune the mbuf Pools. UDP, TCP/IP, and mbuf Tuning Parameters Summary. the walls. sb_max. rfc1323. udp_sendspace. udp_recvspace. tcp_sendspace. tcp_recvspace. ipqmaxlen. xmt_que_size. rec_que_size. MTU. NFS Tuning. How Many biods and nfsds Are Needed for Good Performance? Performance Implications of Hard or Soft NFS Mounts. Tuning to Avoid Retransmits. Tuning the NFS File Attribute Cache. Disabling Unused NFS ACL Support. Tuning for Maximum Caching of NFS Data. Tuning Other Layers to Improve NFS Performance. Increasing NFS Socket Buffer Size. NFS Server Disk Configuration. Hardware Accelerators. Misuses of NFS That Affect Performance. Serving Diskless Workstations. How a Diskless System Is Different. NFS Considerations. When a Program Runs on a Diskless Workstation. Paging. Resource Requirements of Diskless Workstations. Tuning for Performance. Commands Performance. Case Study 1 - An Office Workload. Case Study2 - A Software Development Workload. Tuning Asynchronous Connections for High Speed Transfers. Measurement Objectives and Configurations. Results. The 8/16 Async Port Adapter. The 64 Port Async Adapter. The 128 Port Async Adapter. Async Port Tuning Techniques. fastport for Fast File Transfers. Using netpmon to Evaluate Network Performance. Using iptrace to Analyze Performance Problems. DFS Performance Tuning. DFS Caching on Disk or Memory? DFS Cache Size. DFS Cache Chunk Size. Number of DFS Cache Chunks. Location of DFS Disk Cache. Cache Status Buffer Size. Effect of Application Read/Write Size. Communications Parameter Settings for DFS. DFS File Server Tuning. DCE LFS Tuning for DFS Performance. Performance Analysis with the Trace Facility. Understanding the Trace Facility. Limiting the Amount of Trace Data Collected. Starting and Controlling Trace. Formatting Trace Data. Viewing Trace Data. An Example of Trace Facility Use. Obtaining a Sample Trace File. Formatting the Sample Trace. Reading a Trace Report. Filtering of the Trace Report. Starting and Controlling Trace from the Command Line. Controlling Trace in Subcommand Mode. Controlling Trace by Commands. Starting and Controlling Trace from a Program. Controlling Trace with Trace Subroutine Calls. Controlling Trace with ioctl Calls. Adding New Trace Events. Possible Forms of a Trace Event Record. Trace Channels. Macros for Recording Trace Events. Use of Event IDs. Examples of Coding and Formatting Events. Syntax for Stanzas in the Trace Format File. Performance Diagnostic Tool (PDT). Structure of PDT. Scope of PDT Analysis. Sample PDT Report. Installing and Enabling PDT. Customizing PDT. Responding to PDT Report Messages. Handling a Possible AIX Performance Bug. Measuring the Baseline. Reporting the Problem. Obtaining and Installing AIX Version 3.2.5 PerfPMR. Installing AIX Version 4.1 PerfPMR. Problem Analysis Data. AIX Performance Monitoring and Tuning Commands. Performance Reporting and Analysis Commands. Performance Tuning Commands. schedtune Command. vmtune Command. pdt_config Script. pdt_report Script. Performance Related Subroutines. Cache and Addressing Considerations. Disclaimer. Addressing. Cache Lookup. TLB Lookup. RAM Access. Implications. Efficient Use of the ld Command. Rebindable Executables. Prebound Subroutine Libraries. Examples. Performance of the Performance Tools. filemon. fileplace. iostat. lsattr. lslv. netpmon. netstat. nfsstat. PDT. ps. svmon. tprof. trace. vmstat 0. Application Memory Management. malloc and realloc. Performance Effects of Shared Libraries. Advantages and Disadvantages of Shared Libraries. How to Build Executables Shared or Nonshared. How to Determine If Nonshared Will Help. Accessing the ProcessorTimer. POWER-Architecture Unique Timer Access. Accessing Timer Registers in PowerPC Architecture Systems. Example Use of the second Routine. National Language Support Locale vs Speed. Programming Considerations. Some Simplifying Rules. Controlling Locale. Summary of Tunable AIX Parameters. arpt_killc. biod Count. Disk Adapter Outstanding Requests Limit. Disk Drive Queue Depth. dog_ticks. fork() Retry Interval. ipforwarding. ipfragttl. ipqmaxlen. ipsendredirects. loop_check_sum (3.2.5 only). lowclust (3.2.5 only). lowmbuf (3.2.5 only). maxbuf. max_coalesce. maxfree. maxperm. maxpgahead. maxpin (4.1 only). maxpout. maxttl. mb_cl_hiwat (3.2.5 only). Memory Load Control Parameters. minfree. minperm. minpgahead. minpout. MTU. nfs_chars (3.2.5), nfs_socketsize (4.1). nfsd Count. nfs_gather_threshold (4.1 only). nfs_portmon (3.2.5), portcheck (4.1). nfs_repeat_messages (4.1 only). nfs_setattr_error (4.1 only). nfsudpcksum (3.2.5), udpchecksum (4.1). nonlocsrcroute. npskill (4.1 only). npswarn (4.1 only). numclust (4.1 only). numfsbuf (4.1 only). Paging Space Size. Process Priority Calculation. rec_que_size. rfc1122addrchk. rfc1323. sb_max. subnetsarelocal. syncd Interval. tcp_keepidle. tcp_keepintvl. tcp_mss dflt. tcp_recvspace. tcp_sendspace. tcp_ttl. thewall. Time Slice Expansion Amount. udp_recvspace. udp_sendspace. udp_ttl. xmt_que_size. Bibliography. Glossary. Index.