Friday, September 6, 2013

IBM Power 7 Problem Case: Unexpected paging resolved by disabling memory_affinity

A system running Oracle on IBMPower AIX was seeing lots of paging space traffic, and I was pulled in.  Even when paging space writes occurred, vmstat seemed to show enough server RAM to avoid paging out.  It was worse than folks originally thought in some ways: there were paging space buffer (psbuf) shortages causing additional waits in some paging space operations.  And the paging space disk was experiencing sqfull conditions according to iostat.

This is a big system.  16 memory pools because of the default cpu_scale_memp value of 8: every 8 logical CPUs gets a memory pool.  So there are 128 logical CPUs on the system: 32 Power7 cores.

With memory_affinity enabled, even though the NB_PAGES memory pool sizes were fairly well balanced, the NUMFRB free 4k frames per pool were not very well balanced.
The kdb NB_PAGES and NUMFRB values below are in hex.

(0)> memp *
                 VMP MEMP  NB_PAGES  FRAMESETS        NUMFRB
F1000F0009740000  00  000   000C9D00  000 001 002 003 00030A12
F1000F0009740500  00  001   000CA100  004 005 006 007 0000A828
F1000F0009740A00  00  002   000C9E00  008 009 00A 00B 0000B346
F1000F0009740F00  00  003   000C9300  00C 00D 00E 00F 0001EEB8
F1000F0009741400  00  004   000C9D90  010 011 012 013 000138AB
F1000F0009741900  00  005   000CAA00  014 015 016 017 00014807
F1000F0009741E00  00  006   000CA000  018 019 01A 01B 0000346E
F1000F0009742300  00  007   000C9000  01C 01D 01E 01F 0002C69A
F1000F0009742800  01  008   000C9800  020 021 022 023 0001FA32
F1000F0009742D00  01  009   000CA000  024 025 026 027 0000118B
F1000F0009743200  01  00A   000CA000  028 029 02A 02B 00011B04
F1000F0009743700  01  00B   000C9380  02C 02D 02E 02F 000131E8
F1000F0009743C00  01  00C   000C9000  030 031 032 033 00015683
F1000F0009744100  01  00D   000C9000  034 035 036 037 00003573
F1000F0009744600  01  00E   000C9000  038 039 03A 03B 000097D1
F1000F0009744B00  01  00F   000C9000  03C 03D 03E 03F 0002C819


Advised them to disable vmo parameter memory_affinity (requires bosboot and reboot).
This was what the memory pools looked like afterward.

(0)> memp *
                 VMP MEMP  NB_PAGES  FRAMESETS        NUMFRB
F1000F0009740000  00  000   000C9E00  000 001 002 003 000B3848
F1000F0009740500  00  001   000C8700  004 005 006 007 000B2952
F1000F0009740A00  00  002   000C9E00  008 009 00A 00B 000B379D
F1000F0009740F00  00  003   000C8500  00C 00D 00E 00F 000B2895
F1000F0009741400  00  004   000C8C70  010 011 012 013 000B2F53
F1000F0009741900  00  005   000CA000  014 015 016 017 000B3568
F1000F0009741E00  00  006   000CA000  018 019 01A 01B 000B3702
F1000F0009742300  00  007   000CA000  01C 01D 01E 01F 000B34CF
F1000F0009742800  00  008   000C8000  020 021 022 023 000B2E72
F1000F0009742D00  00  009   000C8000  024 025 026 027 000B305F
F1000F0009743200  00  00A   000C9000  028 029 02A 02B 000B32BA
F1000F0009743700  00  00B   000C9000  02C 02D 02E 02F 000B322C
F1000F0009743C00  00  00C   000C9000  030 031 032 033 000B3215
F1000F0009744100  00  00D   000C9000  034 035 036 037 000B3211
F1000F0009744600  00  00E   000C9000  038 039 03A 03B 000B313E
F1000F0009744B00  00  00F   000C9000  03C 03D 03E 03F 000B33C3


The memory pool NB_PAGES size have became a little more balanced with memory_affinity disabled.  But the NUMFRB numbers have become a LOT more balanced.  Excellent.  That was exactly what I wanted.  After two weeks, I was able to confirm that there had been NO paging space traffic at all based on the "paging space page ins" and "paging space page outs" reported by vmstat -s.

The absence of any paging space traffic allowed them to increase their SGA size.  That further benefited performance - significantly, actually,  In Oracle 11GR2, the small table threshold is 10% of the database cache.  Some of the SGA memory uses are fixed size - increasing SGA size can disproportionately favor the database cache.  In this case, the database cache was increased enough to bring a whole new set of high concurrency tables underneath the small table threshold.  That discouraged Oracle from performing direct path read full table scans of these tables redundantly into PGA, and instead brought more of them into SGA database cache.

I didn't even try to get more data for the memory pools in this case.  I felt bad enough about asking their administrator to run "echo 'memp *' | kdb" for me as root before and after disabling memory_affinity :).

But I felt pretty good about seeing their performance turn around.

Disabling memory_affinity is certainly not a general recommendation to address unwanted paging space use.  By disabling memory_affinity, you sacrifice a significant amount of optimization in memory latency that can be provided by keeping the memory as close as possible to the threads that use it.  But, truth be told, that optimization becomes less valuable on a large database server, with CPU cores and physical RAM that span sockets.

No comments:

Post a Comment