Friday, January 16, 2015

HP-UX 11iv3/11.31 disk congestion control/throttling

HP-UX 11.31 includes some new parameters available via scsimgr.  These parameters are for disk congestion control, based on a maximum queue depth for the target port.

This document includes details on how the congestion control works.
HP-UX 11i v3 Congestion Control Management for Storage
http://www.whitepapercompany.com/pdfs/HP_Tech_WP.pdf

On my production systems, I mainly just want to ensure its disabled :-)

If this lil awk returns a bunch of nuthin' except target path instance numbers, its not enabled.

sasquatch dimaggio:/home/sasquatch> for tgtp in `ioscan -kfNCtgtpath | awk 'NR>2 {print $2}'`; do echo $tgtp":\c"; scsimgr -p get_attr -C tgtpath -I $tgtp -a tport_qdepth_enable -a tport_max_qdepth -a tport_qd_dyn ; done
0:::
1:::
2:::

Tuesday, January 6, 2015

SQLServer:Wait Statistics(Waits started per second)\Log buffer waits - bogus values?

There are 128 in-memory transaction log buffers for each SQL Server database.  If none of them are free (a situation that is only typically encountered with high write latency and high concurrency of writes to the transaction log), threads in active transactions will accrue wait time for transaction log buffers.

In perfmon, this is accounted for with "SQLServer:Wait Statistics(*)\Log buffer waits".  To gauge if transaction log write performance is a bottleneck for high data rate ETL, I typically use  "Waits started per second".  Sometimes I use "Waits started per second" in conjunction with "(Average wait time (ms)". 

Some values are obviously bogus - various flavors of 1.84E+19 such as 18446744073709500000 are among my favorites

But today among the useful values in the range from 0 to 301 on a 6 vcpu system,  I see an abundance of these values, too.
38894
40103
40108
40111
40404
40711
40714
40736
40737
40738
40739

Anyone else seeing bogus values in the SQL Server wait-related perfmon numbers?  Is there something clever to eliminate these bogus values - maybe ensuring that 64 bit perfmon is launched, or some other trick?

******
5761 values collected, in 15 second increments.  This was the distribution.  I'll look through several more days worth of data.  the 1.84E+19 value is pretty easy to discard - even moreso if it always reports 0 as an average wait time.  All of the bogus values 38894 to 40739 report 49 ms as the average wait time.  Hmmm... so I may be able to discard values in a range with a particular reported wait time.


Average Wait Time Waits Started in 15 Seconds Occurrences
0 0 3488
0 1 1
0 1.84467E+19 722
14 4 1
19 24 1
21 60 1
22 3 1
24 13 1
24 301 1
25 8 1
28 14 1
29 3 1
29 4 1
32 152 1
34 3 1
34 12 1
34 25 1
35 6 1
37 4 1
38 4 1
45 7 1
46 8 1
49 38894 718
49 40103 11
49 40108 2
49 40111 3
49 40404 1
49 40711 2
49 40714 103
49 40736 18
49 40737 126
49 40738 7
49 40739 525
53 5 1
55 3 1
56 47 1
58 4 1
62 14 1
63 2 1
67 9 1
71 1 1
106 85 1
115 17 1
119 1 1
134 1 1
166 4 1
177 2 1
184 4 1


When I asked #sqlhelp about this, Paul Randal indicated (rightly so) that there are more reliable methods of gathering this information - such as extended events and the SQL Server wait DMV.  I use the wait DMV pretty extensively, too - usually in 5 minute increments.  But I also have a need for something very lightweight, minimally invasive (particularly requiring low privileges), and easily deployable across numerous systems - so I'll still pursue this a bit.

Friday, December 19, 2014

VMWare scsicontroller per Windows guest LUN

PS C:\Users\sql_sasquatch> gwmi win32_DiskDrive |
>>   %{gwmi -query "ASSOCIATORS OF {$($_.__RELPATH)} where resultclass = Win32_PnpEntity"} |
>>   %{gwmi -query "ASSOCIATORS OF {$($_.__RELPATH)} where resultclass = Win32_SCSIController"} |
>>   fl __CLASS,__RELPATH.Caption,Description
>>



wmic Get DeviceId,Name,MaxDataWidth,MaxNumberControlled,MaxTransferRate
wmic diskdrive get name,scsiport,scsitargetid

Friday, October 10, 2014

IBMPower AIX Memory Accounting: Fun with vmstat... Numperm Surprise!!

I like to account for all of the memory on AIX servers.  I often use vmstat in 30 second increments to do that at a high level: tracking computation memory, filesystem cache, and free memory.
Here's the numbers from a small LPAR inhouse.
sasquatch vm-oracle:/mtn_home/sasquatch> vmstat -v
              4194304 memory pages
              3915331 lruable pages
               393725 free pages
                    2 memory pools
               773459 pinned pages
                 95.0 maxpin percentage
                  3.0 minperm percentage
                 90.0 maxperm percentage
                 28.4 numperm percentage
              1114771 file pages
                  0.0 compressed percentage
                    0 compressed pages
                 28.4 numclient percentage
                 90.0 maxclient percentage
              1114771 client pages
                    0 remote pageouts scheduled
                    4 pending disk I/Os blocked with no pbuf
                    0 paging space I/Os blocked with no psbuf
                 2228 filesystem I/Os blocked with no fsbuf
                    0 client filesystem I/Os blocked with no fsbuf
                 5330 external pager filesystem I/Os blocked with no fsbuf
                 64.0 percentage of memory used for computational pages


OK... so that's 64.0% computational memory, 28.4% numperm/numclient, and 393725 free pages.  Let's see...
393725 free pages from 4194304 total pages... that's 9.4%.
  64.0%
+ 28.4%
+  9.4%
-------
 101.8%


Huh?!? I guess I could pass this off as a rounding error if it was between 100% and 101%.  But not if the total of 3 numbers is 101.8%!
What's going on?

If I look at the total of those numbers over a 12 hour period in 30 second increments, it doesn't get any better.


Well... if there's anything that will make me look like a stooge on a conference call, its when I can't add 3 numbers to get 100% reliably when its expected.  But, I'm not willing to fudge the numbers, either.

Lets consider just these numbers.

              4194304 memory pages
                 28.4 numperm percentage
              1114771 file pages

                 28.4 numclient percentage

1114771/4194304 = 26.58%

Not 28.4%? Oh.  That seems like a clue.  Is there some level of inaccuracy when numperm/numclient is calculated?  Nope. (It took me a long, long time to figure this out but because this is *my* blog, I get to look smart here.)

What about this number reported by vmstat?
              3915331 lruable pages
Hmmm...
1114771/3915331 =  28.47%
Well, look at that.  The number reported as numperm is not a percentage of *total* memory pages on the server, but rather a percentage of *lruable* pages.  And, rather than rounded as one may typically expect, the number is just truncated to one decimal point. I proved that to myself by looking at a bunch of numbers that I won't insert here because I think this post is already ending up too long.

So that's one reason my stacked graphs of computational, filesystem, and free memory have had such a jagged top edge like this.  It won't look too jagged when its this small and with a 0-100 scale... but its bothered me for a while. At the very top of this stacked graph the jagged edge is the same as the jagged blue line displayed above which flits up and down between 100.5 and 101.6.

 

              4194304 memory pages
              3915331 lruable pages
               393725 free pages
              1114771 file pages
                 64.0 percentage of memory used for computational pages



 393725/4194304 =  9.39% free memory
1114771/4194304 = 26.58% filesystem cache memory

That leaves 64.03% for computational pages. Pretty good in comparison to the 64.0% reported by vmstat.  Looking at a few other values is enough to convince that while numperm is computed as a portion of lruable pages (and then truncated rather than rounded), computational memory is computed as a portion of total pages.  But, is the reported computational memory truncated or rounded?


Total - (Free + File) percentage of memory used for computational pages
58.57 58.6
58.58 58.6
58.58 58.6
58.95 59
58.97 59
58.93 58.9

In this case, the numbers are rounded.  So, there you have it.  Numperm is calculated with 'lruable pages' as the denominator, and truncated to one decimal point.  The reported percentage for computational memory is calculated with total pages as denominator, and is rounded as typically expected.

So... from now on to get rid of the jagged edge in my stacked graphs I'll use the page counts and convert to each to percentages with total pages as the denominator: free pages, file pages, and the remaining pages from the total as computational pages.

 
 You can't see it... but trust me, the top edge is now straight.  By definition :-) 


Wednesday, September 24, 2014

Trace flag 4199 impact on Optimizer Memory Use; RESOURCE_SEMAPHORE_QUERY_COMPILE wait logjam




 Its an ugly query, but I have to show it some love.

This is one of the queries responsible for the RESOURCE_SEMAPHORE_QUERY_COMPILE wait logjam I noticed a few days ago.

On a whim, I asked the performance engineer to disable trace flag 4199.  The logjam disappeared - but system workload performance wasn't great.  I didn't expect it to be: to get the query concurrency desired with the available query memory and to get efficient plans at least the join re-ordering of trace flag 4101 is needed and all the other optimizer fixes rolled into trace flag 4199 probably also (in sum) make a positive contribution.

But with the logjam now associated with trace flag 4199, I've got to figure out why T4199 is resulting in more optimizer memory consumption.

I grabbed an estimated plan from the query with the following tacked on:
OPTION
( QUERYTRACEON 3604,
  QUERYTRACEON 2372,
  QUERYTRACEON 2373,
  QUERYTRACEON 4199 )

Then I cleared procedure cache and grabbed an estimated plan with the same trace flag list minus T4199.

Both query plans indicated early abort of the optimizer:

StatementOptmEarlyAbortReason="TimeOut"



With T4199, the diagnostic trace flags 2372 & 2373 spit out 89,932 lines of text, finishing at 7628 for memory.
Without trace flag 4199, they spit out 119,790 lines of text, finishing at 3652 for memory.


Up to line 55,051 the two diagnostic captures are almost identical - the numbers stay within 1 of each other.  After line 55,051 the rules and the memory amounts diverge, as indicated by the graph above.

I'm gonna need a bigger boat :-)

Hopefully I can pique Paul White's interest, and maybe he can teach me an efficient way of picking some insight out of the 29,858 text lines after the T4199 version of the query diverges.  Ultimately I'd like to get the query not to time out of the optimizer... but getting it not to cause RESOURCE_SEMAPHORE_QUERY_COMPILE wait logjam is the first order of business.