Tuesday, October 16, 2012

IBM SVC - Fetch Performance Stats

While the IBM supported method of gathering statistics is TPC, there are occasions where you may want to gather some stats either when you don't have TPC, or if you need to get stats files to IBM, for support issues, for example.
To get the stats, first of all you need to make sure that statistics collection is turned on. To check this, use the lssystem command (or lscluster for code prior to 6.3):
svcinfo lssystem
About 15 lines down you should see the following:
statistics_status on
statistics_frequency 15
If statistics_status is off, then you need to switch it on. The second line is the frequency at which statistics will be collected. This is specified in minutes and can be between 1 and 60. To turn this on, or change the frequency, use the startstats command:
svctask startstats -interval interval
Bear in mind if you use and interval of 1 minute, you will get a lot of stats, but doing so will make it easier to catch small spikes in performance.
Once statistics collection is running, you need to copy the resulting files from the cluster. Note that each node on the cluster will collect stats for that node only, so if you simply copy stats from the cluster, you will only copy them from the configuration node. Note also that each node will contain a maximum of 16 sets of stats files. Further stats collections will overwrite old files.
So bearing this in mind, you need to first copy the files from the non-configuration nodes to the configuration node before copying them from the cluster, and you also need to do this frequently to make sure you collect all the stats. If the statistics_frequency is X minutes, then this period will be 16 * X at the most. There is no harm in collecting statistics more frequently, as scp will simply overwrite duplicate files. I generally collect them at the same period as statistics_frequency in order to ensure I have a copy of the latest stats.
The collection needs to be performed on a host, such as the SSPC. You can use a Unix host with a cron job, or even a while loop using the sleep command to set the interval, or use a scheduled task on a Windows host. The script itself is two lines, and can be run as a cmd or shell script.
ssh admin@ "svcinfo lsnode -nohdr -filtervalue config_node=no|while read -a node; do svctask cpdumps -prefix /dumps/iostats ${node[0]}; done"
scp admin@:/dumps/iostats/* 
Choose suitable values for the destination for the files on the local host.
Note that if you're running this from a local Unix shell, you must escape the $, i.e. use \${node[0]}. If you don't the shell will try to substitute a local variable, which doesn't exist.
Note also that the filter value "config_node" is only implemented in later code. You'll need to remove this for earlier code levels. You'll get an error when the cpdumps command is run but the files will all copy OK nevertheless.
Once you have set this up as a cron job or scheduled task, you should find filling up with statistics ready for analysis, which is a more complicated subject.
Here's an example of a short cmd script which will use the $clust environment variable on the cluster to fetch the name of the cluster. The files will be fetched and added to an appropriately named zip file. I use Cygwin rather than Putty for this, but you could easily modify to use plink and pscp
set SVC=%1
set ZIP="C:\Program Files\7-zip\7z.exe"

for /f %%i in ('ssh admin@%SVC% echo $clust') do set NAME=%%i

ssh admin@%SVC% "svcinfo lsnode -nohdr |while read -a node; do svctask cpdumps -prefix /dumps/iostats $node; done"
scp admin@%SVC%:/dumps/iostats/* .

%ZIP% u -tzip -mx9 stats_%NAME%.zip *_stats_*
if [%errorlevel%]==[0] del *_stats_*
 
IBM developerWorks

No comments:

Post a Comment