Chart your course with sar

Posted in: DBA Lounge, Oracle, Technical Track

The sar (system activity reporter) is a Linux utility that actually started life long ago in System V Unix.

In spite of its ancient origins, sar is still quite useful, and has even been enhanced in recent versions of Linux.

The use of sar is not limited to root – any user can run sar and its utilities.  Most Linux systems have sar installed and enabled by default to collect sar data at 10 minute intervals and retain that data for 10 days.

The reports output by sar will look familiar:

 

[sourcecode language=”bash” padlinenumbers=”true”] jkstill@poirot ~/tmp $ sar -d | head -20
Linux 4.4.0-53-generic (poirot.jks.com) 08/07/17 _x86_64_ (1 CPU)

00:00:01 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
00:02:01 dev7-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev7-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev7-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev7-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev7-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev7-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev7-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:02:01 dev8-0 0.40 0.00 4.97 12.43 0.00 0.26 0.26 0.01
00:02:01 dev252-0 0.63 0.00 4.90 7.78 0.00 0.16 0.16 0.01
00:02:01 dev252-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
00:04:01 dev7-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
[/sourcecode]

This report format is good for a quick look at different metrics to get an overall idea of system perform, disk metrics in the previously shown example.

However what if you would like to collate this data into something more readily consumed for charting an analysis, such as a CSV file?  The standard output of sar does not readily lend itself to such use without quite a bit of work.

The sar utility is part of the Linux  sysstat package and contains not only sar, but sadf.  If sar is available then so is sadf. What is sadf you say?  sadf is a the sar data formatting utility, and can output data in CSV, XML and other formats.

Using sadf a bash script can be created that will extract sar data for many Linux subsystems and write that data in the format of your choice, CSV in this case.

The asp.sh script does just that, creating CSV files that can then be used for charting and analysis.

 

[sourcecode language=”bash”] #!/bin/bash

# Jared Still – Pythian
# still@pythian.com jkstill@gmail.com
# 2017-07-20

# asp – another sar processor

# tested on Red Hat Enterprise Linux Server release 6.6 (Santiago)
# also tested on Linux Mint

help() {
echo
echo $0 -s source-dir -d dest-dir
echo
}

# variables can be set to identify multiple sets of copied sar files
sarSrcDir=’/var/log/sa’ # RedHat, CentOS …
#sarSrcDir=’/var/log/sysstat’ # Debian, Ubuntu …

sarDstDir="sar-csv"

csvConvertCmd=" sed -e ‘s/;/,/g’ "

while getopts s:d:h arg
do
case $arg in
d) sarDstDir=$OPTARG;;
s) sarSrcDir=$OPTARG;;
h) help; exit 0;;
*) help; exit 1;;
esac
done

cat << EOF

Source: $sarSrcDir
Dest: $sarDstDir

EOF

#exit

mkdir -p $sarDstDir || {

echo
echo Failed to create $sarDstDir
echo
exit 1

}

echo "lastSaDirEl: $lastSaDirEl"

# sar options
# -d activity per block device
# -j LABEL: use label for device if possible. eg. sentryoraredo01 rather than /dev/dm-3
# -b IO and transfer rates
# -q load
# -u cpu
# -r memory utilization
# -R memory
# -B paging
# -S swap space utilization
# -W swap stats
# -n network
# -v kernel filesystem stats
# -w context switches and task creation
#sarDestOptions=( ‘-d -j LABEL -p’ ‘-b’ ‘-q’ ‘-u ALL’ ‘-r’ ‘-R’ ‘-B’ ‘-S’ ‘-W’ ‘-n DEV,EDEV,NFS,NFSD,SOCK,IP,EIP,ICMP,EICMP,TCP,ETCP,UDP’ ‘-v’ ‘-w’)
# break up network into a separate file for each option
# not all options available depending on sar version
# for disk "-d" you may want one of ID, LABEL, PATH or UUID – check the output and the sar docs
sarDestOptions=( ‘-d -j ID -p’ ‘-b’ ‘-q’ ‘-u ALL’ ‘-r’ ‘-R’ ‘-B’ ‘-S’ ‘-W’ ‘-n DEV’ ‘-n EDEV’ ‘-n NFS’ ‘-n NFSD’ ‘-n SOCK’ ‘-n IP’ ‘-n EIP’ ‘-n ICMP’ ‘-n EICMP’ ‘-n TCP’ ‘-n ETCP’ ‘-n UDP’ ‘-v’ ‘-w’)

#sarDestFiles=( sar-disk.csv sar-io.csv sar-load.csv sar-cpu.csv sar-mem-utilization.csv sar-mem.csv sar-paging.csv sar-swap-utilization.csv sar-swap-stats.csv sar-network.csv sar-kernel-fs.csv sar-context.csv)
sarDestFiles=( sar-disk.csv sar-io.csv sar-load.csv sar-cpu.csv sar-mem-utilization.csv sar-mem.csv sar-paging.csv sar-swap-utilization.csv sar-swap-stats.csv sar-net-dev.csv sar-net-ede.csv sar-net-nfs.csv sar-net-nfsd.csv sar-net-sock.csv sar-net-ip.csv sar-net-eip.csv sar-net-icmp.csv sar-net-eicmp.csv sar-net-tcp.csv sar-net-etcp.csv sar-net-udp.csv sar-kernel-fs.csv sar-context.csv)

lastSarOptEl=${#sarDestOptions[@]}
echo "lastSarOptEl: $lastSarOptEl"

#while [[ $i -lt ${#x[@]} ]]; do echo ${x[$i]}; (( i++ )); done;

# initialize files with header row
i=0
while [[ $i -lt $lastSarOptEl ]] do
CMD="sadf -d — ${sarDestOptions[$i]} | head -1 | $csvConvertCmd > ${sarDstDir}/${sarDestFiles[$i]} "
echo CMD: $CMD
eval $CMD
#sadf -d — ${sarDestOptions[$i]} | head -1 | $csvConvertCmd > ${sarDstDir}/${sarDestFiles[$i]}
echo "################"
(( i++ ))
done

#exit

#for sarFiles in ${sarSrcDirs[$currentEl]}/sa??
for sarFiles in $(ls -1dtar ${sarSrcDir}/sa??)
do
for sadfFile in $sarFiles
do

#echo CurrentEl: $currentEl
# sadf options
# -t is for local timestamp
# -d : database semi-colon delimited output

echo Processing File: $sadfFile

i=0
while [[ $i -lt $lastSarOptEl ]] do
CMD="sadf -d — ${sarDestOptions[$i]} $sadfFile | tail -n +2 | $csvConvertCmd >> ${sarDstDir}/${sarDestFiles[$i]}"
echo CMD: $CMD
eval $CMD
if [[ $? -ne 0 ]]; then
echo "#############################################
echo "## CMD Failed"
echo "## $CMD"
echo "#############################################

fi
(( i++ ))
done

done
done

echo
echo Processing complete
echo
echo files located in $sarDstDir
echo

# show the files created
i=0
while [[ $i -lt $lastSarOptEl ]] do
ls -ld ${sarDstDir}/${sarDestFiles[$i]}
(( i++ ))
done

[/sourcecode]

 

Following is an example execution of the script:

 

[sourcecode language=”bash”] jkstill@poirot ~/pythian/blog/sar-tools-blog $ ./asp.sh -s /var/log/sysstat

Source: /var/log/sysstat
Dest: sar-csv

lastSaDirEl:
lastSarOptEl: 23
CMD: sadf -d — -d -j ID -p | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-disk.csv
################
CMD: sadf -d — -b | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-io.csv
################
CMD: sadf -d — -q | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-load.csv
################
CMD: sadf -d — -u ALL | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-cpu.csv
################
CMD: sadf -d — -r | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-mem-utilization.csv
################
CMD: sadf -d — -R | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-mem.csv
################
CMD: sadf -d — -B | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-paging.csv
################
CMD: sadf -d — -S | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-swap-utilization.csv
################
CMD: sadf -d — -W | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-swap-stats.csv
################
CMD: sadf -d — -n DEV | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-dev.csv
################
CMD: sadf -d — -n EDEV | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-ede.csv
################
CMD: sadf -d — -n NFS | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-nfs.csv
################
CMD: sadf -d — -n NFSD | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-nfsd.csv
################
CMD: sadf -d — -n SOCK | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-sock.csv
################
CMD: sadf -d — -n IP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-ip.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -n EIP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-eip.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -n ICMP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-icmp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -n EICMP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-eicmp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -n TCP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-tcp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -n ETCP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-etcp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -n UDP | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-net-udp.csv
Requested activities not available in file /var/log/sysstat/sa07
################
CMD: sadf -d — -v | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-kernel-fs.csv
################
CMD: sadf -d — -w | head -1 | sed -e ‘s/;/,/g’ > sar-csv/sar-context.csv
################
Processing File: /var/log/sysstat/sa30
CMD: sadf -d — -d -j ID -p /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-disk.csv
CMD: sadf -d — -b /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-io.csv
CMD: sadf -d — -q /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-load.csv
CMD: sadf -d — -u ALL /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-cpu.csv
CMD: sadf -d — -r /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-mem-utilization.csv
CMD: sadf -d — -R /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-mem.csv
CMD: sadf -d — -B /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-paging.csv
CMD: sadf -d — -S /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-swap-utilization.csv
CMD: sadf -d — -W /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-swap-stats.csv
CMD: sadf -d — -n DEV /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-dev.csv
CMD: sadf -d — -n EDEV /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-ede.csv
CMD: sadf -d — -n NFS /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-nfs.csv
CMD: sadf -d — -n NFSD /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-nfsd.csv
CMD: sadf -d — -n SOCK /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-sock.csv
CMD: sadf -d — -n IP /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-ip.csv
Requested activities not available in file /var/log/sysstat/sa30
CMD: sadf -d — -n EIP /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-eip.csv
Requested activities not available in file /var/log/sysstat/sa30
CMD: sadf -d — -n ICMP /var/log/sysstat/sa30 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-net-icmp.csv
Requested activities not available in file /var/log/sysstat/sa30

CMD: sadf -d — -v /var/log/sysstat/sa07 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-kernel-fs.csv
CMD: sadf -d — -w /var/log/sysstat/sa07 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-context.csv

Processing complete

files located in sar-csv
[/sourcecode]

Note that for some commands the result is Requested activities not available.  What this means is that we have asked for something sar may be capable of collecting, but in this case there is no data. It may be that sar has not been configured to collect that particular metric. These messages can be ignored unless, of course, it is for a metric you would like to see. (sar configuration is outside the scope of this article)

Let’s break down the following sadf command line

[sourcecode language=”bash”] sadf -d — -d -j ID -p /var/log/sysstat/sa05 | tail -n +2 | sed -e ‘s/;/,/g’ >> sar-csv/sar-disk.csv
[/sourcecode]

The sadf ‘-d’ option

This option tells sadf to output data in  CSV format.  As CSV is an acronym for Comma-Separated-Values, the use of a semi-colon as a delimiter may at first seem a little puzzling.  However it makes sense to use something other than a comma, as it may be that a comma could appear in the output.  If that were the case sar would need to then surround the data with quotes.  While many programs can deal with data that may or may not be quoted, it is just simpler to use a semi-colon, which is unlikely to appear in data seen in sar.

The ‘–‘ option

This option doesn’t appear to be too well known. When used on the shell command line, — signifies the end of arguments for the program. A typical use might be if you were using grep to search for ‘-j’ in some text files.  This following command fails due to grep interpreting the text we want to search as an argument. Since there is no ‘-j’ argument for grep, the command fails.

 

[sourcecode language=”bash”] jkstill@poirot ~/pythian/blog/sar-tools-blog $ grep ‘-j’ *.sh
grep: invalid option — ‘j’
Usage: grep [OPTION]… PATTERN [FILE]…
Try ‘grep –help’ for more information.
[/sourcecode]

 

Using ‘–‘ tells the shell to stop processing arguments for the current command, which is grep in this case, and so grep searches for ‘-j’ as intended.

[sourcecode language=”bash”] jkstill@poirot ~/pythian/blog/sar-tools-blog $ grep — ‘-j’ *.sh
# -j LABEL: use label for device if possible. eg. sentryoraredo01 rather than /dev/dm-3
#sarDestOptions=( ‘-d -j LABEL -p’ ‘-b’ ‘-q’ ‘-u ALL’ ‘-r’ ‘-R’ ‘-B’ ‘-S’ ‘-W’ ‘-n DEV,EDEV,NFS,NFSD,SOCK,IP,EIP,ICMP,EICMP,TCP,ETCP,UDP’ ‘-v’ ‘-w’)
sarDestOptions=( ‘-d -j ID -p’ ‘-b’ ‘-q’ ‘-u ALL’ ‘-r’ ‘-R’ ‘-B’ ‘-S’ ‘-W’ ‘-n DEV’ ‘-n EDEV’ ‘-n NFS’ ‘-n NFSD’ ‘-n SOCK’ ‘-n IP’ ‘-n EIP’ ‘-n ICMP’ ‘-n EICMP’ ‘-n TCP’ ‘-n ETCP’ ‘-n UDP’ ‘-v’ ‘-w’)
[/sourcecode]

Why in this case are we using ‘–‘?  It is because the other options following -d are not for sadf, but are sar options that sadf will use when calling sar.

The ‘-d’, ‘-j’ and ‘-p’ options

Be reading the sar man page you can learn what each of these are:

  • -d block device activity
  • -j display persistent device names – options are ID, LABEL, PATH and UUID
  • -p pretty print device names

tail -n +2

Just get the rows after the header row.  asp.sh calls sadf for each day’s sar data.  The first step of creating the header has already been done.

 

sed

sed is being used to change semi-colons to commas.  So far I have not seen any commas in sar data, and using a comma delimiter is just more convenient.

 

Charting in Excel with Perl

 

Having the sar data in the CSV format is convenient, as it will load directly into Excel or Google Sheets. What isn’t convenient is all the work required to load the data, format it, and then create multiple charts from the data.  There could be a number of CSV files this needs to be done with.

Perl to the rescue!  The Excel::Writer::XLSX Perl module can directly create formatted Excel files, complete with charts. I am going to make some basic use of this quite capable module via the dynachart.pl Perl script. This script reads CSV data and creates Excel files, optionally with line charts of selected columns.

If you would like to see some simple examples of using Perl to create Excel files, the documentation for the Perl Module Writer::Excel::XLSX contains several good examples, such as this one:  Example-1

There are several others to be found in the documentation for  Excel::Writer::XLSX

Following is the entire script.  This script and others can be found at the csv-tools github repo.

[sourcecode language=”perl”] #!/usr/bin/env perl

# dynachart.pl
# Jared Still 2017-07-23
# still@pythian.com jkstill@gmail.com

# data is from STDIN

use warnings;
use strict;
use Data::Dumper;
use Pod::Usage;
use Getopt::Long;
use Excel::Writer::XLSX;

my $debug = 0;
my $combinedChart = 0;
my %optctl = ();
my ($help,$man);
my @chartCols;
my $categoryColName=”;
my $categoryColNum=0;

Getopt::Long::GetOptions(
\%optctl,
‘spreadsheet-file=s’,
‘debug!’ => \$debug,
‘chart-cols=s{1,10}’ => \@chartCols,
‘combined-chart!’ => \$combinedChart,
‘worksheet-col=s’, # creates a separate worksheet per value of this column
‘category-col=s’ => \$categoryColName,
‘h|help|?’ => \$help, man => \$man
) or pod2usage(2) ;

pod2usage(1) if $help;
pod2usage(-verbose => 2) if $man;

my $xlFile = defined($optctl{‘spreadsheet-file’}) ? $optctl{‘spreadsheet-file’} : ‘asm-metrics.xlsx’;
my $workSheetCol = defined($optctl{‘worksheet-col’}) ? $optctl{‘worksheet-col’} : 0;

my %fonts = (
fixed => ‘Courier New’,
fixed_bold => ‘Courier New’,
text => ‘Arial’,
text_bold => ‘Arial’,
);

my %fontSizes = (
fixed => 10,
fixed_bold => 10,
text => 10,
text_bold => 10,
);

my $maxColWidth = 50;
my $counter = 0;
my $interval = 100;

# create workbook
my $workBook = Excel::Writer::XLSX->new($xlFile);
die "Problems creating new Excel file $xlFile: $!\n" unless defined $workBook;

# create formats
my $stdFormat = $workBook->add_format(bold => 0,font => $fonts{fixed}, size => $fontSizes{fixed}, color => ‘black’);
my $boldFormat = $workBook->add_format(bold => 1,font => $fonts{fixed_bold}, size => $fontSizes{fixed_bold}, color => ‘black’);
my $wrapFormat = $workBook->add_format(bold => 0,font => $fonts{text}, size => $fontSizes{text}, color => ‘black’);
$wrapFormat->set_align(‘vjustify’);

my $labels=<>;
chomp $labels;
# sadf starts header lines with ‘# ‘ – remove that
$labels =~ s/^#\s+//;
my @labels = split(/,\s*/,$labels);

if ($debug) {

print qq{LABELS:\n};

print join("\n",@labels);

print "\n";

}

# get the X series category
if ( $categoryColName ) {

my $want = $categoryColName;
my $index = 0;
++$index until ($labels[$index] eq $want) or ($index > $#labels);
$categoryColNum = $index;

}

#print join("\n",@labels);

# get the element number of the column used to segregate into worksheets
my $workSheetColPos;
if ($workSheetCol) {
my $i=0;
foreach my $label ( @labels) {
if ($label eq $workSheetCol) {
$workSheetColPos = $i;
last;
}
$i++;
}
}

print "\nworkSheetColPos: $workSheetColPos\n" if $debug;

# validate the columns to be charted
# use as an index into the labels
my @chartColPos=();
{
my $i=0;
foreach my $label ( @labels) {
foreach my $chartCol ( @chartCols ) {
if ($label eq $chartCol) {
push @chartColPos, $i;
last;
}
}
$i++;
}
}

if ($debug) {
print "\nChartCols:\n", Dumper(\@chartCols);
print "\nChartColPos:\n", Dumper(\@chartColPos);
print "\nLabels:\n", Dumper(\@labels);
}

my %lineCount=();
my %workSheets=();

# the first worksheet is a directory
my $directoryName=’Directory’;
my $directory;
my $noDirectory=0;
my $directoryLineCount=0;

unless ($noDirectory) {
$directory = $workBook->add_worksheet($directoryName) ;
$directory->set_column(0,0,30);
$directory->write_row($directoryLineCount++,0,[‘Directory’],$boldFormat);
}

while (<>) {

chomp;
my @data=split(/,/);

my $currWorkSheetName;
if ($workSheetCol) {
$currWorkSheetName=$data[$workSheetColPos];
if (length($currWorkSheetName) > 31 ) {
# cut some out of the middle of the name as the Excel worksheet name has max size of 31
$currWorkSheetName = substr($currWorkSheetName,0,14) . ‘…’ . substr($currWorkSheetName,-14);
}
} else {
$currWorkSheetName=’DynaChart’;
}

print "Worksheet Name: $currWorkSheetName\n" if $debug;

unless (defined $workSheets{$currWorkSheetName}) {
$workSheets{$currWorkSheetName} = $workBook->add_worksheet($currWorkSheetName);
$workSheets{$currWorkSheetName}->write_row($lineCount{$currWorkSheetName}++,0,\@labels,$boldFormat);
# freeze pane at header
$workSheets{$currWorkSheetName}->freeze_panes($lineCount{$currWorkSheetName},0);
}

# setup column widths
#$workSheet->set_column($el,$el,$colWidth);
$workSheets{$currWorkSheetName}->write_row($lineCount{$currWorkSheetName}++,0,\@data, $stdFormat);

}

if ($debug) {
print "Worksheets:\n";
print "$_\n" foreach keys %workSheets;
print Dumper(\%lineCount);
}

# each row consumes about 18 pixels
my $rowHeight=18; # pixels
my $chartHeight=23; # rows

# triple from default width of 480
my $chartWidth = 480 * 3;

=head1 Write the Charts

The default mode is to create a separate chart for each metric
By specifying the command line option –combined-chart the values will be charted in a single chart
Doing so is probably useful only for a limited number of sets of values

Some may question the apparent duplication of code in the sections to combine or not combine the charts
Doing so would be a bit convoluted – this is easier to read and modify

=cut

foreach my $workSheet ( keys %workSheets ) {
print "Charting worksheet: $workSheet\n" if $debug;

my $chartNum = 0;

if ($combinedChart) {
my $chart = $workBook->add_chart( type => ‘line’, name => "Combined" . ‘-‘ . $workSheets{$workSheet}->get_name(), embedded => 1 );
$chart->set_size( width => $chartWidth, height => $chartHeight * $rowHeight);

# each chart consumes about 16 rows
$workSheets{$workSheet}->insert_chart((($chartNum * $chartHeight) + 2),3, $chart);

foreach my $colPos ( @chartColPos ) {
my $col2Chart=$labels[$colPos];
# [ sheet, row_start, row_end, col_start, col_end] $chart->add_series(
name => $col2Chart,
#categories => [$workSheet, 1,$lineCount{$workSheet},2,2],
categories => [$workSheet, 1,$lineCount{$workSheet},$categoryColNum,$categoryColNum],
values => [$workSheet, 1,$lineCount{$workSheet},$colPos,$colPos] );
}

} else {
foreach my $colPos ( @chartColPos ) {
my $col2Chart=$labels[$colPos];
print "\tCharting column: $col2Chart\n" if $debug;
my $chart = $workBook->add_chart( type => ‘line’, name => "$col2Chart" . ‘-‘ . $workSheets{$workSheet}->get_name(), embedded => 1 );
$chart->set_size( width => $chartWidth, height => $chartHeight * $rowHeight);

# each chart consumes about 16 rows
$workSheets{$workSheet}->insert_chart((($chartNum * $chartHeight) + 2),3, $chart);

# [ sheet, row_start, row_end, col_start, col_end] $chart->add_series(
name => $col2Chart,
#categories => [$workSheet, 1,$lineCount{$workSheet},2,2],
categories => [$workSheet, 1,$lineCount{$workSheet},$categoryColNum,$categoryColNum],
values => [$workSheet, 1,$lineCount{$workSheet},$colPos,$colPos] );

$chartNum++;
}
}
}

# write the directory page
my $urlFormat = $workBook->add_format( color => ‘blue’, underline => 1 );
my %sheetNames=();

foreach my $worksheet ( $workBook->sheets() ) {
my $sheetName = $worksheet->get_name();
next if $sheetName eq $directoryName;
$sheetNames{$sheetName} = $worksheet;
}

foreach my $sheetName ( sort keys %sheetNames ) {
$directory->write_url($directoryLineCount++,0, qq{internal:’$sheetName’!A1} ,$urlFormat, $sheetName);
}

__END__

=head1 NAME

dynachart.pl

–help brief help message
–man full documentation
–spreadsheet-file output file name – defaults to asm-metrics.xlsx
–worksheet-col name of column used to segragate data into worksheets
defaults to a single worksheet if not supplied
–chart-cols list of columns to chart

dynachart.pl accepts input from STDIN

This script will read CSV data created by asm-metrics-collector.pl or asm-metrics-aggregator.pl

=head1 SYNOPSIS

dynachart.pl [options] [file …]

Options:
–help brief help message
–man full documentation
–spreadsheet-file output file name – defaults to asm-metrics.xlsx
–worksheet-col name of column used to segragate data into worksheets
defaults to a single worksheet if not supplied
–chart-cols list of columns to chart
–category-col specify the column for the X vector – a timestamp is typically used
the name must exactly match that in the header
–combined-chart create a single chart rather than a chart for each value specified in –chart-cols

dynachart.pl accepts input from STDIN

dynachart.pl –worksheet-col DISKGROUP_NAME < my_input_file.csv

dynachart.pl –spreadsheet-file sar-disk-test.xlsx –combined-chart –worksheet-col DEV –category-col ‘timestamp’ –chart-cols ‘rd_sec/s’ –chart-cols ‘wr_sec/s’ < sar-disk-test.csv

=head1 OPTIONS

=over 8

=item B<-help>

Print a brief help message and exits.

=item B<-man>

Prints the manual page and exits.

=item B<–spreadsheet-file>

The name of the Excel file to create.
The default name is asm-metrics.xlsx

=item B<–worksheet-col>

By default a single worksheet is created.
When this option is used the column supplied as an argument will be used to segragate data into separate worksheets.

=item B<–chart-cols>

List of columns to chart
This should be the last option on the command line if used.

It may be necessary to tell Getopt to stop processing arguments with ‘–‘ in some cases.

eg.

dynachart.pl dynachart.pl –worksheet-col DISKGROUP_NAME –chart-cols READS WRITES — logs/asm-oravm-20150512_01-agg-dg.csv

=item B<–category-col>

Column to use as the category for the X line in the chart – default to the first column
The name must exactly match a column from the CSV file
Typically this line is a timestamp

=back

=head1 DESCRIPTION

B<dynachart.pl> creates an excel file with charts for selected columns>

=head1 EXAMPLES

dynachart.pl accepts data from STDIN

dynachart.pl –worksheet-col DISKGROUP_NAME –spreadsheet-file mywork.xlsx

=cut

[/sourcecode]

 

Creating Multiple Charts with Bash

 

A number of invocations of dynachart.pl can be found in sar-chart.sh as seen in the following example. This allows creating a number of Excel files from sar data for later analysis, all with one command.

 

 

[sourcecode language=”bash” padlinenumbers=”true”] #!/bin/bash

usage() {

cat <<-EOF

usage: $0 destination-directory

example script that charts CSV data that has been generated from sar
Excel XLXS files with charts are produced

applicable to any CSV data

example: sar-chart.sh data-dir

This script is using data generated by asp.sh

EOF

}

while getopts h arg
do
case $arg in
h) usage;exit 0;;
*) usage;exit 1;;
esac
done

destDir=$1

[ -d "$destDir" -a -w "$destDir" ] || {
echo
echo cannot read and/or write directory destDir: $destDir
echo
usage
echo
exit 1
}

### Template
#echo working on sar-
#dynachart.pl –spreadsheet-file ${destDir}/ –worksheet-col hostname –category-col ‘timestamp’
################

# if dynachart.pl is in the path then use it.
# if not then check for local directory copy or link
# otherwise error exit

dynaChart=$(which dynachart.pl)

if [[ -z $dynaChart ]]; then
if [[ -f ./dynachart.pl ]]; then
dynaChart=’./dynachart.pl’
[ -x "$dynaChart" ] || {
echo
echo $dynaChart is not executable
echo
exit 2
}
else
echo
echo "dynachart.pl not found"
echo
exit 1
fi
fi

# default of 1 chart per metric
echo working on sar-disk-default.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-disk-default.xlsx –worksheet-col DEV –category-col ‘timestamp’ –chart-cols ‘rd_sec/s’ –chart-cols ‘wr_sec/s’ < sar-disk.csv

# combine metrics into one chart
echo working on sar-disk-combined.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-disk-combined.xlsx –combined-chart –worksheet-col DEV –category-col ‘timestamp’ –chart-cols ‘rd_sec/s’ –chart-cols ‘wr_sec/s’ < sar-disk.csv

echo working on sar-network-device.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-network-device.xlsx –combined-chart –worksheet-col IFACE –category-col ‘timestamp’ –chart-cols ‘rxkB/s’ –chart-cols ‘txkB/s’ < sar-net-dev.csv

echo working on sar-network-error-device.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-network-error-device.xlsx –combined-chart –worksheet-col IFACE –category-col ‘timestamp’ –chart-cols ‘rxerr/s’ –chart-cols ‘txerr/s’ < sar-net-ede.csv

echo working on sar-network-nfs.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-network-nfs.xlsx –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘call/s’ –chart-cols ‘retrans/’ –chart-cols ‘read/s’ –chart-cols ‘write/s’ –chart-cols ‘access/s’ –chart-cols ‘getatt/s’ < sar-net-nfs.csv

echo working on sar-network-nfsd.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-network-nfsd.xlsx –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘scall/s’ –chart-cols ‘badcall/s’ –chart-cols ‘packet/s’ –chart-cols ‘udp/s’ –chart-cols ‘tcp/s’ –chart-cols ‘hit/s’ –chart-cols ‘miss/s’ –chart-cols ‘sread/s’ –chart-cols ‘swrite/s’ –chart-cols ‘saccess/s’ –chart-cols ‘sgetatt/s’ < sar-net-nfsd.csv

echo working on sar-network-socket.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-network-socket.xlsx –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘totsck’ –chart-cols ‘tcpsck’ –chart-cols ‘udpsck’ –chart-cols ‘rawsck’ –chart-cols ‘ip-frag’ –chart-cols ‘tcp-tw’ < sar-net-sock.csv

echo working on sar-context.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-context.xlsx –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘proc/s’ –chart-cols ‘cswch/s’ < sar-context.csv

echo working on sar-cpu.xlsx
# extracted with -u ALL, so all CPU on one line
dynachart.pl –spreadsheet-file ${destDir}/sar-cpu.xlsx –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘%usr’ –chart-cols ‘%nice’ –chart-cols ‘%sys’ –chart-cols ‘%iowait’ –chart-cols ‘%steal’ –chart-cols ‘%irq’ –chart-cols ‘%soft’ –chart-cols ‘%guest’ –chart-cols ‘%idle’ < sar-cpu.csv

echo working on sar-io-default.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-io-default.xlsx –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘tps’ –chart-cols ‘rtps’ –chart-cols ‘wtps’ –chart-cols ‘bread/s’ –chart-cols ‘bwrtn/s’ < sar-io.csv

echo working on sar-io-tps-combined.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-io-tps-combined.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘tps’ –chart-cols ‘rtps’ –chart-cols ‘wtps’ < sar-io.csv

echo working on sar-io-blks-per-second-combined.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-io-blks-per-second-combined.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘bread/s’ –chart-cols ‘bwrtn/s’ < sar-io.csv

echo working on sar-load-runq-threads.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-load-runq-threads.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘runq-sz’ –chart-cols ‘plist-sz’ –chart-cols ‘ldavg-1’ –chart-cols ‘ldavg-5’ –chart-cols ‘ldavg-15’ < sar-load.csv

echo working on sar-load-runq.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-load-runq.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘runq-sz’ –chart-cols ‘ldavg-1’ –chart-cols ‘ldavg-5’ –chart-cols ‘ldavg-15’ < sar-load.csv

echo working on sar-memory.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-memory.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘frmpg/s’ –chart-cols ‘bufpg/s’ < sar-mem.csv

echo working on sar-paging-rate.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-paging-rate.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘pgpgin/s’ –chart-cols ‘pgpgout/s’ < sar-paging.csv

echo working on sar-swap-rate.xlsx
dynachart.pl –spreadsheet-file ${destDir}/sar-swap-rate.xlsx –combined-chart –worksheet-col hostname –category-col ‘timestamp’ –chart-cols ‘pswpin/s’ –chart-cols ‘pswpout/s’ < sar-swap-stats.csv

[/sourcecode]

Filter, rename and aggregate the data

Perhaps you want to see a subset of some data. Let’s say you wish too see the data for only a select group of devices; those making up the DATA diskgroup in an Oracle database.  In addition you would like the sar report to use names that match those seen by ASM. Rather than /dev/mapper/mp1, mp2,…, you would like to see DATA00, DATA01,…

In addition you would like to see the IOPS, read rate and write rate data for a set of disks aggregated per timestamp.

We can do that. The script remap.sh has an example (real life BTW) of how to map the ASM disk names to the linux device names, and create a new CSV file.

This query was used to map device names to disk names:

 

[sourcecode language=”sql”] select
‘dm-name-‘ || substr(d.path,instr(d.path,’/’,-1)+1,length(d.path) – instr(d.path,’/’,-1)) path
, g.name || lpad(d.disk_number,2,’0′) name
from v$asm_disk d
join v$asm_diskgroup g on g.group_number = d.group_number
order by 2

15:10:17 SYSDBA> /

PATH NAME
—————————— ——————–
dm-name-mpathmp1 DATA00
dm-name-mpathnp1 DATA01
dm-name-mpathop1 DATA02
dm-name-mpathqp1 DATA03
dm-name-mpathrp1 DATA04
dm-name-mpathsp1 DATA05
dm-name-mpathtp1 DATA06
dm-name-mpathvp1 DATA07
dm-name-mpathwp1 DATA08
dm-name-mpathcp1 FRA00

10 rows selected.
[/sourcecode]

 

This information was used to then create a sed command to rewrite the CSV data as needed.

 

[sourcecode language=”bash”]

originalFile=sar-disk-original.csv
newFile=sar-disk.csv

sed \
-e ‘s/dm-name-mpathmp1/DATA00/g’ \
-e ‘s/dm-name-mpathnp1/DATA01/g’ \
-e ‘s/dm-name-mpathop1/DATA02/g’ \
-e ‘s/dm-name-mpathqp1/DATA03/g’ \
-e ‘s/dm-name-mpathrp1/DATA04/g’ \
-e ‘s/dm-name-mpathsp1/DATA05/g’ \
-e ‘s/dm-name-mpathtp1/DATA06/g’ \
-e ‘s/dm-name-mpathvp1/DATA07/g’ \
-e ‘s/dm-name-mpathwp1/DATA08/g’ \
-e ‘s/dm-name-mpathcp1/FRA00/g’ \
< $originalFile > $newFile
[/sourcecode]

Following that csv-aggregator.sh and csv-aggregator.pl were used to filter and aggregate the data.  The result is a much smaller CSV file with data only for the DATA diskgroup, with the disk data rolled up to the disk group level.

Rather than post several hundred more lines of Perl here, I will recommend you to follow the links if you want to see how this aggregation and filtering is done. The command line though is shown here:

 

[sourcecode language=”bash”]

csvFile=sar-csv/sar-disk-test.csv

./csv-aggregator.pl –filter-cols DEV –filter-vals ‘DATA..’ –key-cols hostname –key-cols timestamp –grouping-cols DEV –agg-cols tps –agg-cols ‘rd_sec/s’ –agg-cols ‘wr_sec/s’ –agg-cols ‘avgrq-sz’ –agg-cols ‘avgqu-sz’ < $csvFile

[/sourcecode]

 

This is a chart taken directly from the Excel file created from this CSV data via dynachart.pl.

 

image

 

 

 

Installing Excel::Writer::XLSX

So that the dynachart.pl script will work, the Excel::Writer::XLSX Perl module must be installed.

There are several possible methods by which this may be done, but if you are not already fairly familiar with the Perl environment some of the methods are a little daunting.  As Oracle by default always installs Perl in ORACLE_HOME, you can make use of that existing installation.  These instructions will install the Excel::Writer::XLSX module and it’s dependencies directly into the Oracle Perl installation. Because of that, you would want to do this in a test environment, such as Virtual Machine running on your laptop.  It isn’t too likely that this would break something in the Oracle home, but then again, why take the chance?

An even better option would be to install your own Perl, and not worry about affecting any other software. This is not too difficult to do via Perlbrew. Check out https://perlbrew.pl/ for the simple instructions.

The following example installs Excel::Writer::XLSX in Oracle’s copy of Perl on a test server.  For my own use, I do use Perlbrew, but am just including the following as an example

 

[sourcecode language=”bash” padlinenumbers=”true”] [oracle@oravm01 ~]$ $ORACLE_HOME/perl/bin/perl -MCPAN -e shell

Would you like me to configure as much as possible automatically? [yes] yes

[/sourcecode]

These next commands tell CPAN to automatically answer appropriately rather than prompt you to continually type ‘yes’.
This lasts the duration of this CPAN session.  Following that is the command to install Excel::Writer::XLSX.

 

[sourcecode language=”bash”] cpan> o conf prerequisites_policy ‘follow’
cpan> o conf build_requires_install_policy yes

cpan> install Excel::Writer::XLSX

cpan> exit
Terminal does not support GetHistory.
Lockfile removed.

[/sourcecode]

 

Now test the installation.  This command simply tells Perl the we require version 99 of the module. The failure is normal, as only version 0.95 is installed.  A different message would appear if no version of the module was installed.

[sourcecode language=”bash”] [oracle@oravm01 ~]$ $ORACLE_HOME/perl/bin/perl -MExcel::Writer::XLSX=99
Excel::Writer::XLSX version 99 required–this is only version 0.95 at /u01/app/oracle/product/11.2.0/db_1/perl/lib/5.10.0/Exporter/Heavy.pm line 122.
BEGIN failed–compilation aborted.
[/sourcecode]
email

Interested in working with Jared? Schedule a tech call.

About the Author

Oracle experience: started with Oracle 7.0.13 Programming Experience: Perl, PL/SQL, Shell, SQL Also some other odds and ends that are no longer useful Systems: Networking, Storage, OS to varying degrees. Have fond memories of DG/UX

2 Comments. Leave new

Nice work. But kSAR can visualise sar data using raw sar traces.

Reply

Thank you for your comment. One of the reasons for not using kSAR is that kSAR does not work with newer formats of sar data.
Another reason for not using kSAR is that it does not have the flexibility I desired.
For instance: I don’t believe kSAR has the ability to combine all disks that make up an ASM diskgroup into a single entity.
So yes, I did first try kSAR but concluded it was no longer useful.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *