Improve Oracle Performance with ASM Filter Driver

Posted in: Oracle, Technical Track

Sometimes the results found in performance testing can be a bit of a surprise. I have recently been preparing for an upcoming presentation on Oracle ASM. As part of the preperation it seemed a good idea to become more familiar with ASM Filter Driver and learn a little about how it works.

ASM Filter Driver was introduced in Oracle 12.1 as an alternative to ASMLib. Following are the expected benefits as shown in the Oracle documentation: About ASM Filter Driver

– Oracle ASM Filter Driver simplifies the configuration and management of disk devices by eliminating the need to rebind disk devices used with Oracle ASM each time the system is restarted.

– Oracle ASM Filter Driver rejects any I/O requests that are invalid. This action eliminates accidental overwrites of Oracle ASM disks that would cause corruption in the disks and files within the disk group.

The testing for this presentation is being done with Oracle Grid Infrastructure 12.2 and Oracle RDBMS Enterprise Edition 12.2. The platform is Oracle Enterprise Linux 6.9.

Initially the testing was limited to Virtual Machines via Oracle VirtualBox.

ASM was to be compared when disks were physically managed by each of the following in turn:

– udev
– ASMLib
– ASM Filter Driver

For each test two disk groups were created; DATA and ALIGNED. The DATA diskgroup was for the database installation, while ALIGNED was strictly for testing.

The ALIGNED diskgroup was normal redundancy using two disks.

Oracle memory was constrained so as to force Oracle to do as much IO as possible using the following settings, and then the instance restarted.

ALTER system SET parallel_max_servers = 2 scope=BOTH;
ALTER system SET db_cache_size = 64M scope=BOTH;
ALTER system SET recyclebin = 'off' scope=spfile;
ALTER system SET sga_max_size=512M scope=spfile;
ALTER system SET sga_target=512M scope=spfile;
ALTER system SET job_queue_processes = 0 scope=BOTH;

Each test database was configured as an Oracle 12.2 CDB with a single PDB.

The testing was done with SLOB, a well known Oracle IO performance testing tool.

The contents of slob.conf for the VirtualBox tests:

UPDATE_PCT=50
SCAN_PCT=0
RUN_TIME=600
SCALE=512M
SCAN_TABLE_SZ=1M
WORK_UNIT=32
REDO_STRESS=LITE
LOAD_PARALLEL_DEGREE=2
AWR_SRC=AWR_PDB
THREADS_PER_SCHEMA=4
DATABASE_STATISTICS_TYPE=awr   # Permitted VALUES: [statspack|awr]
DBA_PRIV_USER="system"
SYSDBA_PASSWD="XXX"
EXTERNAL_SCRIPT=''
DO_HOTSPOT=FALSE
HOTSPOT_MB=8
HOTSPOT_OFFSET_MB=16
HOTSPOT_FREQUENCY=3
HOT_SCHEMA_FREQUENCY=0
THINK_TM_FREQUENCY=0
THINK_TM_MIN=.1
THINK_TM_MAX=.5

Several tests were run in each ASM configuration, with increasing numbers of sessions. Each test ran 10 minutes.

For this article I will consider only one set of results per configuration, as the results were consistent between different numbers of sessions.

Let’s first consider the results for udev testing. As there is nothing yet to compare them to, there is not yet much to discuss.

ASM udev AWR Report

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
ora12cR2-asm.jks Linux x86 64-bit                    2     2       1       7.80
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:         5 08-Jan-18 11:34:16         9       1.4
  End Snap:         6 08-Jan-18 11:44:19         1       4.0
   Elapsed:               10.04 (mins)
   DB Time:               80.06 (mins)
Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):               8.0               0.1      0.03     50.03
              DB CPU(s):               0.8               0.0      0.00      5.20
      Redo size (bytes):       1,305,547.7           9,560.6
  Logical read (blocks):          10,029.3              73.5
          Block changes:           8,801.6              64.5
 Physical read (blocks):           7,656.6              56.1
Physical write (blocks):               0.0               0.0
       Read IO requests:           7,656.5              56.1
      Write IO requests:               0.0               0.0
           Read IO (MB):              59.8               0.4
         Executes (SQL):             300.1               2.2
           Transactions:             136.6
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
db file sequential read          2,398,817     3925.9    1.64ms   81.7 User I/O
DB CPU                                          499.1             10.4
db file parallel read               63,404        237    3.74ms    4.9 User I/O
free buffer waits                    4,834       32.4    6.70ms     .7 Configur
enq: TX - row lock contention          140        4.3   31.03ms     .1 Applicat
write complete waits                    12        2.7  227.81ms     .1 Configur
latch: cache buffers lru chain       2,597        2.6    1.00ms     .1 Other
latch: cache buffers chains            643        1.1    1.67ms     .0 Concurre
latch: object queue header ope       1,080         .9  850.18us     .0 Other
log file switch (private stran          27         .9   32.79ms     .0 Configur
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                        Total Wait       Wait   % DB   Active
Wait Class                  Waits       Time (sec)       Time   time Sessions
---------------- ---------------- ---------------- ---------- ------ --------
User I/O                2,462,948            4,164     1.69ms   86.7      6.9
DB CPU                                         499              10.4      0.8
Configuration               4,886               36     7.38ms     .8      0.1
Application                   140                4    31.03ms     .1      0.0
Other                       4,525                4   933.36us     .1      0.0
Concurrency                 1,041                3     3.10ms     .1      0.0
System I/O                    234                0     1.20ms     .0      0.0
Commit                          5                0     1.29ms     .0      0.0
Network                        63                0    87.22us     .0      0.0
IO Profile                  Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------
         Database Requests:           7,656.5         7,656.5             0.0
        Optimized Requests:               0.0             0.0             0.0
             Redo Requests:
                Total (MB):              59.8            59.8             0.0
             Database (MB):              59.8            59.8             0.0
      Optimized Total (MB):               0.0             0.0             0.0
                 Redo (MB):
         Database (blocks):           7,656.6         7,656.6             0.0
 Via Buffer Cache (blocks):           7,656.6         7,656.6             0.0
           Direct (blocks):               0.0             0.0             0.0

Next up was testing with ASMLib. Take a look at the IO metrics as compared to udev.

ASMLib performance is somewhat better than with udev.

Redo size per second is a good indicator of how much DML work was done.

With udev this was 1,305,547.7 bytes per second.

With ASMLib that value jumps to 1,377,287.6 bytes per second. This is approximately a 5% performance improvement.

The book Database Cloud Storage: The Essential Guide to Oracle Automatic Storage Management makes the claim that one of the goals of ASMLib was to improve IO performance as compared to standard Linux means (filesystem , udev) and that seems to be the case, at least for DML.

The performance advantage in this case does not seem to be due to disk access speeds, as those are quite close when comparing udev to ASMLib.

When comparing transactions per second however we can see that ASMLib managed to do 144.5 per second while udev managed only 136.6 per second.

This is seen also in the ‘Block Changes’ metrics.

The ‘Read IO Requests’ and ‘Physical Read’ metrics though are telling a different story, as udev consistently outperformed ASMLib in this area.

The slight performance advantage of ASMLib held steady over many tests with varying loads. This SLOB test was done the update configuration set to 50%. The comparison might favor udev with less DML.

ASM ASMLib AWR Report

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
ora12cR2-asm.jks Linux x86 64-bit                    2     2       1       7.80
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:         1 07-Jan-18 16:10:14         9       1.3
  End Snap:         2 07-Jan-18 16:20:18         2       5.5
   Elapsed:               10.07 (mins)
   DB Time:               80.15 (mins)
Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):               8.0               0.1      0.02     43.32
              DB CPU(s):               0.9               0.0      0.00      4.67
      Redo size (bytes):       1,377,287.6           9,528.5
  Logical read (blocks):          10,633.8              73.6
          Block changes:           9,323.1              64.5
 Physical read (blocks):           7,239.6              50.1
Physical write (blocks):               0.0               0.0
       Read IO requests:           7,235.5              50.1
      Write IO requests:               0.0               0.0
           Read IO (MB):              56.6               0.4
         Executes (SQL):             323.2               2.2
           Transactions:             144.5
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
db file sequential read          2,271,558     3835.6    1.69ms   79.8 User I/O
DB CPU                                          517.9             10.8
db file parallel read               64,721      250.1    3.86ms    5.2 User I/O
free buffer waits                    4,430         26    5.87ms     .5 Configur
write complete waits                    17          4  235.56ms     .1 Configur
latch: cache buffers lru chain       2,837        3.2    1.12ms     .1 Other
enq: TX - row lock contention          138        2.7   19.82ms     .1 Applicat
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                        Total Wait       Wait   % DB   Active
Wait Class                  Waits       Time (sec)       Time   time Sessions
---------------- ---------------- ---------------- ---------- ------ --------
User I/O                2,337,385            4,089     1.75ms   85.0      6.8
DB CPU                                         518              10.8      0.9
Configuration               4,615               38     8.28ms     .8      0.1
Other                       5,414                8     1.39ms     .2      0.0
Concurrency                 1,109                7     6.36ms     .1      0.0
Application                   138                3    19.82ms     .1      0.0
System I/O                  1,045                2     1.51ms     .0      0.0
Commit                          5                0     1.09ms     .0      0.0
Network                        79                0    22.23us     .0      0.0
IO Profile                  Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------
         Database Requests:           7,235.5         7,235.5             0.0
        Optimized Requests:               0.0             0.0             0.0
             Redo Requests:
                Total (MB):              56.9            56.6             0.3
             Database (MB):              56.6            56.6             0.0
      Optimized Total (MB):               0.0             0.0             0.0
                 Redo (MB):
         Database (blocks):           7,239.6         7,239.6             0.0
 Via Buffer Cache (blocks):           7,237.6         7,237.6             0.0
           Direct (blocks):               2.0             2.0             0.0

Next up is the ASMFD test. This is where it gets interesting.

ASM ASMFD AWR Report

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
ora12cR2-asm.jks Linux x86 64-bit                    2     2       1       7.80
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:         1 06-Jan-18 12:57:49         9       1.3
  End Snap:         2 06-Jan-18 13:07:52         1       6.0
   Elapsed:               10.05 (mins)
   DB Time:               80.14 (mins)
Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):               8.0               0.1      0.02     42.55
              DB CPU(s):               1.0               0.0      0.00      5.32
      Redo size (bytes):       1,556,388.0           9,427.8
  Logical read (blocks):          12,134.8              73.5
          Block changes:          10,635.6              64.4
 Physical read (blocks):           8,373.7              50.7
Physical write (blocks):               0.0               0.0
       Read IO requests:           8,371.7              50.7
      Write IO requests:               0.0               0.0
           Read IO (MB):              65.4               0.4
         Executes (SQL):             374.6               2.3
           Transactions:             165.1
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
db file sequential read          2,756,045     2331.5  845.96us   48.5 User I/O
free buffer waits                  107,173     1016.7    9.49ms   21.1 Configur
DB CPU                                          600.9             12.5
write complete waits                   415      364.4  877.97ms    7.6 Configur
db file parallel read               42,696       70.3    1.65ms    1.5 User I/O
enq: TX - row lock contention          146       39.2  268.23ms     .8 Applicat
latch: cache buffers lru chain       7,307       10.8    1.48ms     .2 Other
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                        Total Wait       Wait   % DB   Active
Wait Class                  Waits       Time (sec)       Time   time Sessions
---------------- ---------------- ---------------- ---------- ------ --------
User I/O                2,800,364            2,405   858.87us   50.0      4.0
Configuration             108,005            1,431    13.25ms   29.8      2.4
DB CPU                                         601              12.5      1.0
Other                      13,260               44     3.29ms     .9      0.1
Application                   147               40   272.08ms     .8      0.1
Concurrency                 1,750               11     6.35ms     .2      0.0
System I/O                  1,098                2     1.39ms     .0      0.0
Commit                          4                0   557.50us     .0      0.0
Network                        79                0    25.48us     .0      0.0
IO Profile                  Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------
         Database Requests:           8,371.7         8,371.7             0.0
        Optimized Requests:               0.0             0.0             0.0
             Redo Requests:
                Total (MB):              65.8            65.4             0.4
             Database (MB):              65.4            65.4             0.0
      Optimized Total (MB):               0.0             0.0             0.0
                 Redo (MB):
         Database (blocks):           8,373.7         8,373.7             0.0
 Via Buffer Cache (blocks):           8,371.7         8,371.7             0.0
           Direct (blocks):               2.0             2.0             0.0

Following is a comparison of some key IO metrics. The IO performance shown when using ASM Filter Driver is significantly better than with ASMLib.

Metric ASMLib ASMFD
Redo Size 1,377,287.6 1,556,388.0
Logical read (blocks) 10,633.8 12,134.8
Block changes 9,323.1 10,635.6
Physical Read (blocks) 7,239.6 8,373.7
Read IO requests 7,235.5 8,371.7
Read IO (MB) 56.6 65.4
Executes (SQL) 323.2 374.6
Transactions 144.5 165.1
db file sequential read time 1.69ms 845.96us
db file parallel read time 3.86ms 1.65ms
User IO Avg Wait time 1.75ms 858.87us

Thinking the results might be anomalous, I rebuilt each of the VirtualBox environment and retested. The results were the same.

Even so, I wanted to see this running on something other than my laptop and see if the results would hold. Using a spare PC in my office, I installed Oracle 6.9 and then the same versions of Oracle software as used in the VBox environments.

The Oracle configurations was slightly different due to the disks involved; they disks were no longer speedy SSD, but old and slow spinning hard drives.

No matter, the tests could still be performed, there would just be less IO.

First testing ASMFD and then ASMLib, the results showed that ASMFD still had a clear advantage of about 30% better IO performance when compared to ASMLib.

This is certainly not a result I was expecting (ie. can these be correct?). So, I decided at this time to scale the testing up a bit and build a system that looked more like something used in a production environment.

The new test environment was built on Amazon AWS using the following specifications

– EC2 m4.4xlarge: 16 vCPU and 64G of RAM
– 10 20G EBS volumes for Oracle using default SSD provisioning
– the same Oracle Linux, Grid and RDBMs versions.

This time the size of the data would be increased as well as the number of sessions, and the tests would be run for 60 minutes each.

Here’s the slob.conf for the AWS tests:

UPDATE_PCT=30
SCAN_PCT=20
RUN_TIME=3600
SCALE=8192M
SCAN_TABLE_SZ=1M
WORK_UNIT=32
REDO_STRESS=LITE
LOAD_PARALLEL_DEGREE=2
AWR_SRC=AWR_PDB
THREADS_PER_SCHEMA=4
DATABASE_STATISTICS_TYPE=awr   # Permitted VALUES: [statspack|awr]
DBA_PRIV_USER="system"
SYSDBA_PASSWD="grok"
EXTERNAL_SCRIPT=''
DO_HOTSPOT=FALSE
HOTSPOT_MB=8
HOTSPOT_OFFSET_MB=16
HOTSPOT_FREQUENCY=3
HOT_SCHEMA_FREQUENCY=0
THINK_TM_FREQUENCY=0
THINK_TM_MIN=.1
THINK_TM_MAX=.5

The testing was changed a bit with UPDATE_PCT reduced from 50 to 30, and SCAN_PCT increased from 0 to 20.

The number of SLOB schemas was set at 4, and 3 tests performed using 16, 32 and 48 sessions.

This article will consider the AWR reports generated for 16 sessions.

The reports for 32 and 48 sessions show similar results in terms of performance comparisons.

First up is the ASMLib report.

Please note the following discrepancy: the number of sessions at Begin Snapshot time is 113.

This is most likely due to this test starting immediately after the parallel data load, as only 16 Sessions were started for the SLOB test. If you take a look at the Avg Active Sessions you will see it is 13.3.

Performance looks reasonably good with 4.71ms average for ‘db file sequential read’ and 28.62ms for ‘db file parallel read’.

EC2: ASM ASMLib AWR Report

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:         6 16-Jan-18 16:51:35       113        .8
  End Snap:         7 16-Jan-18 17:51:40         1       4.0
   Elapsed:               60.10 (mins)
   DB Time:            1,153.40 (mins)
Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):              19.2               0.3      0.09     66.93
              DB CPU(s):               0.3               0.0      0.00      0.87
      Redo size (bytes):         919,510.5          16,502.9
  Logical read (blocks):          10,875.3             195.2
          Block changes:           3,963.8              71.1
 Physical read (blocks):           4,804.2              86.2
Physical write (blocks):               0.0               0.0
       Read IO requests:           4,804.1              86.2
      Write IO requests:               0.0               0.0
           Read IO (MB):              37.5               0.7
         Executes (SQL):             214.2               3.8
           Transactions:              55.7
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
db file sequential read          7,059,935      33.3K    4.71ms   48.1 User I/O
db file parallel read              518,432      14.8K   28.62ms   21.4 User I/O
free buffer waits                1,128,101        11K    9.76ms   15.9 Configur
write complete waits                   546     2049.9 3754.38ms    3.0 Configur
DB CPU                                          895.6              1.3
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                        Total Wait       Wait   % DB   Active
Wait Class                  Waits       Time (sec)       Time   time Sessions
---------------- ---------------- ---------------- ---------- ------ --------
User I/O                7,583,176           48,095     6.34ms   69.5     13.3
Configuration           1,129,536           13,120    11.62ms   19.0      3.6
DB CPU                                         896               1.3      0.2
Concurrency                12,641              636    50.32ms     .9      0.2
Other                     131,861              196     1.49ms     .3      0.1
Application                    25                1    58.61ms     .0      0.0
System I/O                  3,616                1   374.80us     .0      0.0
Commit                         62                0     1.75ms     .0      0.0
Network                       686                0     1.96us     .0      0.0
IO Profile                  Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------
         Database Requests:           4,804.1         4,804.1             0.0
        Optimized Requests:               0.0             0.0             0.0
             Redo Requests:
                Total (MB):              37.7            37.5             0.2
             Database (MB):              37.5            37.5             0.0
      Optimized Total (MB):               0.0             0.0             0.0
                 Redo (MB):
         Database (blocks):           4,804.2         4,804.2             0.0
 Via Buffer Cache (blocks):           4,804.2         4,804.2             0.0
           Direct (blocks):               0.0             0.0             0.0

The system was reconfigured to use ASM Filter Driver, and following are the results for the same SLOB Tests:

The performance advantage for ASM Filter Driver is holding true, though not quite as dramatic as that seen in earlier tests.

The ‘db file sequential read’ read metric has improved from 4.71ms to to 3.86ms for a 22% performance improvement over ASMLib.

Similarly ‘db file parallel read’ has improved from 28.62ms to 23.57ms for a 21% improvement over ASMLib.

EC2: ASM ASMFD AWR Report

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:         3 16-Jan-18 00:07:59        18       1.2
  End Snap:         4 16-Jan-18 01:08:40         2       5.0
   Elapsed:               60.68 (mins)
   DB Time:              960.87 (mins)
Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):              15.8               0.2      0.07    245.33
              DB CPU(s):               0.3               0.0      0.00      4.22
      Redo size (bytes):       1,016,316.4          14,801.0
  Logical read (blocks):          11,508.6             167.6
          Block changes:           4,442.6              64.7
 Physical read (blocks):           5,889.0              85.8
Physical write (blocks):               0.0               0.0
       Read IO requests:           5,888.9              85.8
      Write IO requests:               0.0               0.0
           Read IO (MB):              46.0               0.7
         Executes (SQL):             231.5               3.4
           Transactions:              68.7
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
db file sequential read          8,483,787      32.8K    3.86ms   56.9 User I/O
db file parallel read              558,673      13.2K   23.57ms   22.8 User I/O
write complete waits                    76     2291.6   30.15 s    4.0 Configur
free buffer waits                   91,629     1978.3   21.59ms    3.4 Configur
DB CPU                                          992.1              1.7
log file switch (checkpoint in         256      129.9  507.48ms     .2 Configur
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                        Total Wait       Wait   % DB   Active
Wait Class                  Waits       Time (sec)       Time   time Sessions
---------------- ---------------- ---------------- ---------- ------ --------
User I/O                9,046,694           45,955     5.08ms   79.7     12.6
Configuration              92,917            4,603    49.54ms    8.0      1.3
DB CPU                                         992               1.7      0.3
Concurrency                 1,723              243   141.01ms     .4      0.1
Other                      12,198               91     7.49ms     .2      0.0
System I/O                  3,518                1   370.00us     .0      0.0
Application                    23                1    46.94ms     .0      0.0
Commit                          8                0   602.25us     .0      0.0
Network                       163                0     1.41us     .0      0.0
IO Profile                  Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------
         Database Requests:           5,888.9         5,888.9             0.0
        Optimized Requests:               0.0             0.0             0.0
             Redo Requests:
                Total (MB):              46.2            46.0             0.2
             Database (MB):              46.0            46.0             0.0
      Optimized Total (MB):               0.0             0.0             0.0
                 Redo (MB):
         Database (blocks):           5,889.0         5,889.0             0.0
 Via Buffer Cache (blocks):           5,889.0         5,889.0             0.0
           Direct (blocks):               0.0             0.0             0.0

Let’s take a look at the same table of metrics seen previously:

ASMLIb vs ASMFD IO Metrics

Metric ASMLib ASMFD
Redo Size 919,510.5 1,016,316.4
Logical read (blocks) 10,875.3 11,508.6
Block changes 3,963.8 4,442.6
Physical Read (blocks) 4,804.2 5,889.0
Read IO requests 4,804.1 5,888.9
Read IO (MB) 37.5 46.0
Executes (SQL) 214.2 231.5
Transactions 55.7 68.7
db file sequential read time 4.71ms 3.86ms
db file parallel read time 28.62ms 23.57ms
User IO Avg Wait time 6.34ms 5.08ms

Following the multiple tests on the EC2 system I was prepared to believe the results. In this case ASM Filter Driver clearly has better IO performance than ASMLib. No testing was done with udev, as all previous tests showed that ASMLib and udev were quite similar in performance, dependent on the type of IO load.

So the question now is “Why is ASMFD performing so much better than ASMLib?”

The ultimate answer to that question will be found among the architects and developers of ASMFD and is not likely to be published. A patent search might turn up some indications, though I had not yet tried that.

We can however get some idea of where the performance gains may be coming from.

Perhaps you have heard of Flamegraphs?

Flamegraphs are a visualization of profiled software. This allows the most frequent code paths to be quickly identified.

Profiling was performed on the VBox environments for both ASMLib and ASMFD. The tests were scaled back to a single session for the purpose of simplifying the profile generation. Each test was run for 10 minutes.

The SLOB script slob.sql was modified to call ‘call-perf.sql’

slob.sql

...
-- SET SERVEROUTPUT OFF ;
SET SERVEROUTPUT ON   ;
SET VERIFY OFF;
 
--@@call-perf.sql
 
DECLARE
...

call-perf.sql in turn calls a root shell script that calls runs perf on the session to be profiled

call-perf.sql

col mypid new_value mypid
 
SELECT
   p.spid mypid
FROM v$session s, v$process p
WHERE p.addr = s.paddr
   AND userenv('SESSIONID') = s.audsid
/
 
host sudo /root/FlameGraph/sp-perf.sh &mypid

sp-perf.sh

#!/bin/bash
 
cd /root/FlameGraph
 
nohup perf record -g -F 997 -p  $1 &

When the testing was complete, the SVG files were generated with a command similar to this:

./stackcollapse.pl | ./flamegraph.pl > ASM-test.svg

For inclusion in this blog a smaller version of each file was created to better fit the viewers screen.

Though we cannot determine from this exactly what is different, we can make some inferences.

Flamegraphs show stack depth on the Y axis, and the population of the stack on the X axis. There is no time component as the X axis is sorted alphabetically. The wider the stack frame the more frequent the calls.

Notice the two distinct stacks for both ASM Filter Driver and ASMLib. With ASM Filter Driver the left most stack is much narrower than the left most stack for ASMLib. It may be that the leftmost stack represents the IO call portion of Oracle processing.

If so Oracle Developers have found a method to move code from the IO portion of the kernel to another section, reducing the overhead when making IO calls.

Further digging may reveal more about how ASM Filter Driver has managed to outperform ASMLib by 20%. That may be a topic for later research. For the time being I need to get back preparing my presentations.

So, should you use ASM Filter driver with Grid 12.2?
These results indicate it merits serious consideration and testing in your own environment.

email

Interested in working with Jared? Schedule a tech call.

About the Author

Oracle experience: started with Oracle 7.0.13 Programming Experience: Perl, PL/SQL, Shell, SQL Also some other odds and ends that are no longer useful Systems: Networking, Storage, OS to varying degrees. Have fond memories of DG/UX

7 Comments. Leave new

Your test beds are all virtual guests: following your comment about code being moved from kernel space to user space, I wonder if this bypasses some of the overhead in the virtual environment ? And it would be curious if the same benefits extend into bare metal servers. I just started looking into ASMFD so this is helpful.

Reply

Hi Jon, thanks for the comment.
Not all of the test environments were virtual. Following the tests on my laptop I ran them on bare metal on a beefy PC.
Those AWR reports were not reproduced here as the results were similar and did not add any value to the article.
After that test was successful I then ran tests on EC2/EBS.

I would run them on big bare metal HW if available.

Reply

Several folks have noticed that the AWR reports do not indicate any write activity.

Hopefully this will clear up the writes issue a bit.

These tests were all performed on Oracle 12.2 with a single PDB.

Why did I choose to use PDB? Because the legacy configuration is deprecated.

Some of you noticed that there were no writes appearing in the AWR report.

I also noticed it, but failed to call it out explicitly.

As there was much redo appearing in the reports, I chalked it up to an AWR reporting bug.
The AWR reporting system is much revised for 12.2 so that metrics can be gathered at the PDB level.

Certainly this is not the first AWR Bug. Some of you may recall that many (all?) CPU metrics are missing from AWR in 11.2. Bug.

I also confirmed the write activity via iostat. If you are familiar with SLOB then you may recall that an iostat report is run for the period of the test.
Yes, there were plenty of writes.

I should have explained this before the sharp eyed Oracle-L crowd got hold of it.

Anyway, I have rerun one of the tests, but this time modified the runit.sh script a bit to also create CDB level snapshots before and after the SLOB run..

=== PDB Report ===

Host Name Platform CPUs Cores Sockets Memory(GB)
—————- ——————————– —- —– ——- ———-
ora12cR2-asm.jks Linux x86 64-bit 2 2 1 7.80

Snap Id Snap Time Sessions Curs/Sess
——— ——————- ——– ———
Begin Snap: 11 26-Jan-18 08:57:20 5 1.6
End Snap: 12 26-Jan-18 09:07:23 1 6.0
Elapsed: 10.06 (mins)
DB Time: 40.04 (mins)

Container DB Id Container Name Open Time
————— ————— —————
859891322 P1 24-Jan-18 06:01

Host Name Platform CPUs Cores Sockets Memory(GB)
—————- ——————————– —- —– ——- ———-
ora12cR2-asm.jks Linux x86 64-bit 2 2 1 7.80

Snap Id Snap Time Sessions Curs/Sess
——— ——————- ——– ———
Begin Snap: 11 26-Jan-18 08:57:20 5 1.6
End Snap: 12 26-Jan-18 09:07:23 1 6.0
Elapsed: 10.06 (mins)
DB Time: 40.04 (mins)

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ ————— ————— ——— ———
DB Time(s): 4.0 0.0 0.01 42.90
DB CPU(s): 0.7 0.0 0.00 7.76
Background CPU(s): 0.0 0.0 0.00 0.00
Redo size (bytes): 1,194,250.6 8,774.5
Logical read (blocks): 9,903.1 72.8
Block changes: 8,766.3 64.4
Physical read (blocks): 6,800.2 50.0
Physical write (blocks): 0.0 0.0
Read IO requests: 6,799.8 50.0
Write IO requests: 0.0 0.0
Read IO (MB): 53.1 0.4
Write IO (MB): 0.0 0.0
IM scan rows: 0.0 0.0
Session Logical Read IM: 0.0 0.0
User calls: 0.1 0.0
Parses (SQL): 2.2 0.0
Hard parses (SQL): 0.5 0.0
SQL Work Area (MB): 0.2 0.0
Logons: 0.0 0.0
Executes (SQL): 294.1 2.2
Rollbacks: 0.0 0.0
Transactions: 136.1

=== CDB Report ===

Host Name Platform CPUs Cores Sockets Memory(GB)
—————- ——————————– —- —– ——- ———-
ora12cR2-asm.jks Linux x86 64-bit 2 2 1 7.80

Snap Id Snap Time Sessions Curs/Sess PDBs
——— ——————- ——– ——— —–
Begin Snap: 210 26-Jan-18 08:57:21 57 .7 1
End Snap: 211 26-Jan-18 09:07:25 53 .6 1
Elapsed: 10.06 (mins)
DB Time: 40.11 (mins)

Top ADDM Findings by Average Active Sessions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Finding Name Avg act Percen Task Name
—————————————- ——- —— ————————
Top SQL Statements 3.98 98.31 ADDM:2577183038_1_211
Top Segments by “User I/O” and “Cluster” 3.98 81.39 ADDM:2577183038_1_211
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ ————— ————— ——— ———
DB Time(s): 4.0 0.0 0.01 2.78
DB CPU(s): 0.7 0.0 0.00 0.50
Background CPU(s): 0.5 0.0 0.00 0.00
Redo size (bytes): 1,280,263.5 9,402.6
Logical read (blocks): 10,080.1 74.0
Block changes: 8,786.3 64.5
Physical read (blocks): 6,806.5 50.0
Physical write (blocks): 3,980.9 29.2
Read IO requests: 6,806.1 50.0
Write IO requests: 3,449.2 25.3
Read IO (MB): 53.2 0.4
Write IO (MB): 31.1 0.2
IM scan rows: 0.0 0.0
Session Logical Read IM: 0.0 0.0
User calls: 1.4 0.0
Parses (SQL): 8.6 0.1
Hard parses (SQL): 1.8 0.0
SQL Work Area (MB): 0.1 0.0
Logons: 0.3 0.0
Executes (SQL): 343.4 2.5
Rollbacks: 0.0 0.0
Transactions: 136.2

Reply
Yury Tyapkin
March 21, 2018 1:36 am

Hi,
I’ve just tested ASMFD on non-virtual server, which is being prepeared for production. Storage is accessed through SAN, multipath is configured. We have never used ASMLIB, so I compare UDEV with ASMFD. After I configure ASMFD I see a slight performance degradation.

WORKLOAD REPOSITORY report for

DB Name DB Id Unique Name DB Role Edition Release RAC CDB
———— ———– ———– —————- ——- ———- — —
CLONEMAS 3451810418 clonemaster PRIMARY EE 12.2.0.1.0 NO NO

Instance Inst Num Startup Time
———— ——– —————
clonemaster 1 20-Mar-18 18:33

Host Name Platform CPUs Cores Sockets Memory(GB)
—————- ——————————– —- —– ——- ———-
dwh101l.region.v Linux x86 64-bit 80 80 4 1007.59

UDEV:
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ ————— ————— ——— ———
DB Time(s): 63.9 0.0 0.01 55.98
DB CPU(s): 4.3 0.0 0.00 3.79
Background CPU(s): 1.6 0.0 0.00 0.00
Redo size (bytes): 19,951,095.2 9,639.5
Logical read (blocks): 149,767.3 72.4
Block changes: 133,170.4 64.3
Physical read (blocks): 130,092.4 62.9
Physical write (blocks): 63,880.9 30.9
Read IO requests: 130,088.2 62.9
Write IO requests: 61,170.1 29.6
Read IO (MB): 1,016.4 0.5
Write IO (MB): 499.1 0.2
IM scan rows: 0.0 0.0
Session Logical Read IM: 0.0 0.0
User calls: 1.1 0.0
Parses (SQL): 5.3 0.0
Hard parses (SQL): 1.1 0.0
SQL Work Area (MB): 0.1 0.0
Logons: 0.1 0.0
Executes (SQL): 4,334.6 2.1
Rollbacks: 0.0 0.0
Transactions: 2,069.7

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 99.97
Buffer Hit %: 13.14 In-memory Sort %: 100.00
Library Hit %: 99.92 Soft Parse %: 78.77
Execute to Parse %: 99.88 Latch Hit %: 99.93
Parse CPU to Parse Elapsd %: 16.38 % Non-Parse CPU: 99.97
Flash Cache Hit %: 0.00

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Total Wait Avg % DB Wait
Event Waits Time (sec) Wait time Class
—————————— ———– ———- ——— —— ——–
db file sequential read 40,176,858 32.9K 818.99us 85.7 User I/O
db file parallel read 1,340,258 3419.1 2.55ms 8.9 User I/O
DB CPU 2596.6 6.8
enq: TX – row lock contention 2,287 27.4 12.00ms .1 Applicat
log file switch (private stran 1,301 14.9 11.43ms .0 Configur
free buffer waits 1,804 14.3 7.95ms .0 Configur
read by other session 5,147 6.3 1.22ms .0 User I/O
cursor: pin S wait on X 608 2.7 4.41ms .0 Concurre
library cache load lock 598 1.9 3.12ms .0 Concurre
log file switch completion 115 1 8.31ms .0 Configur

ASMDF:
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ ————— ————— ——— ———
DB Time(s): 63.9 0.0 0.02 56.64
DB CPU(s): 4.5 0.0 0.00 4.02
Background CPU(s): 1.6 0.0 0.00 0.00
Redo size (bytes): 19,211,564.8 9,630.9
Logical read (blocks): 144,408.6 72.4
Block changes: 128,344.8 64.3
Physical read (blocks): 125,491.3 62.9
Physical write (blocks): 61,573.9 30.9
Read IO requests: 125,487.0 62.9
Write IO requests: 58,972.0 29.6
Read IO (MB): 980.4 0.5
Write IO (MB): 481.1 0.2
IM scan rows: 0.0 0.0
Session Logical Read IM: 0.0 0.0
User calls: 1.1 0.0
Parses (SQL): 4.5 0.0
Hard parses (SQL): 0.9 0.0
SQL Work Area (MB): 0.1 0.0
Logons: 0.1 0.0
Executes (SQL): 4,176.3 2.1
Rollbacks: 0.0 0.0
Transactions: 1,994.8

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 99.97
Buffer Hit %: 13.10 In-memory Sort %: 100.00
Library Hit %: 99.93 Soft Parse %: 80.89
Execute to Parse %: 99.89 Latch Hit %: 99.94
Parse CPU to Parse Elapsd %: 18.52 % Non-Parse CPU: 99.97
Flash Cache Hit %: 0.00

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Total Wait Avg % DB Wait
Event Waits Time (sec) Wait time Class
—————————— ———– ———- ——— —— ——–
db file sequential read 38,855,671 32.9K 847.53us 85.7 User I/O
db file parallel read 1,293,193 3515.5 2.72ms 9.2 User I/O
DB CPU 2726.3 7.1
enq: TX – row lock contention 2,102 25.6 12.19ms .1 Applicat
free buffer waits 2,300 18 7.85ms .0 Configur
log file switch (private stran 1,121 11.1 9.88ms .0 Configur
read by other session 5,058 6.4 1.27ms .0 User I/O
log file switch completion 226 2.1 9.35ms .0 Configur
cursor: pin S wait on X 535 2.1 3.84ms .0 Concurre
library cache load lock 488 1.5 3.13ms .0 Concurre

Reply

Interesting results. I also tested on a non-virtual box and found consistent performance improvements with ASMFD.
What software was used to perform the tests?

Reply

Hi Jared,
I used SLOB with configuration file from your article and 16 concurrent sessions.
ASMFD was configured according to https://docs.oracle.com/database/121/OSTMG/GUID-06B3337C-07A3-4B3F-B6CD-04F2916C11F6.htm

Reply

Hi Yury,

If you were testing on 12.1, that could be the differentiator in our results, as I was testing on 12.2.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *