I saw a question on the OTN Exadata forum asking whether or not to use BIGFILE temporary tablespaces in an Exadata system. BIGFILE tablespaces have been around since at least Oracle 10gR2 and extend the 32G datafile size limit, but they restrict to a single datafile per tablespace. Indeed, it trades the simplified management of a single datafile for reduced flexibility.
The default Exadata database “dbm” comes with a single BIGFILE tablespace, and I’ve found this approach to work well in most cases. However, the OTN question reminded me of a situation I encountered last year where i couldn’t use BIGFILE tablespaces for performance reasons, and I thought I’d share some of the details with Pythian blog readers.
The environment was an Exadata environment undergoing pre-production stress testing. I used Real Application Testing to take a highly-concurrent OLTP workload, and replayed the workload with the
synchronization parameter set to
FALSE, effectively increasing concurrency beyond the original test system.
AWR showed a large volume of buffer-busy activity:
Top 5 Timed Foreground Events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avg wait % DB Event Waits Time(s) (ms) Time Wait Class ------------------------------ ------------ ----------- ------ ------ ------------- gc buffer busy release 110,700,450 12,544,451 113 58.9 Cluster buffer busy waits 112,565,811 3,950,102 35 18.5 Concurrency latch: cache buffers chains 113,162,321 2,335,055 21 11.0 Concurrency enq: SS - contention 20,026,343 1,208,422 60 5.7 Configuration gc buffer busy acquire 11,244,505 720,003 64 3.4 Cluster
To drill down, I used ASH (active session history) information. Given the raw volume of sessions and data, though, querying dba_hist_active_sess_history is very slow. I therefore took advantage of Exadata’s hybrid columnar compression feature to create a smaller, flattened, and compressed work table containing observations from the RAT replay period. The resulting table is much faster to query.
create table marc_ash nologging compress for query high parallel 128 as select /*+parallel(128)*/ * from dba_hist_active_sess_history where snap_id between 384 and 423;
Then, I identified the P1, P2, and P3 parameters of the top wait events to see where the waits were happening, as well as how much time was spent in each combination. This is probably a good time for me to mention my standard disclaimer regarding dba_hist_active_session_history: The data is composed of samples of the system state, so it doesn’t have the same granularity as, say, session traces. But with a large sample like the one I have here, I can still get a good idea of what’s going on.
select event,p1, p2, p3, count(*), sum(tm_delta_time) from marc_ash group by event,p1,p2,p3 order by sum(tm_delta_time) desc; EVENT P1 P2 P3 COUNT(*) SUM(TM_DELTA_TIME) --------------------------- ------------ --- --- -------- ------------------ gc buffer busy release 501 2 13 2336685 57421543417904 buffer busy waits 501 2 13 736241 16902873986356 latch: cache buffers chains 262188190496 155 0 260182 6197021780073 latch: cache buffers chains 230512806688 155 0 217646 5831016580939 enq: SS - contention 1397948422 41 2 183510 2378750008606 gc buffer busy acquire 501 2 13 130649 3265344853224
What’s interesting here is that a common set of parameters came up in buffer busy waits: P1 = 501, P2 = 2, P3 = 13. P3 lists the block type, and there’s a great summary of values on Charles Hooper’s blog. 13 maps to “file header block”.
P1 is the file number. The number 501 is above my DB_FILES value of 500, indicating that it’s a tempfile, and in this case, the one very large BIGFILE of the TEMP tablespace. P2 is the block number. According to MOS note 109630.1, block 2 is the space management bitmap in a locally-managed tablespace.
My Oracle Support note 465840.1 has some good information about addressing tempfile contention in RAC. It mentions that “buffer busy wait” events on the file header block can be encountered in RAC since each instance allocates space in the same tempfile. It recommends creating at least as many tempfiles as there are instances in the cluster. To accommodate multiple files in a tablespace, we must create a SMALLFILE tablespace, and work around the 32G file limit. In our case, it means creating a new temporary tablespace with 24 32G tempfiles for a total capacity of 768GB, changing the default temporary tablespace, and dropping the old one.
After flashing back the database and re-running the same RAT workload, the wait on tempfile 501 and the SS enqueue disappeared entirely.
What are your experiences, blog readers? Do you normally use BIGFILE temporary tablespaces? Have you encountered similar issues?