I was recently looking at an issue where a large database server was running out of temp space. The issue had occurred in the past, so the goal was to identify what had been causing the system to run out of temp. Although ORA-1555 messages will appear in the database alert log (with query text) when a query runs out of temporary space, they only talk about the query that was “unlucky” enough to be attempting to get space when the tablespace was full, not who was using the temp space.
A quick web search shows that most people seem to be querying current tables. Coskan Gundogar blogged about a historical monitoring script he wrote but only drilled down into a specific session and serial number.
A quick look at AWR showed that v$tempseg_usage had no direct historical view. There is dba_hist_tbspc_usage info for the temp tablespace, but it only shows aggregate usage.
However, dba_hist_active_sess_history, the AWR samples of active session history data, has a new column in 11gR2: TEMP_SPACE_ALLOCATED. From the docs:
Amount of TEMP memory (in bytes) consumed by this session at the time this sample was taken.
Having temp usage in a table like ASH is actually much more useful than a v$tempseg_usage capture since it lets us drill down not only to the query, but also to the exact SQL plan and plan step where the temp space is being used.
Querying out the data requires a bit of elbow grease though: There is no delta column like for the I/O and time model stats, so we need to use some fancy analytic functions to compute the delta values ourselves.
dba_hist_active_sess_history itself is a view composed of multiple joins on the back end, so it takes quite a while to query. I generally create a fixed version of the table with the timestamp I want to analyze, in order to speed up the queries later. And since this is an Exadata system, I’ll take advantage of hybrid columnar compression. Adjust the
begin_interval_time range to match the date range you’re looking at.
CREATE TABLE marc_ash COMPRESS FOR QUERY LOW as SELECT * FROM dba_hist_active_sess_history WHERE dbid = (SELECT dbid FROM v$database) AND snap_id IN (SELECT snap_id FROM dba_hist_snapshot WHERE dbid = (SELECT dbid FROM v$database) AND begin_interval_time BETWEEN To_date( '20-DEC-2012 05:00', 'dd-mon-yyyy hh24:mi') AND To_date( '20-DEC-2012 10:00', 'dd-mon-yyyy hh24:mi'));
(Side note: I’m trying out an online SQL beautifier tool. Readers, do you find this format easier to read?)
Once the base table is created, I can query it:
(Thanks to Randolf Geist for pointing out a missing grouping for parallel execution)
SELECT * FROM (SELECT Rank() over ( ORDER BY SUM(Nvl(temp_space_delta, 0)) DESC) position, sql_id, sql_plan_hash_value, sql_plan_operation, sql_plan_line_id, Count(DISTINCT sql_exec_id) total_execs, Trunc(SUM(Nvl(temp_space_delta, 0)) / 1024 / 1024) ||'m' temp_usage FROM (SELECT sql_exec_id, sql_id, sql_plan_hash_value, sql_plan_operation, sql_plan_line_id, temp_space_allocated - Nvl(Lag(temp_space_allocated, 1) over ( PARTITION BY sql_exec_id, sql_id, session_id ORDER BY sample_id), 0) temp_space_delta FROM marc_ash) GROUP BY sql_id, sql_plan_operation, sql_plan_hash_value, sql_plan_line_id) WHERE position <= 10 ORDER BY position;
A bit of explanation is in order: The core of the query selects from marc_ash using the lag() analytic function to get the previous temp space allocation for the same execution, SQL statement, and parallel query process, and subtracts it to get the difference in temp space usage for the current sample. I then take this result, plus some information about the SQL step executing, and group it by SQL ID, plan operation, and line ID. And finally, I sort the results, keeping the top 10, and apply some formatting to the result.
And as is usual with ASH, since I’m only looking at 10-second samples of data, the results don’t reflect 100% what really happened. Though the longer the execution step runs, the more accurate the end results will be.
Here’s a sample result (with a bit of cleanup of the column headers):
POSITION SQL_ID SQL_PLAN_HASH SQL_PLAN_OP SQL_PLAN_LINE_ID TOTAL_EXECS TEMP_USAGE ------- --------------- --------------- --------------- --------------- --------------- ---------- 1 a9hrk7g2ahbtk 1062456949 HASH JOIN 15 1 142607m 2 b40ahcxu8nxdq 2741224596 HASH JOIN 21 1 139959m 3 b62gn3yqyau4k 3673174370 SORT 1 4 92378m 4 5rf3btrz44d6v 3879859951 SORT 3 1 35128m 5 5rf3btrz44d6v 3879859951 SORT 6 1 34095m 6 8ucjdxhxwz93g 2599798119 HASH 15 1 10873m 7 901mxasy6gu2j 801798936 SORT 2 28 6268m 8 4b7jatzjaf4z3 270793325 SORT 3 3 5876m 9 0h5cvsbs92cyp 1935762178 SORT 1 3 4522m 10 84z212am4j987 3445415588 SORT 1 2 4182m
During this time period, there was high temp usage from the two plan steps of SQL ID 5rf3btrz44d6v. Even though it only ran once, it consumed close to 70G of temp space during this period. The benefit of using ASH data, though, is that I not only know which SQL statement is responsible, but I also have an idea of which plan steps are responsible: two sort operations.
We can plug in the SQL ID and plan hash value in the output to
dbms_xplan.display_awr to see the plan itself. The results show the full query text as well as the execution pan involved.
A short snippet of the plan:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | 100K(100)| | | | | | | | 1 | PX COORDINATOR | | | | | | | | | | | | | 2 | PX SEND QC (ORDER) | :TQ10009 | 76M| 18G| | 100K (1)| 00:20:04 | | | Q1,09 | P->S | QC (ORDER) | | 3 | SORT ORDER BY | | 76M| 18G| 21G| 100K (1)| 00:20:04 | | | Q1,09 | PCWP | | | 4 | PX RECEIVE | | 76M| 18G| | 100K (1)| 00:20:04 | | | Q1,09 | PCWP | | | 5 | PX SEND RANGE | :TQ10008 | 76M| 18G| | 100K (1)| 00:20:04 | | | Q1,08 | P->P | RANGE | | 6 | SORT GROUP BY | | 76M| 18G| 21G| 100K (1)| 00:20:04 | | | Q1,08 | PCWP | | | 7 | PX RECEIVE | | 76M| 18G| | 100K (1)| 00:20:04 | | | Q1,08 | PCWP | | | 8 | PX SEND HASH | :TQ10007 | 76M| 18G| | 100K (1)| 00:20:04 | | | Q1,07 | P->P | HASH | | 9 | HASH GROUP BY | | 76M| 18G| 21G| 100K (1)| 00:20:04 | | | Q1,07 | PCWP | | | 10 | VIEW | VW_DAG_0 | 76M| 18G| | 67638 (1)| 00:13:32 | | | Q1,07 | PCWP | | | 11 | HASH GROUP BY | | 76M| 16G| 17G| 67638 (1)| 00:13:32 | | | Q1,07 | PCWP | | | 12 | HASH JOIN | | 76M| 16G| | 37552 (2)| 00:07:31 | | | Q1,07 | PCWP | | | 13 | BUFFER SORT | | | | | | | | | Q1,07 | PCWC | | | 14 | PX RECEIVE | | 2709K| 67M| | 5287 (1)| 00:01:04 | | | Q1,07 | PCWP | | | 15 | PX SEND HASH | :TQ10004 | 2709K| 67M| | 5287 (1)| 00:01:04 | | | | S->P | HASH | | 16 | TABLE ACCESS STORAGE FULL | TAB1 | 2709K| 67M| | 5287 (1)| 00:01:04 | | | | | |
A very quick look at this execution plan seems to indicate that it’s returning an absolutely massive dataset to the database client and sorting it twice. It would benefit from more conditions to restrict results to what the client is actually looking for.
Any comments, readers? How are you monitoring for temp space usage? Anyone using resource management to control temp usage?
I always wanted to explore an AFTER SERVERERROR trigger to see if there was anything useful I could glean when a ORA-01555 fires.
I like your approach though, I will give it a shot.
it’s good to see someone writing about this new ASH feature. A few comments:
1. I think your query using the LAG function lacks a bit of precision: In case of a Parallel Execution there will be multiple samples per SAMPLE_ID (one per sampled Parallel Slave), so your ORDER BY SAMPLE_ID of the LAG function is not deterministic for Parallel Executions I think.
2. This also makes the approach of building a delta between samples questionable I think, at least for Parallel Execution. After all it is based on sampling – what if the particular session being sampled doesn’t show up afterwards, or only 100 samples later having released the previous TEMP allocation but allocating TEMP for a completely different part of the execution plan? The question here is probably, what kind of problem do you want to address using your query?
3. I don’t think that you can break down the TEMP_SPACE_ALLOCATED information from ASH on SQL plan line operation level. You can see this from your example where “PX RECEIVE” gets TEMP space allocated.
But a “PX RECEIVE” operation doesn’t have a workarea assigned, so why should it show up?
I believe the PGA and TEMP allocation in ASH data is not on operation level, but on session level, so represents the total aggregate for all active workareas of that SQL execution at the moment of sampling.
This is different from Real-Time SQL Monitoring and Rowsource Statistics where you get the allocation data on workarea level.
It looks more like the ASH data is taken from V$PROCESS for PGA and V$TEMPSEG_USAGE for TEMP.
4. I do similar analysis in my XPLAN_ASH tool and I can show there that the TEMP space information from ASH is unreliable and quite often falls short of the actual TEMP space allocation reported by Rowsource Statistics or Real-Time SQL Monitoring, so the information has to be treated carefully and might be misleading (or the way I treat the data in XPLAN_ASH is misleading, but at present I don’t think so).
Thanks for the insightful comments. I went back to my original working version of the query, and I had a “PARTITION BY sql_id, sql_exec_id, session_id” clause in it, and somehow lost the session_id part in the final version. Like you say, the temp space counts are independent for each PX process, and I’ve corrected the query. It also now shows the usage in hashing and sorting operations, which does make a lot more sense than PX receive.
As far as the accuracy of the data, I would expect that it would be driven off the session statistic “temp space allocated (bytes)”. A test on one (otherwise idle) 11gR2 system:
— Temp space consumer
select count(distinct id) from big_table where rownum < 10000000; -- While running, in another session select value/1024/1024 from v$sesstat where statistic# = (select statistic# from v$statname where name = 'temp space allocated (bytes)') and sid = 804; 121 select temp_space_allocated/1024/1024 from v$active_session_history where session_id = 804 and sample_time > sysdate-1/(24*60*60);
select used_blocks*8192/1024/1024 from v$sort_segment;
So, in my simple test at least, the numbers from ASH, v$sort_segment, and v$sesstat do line up at 121m. Are you saying that they’re all inaccurate?
Thanks again, and happy new year!
thanks for taking up my comments and Happy New Year!
As I already tried to explain, I don’t think that the ASH TEMP and PGA information can be broken down on SQL execution plan line information, since it seems to reflect the overall PGA / TEMP allocation of the process / session at the time of sampling across all workareas active, so I still believe that your example query showing PGA/TEMP on execution plan line level is a bit misleading.
Regarding the accuracy: I still have to come up with good, reproducible examples, but I regularly run my XPLAN_ASH tool on real-life client queries and I can see quite often that the ASH TEMP information does not correspond to the real TEMP allocation as shown in Real-Time SQL Monitoring.
One particular issue seems to come from BUFFERED operations that are only there in case of Parallel Execution (like HASH JOIN BUFFERED or additional BUFFER SORTs specific to PX), so may be serial executions are OK, and only Parallel Executions using BUFFERED operations might be affected by the problem.
I haven’t had the time yet to investigate further, but I’ve already witnessed numerous inconsistencies.
while working on a completely different matter I came again across this PGA/TEMP ASH stuff, and I have to correct my previous statement, because I mixed things up.
The accuraccy of the PGA/TEMP figures in the ASH data seems to be fine.
The problem I’ve stumbled upon regarding PGA/TEMP is the fact that I tried to come up with a PGA/TEMP usage figure over time, and due to the sampling this figure for Parallel Execution isn’t that consistent, as not always every PX Slave involved gets sampled every time, so a sum(PGA/TEMP) per sample time jumps a bit up and down, depending on which PX Slaves were sampled or not.
This doesn’t mean however that the PGA/TEMP figures shown in the existing samples are incorrect/unreliable, so my previous statement about the reliability needs to be corrected.
However, and that is what triggered me to mention this unreliability, the delta figures regarding I/O, that are also part of the ASH data since 11.2, are actually sometimes quite incorrect and fall short of the actual number of I/Os and volume of I/O performed.
This I can show quite easily and if I find the time I will blog about it. The same inconsistency can be seen in the Real-Time SQL Monitoring report (“I/O requests” / “I/O Throughput” in the “Metrics” tab), which uses all these figures to populate the “Metrics” tab in the “EM/Active” version of the report.
I’ve mixed these two problems, and thought that the PGA/TEMP figures are also unreliable, but that doesn’t seem to be the case – the delta I/O figures seem to be the ones that are often unreliable.
Your query, shows TEMP usage that I do not have on mine TEMP_GROUP:
POSITION SQL_ID SQL_PLAN_HASH_VALUE SQL_PLAN_OPERATION SQL_PLAN_LINE_ID NR_EXECS TEMP_MB
———- ————- ——————- ———————- —————- ———- ———-
1 aj5tu2zhpnqdr 3447736756 TABLE ACCESS 8 13 178131
Mine TEMP_GROUP has 120 GB and result shows 178 GB use.
And SQL is insert into GTT table
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | INSERT STATEMENT | | | | 14M(100)| |
| 1 | LOAD TABLE CONVENTIONAL | | | | | |
|* 2 | VIEW | | 1 | 159 | 14M (1)| 00:06:18 |
| 3 | WINDOW SORT | | 1 | 1096 | 14M (1)| 00:06:18 |
| 4 | NESTED LOOPS OUTER | | 1 | 1096 | 14M (1)| 00:06:18 |
| 5 | VIEW | | 1 | 583 | 14M (1)| 00:06:18 |
| 6 | COUNT | | | | | |
|* 7 | FILTER | | | | | |
|* 8 | TABLE ACCESS FULL | RPM_FUTURE_RETAIL_GTT | 512K| 283M| 217K (2)| 00:00:06 |
|* 9 | COLLECTION ITERATOR PICKLER FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
|* 10 | COLLECTION ITERATOR PICKLER FETCH| | 10 | 20 | 4 (0)| 00:00:01 |
|* 11 | TABLE ACCESS BY INDEX ROWID | RPM_FUTURE_RETAIL_GTT | 1 | 513 | 5 (0)| 00:00:01 |
|* 12 | INDEX SKIP SCAN | RPM_FUTURE_RETAIL_GTT_I5 | 63 | | 4 (0)| 00:00:01 |
INSERT INTO RPM_ROLLUP_GTT (ITEM,
FROM (SELECT I.ITEM,
OVER (PARTITION BY I.ITEM,
ORDER BY I.ACTION_DATE)
NVL (P.SELLING_RETAIL, -999), 0,
+ DECODE (I.SELLING_RETAIL_CURRENCY,
NVL (P.SELLING_RETAIL_CURRENCY, -999), 0,
+ DECODE (I.SELLING_UOM,
NVL (P.SELLING_UOM, -999), 0,
+ DECODE (NVL (I.MULTI_UNITS, -999),
NVL (P.MULTI_UNITS, -999), 0,
+ DECODE (NVL (I.MULTI_UNIT_RETAIL, -999),
NVL (P.MULTI_UNIT_RETAIL, -999), 0,
+ DECODE (NVL (I.MULTI_UNIT_RETAIL_CURRENCY, -999),
NVL (P.MULTI_UNIT_RETAIL_CURRENCY, -999), 0,
+ DECODE (NVL (I.MULTI_SELLING_UOM, -999),
NVL (P.MULTI_SELLING_UOM, -999), 0,
+ DECODE (NVL (I.CLEAR_RETAIL, -999),
NVL (P.CLEAR_RETAIL, -999), 0,
+ DECODE (NVL (I.CLEAR_RETAIL_CURRENCY, -999),
NVL (P.CLEAR_RETAIL_CURRENCY, -999), 0,
+ DECODE (NVL (I.CLEAR_UOM, -999),
NVL (P.CLEAR_UOM, -999), 0,
+ DECODE (NVL (I.SIMPLE_PROMO_RETAIL, -999),
NVL (P.SIMPLE_PROMO_RETAIL, -999), 0,
+ DECODE (NVL (I.SIMPLE_PROMO_RETAIL_CURRENCY, -999),
NVL (P.SIMPLE_PROMO_RETAIL_CURRENCY, -999), 0,
+ DECODE (NVL (I.SIMPLE_PROMO_UOM, -999),
NVL (P.SIMPLE_PROMO_UOM, -999), 0,
+ DECODE (I.ON_SIMPLE_PROMO_IND,
NVL (P.ON_SIMPLE_PROMO_IND, -999), 0,
+ DECODE (NVL (I.COMPLEX_PROMO_RETAIL, -999),
NVL (P.COMPLEX_PROMO_RETAIL, -999), 0,
+ DECODE (
NVL (I.COMPLEX_PROMO_RETAIL_CURRENCY, -999),
NVL (P.COMPLEX_PROMO_RETAIL_CURRENCY, -999), 0,
+ DECODE (NVL (I.COMPLEX_PROMO_UOM, -999),
NVL (P.COMPLEX_PROMO_UOM, -999), 0,
+ DECODE (I.ON_COMPLEX_PROMO_IND,
NVL (P.ON_COMPLEX_PROMO_IND, -999), 0,
+ DECODE (NVL (I.PRICE_CHANGE_ID, -999),
NVL (P.PRICE_CHANGE_ID, -999), 0,
+ DECODE (NVL (I.PRICE_CHANGE_DISPLAY_ID, -999),
NVL (P.PRICE_CHANGE_DISPLAY_ID, -999), 0,
+ DECODE (NVL (I.PC_EXCEPTION_PARENT_ID, -999),
NVL (P.PC_EXCEPTION_PARENT_ID, -999), 0,
+ DECODE (NVL (I.PC_CHANGE_TYPE, -999),
NVL (P.PC_CHANGE_TYPE, -999), 0,
+ DECODE (NVL (I.PC_CHANGE_AMOUNT, -999),
NVL (P.PC_CHANGE_AMOUNT, -999), 0,
+ DECODE (NVL (I.PC_CHANGE_CURRENCY, -999),
NVL (P.PC_CHANGE_CURRENCY, -999), 0,
+ DECODE (NVL (I.PC_CHANGE_PERCENT, -999),
NVL (P.PC_CHANGE_PERCENT, -999), 0,
+ DECODE (NVL (I.PC_CHANGE_SELLING_UOM, -999),
NVL (P.PC_CHANGE_SELLING_UOM, -999), 0,
+ DECODE (NVL (I.PC_NULL_MULTI_IND, -999),
NVL (P.PC_NULL_MULTI_IND, -999), 0,
+ DECODE (NVL (I.PC_MULTI_UNITS, -999),
NVL (P.PC_MULTI_UNITS, -999), 0,
+ DECODE (NVL (I.PC_MULTI_UNIT_RETAIL, -999),
NVL (P.PC_MULTI_UNIT_RETAIL, -999), 0,
+ DECODE (
NVL (I.PC_MULTI_UNIT_RETAIL_CURRENCY, -999),
NVL (P.PC_MULTI_UNIT_RETAIL_CURRENCY, -999), 0,
+ DECODE (NVL (I.PC_MULTI_SELLING_UOM, -999),
NVL (P.PC_MULTI_SELLING_UOM, -999), 0,
+ DECODE (NVL (I.PC_PRICE_GUIDE_ID, -999),
NVL (P.PC_PRICE_GUIDE_ID, -999), 0,
+ DECODE (NVL (I.CLEAR_EXCEPTION_PARENT_ID, -999),
NVL (P.CLEAR_EXCEPTION_PARENT_ID, -999), 0,
+ DECODE (NVL (I.CLEARANCE_ID, -999),
NVL (P.CLEARANCE_ID, -999), 0,
+ DECODE (NVL (I.CLEARANCE_DISPLAY_ID, -999),
NVL (P.CLEARANCE_DISPLAY_ID, -999), 0,
+ DECODE (NVL (I.CLEAR_MKDN_INDEX, -999),
NVL (P.CLEAR_MKDN_INDEX, -999), 0,
+ DECODE (NVL (I.CLEAR_START_IND, -999),
NVL (P.CLEAR_START_IND, -999), 0,
+ DECODE (NVL (I.CLEAR_CHANGE_TYPE, -999),
NVL (P.CLEAR_CHANGE_TYPE, -999), 0,
+ DECODE (NVL (I.CLEAR_CHANGE_AMOUNT, -999),
NVL (P.CLEAR_CHANGE_AMOUNT, -999), 0,
+ DECODE (NVL (I.CLEAR_CHANGE_CURRENCY, -999),
NVL (P.CLEAR_CHANGE_CURRENCY, -999), 0,
+ DECODE (NVL (I.CLEAR_CHANGE_PERCENT, -999),
NVL (P.CLEAR_CHANGE_PERCENT, -999), 0,
+ DECODE (NVL (I.CLEAR_CHANGE_SELLING_UOM, -999),
NVL (P.CLEAR_CHANGE_SELLING_UOM, -999), 0,
+ DECODE (NVL (I.CLEAR_PRICE_GUIDE_ID, -999),
NVL (P.CLEAR_PRICE_GUIDE_ID, -999), 0,
+ DECODE (NVL (I.LOC_MOVE_FROM_ZONE_ID, -999),
NVL (P.LOC_MOVE_FROM_ZONE_ID, -999), 0,
+ DECODE (NVL (I.LOC_MOVE_TO_ZONE_ID, -999),
NVL (P.LOC_MOVE_TO_ZONE_ID, -999), 0,
+ DECODE (NVL (I.LOCATION_MOVE_ID, -999),
NVL (P.LOCATION_MOVE_ID, -999), 0,
OVER (PARTITION BY I.ITEM,
FROM RPM_FUTURE_RETAIL_GTT P,
DECODE ( :B14,
DECODE ( :B14,
:B13, NVL (R.DIFF_ID, ‘-999’),
DECODE ( :B14,
DECODE ( :B14,
FROM RPM_FUTURE_RETAIL_GTT R
WHERE R.CUR_HIER_LEVEL IN (SELECT /*+ no_unnest CARDINALITY(cr, 1) */
FROM TABLE (
:B15 AS OBJ_VARCHAR_ID_TABLE)) CR)
AND R.MAX_HIER_LEVEL IN (SELECT /*+ no_unnest CARDINALITY(mx, 10) */
FROM TABLE (
:B16 AS OBJ_VARCHAR_ID_TABLE)) MX)
AND R.DEPT = :B1
AND NVL (R.TIMELINE_SEQ, 0) != -999
AND ROWNUM > 0) I
WHERE I.DEPT = P.DEPT(+)
AND I.TARGET_ITEM = P.ITEM(+)
AND I.TARGET_DIFF_ID = NVL (P.DIFF_ID(+), ‘-999’)
AND I.TARGET_ZONE_NODE = P.LOCATION(+)
AND I.TARGET_ZONE_NODE_TYPE = P.ZONE_NODE_TYPE(+)
AND I.ACTION_DATE = P.ACTION_DATE(+)
AND I.MAX_HIER_LEVEL = P.MAX_HIER_LEVEL(+)
AND P.CUR_HIER_LEVEL(+) = :B17) INNER
WHERE INNER.VARIATION_COUNT = 0 AND INNER.ROW_CNT = 1
Thanks for this very good info. Can you confirm that the SQL that you posted is the updated one with the group by fix on parallel execution? :-)
Also, just want to clarify, is it correct to assume that the TEMP usage for the SQL is therefore TEMP_USAGE / TOTAL_EXECS?
Can we assume whether a SQL is parallel or not by the value of PX_FLAGS?
Any reply much appreciated. Thanks
plan shows no paraleleism