Tackling time troubles – how to use dates correctly in Oracle

Posted in: Oracle, Technical Track

For the Oracle Code series of events I have delivered an older but still relevant talk called “Tackling Time Troubles.” Half of it is nice-to-know geek trivia and the other half is a few points on how to avoid very common mistakes when dealing with these datatypes in an Oracle database.

If you don’t have the patience to watch the full talk, here is an outtake.

Avoid functions on datetime columns in your predicates – part 1

So this first one is a query against a somewhat large table. I created it by selecting dba_objects into a new table several times and created a regular index on the last_ddl_time column. This results in a “date” column. Unlike MySQL and postgres, “date” in Oracle also stores the hours, minutes and seconds and a common (but far from great) way to find all rows that have a date on a given date is to use the trunc() function which will set all components that are more specific than the day (hours, minutes and seconds) to 00, making the comparison to a single date an equality check.

SQL> SELECT COUNT(*)
FROM bigtab
WHERE TRUNC(last_ddl_time) = to_date('20170908','YYYYMMDD');
 COUNT(*)
----------
 1772
Elapsed: 00:00:07.23

Execution Plan
----------------------------------------------------------
Plan hash value: 2140185107
-----------------------------------------------------------------------------
| Id | Operation        | Name   | Rows | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT  |        | 1    | 8     | 427 (1)    | 00:00:01 |
| 1 | SORT AGGREGATE    |        | 1    | 8     |            |          |
|* 2 | TABLE ACCESS FULL| BIGTAB | 908  | 7264  | 427 (1)    | 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter(TRUNC(INTERNAL_FUNCTION("LAST_DDL_TIME"))=TO_DATE('
 2017-09-08 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

At this point you are wondering what’s wrong with that query. Below I have another one that looks somewhat more complicated and longer and uses “between conditions“. It does the same thing as the query above and returns the same results. But it does it 35x faster.

SQL> SELECT COUNT(*)
FROM bigtab
WHERE last_ddl_time BETWEEN to_date('20170908','YYYYMMDD')
 AND to_date('20170908-23:59:59','YYYYMMDD-HH24:MI:SS');
 COUNT(*)
----------
 1772
Elapsed: 00:00:00.21

Execution Plan
----------------------------------------------------------
Plan hash value: 397841597
--------------------------------------------------------------------------------
| Id | Operation        | Name       | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0  | SELECT STATEMENT |            | 1    | 8     | 7 (0)      | 00:00:01 |
| 1  | SORT AGGREGATE   |            | 1    | 8     |            |          |
|* 2 | INDEX RANGE SCAN | BIGTAB_IDX | 228  | 1824  | 7 (0)      | 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - access("LAST_DDL_TIME">=TO_DATE(' 2017-09-08 00:00:00',
 'syyyy-mm-dd hh24:mi:ss') AND "LAST_DDL_TIME"<=TO_DATE(' 2017-09-08
 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

The explanation for the difference here is that Oracle cannot use a regular index if you wrap the indexed column inside of a function. One may argue that a function based index on trunc(last_ddl_time) would solve this but replacing the index would also mean that all queries would now have to use the trunc() function or do without an index. As a bonus you also have the chance for partition pruning if you are not using a function. If you are lucky enough to be running on Exadata the same is true for smart scans and storage indexes.

Avoid functions on datetime columns in your predicates – part 2

For this second example, let’s consider that there is no index on the column in the predicate and compare the two syntaxes again. First, the good example with “between” taking about 5s.

SQL> SELECT COUNT(*)
FROM bigtab
WHERE created BETWEEN to_date('20170908','YYYYMMDD')
 AND to_date('20170908-23:59:59','YYYYMMDD-HH24:MI:SS');
 COUNT(*)
----------
 408
Elapsed: 00:00:05.13

In this case, the lazier syntax with trunc() took more than twice as long to run even though both queries are performing a (fully cached) full table scan.

SQL> SELECT COUNT(*)
FROM bigtab
WHERE TRUNC(created) = to_date('20110908','YYYYMMDD');
 COUNT(*)
----------
 408
Elapsed: 00:00:12.32

The reason for the difference in execution time here is that with the trunc() syntax Oracle still has to apply the function to every row it scans, taking up extra context switches and CPU cycles even though the amount of logical IOs is the same.

Conclusion

I hope I made the point clearly enough to not use functions like trunc() in the where-clause of your queries. The “between” syntax is more verbose and requires a little more typing but you get rewarded with better performance because Oracle may still use regular indexes, partition pruning, smart scans and at the very least avoid context switching to evaluate a function.

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Björn Rost is an Oracle ACE Director, and one of Pythian’s top Oracle experts. A popular presenter, he travels the world attending technology conferences, sharing insights, and learning with his wide network of peers. Björn also serves as president of IOUG’s RAC special interest group. He is always challenging himself, personally through physical activities including triathlons, and professionally through his ongoing quest to increase his knowledge of Oracle and other leading technologies.

6 Comments. Leave new

Alex Tokarev
April 3, 2018 8:14 am

I would not recomend to use BETWEEN to simulate TRUNC. I would suggest to use date_field >= ‘your date’ and date_field < 'your date' + 1. It makes your code more stable when someone decides to use timestamps rather than dates.

Reply

The explanation for the difference here is that Oracle cannot use a regular index if you wrap the indexed column inside
of a function. One may argue that a function based index on trunc(last_ddl_time) would solve this but replacing the
index would also mean that all queries would now have to use the trunc() function or do without an index. As a bonus you
also have the chance for partition pruning if you are not using a function. If you are lucky enough to be running on
Exadata the same is true for smart scans and storage indexes.

–//right ?
create index if_emp_hiredate on emp(trunc(hiredate));

select * from emp where hiredate=to_date(‘1980/12/17′,’yyyy-mm-dd’);

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
———- ———- ——— ———- ——————- ———- ———- ———-
7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20

select * from table(dbms_xplan.display_cursor(NULL,NULL,’ALL ALLSTATS LAST PEEKED_BINDS cost partition -projection -outline’));
PLAN_TABLE_OUTPUT
————————————-
SQL_ID 1ct8dum3uyy7u, child number 0
————————————-
select * from emp where hiredate=to_date(‘1980/12/17′,’yyyy-mm-dd’)

Plan hash value: 4059437819

————————————————————————————-
| Id | Operation | Name | E-Rows | Cost (%CPU)|
————————————————————————————-
| 0 | SELECT STATEMENT | | | 2 (100)|
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| EMP | 1 | 2 (0)|
|* 2 | INDEX RANGE SCAN | IF_EMP_HIREDATE | 1 | 1 (0)|
————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(“HIREDATE”=TO_DATE(‘ 1980-12-17 00:00:00’, ‘syyyy-mm-dd
hh24:mi:ss’))
2 – access(“EMP”.”SYS_NC00009$”=TRUNC(TO_DATE(‘ 1980-12-17 00:00:00’,
‘syyyy-mm-dd hh24:mi:ss’)))

–//using function index!!

Reply

That’s a great counter-example to my very general statement above. Your example relies on the dates stored in the table following a specific constraint and this only works if the hiredates have no time component to it. (as it should be). Unfortunately more often than not I would see new hires added like this:

insert into emp (empno, ename, hiredate) values (42, ‘Bjoern Rost’, sysdate);

That would break the query as it would return 0 results unless I was lucky enough to run the insert at midnight. One would have to make sure that the application that inserts data already truncates the dates to remove hours, minutes and seconds. Just to be extra cautious and because we cannot trust applications, this would also call for a check constraint on the column.

I believe my general statement of “do not use trunc() and other functions on columns in predicates” still stands.

Of course, if your data is clean enough work without trunc() (ie: hiredates only contain year, month, day), you can easily query for equality in which case index access paths (and as proven by you even function based indexes) can be used.

Reply
Alex Tokarev
April 4, 2018 8:40 am

The intention of the example is clear enough but it shouldn’t be error prone in the light of possible ammendments. My point is that such examples should not only show possible issues but how to implement stable applications.

Reply

[email protected]> update (select * from emp where hiredate=to_date(‘1980/12/17′,’yyyy-mm-dd’)) set hiredate=hiredate+1/86400;
1 row updated.

[email protected]> commit ;
Commit complete.

[email protected]> select EMPNO,ENAME,HIREDATE from emp where empno=7369;
EMPNO ENAME HIREDATE
———- ———- ——————-
7369 SMITH 1980-12-17 00:00:01

select * from emp where hiredate=to_date(‘1980/12/17 00:00:01′,’yyyy-mm-dd hh24:mi:ss’);
..

PLAN_TABLE_OUTPUT
————————————-
SQL_ID 78y07y4vfd9a7, child number 0
————————————-
select * from emp where hiredate=to_date(‘1980/12/17
00:00:01′,’yyyy-mm-dd hh24:mi:ss’)
Plan hash value: 4059437819
——————————————————————————————————–
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time |
——————————————————————————————————–
| 0 | SELECT STATEMENT | | | | 2 (100)| |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| EMP | 1 | 38 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IF_EMP_HIREDATE | 1 | | 1 (0)| 00:00:01 |
——————————————————————————————————–

[email protected]> select * from emp where hiredate between to_date(‘1980/12/17′,’yyyy-mm-dd’) and to_date(‘1980/12/18′,’yyyy-mm-dd’);
..
Plan hash value: 4059437819
——————————————————————————————————–
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time |
——————————————————————————————————–
| 0 | SELECT STATEMENT | | | | 2 (100)| |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| EMP | 1 | 38 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IF_EMP_HIREDATE | 1 | | 2 (0)| 00:00:01 |
——————————————————————————————————–
Query Block Name / Object Alias (identified by operation id):
————————————————————-
1 – SEL$1 / [email protected]$1
2 – SEL$1 / [email protected]$1

Predicate Information (identified by operation id):
—————————————————

1 – filter((“HIREDATE”=TO_DATE(‘ 1980-12-17 00:00:00’, ‘syyyy-mm-dd hh24:mi:ss’)))
2 – access(“EMP”.”SYS_NC00010$”>=TRUNC(TO_DATE(‘ 1980-12-17 00:00:00’, ‘syyyy-mm-dd
hh24:mi:ss’)) AND “EMP”.”SYS_NC00010$”<=TRUNC(TO_DATE(' 1980-12-18 00:00:00', 'syyyy-mm-dd
hh24:mi:ss')))

Reply
Alex Tokarev
April 6, 2018 2:46 am

BETWEEN shouldn’t be used here because it takes 2 days rather than 1. That’s why I suggested

date >= searched date and date < searched date + 1 (without equity sign)

to not use trunc and 59:59:59

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *