One of the lesser known new features that comes as a boon to many developers and DBAs is the provision of implicit handling of large data strings using scalar data types like VARCHAR2 and RAW.
When creating tables, each column must be assigned a data type, which determines the nature of the values that can be inserted into the column. Common data types include number, date, and varchar2. These data types are also used to specify the nature of arguments for PL/SQL programs like functions and procedures.
When choosing a data type, you must carefully consider the data you plan to store and the operations you may want to perform upon it. Making good decisions at the table design stage reduces the potential negative downstream impact on space utilization and performance. Space is a consideration since some data types occupy a fixed length, consuming the same number of bytes, no matter what data is actually stored in it.
In pre-12c databases, long characters strings of more than 4000 bytes had to be handled using creative solutions including: CLOB or LONG data types and multiple columns or variables. These approaches led to inefficient unnecessarily complex designs and added processing overheads.
12c introduced the MAX_STRING_SIZE system parameter that allows string data types to be much larger when the parameter is changed from its default value of STANDARD to EXTENDED. The VARCHAR2 data type, stores variable length character data from 1 to 4000 bytes if MAX_STRING_SIZE=STANDARD or up to 32767 bytes if MAX_STRING_SIZE=EXTENDED.
RAW and NVARCHAR2 data types are affected in similar ways.
Potential issues to consider:
- Internally, extended data types are stored out-of-line using LOBs, but these cannot be manipulated using the DBMS_LOB interface.
- When changing the MAX_STRING_SIZE parameter, objects may be updated invalidating dependent objects, so ideally, change this parameter during a maintenance window in your important databases.
- List partitioning on EDT columns may potentially exceed the 4096 byte limit for the partition bounds. The DEFAULT partition may be used for data values that exceed the 4096 byte limit or a hash function may be used on the data to create unique identifiers smaller than 4096 bytes.
- Indexing EDT columns may fail with “maximum key length exceeded” errors. For example, databases with an 8k default block size support a maximum key length of approximately 6400 bytes. A suggested work-around is to use a virtual column or function-based index to effectively shorten the index key length.
This feature will no doubt be improved and the shortcomings will be dealt with in future releases—but for now, it offers a clean and elegant mechanism for handling large character data within existing applications requiring minimal code changes.
Interested in working with Roopesh? Schedule a tech call.