The Pitfalls of Oracle’s Automatic SGA Management

Posted in: Technical Track

Of late, Oracle has had the tendency to automate a lot of the usual DBA tasks.

I didn’t complain on autoextensible datafiles, I just had to make sure my disk had enough space to handle the maximum potential space requirement. I also didn’t complain too loudly when a file extended to a size that triggered an Oracle bug and caused the database to crash irrecoverably. It was 8i after all, and things have improved since. Or have they?

I was mildly amused by automatic PGA and for the most part it worked well.
The default dba_scheduler job to refresh statistics interfered with my own job to gather statistics, and the schedule was the worst possible for the db I first saw it on, after an upgrade from 9i to 10.1 — but that was a minor irritant and quickly resolved.

However, I have to draw the line at automatic SGA management. I thought it seemed like a good feature and left it at its default in a new 10.2.0.2 database that I recently set up. As production ramps up on the db, I can see how a combination of application issues (non-bound SQL) and automatic SGA can quickly cause problems. The database has been running for about a week now. The application has very little bound SQL and our best efforts to get the developers to start binding SQL have been mostly unsuccessful, since they are hard–pressed to maintain the legacy code, which is mostly spaghetti.

So here’s how the SGA looks after one week of running in production:

SQL> show sga

Total System Global Area 7516192768 bytes
Fixed Size                  2082856 bytes
Variable Size            4513073112 bytes
Database Buffers         2952790016 bytes
Redo Buffers               48246784 bytes
SQL>

Wow… 4.5G of the 7.5G or so is “Variable Size” — I would expect that the db cache should have been bigger than the variable size, but no. The db cache, which started out at 6G+ is down to half its size.

Lets see where the variable size is allocated:

SQL> select component,current_size/1024/1024 as "Size MB" from v$sga_dynamic_components
2  /

COMPONENT                                                           Size MB
---------------------------------------------------------------- ----------
shared pool                                                            4272
large pool                                                               16
java pool                                                                16
streams pool                                                              0
DEFAULT buffer cache                                                   2816
KEEP buffer cache                                                         0
RECYCLE buffer cache                                                      0
DEFAULT 2K buffer cache                                                   0
DEFAULT 4K buffer cache                                                   0
DEFAULT 8K buffer cache                                                   0
DEFAULT 16K buffer cache                                                  0
DEFAULT 32K buffer cache                                                  0
ASM Buffer Cache                                                          0

13 rows selected.

SQL>

4.2G in the shared pool. It was just under 4G yesterday, it’s still growing!
Oracle thinks it can make queries run better by not having to parse them repeatedly, so it grows the shared pool to keep as many queries as it can. Problem: the shared pool is now 4G+. That takes a while to trawl through, which of course adds to the spins on the library cache lock and pushes up CPU usage. And of course I/O usage is high(er) because the db cache is now lower. A lose-lose situation I think. I’m going to get into more detail on this when I have the time, but for this db, I think I’m going to switch to manual SGA and set a hard limit of 500M-1G on the shared pool. There isn’t any point in trying to hold on to what is going to be mostly–unique SQL anyway.

Oracle, please spare me the “automatic” stuff unless you can make it a little more intelligent!

–Raj.

email

Author

Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Raj has been managing and guiding highly effective teams for over 20 years. Raj combines strong leadership skills with a deeply technical background in IT consulting to bring tailored and efficient technology solutions to Pythian. Additionally, his focus on data security has helped Pythian meet and often exceed client and regulatory requirements. Prior to joining Pythian in 2003, Raj held various leadership positions with a proven record of success. He has a degree in Industrial Engineering and has worked in multiple IT and engineering fields.

4 Comments. Leave new

Raj, there is much worse to come!

Wait until you have to run multiple schemas in same db, with same table names.
If you’re using ANY of the automatic sga and pga management features since 9ir1, it’s odds on that on a loaded system you’ll see the bugs I alert to in my blog.

They are caused by appaling QA of the pga_aggregate code and other auto memory management parameters.

As well: try to use BLOBs on ASSM tablespaces in any 9i or 10g and aside from the very latest – and unproven – patch releases, I can virtually guarantee you’ll get a corrupt LOB segment as soon as the load goes up!

And Oracle wants us to go through all the trouble of installing regular “patch releases” when they can’t fix the most basic of basics? Helllooooo?

All I can say at this stage is: use “auto-anything” with a grain of salt bigger than Ben-Hur…

Reply

Just going through this entry, a question comes up on my mind. You have mentioned that the application uses very little binding. Isn’t that a basic design flaw ? Is it fair for us to expect Oracle to automatically fix mistakes in application design?

Reply

We ran into the issue you described too. The trick is to set minimum values for the pools, especially the default buffer cache, that way when your unbound queries dry up the other pools, they can only dry them up to a point, and over time they’ll balance out again.

We’re using 10.2.0.3, and we’ve loaded many terrabytes of *lobs into ASSM, and we haven’t had any problems. In fact, it stores them more efficently than the OS was storing them…we use less space by putting them in the DB then we did when they were out of the DB. I think this is because the full OS block is allocated in the OS, but ASSM is a little more efficient. I wonder in what version you experienced issues, Noons? We have had the shared cursor issue you mentioned, that’s a scary one…and I know of at least 3 different bugs that cause that. Basicly, Oracle matches the sql text in the shared pool for your query, and doesn’t consider the schema that executed that query…so 2 scheams w/the same table name will get the same result, even if they have different data in the tables.

I agree krisgopala, but the problem is the impression from the users…they don’t know about the unbound sql, they only see the db running slowly and they blame Oracle, and of course, the DBA. Still, by setting minimum values, you can compensate for the bad code…a little bit.

Reply
Syed Shabuddin
November 19, 2015 2:32 am

My Question are… In 11gR2. using Oracle application R12.2.4.

* Can we set manually –>>>> Shared Pool after setting SGA_TARGET dynamically.???
* CPU high in select by custom query can be reduce by Shared Pool.??
* What is the role Buffer Cache parameter during setting the Shared Pool???

Regards,
Syed Shabuddin

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *