Automatic degree of parallelism, or Auto DOP, is a new feature in 11gR2 that promises to help manage systems where large subset of the workload runs with parallel processing. In this post I’ll introduce the feature and give very useful tips I got from Oracle’s Real World Performance expert Greg Rahn on how to use it. So this is worth reading even if you are familiar with the feature.
The problem is fairly well known – you system only has finite amount of resources. Only so many CPUs, only so many disks capable of delivering only so many IO/s and MB/s. A certain query may have amazing performance when running with 32 parallel processes all alone on your test system. When 5 people need to run it at once, and at the same time there are two scheduled jobs running each with its own parallel processes, there are two likely outcomes:
- You will run more parallel processes than your system is capable of serving. Resulting in long queues on the CPU and storage, and overall performance degradation.
- You limit the maximum number of parallel processes to protect the database resources, and some of the queries degrade. If you don’t detect it, the ETL process that should have finished in two hours takes 24, which means that the daily report sent to the CEO is missing some of the data. Ouch.
Margaret Norman gave a presentation at HotSos last year on how to decide on the maximum DOP per workload, to allow several workloads to run concurrently and share resources in a way that maximizes performance. The process is long and tedious, and I’m not convinced that it results in optimal performance. What’s worse, it has to repeat every time the environment changes. It is a brave effort to solve a difficult problem, but it is still a pain.
So Auto DOP is Oracle’s solution, and at least in the white papers and presentations it sounds like a very attractive solution. There are two levels to Auto DOP:
- Limited – when accessing tables and indexes that have been declared with parallel clause, Oracle will decide on the degree of parallelism based on the query and system resources.
- Auto – Oracle will decide on degree of parallelism for every query. In addition two exciting new features are enabled: parallel statement queuing and in-memory parallel execution.
Parallel statement queuing is in itself a reason to use auto-DOP: When Oracle determines that a statement should run with 8 parallel processes, but only 6 are available – Oracle will not degrade the statement. It will queue it and execute when the processes become available. This is a huge deal. The ETL process that took 24 hours because it degraded? Now it can queue for an hour while waiting for that index rebuild to finish, and then run with the optimal DOP, finishing just an hour later than expected and not 22 hours late.
In-memory parallel execution means that Oracle will not automatically use direct path for parallel processing IO, instead, depending on the size of the object, its volatility and the size of the buffer cache, Oracle may decide to load the blocks that are needed by the query to the buffer cache. If this happens on Oracle RAC, the blocks will be divided between the nodes, and Oracle implements node affinity – it prevents cache-fusion from getting these blocks, instead only the parallel process running on the server when the blocks reside can access these blocks. This almost turns Oracle into an im-memory share-nothing architecture, which is a very efficient way to do parallel processing.
Auto DOP is only useful in systems with enough CPUs and disks to make concurrent parallel processing feasible. It was designed mostly for Exadata, or for systems of similar size. 48 CPUs and 84 disks sound like a nice start.
One fateful Friday night, Fahd Mirza and I attempted to implement Auto-DOP for our favorite customer. The resulting experience left Fahd awake for the next 48 hours, and left me a bit scarred as well. Since I keep running into Oracle employees in conferences who keep insisting that Auto DOP is the best thing since sliced bread, I kept pestering them with questions on why things don’t always work as expected.
I caught Greg Rahn at a pre-OOW party, and he was nice enough to answer all my questions about this feature. I suggested he’ll blog about it, but he seems to have better things to do, so I’m summarizing his thoughts for the public. Note that the rest of this blog post is based on notes I hastily sketched the morning after a party where excellent scotch was served. The good ideas are by Greg, the mistakes are mine alone, and as Christo says: Don’t believe me, test and verify.
Without farther ado, very important notes about Auto DOP:
You can’t use Auto DOP without first running DBMS_RESOURCE_MANAGER.CALIBRATE_IO().
Because Oracle needs to know how many resources (IO/s and MB/s) it can allocate when deciding on DOP. This much is in the documentation. CALIBRATE_IO takes two parameters – number of disks available, and the maximum IO latency you are willing to tolerate. Here’s the catch – while most of the time you know the number of disks (you are *not* running your massively parallel data warehouse in the cloud, yes?), no one really knows the maximum IO latency to use in the function. When I say “no one” I mean it. I asked the entire Oak Table and a large number of Oracle employees. So, this procedure is useless in practice, what’s one to do?
So do what Oracle (and everyone else) does: You manually set MAX_PMBPS – “maximum megabytes per second of large I/O requests that can be sustained by a single process'”. Which is apparently the only value auto DOP cares about. To do that:
delete from resource_io_calibrate$;
insert into resource_io_calibrate$
values(current_timestamp, current_timestamp, 0, 0, 200, 0, 0);
And restart the database.
Where did 200 come from? Thats the number Oracle uses internally to test Auto DOP on the Exadata. This is also the number Oracle Support recommends for Exadata customers.
What if you don’t have Exadata? Chances are that if your system is large enough to warrant Auto DOP, it is similar enough to Exadata to behave nicely with this number. But the big question is – how will I know if this was not configured right? Turns out that this number controls the minimum DOP per query. If you set MAX_PMBPS to 200 and you see Auto DOP select too few parallel processes for your tastes, halve the number and you’ll get more processes. What if the DOP are too high? Use PARALLEL_DEGREE_LIMIT to limit the maximum number of parallel processes per query.
Does it sound like tuning the minimum DOP requires a significant number of node bounces? It may. Thats why you use Auto DOP and tune it in test system first. Be reasonable, while it is a simple change in the system – it has significant potential impact. Like changing every single execution plan in your database. Treat it like an upgrade and test it.
Note that Auto DOP can also be set per session, so you can choose to only queue the queries from BI users. This is useful because there are some known bugs regarding Auto DOP (and definitely. For example, DOP is calculated incorrectly for statements that combine insert with select. If you hit a bug, you can still use Auto DOP for users that will not be impacted by it.
Resource manager can be used to limit DOP for a specific group of users. This can be a good idea with or without Auto DOP. Note that resource manager DOP limit is per instance. If you want to limit parallel degree for a user globally, set a service with one node and force the user to connect to this service. Parallel slaves will still run on all nodes, but the limit will be enforced.
Do you have other Auto DOP hints? Want to share your experience with it? There are almost no real world stories about Auto DOP on the web. Its all either Oracle papers and blogs (which are excellent, but don’t replace customer experience) and enterprising bloggers who played with it a bit on their test systems. So, if you use Auto DOP on production – please share how it works for you!