Performance Settings of Concurrent Managers

Posted in: Technical Track

This is the second article in a series about internals and performance of concurrent managers. In this post, we’ll take a look at three important settings that affect the performance of concurrent managers: number of processes, “sleep seconds”, and “cache size”.  This article might be a bit on the theoretical side, but it should provide a good understanding of how these settings actually affect the behavior and performance of concurrent managers. Most of the statements in this article are built off of information from my previous post:  The Internal Workflow of E-Business Suite Concurrent Manager Process. It may be helpful to take a look at it before continuing with this one.

Life cycle of a concurrent request

The interesting thing about tuning concurrent managers is the fact that we don’t tune a particular query or a running process, but we actually tune the pending time of concurrent requests. The goal of the tuning is to make sure concurrent requests start executing soon enough after the time they have been scheduled for. Let’s take a look at the life cycle of a concurrent request:

The Lifecycle of a Concurrent Request

The Life cycle of a Concurrent Request (CR)

Based on the diagram above, the pending time of the request is the interval between the time the request was scheduled to start and the time it actually started. This time can be split in two parts:

  1. Pending for Conflict Resolution Manager (CRM) – Here the CRM checks the incompatibility rules effective for the pending concurrent request against other running requests. The CRM allows the request to execute only when all incompatible requests have completed.
  2. Pending for Concurrent Manager (CM) – This is the time spent waiting for an available concurrent manager process. It also includes the time the CM process takes to fetch the request from FND_CONCURRENT_REQUESTS table and to start executing it. “Pending for CM” is the interval that can be tuned by altering the number of manager processes, “sleep seconds”, and the “cache size” settings.

Requirements

Understanding the requirements is a mandatory step for any tuning attempt. Otherwise, it’s hard to know when to stop tuning, as it can become unclear if the performance is sufficient or not. When you’re tuning concurrent managers, the requirements can be defined by answering a simple question: How long is the request allowed to stay pending after the scheduled start? You should also keep in mind the following items while thinking about the answer:

  • Be realistic – “Requests should start immediately” is not a valid answer. It’s simply impossible because of how concurrent managers work. If there is anything you have to run immediately, concurrent programs are not the correct way of doing it.
  • Think of groups of concurrent programs – If you have similar requirements for a group of requests (i.e. a program printing invoices while the customer is waiting on-site should start executing in 10 seconds after it’s submitted), a dedicated concurrent manager should be implemented for them.
  • Unsure of requirements? – If the requirements are not known, ask the end users if they have experienced situations when requests stay in the queue for too long. Was it because the managers couldn’t cope with the amount of incoming requests? If so, the settings might be too low.
  • Work Shifts – If the requirements differ at different times of the day, concurrent manager Work Shifts can be used to define different settings depending on the time of the day.

Settings

The settings related to concurrent managers are explained in the documentation: Oracle E-Business Suite System Administrator’s Guide – Configuration. But I find the explanations are too often unclear to effectively tune the concurrent managers. In this chapter I’ll reveal the basic principles behind each of the three main settings and will describe how I utilize them to control the performance of concurrent managers.

Number of Processes

From the Documentation: “The number of operating system processes you want your work shift to run simultaneously. Each process can run a concurrent request.

  • It’s important to add that there is no coordination between processes of a concurrent manager. If all concurrent processes were fetched from the FND_CONCURRENT_REQUESTS table at exactly the same time, they would all read exactly the same information about pending requests. A simple mechanism of row locking is utilized later to allow the execution of a concurrent request on a single manager process.
  • Choosing the correct number of concurrent processes is not easy since the workloads are not constant and the number of pending requests can vary. Here are some clues you should consider (I’m also planning a future blog post about measuring the actual utilization levels of concurrent managers):
    • Don’t configure too many processes! Your hardware is limited. If you configure too many processes you’ll exhaust the server resources. For example, if you run lots of big Oracle Reports (CPU-intensive workload) and you have 4 CPUs on your only web/forms/concurrent node, configuring 8 processes for the concurrent manager is risky. If 8 reports where to be generated at the same time, the users could experience web/forms slowness. The same applies to the database tier. Additionally, if too many processes are configured, they can use a significant amount of resources even when they are idle. This is especially important for RAC configurations. (This too will be explained in one of my upcoming blog posts.)
    • Don’t configure too few processes! If the number of processes is insufficient, the requests will start queuing. If that becomes an issue, increase the number of processes slightly. Consider defining a Work Shift with a different number of processes at different times of the day to accommodate your requirements.
    • Don’t be afraid of queuing! Queuing is normal, especially if all requests still manage to start as expected based on the requirements. If you see recurring queuing at particular times, check with the users to see if that is going to be a problem.
    • Start low! If you’re unsure of what setting should be used, start with a low number of processes. Check if any queuing occurs and if the users start complaining.
Sleep Seconds

From the Documentation: “The sleep time for your manager during this work shift. Sleep time is the number of seconds your manager waits between checking the list of pending concurrent requests (concurrent requests waiting to be started)

  • The documentation is inaccurate – It should clearly state that it’s effective for each manager process – i.e. if you have 5 concurrent manager processes for the Standard Manager and the “Sleep Seconds” setting is set to 30, then the average time between checks for pending requests (if all managers are idle) is 30 / 5 = 6 seconds.
  • A manager process sleeps only when there are no pending requests – The manager process checks the requests queue immediately after it has processed the last request it fetched. A common misconception is that if the rate of incoming requests is very high, the “Sleep Seconds” should be low to process all of them quickly. Not true! If the rate of incoming requests is high, there is a good chance some requests will be executing at any given time. So, when they complete, the requests queue will be checked immediately and the new requests will be started.
  • What value to use? – Calculate it! Three parameters are important to estimate the “Sleep Seconds” (S) setting: the number of manager processes (N), the average utilization level (U) of concurrent managers (this setting will be explained later), and the average time of how long the request is allowed to be pending (T). As “Sleep Seconds” are effective only for the idle processes, it can be calculated using the following: S = N * (1 – U) * T.
    • Example 1: if N = 5 processes, U= 20%, T=20 seconds – let’s calculate the “Sleep Seconds” setting: S = 5 * (1 – 0.2) * 20 = 5 * 0.8 * 20 = 80 seconds. It seems high, but think about it – if the average utilization of 5 processes is 20%, then there are 4 idle processes at any given time. Each of these will have a sleep interval of 80 seconds, so on average the requests queue will be checked every 20 seconds.
    • Example 2: if N = 3 processes, U= 90%, T=20 seconds: S = 3 * (1 – 0.9) * 20 = 3 * 0.1 * 20 = 6 seconds.  This example reveals a problem as the calculated “Sleep Seconds” are lower than the requirement we have set – this means the requirement can’t be reached with the number of running processes. Think about it – we have 3 processes each utilized 90% of time; it’s impossible to meet the 20 seconds goal because all managers are busy most of the time. There simply aren’t enough processes to execute the incoming requests. The defined requirements can be reached only if at least one manager process is idle. This scenario also describes a “perfect world”, where all but one manager is busy, so all new requests are picked up in time and the processing overhead of the idle manager processes is minimal.
Cache Size

From the Documentation: “The number of requests your manager remembers each time it reads which requests to run

  • Almost useless setting – Unless you have a manager with only one running process. If multiple manager processes are running, there is a good chance that most of the cached requests will be processed (remember, the processes don’t coordinate the work – they compete the work) by other manager processes while the first request is running.
    • Example: There are 10 manager processes, and 10 requests are submitted. One of the manager processes starts executing the 1st request, the other managers start running the remaining requests. So, by the time the request completes all the cached requests will be obsolete, but the manager process will try to lock the corresponding rows in FND_CONCURRENT_REQUESTS table anyway, and will fail for all 9 requests. It will then immediately query the queue to check if more requests are pending.
    • I think it’s best to set “Cache Size” setting to 1 so the hardware resources aren’t spent on trying to lock the processed requests, but rather on checking the requests queue.
  • Request priorities are cached too – If you have a cache size greater than 1, keep in mind the request priorities are cached too. If the priority is changed for a cached request, the manager process will not notice it.

Summary

I have to admit that this article turned out to be much more complicated to write than I expected, even if there are just a few settings to describe. The problem is that the concurrent processing environment is changing all the time. At one moment it’s completely idle, and then suddenly there is a spike of incoming requests. It makes it impossible to tune for all situations.  There will always be instances when any configuration works better and others when it makes things worse, but I hope I was able to outline the significance of each configurable parameter so the overall picture is clearer.

As I promised, two other articles are lined up for the future – one to describe the actual utilization levels of concurrent managers and the other to look into overhead of an idle concurrent processing environment. Stay tuned!

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Maris Elsins is an experienced Oracle Applications DBA currently working as Lead Database Consultant at The Pythian Group. His main areas of expertise are troubleshooting and performance tuning of Oracle Database and e-Business Suite systems. He is a blogger and a frequent speaker at Oracle related conferences such as UKOUG, Collaborate, Oracle OpenWorld, HotSos, and others. Maris is an Oracle ACE, an Oracle Certified Master, and a co-author of “Practical Oracle Database Appliance” (Apress, 2014). He's also a member of the board at Latvian Oracle User Group.

28 Comments. Leave new

Ajith Narayanan
March 27, 2013 9:37 am

Hi Maris,

Thanks for the wonderful post. Beautiful way of explaining concurrent managers. Just adding few thumbrules on concurrent managers. The thumbrules below can be equated to
“Think of groups of concurrent programs” section
and “Sleep” section value explained in the above post

Thumbrule for Concurrent queue’s configuration
Configuration examples of 3 “typical” queues:
(i) Fast Queue
Sleep = 15 (seconds)
Cache Size = 10
Target = 5

(ii) Standard Queue
Sleep = 60 (seconds)
Cache Size = 20
Target = 10

Slow Queue
Sleep = 60 (seconds)
Cache Size = 10
Target = 5

Target
Make sure the number of targets (processes) don’t exceed more than 20 per queue, and also remember the “rule of thumb” 3-5 processes per CPU. This rule is in most cases exceeded.

Sleep(This should be checked against the formula in the post)
For a queue (e.g. CUST_FAST) running fast jobs set the Sleep to 15 (seconds).
For a queue (e.g. CUST_SLOW) running slow jobs set the Sleep to 60 (seconds).
For a standard queue (e.g. CUST_STANDARD or STANDARD) set the Sleep to 60 (seconds).
For any other queues set the Sleep to 60 (seconds).

Cache Size
If Cache Size (CS) is not set, then set the cache size equal to the target value. Set CS to 2 times target value for a fast, slow and standard queue.

Note:- The queue names Fast, Slow & Standard are just representing 3 types of concurrent managers settings that can be used based on the concurrent programs they should be handling. Main thing here is that we need to understand our concurrent programs and try to fit into any of these groups if possible, Its time consuming, but it is necessary that we understand our own workload.

Reply

Hi Ajith,

Thanks for reading!

I think the note at the end of your comment is the main thing there :) Let me quote you: “it is necessary that we understand our own workload.” – and it should happen before new custom managers are defined.

In some cases the configuration you suggested could work, but I think it shouldn’t be used as an approach for all systems – understanding of requirements should come first. May be the fast/slow queues are not needed at all, May be the sleep seconds settings are too agressive, and so on…

Maris

Reply

Nice post Maris.

Reply
Yury Velikanov
March 27, 2013 6:10 pm

Good reading Maris :) Thanks for taking initiative and putting posts together for us. I totally see how those can grow into an eBook :) Just put those together in the right way.

A few comments from my side:
“Choosing the correct number of concurrent processes” – I would mention the fact that there could be cases where a concurrent process is occupied by doing nothing but waiting on a child request to finish. Please correct me if I am wrong. Is it right that if a parent spawns a child request it uses one concurrent process and just waits on child to complete? If my understanding correct than the planning may be even more complex than just calculating resources utilisation and average pending time.

Other things that I would try to describe is how to measure how much resources are used by Concurrent Processing framework. How to find that we have too aggressive sleep time based on the number of processes. My understanding is that we should aim to find a right balance between cost and value. In this case the host is resources used by CP framework and the value is minimal pending time.

Thanks once again, Very useful stuff and looking forward to read next blog posts :)

Yury

PS You lost me with “S = N * (1 – U) * T”. This is way too smart for me :)

Reply
Maris Elsins
March 28, 2013 2:04 am

Hi Yury,

Good to hear you enjoyed it and thanks for the time you took to comment!

You’re not wrong about parent/child requests, but I think it’s even more complicated.
There are situations when the parent request is put into “paused” state while the child executes. If that is the case, the parent request does not occupy a concurrent manager process while the child process executes.
There are also parent requests that, instead of being put into paused state, simply submit the child request and loop on dbms_lock.sleep until the child completes. In this case the parent request occupies a manager process.

There is a huge amount of these little nuances and they make theoretical planning of the configuration complicated. It always behaves slightly different in practice, and this is the reason why tuning concurrent managers is usually an iterative process – you try something and then check how it works, then adjust it slightly based on the findings and check again. Stop when the configuration is “good enough” and no one complains about it (including yourself).

I agree to what you said about the balance between cost and value regarding the processes and the sleep time settings. This is also one of the topics I’ll describe in my future blog posts.

Maris

> those can grow into an eBook
Ssshhhh, it’s the secret plan you’re revealing here… :)

Reply
Ajith Narayanan
March 28, 2013 5:24 am

Hi Maris,

You said it right, the sequence was wrong, the last line in my reply should have been first, Also, you are right, the mentioned are just thumbrules and cannot be used directly on any environment, without analyzing the concurrent workload.

Even I was also bounced over with the formula S = N * (1 – U) * T Kudos!!

Regards,
Ajith Narayanan

Reply
Panuganti. Bharath
May 14, 2013 12:17 am

Laudable effort Maris Elsins, Thanks for demystifying the mystics that comes along with tuning Concurrent Processing environ. As it been rightly said that there is no ideal configuration as such as the workload is highly volatile, never gonna be the same.

Nicely able to caught verbally the intricacies involved in tuning CM.
Great Work, Will be tuned in for the upcoming posts in these lines from you.

Reply
Maris Elsins
May 14, 2013 12:25 am

Hi Panuganti,
Thanks for a feedback and I’m glad this was useful.

Reply
Muthu Nagaraj
June 12, 2013 4:51 am

Many thanks for sharing this information Maris, Fantastic blog and very useful information.

Reply
panuganti. bharath
July 23, 2013 10:33 am

Hi Maris,

Can you help me gain visibility that enables me to look beyond the current post i.e. about “resolving gap” as to when the request is been scheduled to be picked up and when it actually got processed by the CM process.

I mean how about the requests which are been running for an abnormal time period as the same was completed normally yesterday within say 5 mins.

Also is there any possibility of running the same request by different processes at the same point in time, if so can you please delineate on that front.

Reply

Hi Maris,

Thanks for this post. One of the GL posting request takes time around 120-130 mins when it run during peak hours. Same is completed in 40-50 mins when it run during off hours. Setting the cache and sleep time will work in my case.

Please suggest.

Reply

Hi,
cache and sleep settings will not help you here – they don’t have any impact after the request is picked up and starts executing. You’ ll want to look into what else is going on in the system at peak time to see if tho overall activity can be reduced or distributed over time differently to let the GL posting request complete faster.

Maris.

Reply

The request which is a resource consuming query. This are AP reports.
What i want to do here i want the Posting must acquire most resource. I don’t care other programs. Is there any way so that i can assign some priority so that this request run with more resource. Oracle apps must give it more priority than others.

Thanks

Reply
Andrejs Karpovs
November 7, 2013 8:06 am

Hi Maris,

Excellent post!
One more suggestion or clue – be careful when setting Sleep Time to a very low value (less than 10 seconds) as it might lead to high rate of table scans of FND_CONCURRENT_REQUESTS and wastage of resources.

Regards,
Andrejs.

Reply
Anna María Valtýsdóttir
February 6, 2014 4:22 am

Hi Maris,

Great article and a much better clarification on Concurrent Managers than Oracle has ever provided.
Looking very much forward to your post on measuring the actual utilization levels of concurrent managers.

Thanks,
Anna María

Reply
Srikanth Bolagani
April 20, 2014 11:10 pm

Wonderful Post.

Reply

Trying to understand behavior in a high volume system – 300k requests per day. We have specialized queues and they work as expected, but standard managers (we have 4 in a PCP setup), which process approximately 50% of the load often show running significantly less than actual with large amounts of pending. Managers were set up with 30 processes each (total of 120) with sleep of 50 and cache of 200. I believe after reading your article that cache is way too high, but was set up by someone following the oracle guidelines. Is there a way to evaluate if we are spending too much time in chasing fnd_concurrent_requests locking (select for update)?

Reply

Hi Jeff,

> …significantly less than actual with large amounts of pending…
Are you sure you don’t have incompatibility rules that cause this?
Open the requests forms, find one of these pending requests, clock on “diagnostic”, scroll down the text field, it usually displays the incompatible requests, if any.

> Is there a way to evaluate if we are spending too much time in chasing fnd_concurrent_requests locking (select for update)?
You need to find the query tat does the “for update” and check how much resources/time it consumes.
It’s going to be similar to “CURSOR #139643743640368” that you’ll find in https://blog.pythian.com/internals-querying-concurrent-requests-queue-revisited-r12-2/

Maris

Reply

Hi, just found this article and it’s really helping us diagnose some issues we are having! Question, when managers “wake” up, they don’t all wake up at the same time, do they? Is it guaranteed that if you have multiple processes defined for a manager that they’re wake up times will be distributed over the wakeup interval or is it random, or is it based on when managers finish a job (such that they go to sleep, for their sleep time, when their current job is finished)?

Thanks!

Wayne

Reply

Hi Wayne,

I’m glad this blog post is still relevant (and why wouldn’t it be, as nothing much has changed in how this works)!

You may want to review the process diagram that I posted in another blog post here: https://blog.pythian.com/the-internal-workflow-of-e-business-suite-concurrent-manager-process/

The last part of your statement is almost correct. It’s based “on when managers finish a job” and there are no other job waiting in the queue (you’ll understand it after reviewing the diagram). So in long term I’d say the sleeping and waking of individual processes is pretty random. And it’s for sure not coordinated between processes.

regards,
Maris

Reply

Thanks! That helped. Another question. Do you have a script that calculates the Utilization, U, that you mention above, for a given manager?

Wayne

Reply

Yes I have the scripts.

You may also want to look at this very old presentation that I prepared for UKOUG Tech 2011 confefrence: https://www.slideshare.net/mariselsins/concurrent-processing-performance-analysis-for-apps-dbas
The scripts are here: https://github.com/MarisElsins/TOOLS/tree/master/SQL/APPS/ConPerfScripts

P.S. I remember that some of the scripts could be quite heavy depending on the volume of requests and size of the involved tables, so please test them in non-prod first!
P.S.2 I didn’t check them on the most recent versions of eBS before uploading, so I hope they still work.

Maris

Reply
Keerthivasan C
March 22, 2018 5:22 am

This is so useful to us to learn about the concurrent manager. Many unanswered questioned are answered here. Confusion was like why we need concurrent manager for request. Explained with the requirement level. Thank you for the wonderful think.

Reply
Maris Elsins
March 22, 2018 5:31 am

Thank you

Reply
Rajesh Thampi
April 30, 2019 10:58 pm

Hello Maris
I’ve been keeping on wondering why the hell my VMs running EBS R12 go mad few minutes after going live in the lab & somehow always found the concurrent jobs being the villain. After going through your article, it is obvious that my “biased” hatred for the concurrent processes are doubled :)))
Thank you! I have reduced the processes, increased the sleep time etc & my instance is already performing better.
Scenario: Oracle VirtualBox 5.2, R12 12.0.6, 11gR2, 480+GB database.

Thanks once again for the great article buddy.

Reply
Maris Elsins
May 2, 2019 2:02 am

Hi Rajesh,

Thank you for taking the time to provide the feedback. Appreciated!
I’m so happy to hear this old post is still relevant.

Maris

Reply
CHRISTOPHE ORTIZ
May 21, 2019 9:03 am

Hi Maris,
I’m using the scripts you wrote in order to compute the usage of our CMs (the famous U). but i’m having usage percent as around 7500% and even higher…
What do you think it could come from ?

cheers
Chris

Reply
Maris Elsins
May 22, 2019 1:10 am

Hi Chris,

I think you have some concurrent requests still in Running phase (phase_code=’R’) in FND_CONCURRENT_REQUESTS, that are not actually running.
It can also be that there is a specific concurrent program that are submitted in batches and that completes very quickly, for example, if a concurrent manager manages to complete 5 requests in a second – it would show 500% utilization. But, seeing that you observe 7500%+ I think the other possible cause is more likely.

Maris

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *