Internals of querying the concurrent requests' queue - revisited for R12.2

Maris Elsins

April 19, 2016

Tags: MySQL, Ansible, Technical Track, Open Source, Replication, Devops, Docker, Container, Mariadb, Mysql-Python

Once upon a time I wrote about the Internal Workflow of an E-Business Suite Concurrent Manager Process. Many things have changed since that blog post, the most obvious change being the release of Oracle e-Business Suite R12.2. I decided to check if the way the concurrent manager queues were processed by concurrent manager processes were still the same. My main goal was to see if the manager processes still don't attempt any way of coordination to distribute the requests among them. This is how I did the testing:

I used the VM templates provided by Oracle to build my R12.2.4 test environment. By the way, I didn't expect that the process of getting the environment up would be so simple! Downloading the media files from edelivery.oracle.com was the most time-consuming step, once done - it took me just 1 hour to un-compress everything, import the Virtual Assembly file and bring up the R12.2.4 environment on my laptop.
3 Standard managers are defined by default
Sleep seconds were left as is = 30 seconds
Cache size was increased from 1 to 5.

Identified the 3 DB processes that belong to the Standard managers:

select sid, serial# from v$session where module='e:FND:cp:STANDARD'

I enabled tracing with binds and waits for each of them like this:
```
exec dbms_monitor.session_trace_enable(sid,serial#,true,true);
```
Once that was done I submitted one concurrent program - "Active users" and waited for it to complete.

I disabled the tracing and collected the trace files.

exec dbms_monitor.session_trace_disable(sid,serial#);

Collected the trace files

I found 2 of the trace files to be very interesting. To make things more simple, the manager process "A" will be the one that executed the concurrent request, and process "B" will be the one that didn't.

Before the "Active Users" Request Was Submitted

No other requests were running at the time I did the testing, so I clearly observed how both Managers A and B queried the FND_CONCURRENT_REQUESTS table BOTH of the trace files displayed the same method of how requests are picked up from the queue. Note, I'm showing only the lines relevant to display the main query only, and I have formatted the query text to make it more readable: [sourcecode language="sql" highlight="2-35"] PARSING IN CURSOR #139643743645920 len=1149 dep=0 uid=100 oct=3 lid=100 tim=1460211399835915 hv=3722997734 ad='d275f750' sqlid='cd23u4zfyhvz6' SELECT R.Rowid FROM Fnd_Concurrent_Requests R WHERE R.Hold_Flag = 'N' AND R.Status_Code = 'I' AND R.Requested_Start_Date &amp;amp;amp;amp;lt;= Sysdate AND (R.Node_Name1 IS NULL OR (R.Node_Name1 IS NOT NULL AND FND_DCP.target_node_mgr_chk(R.request_id) = 1)) AND (R.Edition_Name IS NULL OR R.Edition_Name &amp;amp;amp;amp;lt;= sys_context('userenv', 'current_edition_name')) AND EXISTS (SELECT NULL FROM Fnd_Concurrent_Programs P WHERE P.Enabled_Flag = 'Y' AND R.Program_Application_Id = P.Application_Id AND R.Concurrent_Program_Id = P.Concurrent_Program_Id AND EXISTS (SELECT NULL FROM Fnd_Oracle_Userid O WHERE R.Oracle_Id = O.Oracle_Id AND EXISTS (SELECT NULL FROM Fnd_Conflicts_Domain C WHERE P.Run_Alone_Flag = C.RunAlone_Flag AND R.CD_Id = C.CD_Id ) ) AND (P.Execution_Method_Code != 'S' OR (R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) IN ((0,98),(0,100),(0,31721),(0,31722),(0,31757))) ) AND ((R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) NOT IN ((510,40032),(510,40033),(510,42156),(510,42157),(530,43793),(530,43794),(535,42626),(535,42627),(535,42628))) ORDER BY NVL(R.priority, 999999999), R.Priority_Request_ID, R.Request_ID END OF STMT EXEC #139643743645920:c=0,e=33,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211399835910 FETCH #139643743645920:c=0,e=546,p=0,cr=106,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211399836507 WAIT #139643743645920: nam='SQL*Net message to client' ela= 3 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211399836572 *** 2016-04-09 10:17:09.837 WAIT #139643743645920: nam='SQL*Net message from client' ela= 30000367 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211429836965 ... EXEC #139643743645920:c=0,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211429838767 FETCH #139643743645920:c=0,e=689,p=0,cr=106,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211429839587 WAIT #139643743645920: nam='SQL*Net message to client' ela= 4 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211429839652 *** 2016-04-09 10:17:39.840 WAIT #139643743645920: nam='SQL*Net message from client' ela= 30000325 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211459840003 ... [/sourcecode] It's important to observe that:

All manager's processes still compete for the same requests. If the query is executed at the same time, the same list of concurrent requests will be retrieved by all processes.
The constants literals used in lines 30-32 mean that the query for checking the queue is still built when the concurrent manager process starts up. These constants are mainly used to implement the specializations rules in the query.
Only rowid for the pending requests' rows in FND_CONCURRENT_REQUESTS are fetched.
The sleep time is clearly visible on lines 41,42 and 48,49

After the "Active Users" Request Was Submitted - Starting the Concurrent Request

The manager process A was the first to pick up the submitted requests and it could be observed by the "r=1" (1 row fetched) in the FETCH call for the query we just reviewed: [sourcecode language="sql"] FETCH #139643743645920:c=0,e=437,p=0,cr=113,cu=0,mis=0,r=1,dep=0,og=1,plh=3984653669,tim=1460211519844640 [/sourcecode] Immediately after this, the manager process A locked the row in FND_CONCURRENT_REQUESTS table, this way, the request got assigned to this process. Notice the similar where predicates used in this query, these are actually required to make sure that the request is still not picked up by another manager process. However the main thing here is the fact that the request row is accessed by the "rowid" retrieved earlier (row 45, the value of the bind variable ":reqname" is "AAAjnSAA/AAAyn1AAH" in this case). Locking of the row is done by the "FOR UPDATE OF R.status_code NoWait" clause on line 49: [sourcecode language="sql" highlight="45,49"] PARSING IN CURSOR #139643743640368 len=4530 dep=0 uid=100 oct=3 lid=100 tim=1460211519864113 hv=4239777398 ad='cde86338' sqlid='6ya6bzgybbrmq' SELECT R.Conc_Login_Id, R.Request_Id, ... excluded other 156 columns for brevity... FROM fnd_concurrent_requests R, fnd_concurrent_programs P, fnd_application A, fnd_user U, fnd_oracle_userid O, fnd_conflicts_domain C, fnd_concurrent_queues Q, fnd_application A2, fnd_executables E, fnd_conc_request_arguments X WHERE R.Status_code = 'I' AND (R.Edition_Name IS NULL OR R.Edition_Name &amp;amp;amp;amp;lt;= sys_context('userenv', 'current_edition_name')) AND R.Request_ID = X.Request_ID(+) AND R.Program_Application_Id = P.Application_Id(+) AND R.Concurrent_Program_Id = P.Concurrent_Program_Id(+) AND R.Program_Application_Id = A.Application_Id(+) AND P.Executable_Application_Id = E.Application_Id(+) AND P.Executable_Id = E.Executable_Id(+) AND P.Executable_Application_Id = A2.Application_Id(+) AND R.Requested_By = U.User_Id(+) AND R.Cd_Id = C.Cd_Id(+) AND R.Oracle_Id = O.Oracle_Id(+) AND Q.Application_Id = :q_applid AND Q.Concurrent_Queue_Id = :queue_id AND (P.Enabled_Flag IS NULL OR P.Enabled_Flag = 'Y') AND R.Hold_Flag = 'N' AND R.Requested_Start_Date &amp;amp;amp;amp;lt;= Sysdate AND ( R.Enforce_Seriality_Flag = 'N' OR ( C.RunAlone_Flag = P.Run_Alone_Flag AND (P.Run_Alone_Flag = 'N' OR NOT EXISTS (SELECT NULL FROM Fnd_Concurrent_Requests Sr WHERE Sr.Status_Code IN ('R', 'T') AND Sr.Enforce_Seriality_Flag = 'Y' AND Sr.CD_id = C.CD_Id )))) AND Q.Running_Processes &amp;amp;amp;amp;lt;= Q.Max_Processes AND R.Rowid = :reqname AND ((P.Execution_Method_Code != 'S' OR (R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) IN ((0,98),(0,100),(0,31721),(0,31722),(0,31757)))) AND ((R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) NOT IN ((510,40032),(510,40033),(510,42156),(510,42157),(530,43793),(530,43794),(535,42626),(535,42627),(535,42628))) FOR UPDATE OF R.status_code NoWait [/sourcecode] The behavior of the manager process B was a little bit more interesting. It too managed to fetch the same rowid from FND_CONCURRENT_PROCESSES table belonging to the submitted "Active Users" processes. However, when it tried to lock the row in FND_CONCURRENT_REQUESTS (By using exactly the same query), this happened: [sourcecode language="sql" highlight="11"] PARSING IN CURSOR #139690311998256 len=4530 dep=0 uid=100 oct=3 lid=100 tim=1460211519900924 hv=4239777398 ad='cde86338' sqlid='6ya6bzgybbrmq' ... BINDS #139690311998256: ... Bind#2 oacdty=01 mxl=32(18) mxlc=00 mal=00 scl=00 pre=00 oacflg=20 fl2=1000001 frm=01 csi=873 siz=0 off=64 kxsbbbfp=7f0c2f713f20 bln=32 avl=18 flg=01 value="AAAjnSAA/AAAyn1AAH" EXEC #139690311998256:c=1000,e=1525,p=0,cr=25,cu=1,mis=0,r=0,dep=0,og=1,plh=4044729389,tim=1460211519902727 ERROR #139690311998256:err=54 tim=1460211519902750 [/sourcecode] The query failed with "ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired". This is how the access to pending concurrent requests is serialized to make sure only one of the manager processes can run it. And, I think, relying on the well-tuned and highly efficient locking mechanism of Oracle Database is a very very smart idea.

Conclusions

The coordination between manager processes is still not happening to distribute the requests, but the managers all query the queue the same way and then compete between themselves to lock the requests' entries on the table 1st. The process that gets the lock also gets to execute the concurrent request.
The cache size variable couldn't be observed in the trace files, but as far as I remember from my previous research the process would only fetch "cache size"-number of rowids using the 1st query in this post. This could be tested by submitting larger volume of requests simultaneously.
The "sleep seconds" kicks in only when the manager process didn't fetch any rowids from the queue. After all the cached requests are attempted/executed by the manager process, the queue is checked again immediately without waiting for the "sleep seconds" (Not explained in detail in this post, but it's revealed in the trace files)
The DMLs used to query the FND_CONCURRENT_REQUESTS and to lock the row are very very similar to Pre-R12.2 releases of e-Business Suite (Another sign that the process hasn't changed, though one change that I see is the addition of where clause predicates for Checking the Editions).

Insight and analysis of technology and business strategy

Internals of querying the concurrent requests' queue - revisited for R12.2

Before the "Active Users" Request Was Submitted

After the "Active Users" Request Was Submitted - Starting the Concurrent Request

Conclusions

Top Categories

Tell us how we can help!