Oracle GoldenGate Extract Internals, Part II

Posted in: Technical Track

Today we continue looking at various aspects of how the Oracle GoldenGate extract process works.

One of the follow up questions to part I was about the way the Extract process reads from ASM storage. I’ve provided the answer, however, today we’re going get a detailed look at how the Extract process interacts with an ASM instance and what kind of implications may result.

Let’s take a look at what changes in the Extract process loop when it reads from ASM storage:

nanosleep({1, 0}, NULL)                 = 0
write(18, "\37\333\0\0\6\0\0\0\0\0\3^\v)\4\4\0\1\0\0\0\376\377\377\377\377\377\377\377C\0\0"..., 8155) = 8155
write(18, "\37\333\0\0\6\0\0\0\0\0\1\1\0Z\7\n\0\0\0\0\0\1\2\0\0W\7\n\0\237\0\0"..., 8155) = 8155
write(18, "\37\333\0\0\6\0\0\0\0\0\1\0\4\0\0\0\237\0\0\1u\7\n\0\0\0\0\0\1\2\235\336"..., 8155) = 8155
write(18, "\22\367\0\0\6\0\0\0\0\0\0.\1\300\0\204\0\n\0\0\200\0\0\217\7\n\0\237\0\0\1\232"..., 4855) = 4855
read(18, "\37\333\0\0\6\0\0\0\0\0\v\1\5\276\4\0\0\0\0\0\1\0\0\0\0\0\0\0\260[\211\312"..., 8208) = 8208
read(18, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8155) = 8155
read(18, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8155) = 8155
read(18, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8155) = 4540
nanosleep({1, 0}, NULL)                 = 0

As before, to make the loop a bit more understandable, I’ve removed a bunch of syscalls. Let’s take a look at the file descriptor 18:

[[email protected] fd]# ls -l 18
lrwx------ 1 oracle oinstall 64 Jan 24 20:56 18 -> socket:[24439]

Let’s find out more about this socket…

[[email protected] fd]# lsof -p 7725 | grep 18u
extract 7725 oracle   18u  IPv6              24439                TCP [::1]:16428->[::1]:ncube-lm (ESTABLISHED)
[[email protected] fd]# lsof -i :16428
extract 7725 oracle   18u  IPv6  24439       TCP [::1]:16428->[::1]:ncube-lm (ESTABLISHED)
oracle  7753 oracle   16u  IPv6  24440       TCP [::1]:ncube-lm->[::1]:16428 (ESTABLISHED)
[[email protected] fd]# ps -f -p 7753
oracle    7753     1  0 20:56 ?        00:00:00 oracle+ASM (LOCAL=NO)

In place of the read syscalls there is now some sort of communication going on with ASM instance.

Alas, when it comes to tracing, an ASM instance is not the most friendly thing you can work with. But there are other ways. Enabling SQL*Net trace is enough to see what replaced the read syscall:

BEGIN, :offset, :length, :buffer); END;

This is nothing other than one of the ASM internal packages which can be used to read directly from ASM disk group. How is the :handle being obtained? If we go a little bit up the trace, we can find the answer:

BEGIN dbms_diskgroup.getfileattr('+REDO/gg1/onlinelog/group_2.2547.708989191', :filetype, :filesize, :lblksize); END

The above code is used to get file attributes which are necessary to call the procedure:

BEGIN'+REDO/gg1/onlinelog/group_2.257.708989191', 'r', :filetype, :lblksize, :handle, :pblksize, :filesize); END;

This call will return a :handle which can later be used in The name of the online redo log file to read is being fetched by querying the RDBMS instance:

SELECT member,        DECODE(status, 'CURRENT', 1, 0),        DECODE(archived, 'YES', 1, 0)   FROM (select lf.member,        l.status,        1.archived        from v$logfile lf, v$log l        WHERE = AND        l.thread# = :ora_thread AND        l.sequence# = :ora_seq_no AND        (lf.status NOT IN        ('INVALID','INCOMPLETE','STALE') OR        lf.status is null)        order by lf.member DESC )        where rownum = 1

Note that this query is a bit different form the one used when fetching the log file name on a file system. Both dbms_diskgroup.getfileattr and calls happen when a new log file needs to be accessed (during startup or after a log file switch, for example), i.e. we do not execute these as part of a main loop listed above.

The next thing to find out is what happens in the ASM dedicated process as a result of call:

read(16, "\37\333\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 8155
read(16, "\37\333\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 8208
read(16, "\1\0\0\0\0\0\0\0\0\0\0\v\1\33\5\0\0\31\2\2\r\0\0\0\0\0\0\372\0\300\0M"..., 8155) = 4803
pread(15, "\1\"\0\0\22 \0\0\21\0\0\0H\200\305\365\5\24\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 28672, 136324096) = 28672
write(16, "\37\333\0\0\6\0\0\0\0\0\v\1\5\326\4\0\0\0\0\0\1\0\0\0\0\0\0\0\240/V\232"..., 8155) = 8155
write(16, "\37\333\0\0\6\0\0\0\0\0\v\5\1\0\4\0\0\0\237\0\0\1g\251\v\0\0\0b\251\1\2"..., 8155) = 8155
write(16, "\37\333\0\0\6\0\0\0\0\0\0\2\0\3\0\4\0\5\0\6\0\7\0\10\0\t\0\n\0\v\0\f"..., 8155) = 8155

If you take a loot at the lsof output above, the file descriptor 16 is a socket back to the Extract process so these syscalls are related to SQL*Net messages. What we’re really interested in is a call on line# 4 with a pread syscall from the file descriptor 15:

[[email protected] fd]# ls -l 15
lrwx------ 1 oracle oinstall 64 Jan 24 20:58 15 -> /dev/sdb1

Of course, /dev/sdb1 is a device under out REDO diskgroup:

SQL> select d.path,
        from v$asm_disk d, v$asm_diskgroup g
        where d.group_number = g.group_number
                and d.path='/dev/sdb1';  2    3    4
PATH            NAME
--------------- ----------------
/dev/sdb1       REDO

What else can we say? First of all, the read size is significantly smaller, just 28672 bytes, compared to 1000K read size when the log has been located on a cooked file system. If you get such a small read size and the fact that the data needs to go through the network and SQL*Net stacks… I would say that I expect this to be much less efficient compared to how online redo logs are being read from a file system. One of the immediate things to realize is that, in case you’re running the Extract process on the same machine as your ASM instance, it probably makes total sense to configure the connection string to ASM instance (the one which is being specified in the Extract process parameters) using bequeath protocol so the traffic can go through a pipe instead of a socket which should provide better performance.



Want to talk with an expert? Schedule a call with our team to get the conversation started.

7 Comments. Leave new

I guess the reason for much smaller read in ASM comes from the fact that the :buffer parameter passed to is a RAW, and PL/SQL does not support more than 32k (not even for LONG RAW).
And trying using BLOB ends in “ORA-01220: file based sort illegal before database is open”.
Perhaps they could get around this limitations using x$kffxp, but were not brave enough to handle ongoing metadata changes due to rebalances, failover to secondary extents etc.

On a bit different note – GoldenGate does not support ASM for archivelogs, at least per documentation. Any idea, why? dbms_diskgroup seems to work on them fine…

Alex Fatkulin
January 29, 2010 9:22 am


32K limit sounds like a reasonable assumption as an upper limit for a single call. The stacks when reading from a file system vs ASM are quite different and a small read call is just part of the story as now the Extract have to rely on a dedicated server process to do part of the lifting. There is still an option to sent more than 32K of data by using multiple buffers but this may not yield any tangible improvements unless we’re using network and the latency is high. Alas, the highest value of ASMBUFSIZE is 28672.

Archivelogs are not supported in ALO mode only. Technology-wise I don’t see why there would be any show stoppers in doing this. They also do not support DDL replication in ALO mode so I wonder whether these limitations resulted from a different codepaths used and will be lifted eventually.

Oracle GoldenGate Extract Internals, Part III | The Pythian Blog
February 3, 2010 5:29 pm

[…] This is the third post in Oracle GoldenGate Extract Internals series (links to part I and part II). […]

Blogroll Report 22/01/2009 – 29/01/2010 « Coskan’s Approach to Oracle
February 20, 2010 10:25 pm

[…] 4-Internals of goldengate extract process when ASM is used Alex Fatkulin-Oracle GoldenGate Extract Internals, Part II […]


I want to read offline archive log contents through Golden Gate. Will it display same contents like oracle logminor. i want to read individual archive log file contents. what are the steps. if you could reply to my email.


we are having 8 node rac of 60tb database and having 120 log switches per hr overall on all threads. i mean to say very busy database with etl loads.
we are having goldengate replication to various out bound applications. we are using rac service(two nodes) to connect to database in extract param file. And we have 10+ extracts running in env.
whenever we have huge load(i can say 200 log switches/hr) extract are falling behind…taking days to catch up lag even though load become stable…
any suggestion in architecture to improve performance.


Leave a Reply

Your email address will not be published. Required fields are marked *