GoldenGate 12.2 big data adapters: part 4 – HBASE

Posted in: Big Data, Hadoop, Oracle, Technical Track

This is the next post in my series about Oracle GoldenGate Big Data adapters. Here is list of all posts in the series:

  1. GoldenGate 12.2 Big Data Adapters: part 1 – HDFS
  2. GoldenGate 12.2 Big Data Adapters: part 2 – Flume
  3. GoldenGate 12.2 Big Data Adapters: part 3 – Kafka
  4. GoldenGate 12.2 Big Data Adapters: part 4 – HBASE

In this post I am going to explore HBASE adapter for GoldenGate. Let’s start by recalling what we know about HBASE. The Apache HBASE is non-relational, distributed database. It has been modelled after the Google’s Bigtable distributed database. It can provide read write access to the data and is based on top of Hadoop or HDFS.

So, what does it tell us? First, we can write and change the data. Second, we need to remember that it is non-relation database and it is a bit of a different approach to data in comparison with traditional relation databases. You can think about HBase as about a key-value store. We are not going deep inside HBASE architecture and internals here, since our main task is to test Oracle GoldenGate adapter and see how it works. Our configuration has an Oracle database as a source with a GoldenGate extract and target system where we have Oracle GoldenGate for BigData.

We have more information about setting up the source and target in the first post in the series about HDFS adapter. The source side replication part has already been configured and started. We have initial trail file for data initialization and trails for the ongoing replication. We capture changes for all tables in the ggtest schema on the oracle database.
Now we need to prepare our target site. Let’s start from HBase. I used a pseudo-distributed mode for my tests where I ran a fully-distributed mode on a single host. It is not acceptable for any production configuration but will suffice for our tests. On the same box I have HDFS to serve as a main storage. Oracle documentation for the adapter states that they support HBase from version 1.0.x . In my first attempt I tried to use HBase version 1.0.0 (Cloudera 5.6) but it didn’t work. I got errors in the GoldenGate and my extract was aborted.
Here is the error :

2016-03-29 11:51:31  ERROR   OGG-15051  Oracle GoldenGate Delivery, irhbase.prm:  Java or JNI exception:
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)Lorg/apache/hadoop/hbase/HTableDescriptor;.
2016-03-29 11:51:31  ERROR   OGG-01668  Oracle GoldenGate Delivery, irhbase.prm:  PROCESS ABENDING.

So, I installed another version HBase and the version 1.1.4 worked just fine. I used simple, standard HBase configuration for pseudo-distributed mode where region server was on the same host as master and hbase.rootdir point to local hdfs.
Here is example of configuration:

<configuration>
<property>
  <name>hbase.cluster.distributed</name>
    <value>true</value>
    </property>
    <property>
      <name>hbase.rootdir</name>
        <value>hdfs://localhost:8020/user/oracle/hbase</value>
        </property>
</configuration>
[[email protected] conf]# cat regionservers
localhost
[[email protected] conf]#

As soon as we have HBase setup and running we can switch our attention to GoldenGate instead. We have already a trail file with initial load. Now we need to prepare our configuration files for initial and ongoing replication. Let’s go to our GoldenGate for Big Data home directory and prepare everything. In first, we need a hbase.conf file copied from $OGG_HOME/AdapterExamples/big-data/hbase directory to $OGG_HOME/dirprm. I left everything as it used to be in the original file changing only gg.classpath parameter to point it to my configuration files and libs for HBase.
Here is an example of the configuration files:

[[email protected] oggbd]$ cat dirprm/hbase.props
gg.handlerlist=hbase
gg.handler.hbase.type=hbase
gg.handler.hbase.hBaseColumnFamilyName=cf
gg.handler.hbase.keyValueDelimiter=CDATA[=]
gg.handler.hbase.keyValuePairDelimiter=CDATA[,]
gg.handler.hbase.encoding=UTF-8
gg.handler.hbase.pkUpdateHandling=abend
gg.handler.hbase.nullValueRepresentation=CDATA[NULL]
gg.handler.hbase.authType=none
gg.handler.hbase.includeTokens=false
gg.handler.hbase.mode=tx
goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec
gg.classpath=/u01/hbase/lib/*:/u01/hbase/conf:/usr/lib/hadoop/client/*
javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

In second, we have to prepare a parameter file for our initial load. I used a simple file with minimum parameters.

[[email protected] oggbd]$ cat dirprm/irhbase.prm
-- passive REPLICAT irhbase
-- Trail file for this example is located in "./dirdat/initld" file
-- Command to add REPLICAT
-- run replicat irhbase:
-- ./replicat paramfile dirprm/irhbase.prm reportfile dirrpt/irhbase.rpt
SPECIALRUN
END RUNTIME
EXTFILE /u01/oggbd/dirdat/initld
TARGETDB LIBFILE libggjava.so SET property=dirprm/hbase.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP GGTEST.*, TARGET BDTEST.*;

Having that configuration file we can run the replicat in passive mode from command line and see the result.
Here is initial status for HBASE:

hbase(main):001:0> version
1.1.4, r14c0e77956f9bb4c6edf0378474264843e4a82c3, Wed Mar 16 21:18:26 PDT 2016
hbase(main):001:0> list
TABLE
0 row(s) in 0.3340 seconds
=> []
hbase(main):002:0>

Running the replicat:

[email protected] oggbd]$ ./replicat paramfile dirprm/irhbase.prm reportfile dirrpt/irhbase.rpt
[[email protected] oggbd]$

Now we have 2 tables in HBASE:

hbase(main):002:0> list
TABLE
BDTEST:TEST_TAB_1
BDTEST:TEST_TAB_2
2 row(s) in 0.3680 seconds
=> ["BDTEST:TEST_TAB_1", "BDTEST:TEST_TAB_2"]
hbase(main):003:0>

Let’s have a look to the tables structure and contains:

hbase(main):004:0> describe 'BDTEST:TEST_TAB_1'
Table BDTEST:TEST_TAB_1 is ENABLED
BDTEST:TEST_TAB_1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MI
N_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.2090 seconds
hbase(main):005:0> scan 'BDTEST:TEST_TAB_1'
ROW                                            COLUMN+CELL
 1                                             column=cf:ACC_DATE, timestamp=1459269153102, value=2014-01-22:12:14:30
 1                                             column=cf:PK_ID, timestamp=1459269153102, value=1
 1                                             column=cf:RND_STR, timestamp=1459269153102, value=371O62FX
 1                                             column=cf:RND_STR_1, timestamp=1459269153102, value=RJ68QYM5
 1                                             column=cf:USE_DATE, timestamp=1459269153102, value=2014-01-24:19:09:20
 2                                             column=cf:ACC_DATE, timestamp=1459269153102, value=2014-05-11:05:23:23
 2                                             column=cf:PK_ID, timestamp=1459269153102, value=2
 2                                             column=cf:RND_STR, timestamp=1459269153102, value=371O62FX
 2                                             column=cf:RND_STR_1, timestamp=1459269153102, value=HW82LI73
 2                                             column=cf:USE_DATE, timestamp=1459269153102, value=2014-01-24:19:09:20
 3                                             column=cf:ACC_DATE, timestamp=1459269153102, value=2014-01-22:12:14:30
 3                                             column=cf:PK_ID, timestamp=1459269153102, value=3
 3                                             column=cf:RND_STR, timestamp=1459269153102, value=RXZT5VUN
 3                                             column=cf:RND_STR_1, timestamp=1459269153102, value=RJ68QYM5
 3                                             column=cf:USE_DATE, timestamp=1459269153102, value=2013-09-04:23:32:56
 4                                             column=cf:ACC_DATE, timestamp=1459269153102, value=2014-05-11:05:23:23
 4                                             column=cf:PK_ID, timestamp=1459269153102, value=4
 4                                             column=cf:RND_STR, timestamp=1459269153102, value=RXZT5VUN
 4                                             column=cf:RND_STR_1, timestamp=1459269153102, value=HW82LI73
 4                                             column=cf:USE_DATE, timestamp=1459269153102, value=2013-09-04:23:32:56
4 row(s) in 0.1630 seconds
hbase(main):006:0> scan 'BDTEST:TEST_TAB_2'
ROW                                            COLUMN+CELL
 7|IJWQRO7T|2013-07-07:08:13:52                column=cf:ACC_DATE, timestamp=1459269153132, value=2013-07-07:08:13:52
 7|IJWQRO7T|2013-07-07:08:13:52                column=cf:PK_ID, timestamp=1459269153132, value=7
 7|IJWQRO7T|2013-07-07:08:13:52                column=cf:RND_STR_1, timestamp=1459269153132, value=IJWQRO7T
1 row(s) in 0.0390 seconds
hbase(main):007:0>

Everything looks good for me. We have structure and records as expected. Let’s go forward and setup ongoing replication.
I have created a parameter file for my replicat using the the initial load parameters as a basis:

[[email protected] oggbd]$ cat dirprm/rhbase.prm
REPLICAT rhbase
-- Trail file for this example is located in "dirdat/or" directory
-- Command to add REPLICAT
-- add replicat rhbase, exttrail dirdat/or
TARGETDB LIBFILE libggjava.so SET property=dirprm/hbase.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP ggtest.*, TARGET bdtest.*;

We are checking our trail files and starting our replicat using the latest trail file. By default, a replicat would be looking for a trail with sequential number 0, but, since I have a purging policy on my GoldenGate it deletes old files and I need tell to replicat where to start exactly.

[[email protected] oggbd]$ ll dirdat/
total 4940
-rw-r-----. 1 oracle oinstall    3028 Feb 16 14:17 initld
-rw-r-----. 1 oracle oinstall 2015199 Mar 24 13:07 or000043
-rw-r-----. 1 oracle oinstall 2015229 Mar 24 13:08 or000044
-rw-r-----. 1 oracle oinstall 1018490 Mar 24 13:09 or000045
[[email protected] oggbd]$ ggsci
Oracle GoldenGate Command Interpreter
Version 12.2.0.1.0 OGGCORE_12.2.0.1.0_PLATFORMS_151101.1925.2
Linux, x64, 64bit (optimized), Generic on Nov 10 2015 16:18:12
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.
GGSCI (sandbox.localdomain) 1> info all
Program     Status      Group       Lag at Chkpt  Time Since Chkpt
MANAGER     RUNNING
GGSCI (sandbox.localdomain) 2> add replicat rhbase, exttrail dirdat/or,EXTSEQNO 45
REPLICAT added.
GGSCI (sandbox.localdomain) 3> start replicat rhbase
Sending START request to MANAGER ...
REPLICAT RHBASE starting
GGSCI (sandbox.localdomain) 4> info all
Program     Status      Group       Lag at Chkpt  Time Since Chkpt
MANAGER     RUNNING
REPLICAT    RUNNING     RHBASE      00:00:00      00:00:06
GGSCI (sandbox.localdomain) 5> info rhbase
REPLICAT   RHBASE    Last Started 2016-03-29 12:56   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:08 ago)
Process ID           27277
Log Read Checkpoint  File dirdat/or000045
                     2016-03-24 13:09:02.000274  RBA 1018490
GGSCI (sandbox.localdomain) 6>

I inserted number of rows to test_tab_1 on oracle side and all of them were successfully replicated to HBASE.

hbase(main):015:0> count 'BDTEST:TEST_TAB_1'
Current count: 1000, row: 1005694
Current count: 2000, row: 442
Current count: 3000, row: 6333
3473 row(s) in 1.0810 seconds
=> 3473
hbase(main):016:0>

Let’s have a look bit close to test_tab_1 and test_tab_2:

hbase(main):005:0> scan 'BDTEST:TEST_TAB_1'
ROW                                            COLUMN+CELL
 1                                             column=cf:ACC_DATE, timestamp=1459269153102, value=2014-01-22:12:14:30
 1                                             column=cf:PK_ID, timestamp=1459269153102, value=1
 1                                             column=cf:RND_STR, timestamp=1459269153102, value=371O62FX
 1                                             column=cf:RND_STR_1, timestamp=1459269153102, value=RJ68QYM5
 1                                             column=cf:USE_DATE, timestamp=1459269153102, value=2014-01-24:19:09:20
 2                                             column=cf:ACC_DATE, timestamp=1459269153102, value=2014-05-11:05:23:23
 2                                             column=cf:PK_ID, timestamp=1459269153102, value=2
 2                                             column=cf:RND_STR, timestamp=1459269153102, value=371O62FX
 2                                             column=cf:RND_STR_1, timestamp=1459269153102, value=HW82LI73
 2                                             column=cf:USE_DATE, timestamp=1459269153102, value=2014-01-24:19:09:20
..............................................
hbase(main):006:0> scan 'BDTEST:TEST_TAB_2'
ROW                                            COLUMN+CELL
 7|IJWQRO7T|2013-07-07:08:13:52                column=cf:ACC_DATE, timestamp=1459269153132, value=2013-07-07:08:13:52
 7|IJWQRO7T|2013-07-07:08:13:52                column=cf:PK_ID, timestamp=1459269153132, value=7
 7|IJWQRO7T|2013-07-07:08:13:52                column=cf:RND_STR_1, timestamp=1459269153132, value=IJWQRO7T
1 row(s) in 0.0390 seconds
hbase(main):007:0>

You can see that row identifier for test_tab_1 is value for pk_id and for test_tab_2 it is concatenation of all values for all columns. Why is it so? The difference is in constraints for the tables. Since we don’t have a primary key or unique index for test_tab_2 it uses all columns as a key value. We can try to add a constraint and see the result.

select * from dba_constraints where owner='GGTEST' and table_name='TEST_TAB_2';
no rows selected
alter table ggtest.test_tab_2 add constraint pk_test_tab_2 primary key (pk_id);
Table altered.
insert into ggtest.test_tab_2 values(9,'PK_TEST',sysdate,null);
1 row created.
commit;
Commit complete.
orcl>

And let us comare with result on the HBASE:

hbase(main):012:0> scan 'BDTEST:TEST_TAB_2'
ROW                                           COLUMN+CELL
 7|IJWQRO7T|2013-07-07:08:13:52               column=cf:ACC_DATE, timestamp=1459275116849, value=2013-07-07:08:13:52
 7|IJWQRO7T|2013-07-07:08:13:52               column=cf:PK_ID, timestamp=1459275116849, value=7
 7|IJWQRO7T|2013-07-07:08:13:52               column=cf:RND_STR_1, timestamp=1459275116849, value=IJWQRO7T
 8|TEST_INS1|2016-03-29:15:14:37|TEST_ALTER   column=cf:ACC_DATE, timestamp=1459278884047, value=2016-03-29:15:14:37
 8|TEST_INS1|2016-03-29:15:14:37|TEST_ALTER   column=cf:PK_ID, timestamp=1459278884047, value=8
 8|TEST_INS1|2016-03-29:15:14:37|TEST_ALTER   column=cf:RND_STR_1, timestamp=1459278884047, value=TEST_INS1
 8|TEST_INS1|2016-03-29:15:14:37|TEST_ALTER   column=cf:TEST_COL, timestamp=1459278884047, value=TEST_ALTER
 9                                            column=cf:ACC_DATE, timestamp=1462473865704, value=2016-05-05:14:44:19
 9                                            column=cf:PK_ID, timestamp=1462473865704, value=9
 9                                            column=cf:RND_STR_1, timestamp=1462473865704, value=PK_TEST
 9                                            column=cf:TEST_COL, timestamp=1462473865704, value=NULL
3 row(s) in 0.0550 seconds
hbase(main):013:0>

It is fully dynamic and changed row id column on the fly. Will it work with unique index? Yes it will :

delete from ggtest.test_tab_2 where pk_id=9;
1 row deleted.
alter table ggtest.test_tab_2 drop constraint pk_test_tab_2;
Table altered.
create unique index ggtest.ux_test_tab_2 on ggtest.test_tab_2 (pk_id);
Index created.
insert into ggtest.test_tab_2 values(10,'UX_TEST',sysdate,null);
1 row created.
commit;

Here is the newly inserted row.

hbase(main):017:0> scan 'BDTEST:TEST_TAB_2'
ROW                                           COLUMN+CELL
 10                                           column=cf:ACC_DATE, timestamp=1462474389145, value=2016-05-05:14:53:03
 10                                           column=cf:PK_ID, timestamp=1462474389145, value=10
 10                                           column=cf:RND_STR_1, timestamp=1462474389145, value=UX_TEST
 10                                           column=cf:TEST_COL, timestamp=1462474389145, value=NULL
 7|IJWQRO7T|2013-07-07:08:13:52               column=cf:ACC_DATE, timestamp=1459275116849, value=2013-07-07:08:13:52
 7|IJWQRO7T|2013-07-07:08:13:52               column=cf:PK_ID, timestamp=1459275116849, value=7

But it will not make any difference if we just create an index on the source. It will not change anything. So, if we need to identify a key for a table we have to have at least unique constraint. Of course it is just default behavior for a schema replication and we may use KEYCOLS to identify keys for some tables.

Interesting that if we change a table structure it will affect all newly inserted rows but will not change existing even if we update some values. It works by this way if you have an unique identifier and it was not changed by your DDL operation.
Here is an example. We have a column “TEST_COL” in the table test_tab_2. Let’s drop the column and update the row. Keep in mind that our primary key is column PK_ID and we are not modifying the key.

alter table ggtest.test_tab_2 drop column TEST_COL;
Table altered.
update ggtest.test_tab_2 set rnd_str_1='TEST_COL' where pk_id=9;
1 row updated.
commit;

In HBASE we can see the same set of columns:

hbase(main):030:0> scan 'BDTEST:TEST_TAB_2'
ROW                                           COLUMN+CELL
 9                                            column=cf:ACC_DATE, timestamp=1462477581440, value=2016-05-05:15:46:13
 9                                            column=cf:PK_ID, timestamp=1462477794597, value=9
 9                                            column=cf:RND_STR_1, timestamp=1462477794597, value=TEST_COL
 9                                            column=cf:TEST_COL, timestamp=1462477581440, value=NULL
1 row(s) in 0.0200 seconds

We still have the deleted column TEST_COL even we’ve updated the row.
But if we insert any new row it will have the new set of columns:

insert into ggtest.test_tab_2 values(10,'TEST_COL',sysdate);
1 row created.
commit;
Commit complete.

And in HBASE:

hbase(main):031:0> scan 'BDTEST:TEST_TAB_2'
ROW                                           COLUMN+CELL
 10                                           column=cf:ACC_DATE, timestamp=1462477860649, value=2016-05-05:15:50:55
 10                                           column=cf:PK_ID, timestamp=1462477860649, value=10
 10                                           column=cf:RND_STR_1, timestamp=1462477860649, value=TEST_COL
 9                                            column=cf:ACC_DATE, timestamp=1462477581440, value=2016-05-05:15:46:13
 9                                            column=cf:PK_ID, timestamp=1462477794597, value=9
 9                                            column=cf:RND_STR_1, timestamp=1462477794597, value=TEST_COL
 9                                            column=cf:TEST_COL, timestamp=1462477581440, value=NULL
2 row(s) in 0.0340 seconds

And, as for all other cases, truncate on source table is not going to be replicated to the target and the operation will be ignored. You have to truncate the table in HBASE by yourself to keep the data in sync. In case you insert data again the data in HBASE will be “updated”. But it will not delete other rows. It will be more like a “merge” operation.
Here is an example:

truncate table ggtest.test_tab_2;
Table truncated.
insert into ggtest.test_tab_2 values(10,'TEST_COL2',sysdate);
1 row created.
commit;
Commit complete.
select * from ggtest.test_tab_2;
	   PK_ID RND_STR_1  ACC_DATE
---------------- ---------- -----------------
	      10 TEST_COL2  05/05/16 16:01:20
orcl>
HBASE:
hbase(main):033:0> scan 'BDTEST:TEST_TAB_2'
ROW                                           COLUMN+CELL
 10                                           column=cf:ACC_DATE, timestamp=1462478485067, value=2016-05-05:16:01:20
 10                                           column=cf:PK_ID, timestamp=1462478485067, value=10
 10                                           column=cf:RND_STR_1, timestamp=1462478485067, value=TEST_COL2
 9                                            column=cf:ACC_DATE, timestamp=1462477581440, value=2016-05-05:15:46:13
 9                                            column=cf:PK_ID, timestamp=1462477794597, value=9
 9                                            column=cf:RND_STR_1, timestamp=1462477794597, value=TEST_COL
 9                                            column=cf:TEST_COL, timestamp=1462477581440, value=NULL
2 row(s) in 0.0300 seconds
hbase(main):034:0>

I spent some time testing performance and found the main bottleneck was my Oracle source rather than GoldenGate and HBASE. I was able to sustain transaction rate up to 60 DML per second and my Oracle DB started to struggle to keep pace because of waiting for a commit. The HBASE and replicat were absolutely fine. I also checked how it handles big transactions and inserted about 2 billion rows by one transaction. It worked fine. Of course it doesn’t prove that any of your production configurations will be without any performance issues. To conduct real performance tests I need to use much bigger environment.
In addition, I noticed one more minor error in Oracle documentation for adapter related to “keyValuePairDelimiter” parameter. In documentation it is replaced by “keyValueDelimiter”. It just small mistype and the “keyValueDelimiter” is repeated twice. First time it is correct and the second time it stands on the place where “keyValuePairDelimiter” is supposed to be. Here is the link.

As a summary I can say that despite some minor issues the adapters and GoldenGate for Big Data showed quite mature status and readiness for real work. I think it is good robust technology and, hopefully, its development will continue improving it with new releases. I am looking forward to use it in a real production environment with significant workload. In following posts I will try to test different DDL operations and maybe some other datatypes. Stay tuned.

email

Author

Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Regarded by his peers as an Oracle guru, Gleb is known for being able to resolve any problem related to Oracle. He loves the satisfaction of troubleshooting, and his colleagues even say that seeking Gleb’s advice regarding an issue is more efficient than looking it up. Gleb enjoys the variety of challenges he faces while working at Pythian, rather than working on the same thing every day. His areas of speciality include Oracle RAC, Exadata, RMAN, SQL tuning, high availability, storage, performance tuning, and many more. When he’s not working, running, or cycling, Gleb can be found reading.

14 Comments. Leave new

Fabio Oliveira
August 29, 2016 7:21 pm

Hello Gleb,

First of all, nice post!

I am trying to replicate this implementation using mongoDB, but until now without success.
I received the following error:

2016-08-30 00:23:38 ERROR OGG-15051 Java or JNI exception:
oracle.goldengate.util.GGException: java.lang.NoClassDefFoundError: com/mongodb/client/model/WriteModel
java.lang.NoClassDefFoundError: com/mongodb/client/model/WriteModel
at oracle.goldengate.datasource.DataSourceLauncher.(DataSourceLauncher.java:161)
at oracle.goldengate.datasource.UserExitMain.main(UserExitMain.java:108)
Caused by: org.springframework.beans.factory.BeanDefinitionStoreException: Factory method [public final oracle.goldengate.datasource.GGDataSource oracle.goldengate.dat asource.factory.DataSourceFactory.getDataSource()] threw exception; nested exception is java.lang.NoClassDefFoundError: com/mongodb/client/model/WriteModel
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:169)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:570)
… 11 more
Caused by: java.lang.NoClassDefFoundError: com/mongodb/client/model/WriteModel
at oracle.goldengate.delivery.handler.mongodb.util.DBOperationFactory.init(DBOperationFactory.java:21)
at oracle.goldengate.delivery.handler.mongodb.MongoDBHandler.init(MongoDBHandler.java:41)
at oracle.goldengate.datasource.AbstractDataSource.addDataSourceListener(AbstractDataSource.java:664)
at oracle.goldengate.datasource.factory.DataSourceFactory.getDataSource(DataSourceFactory.java:159)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:149)
… 12 more
Caused by: java.lang.ClassNotFoundException: com.mongodb.client.model.WriteModel
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 21 more.

2016-08-30 00:23:38 ERROR OGG-01668 PROCESS ABENDING.

Can you help me on this?
Fábio Oliveira

Reply
Gleb Otochkin
August 30, 2016 11:29 am

HI Fabio,
I haven’t tried a mongodb adapter during my tests.
I may try it in coming days/weekend and write one more post if time permits.
Meanwhile, what version of mongodb java driver do you use?
Is it in your classpath?

Thanks,
Gleb

Reply
Fabio Oliveira
August 30, 2016 3:45 pm

Hi Gleb,

My configuration is:

GoldeGate: Version 12.2.0.1.160419 OGGCORE_12.2.0.1.0OGGBP_PLATFORMS_160430.1401
Driver: mongodb-driver-3.2.2.jar
Adapter: ogg-mongodb-adapter-1.0.jar
MongoDB: 3.2.0-1
Java: 1.8.0_91

[[email protected] ~]# rpm -qa| grep mongo
mongodb-org-shell-3.2.0-1.el6.x86_64
mongodb-org-server-3.2.0-1.el6.x86_64
mongodb-org-tools-3.2.0-1.el6.x86_64
mongodb-org-mongos-3.2.0-1.el6.x86_64
mongodb-org-3.2.0-1.el6.x86_64
[[email protected] ~]# uname -a
Linux localhost.localdomain 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 17:27:01 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[[email protected] ~]# cat /etc/redhat-release
CentOS release 6.8 (Final)
[[email protected] ~]#

[[email protected] oggmongo]$ java -version
java version “1.8.0_91”
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
[[email protected] oggmongo]$

gg.classpath=/u01/oggmongo/mongo/lib/*:/u01/oggmongo/mongo/bin/*:

Thanks again,
Fábio Oliveira

Reply
Fabio Oliveira
August 30, 2016 3:46 pm

[[email protected] oggmongo]$ echo $LD_LIBRARY_PATH
/usr/java/jdk1.8.0_91/jre/lib/amd64:/u01/oggmongo:/usr/java/jdk1.8.0_91/jre/lib/amd64/server
[[email protected] oggmongo]$

Reply
Gleb Otochkin
August 31, 2016 8:35 am

Hi Fabio,
I’ve setup a MongoDB and tried the adapter by myself. It worked for me pretty well with mongo-java-driver-3.2.2.jar but gave the same error as for you with the mongodb-driver-3.2.2.jar. So, looks like you have to use the uber java driver. You can get it using link in the README for the adapter or from https://oss.sonatype.org/content/repositories/releases/org/mongodb/mongo-java-driver/3.2.2/

Try it and let me know if it works for you.

Cheers,
Gleb

Fábio Oliveira
August 31, 2016 9:22 am

Hi Gleb,

My bad was the
“mongo-java-driver-3.2.2.jar”… I put it on the right location now and works fine.

Cheers,
Fábio

Gleb Otochkin
August 31, 2016 9:27 am

Great. I am going to create a short post about mongodb adapter.

Cheers,
Gleb

David Kjerrumgaard
September 20, 2016 8:22 pm

How does an database update or delete affect the contents of the HBase table? Is the HBase handler smart enough to update the individual cell value or delete the records with the associated row key?

Reply

all path set, but still get the error.

2017-03-24 08:29:15 ERROR OGG-15051 Oracle GoldenGate Delivery, rhbase.prm: Java or JNI exception:
java.lang.NoClassDefFoundError: oracle/goldengate/datasource/UserExitMain

Reply
Gleb Otochkin
March 24, 2017 10:13 am

Hi Ray,
It looks like it is not able to find the user exit java class.
Can you please check if you have “TARGETDB LIBFILE libggjava.so SET property=dirprm/hbase.props” in your replicat parameter file and “-Djava.class.path=ggjava/ggjava.jar” in your dirprm/hbase.props file?

Thanks,
Gleb

Reply

Thanks so much for writing this article. This is probably the best one by far. Easy to understand and educate myself on blog commenting and how the best way to go about it. Thanks a lot really appreciate you sharing with us. The information you provided is very helpful for Hbase Learners.

Reply
Gleb Otochkin
July 25, 2017 8:15 am

Hi Leviya,
Thank you for your kind words. Good, the blog has appeared to be useful for somebody.

Cheers,
Gleb

Reply

Thanks for sharing this information

Reply

thanks for sharing very informative and useful article.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *