首页 > 代码库 > oracle goldengate报错解决之OGG-01033

oracle goldengate报错解决之OGG-01033

环境概述:

生产环境使用ogg进行数据同步,要求新增两张表

两张表增加后发现目标端数据总是多于源端数据,为此专门做了个测试,遇OGG-01033故障。


报错描述:pump1进程启动失败,状态为abended

看源端日志:

2016-11-07 16:25:40  ERROR   OGG-01033  There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unk

nown error. (Remote file used is ./dirdat/ro002250, reply received is Unable to open file "./dirdat/ro002250" (error 2, No such file or directory)).


2016-11-07 16:25:40  ERROR   OGG-01668  PROCESS ABENDING.

查看网上各种资料,有说可能是远程trail文件被锁(但是人家的日志里有明显日志说文件被锁,我的日志里并没有)

还是去看看rmtrail文件:

oracle@a-db61:/data/ogg/dirdat$ ls -lrt

total 170388

-rw-r----- 1 oracle dba 89575424 Nov  3 06:24 po000000083

-rw-r----- 1 oracle dba 84889622 Nov  3 11:42 ro000002250

发现文件更新时间发现停留在11.3日(我的实验是在11.7日)


除本测试环境,还有两个生产库作为源数据库,查看进程状态也为abended(晕。。)

将生产库的源端进程启动后,测试环境的pump1好了一会又是abended了,但是此次报错使用的文件变了:

2016-11-07 17:52:21  ERROR   OGG-01033  There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unk

nown error. (Remote file used is ./dirdat/ro002452, reply received is Unable to open file "./dirdat/ro002452" (error 2, No such file or directory)).

2016-11-07 17:52:21  ERROR   OGG-01668  PROCESS ABENDING.


查阅资料说单实例环境可以这样解决:

oracle@a-db61:/data/ogg/dirdat$ mv /data/ogg/dirdat/ro000002250 /data/ogg/dirdat/ro000002250bak

oracle@a-db61:/data/ogg/dirdat$ cp /data/ogg/dirdat/ro000002250bak /data/ogg/dirdat/ro000002250

oracle@a-db61:/data/ogg/dirdat$ pwd

/data/ogg/dirdat

oracle@a-db61:/data/ogg/dirdat$ ll

total 22520276

-rw-r----- 1 oracle dba 89575424 Nov  3 06:24 po000000083

-rw-r----- 1 oracle dba 84889622 Nov  7 17:46 ro000002250

-rw-r----- 1 oracle dba 84889622 Nov  3 11:42 ro000002250bak

………………

-rw-r----- 1 oracle dba 99999848 Nov  7 17:52 ro000002452

-rw-r----- 1 oracle dba 99999930 Nov  7 17:52 ro000002453

………………

重新启动仍旧不好使。

现在问题变成:只要是投递过去的trail文件都无法打开。


还有可能是远程trail文件的目录不正确,查看生产环境源端pump1参数

GGSCI (a-db2 as goldengate@BILLDB) 5> view params pump1

Extract pump1

PassThru

RmtHost 192.168.10.61, MgrPort 7809

RmtTrail ./dirdat/ro


GGSCI (a-db31 as goldengate@AAADB) 12> view params pump1

Extract pump1

PassThru

RmtHost 192.168.10.61, MgrPort 7809

RmtTrail ./dirdat/po

 

参数rmttrail都是./dirdat下的

查看我的测试环境pump参数:

GGSCI (a-test30 as goldengate@qatest30) 11> view params pump1

Extract pump1

PassThru

RmtHost 192.168.10.61, MgrPort 7809

RmtTrail ./dirdat/ro


终于发现问题了:

rmttrail文件生成格式与生产环境billdb的重复了!!都是./dirdat/ro


解决:

修改参数pump1   RmtTrail为./dirdat/go


此时直接重新start 是不可以的,因为之前添加ext,pump,rep进程时都使用的是./dirdat/ro,否则会报错:

2016-11-08 10:47:56  ERROR   OGG-01044  The trail ‘./dirdat/go‘ is not assigned to extract ‘PUMP1‘. Assign the trail to the extract with the command "ADD EXTTRAIL/RMTTRAIL ./dirdat/go, EX

TRACT PUMP1".

2016-11-08 10:47:56  ERROR   OGG-01668  PROCESS ABENDING.


需要删除所有ext,pump,rep进程,重新添加

删除:

目标端

GGSCI (a-db61 as goldengate@jfogg) 3> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

REPLICAT    RUNNING     REP1        09:09:58      115:49:19   

REPLICAT    RUNNING     REP2        115:11:08     00:00:00    

REPLICAT    ABENDED     REP3        111:49:46     12:39:24    

GGSCI (a-db61 as goldengate@jfogg) 4> delete replicat rep3

Deleted REPLICAT REP3.

GGSCI (a-db61 as goldengate@jfogg) 5> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

REPLICAT    RUNNING     REP1        09:09:58      116:04:29   

REPLICAT    RUNNING     REP2        00:00:00      00:00:06    

(测试环境)源端:

GGSCI (a-test30 as goldengate@qatest30) 17> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

EXTRACT     RUNNING     EXT1        00:00:00      00:00:00    

EXTRACT     ABENDED     PUMP1       00:00:00      17:20:57    

GGSCI (a-test30 as goldengate@qatest30) 18> delete RmtTrail ./dirdat/ro, Extract pump1

Deleting extract trail ./dirdat/ro for extract PUMP1

GGSCI (a-test30 as goldengate@qatest30) 19> delete Extract pump1

Deleted EXTRACT PUMP1.

GGSCI (a-test30 as goldengate@qatest30) 20> delete ExtTrail ./dirdat/eo, Extract ext1

Cannot delete extract trail ./dirdat/eo, extract EXT1 is running.

GGSCI (a-test30 as goldengate@qatest30) 21> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

EXTRACT     RUNNING     EXT1        00:00:00      00:00:06    

GGSCI (a-test30 as goldengate@qatest30) 22> stop ext1

Sending STOP request to EXTRACT EXT1 ...

Request processed.

GGSCI (a-test30 as goldengate@qatest30) 23> stop mgr

Manager process is required by other GGS processes.

Are you sure you want to stop it (y/n)?y

Sending STOP request to MANAGER ...

Request processed.

Manager stopped.

GGSCI (a-test30 as goldengate@qatest30) 24> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           

EXTRACT     STOPPED     EXT1        00:00:00      00:00:10    

GGSCI (a-test30 as goldengate@qatest30) 25> delete ExtTrail ./dirdat/eo, Extract ext1

Deleting extract trail ./dirdat/eo for extract EXT1

GGSCI (a-test30 as goldengate@qatest30) 26> delete Extract ext1

Deleted EXTRACT EXT1.

GGSCI (a-test30 as goldengate@qatest30) 27> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           

添加

---------源端

GGSCI (a-test30 as goldengate@qatest30) 28> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           

GGSCI (a-test30 as goldengate@qatest30) 29> Add Extract ext1, TranLog, Begin Now

EXTRACT added.

GGSCI (a-test30 as goldengate@qatest30) 30> Add ExtTrail ./dirdat/eo, Extract ext1, MegaBytes 100

EXTTRAIL added.

GGSCI (a-test30 as goldengate@qatest30) 31> Add Extract pump1, ExtTrailSource ./dirdat/eo, Begin Now

EXTRACT added.

GGSCI (a-test30 as goldengate@qatest30) 32> Add RmtTrail ./dirdat/go, Extract pump1, MegaBytes 100

RMTTRAIL added.

GGSCI (a-test30 as goldengate@qatest30) 33> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           

EXTRACT     STOPPED     EXT1        00:00:00      00:01:21    

EXTRACT     STOPPED     PUMP1       00:00:00      00:00:47    

-------目标端

GGSCI (a-db61 as goldengate@jfogg) 6> Add Replicat rep3, ExtTrail ./dirdat/go, Begin Now

REPLICAT added.

GGSCI (a-db61 as goldengate@jfogg) 7> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

REPLICAT    RUNNING     REP1        09:09:58      116:09:21   

REPLICAT    RUNNING     REP2        00:00:00      00:00:00    

REPLICAT    STOPPED     REP3        00:00:00      00:00:34 

启动全部进程

GGSCI (a-test30 as goldengate@qatest30) 34> start mgr

Manager started.

GGSCI (a-test30 as goldengate@qatest30) 35> start ext1

Sending START request to MANAGER ...

EXTRACT EXT1 starting

GGSCI (a-test30 as goldengate@qatest30) 36> start pump1

Sending START request to MANAGER ...

EXTRACT PUMP1 starting

GGSCI (a-test30 as goldengate@qatest30) 38> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

EXTRACT     RUNNING     EXT1        00:02:33      00:00:00    

EXTRACT     RUNNING     PUMP1       00:00:00      00:00:01    

GGSCI (a-db61 as goldengate@jfogg) 8> start rep3

Sending START request to MANAGER ...

REPLICAT REP3 starting

GGSCI (a-db61 as goldengate@jfogg) 9> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           

REPLICAT    RUNNING     REP1        09:09:58      116:10:40   

REPLICAT    RUNNING     REP2        00:00:00      00:00:02    

REPLICAT    RUNNING     REP3        00:00:00      00:00:07    

一切正常了

GGSCI (a-db61 as goldengate@jfogg) 2> info rep3

REPLICAT   REP3      Last Started 2016-11-08 10:56   Status RUNNING

Checkpoint Lag       00:00:00 (updated 00:00:00 ago)

Process ID           44117

Log Read Checkpoint  File ./dirdat/go000000

                     2016-11-08 10:56:30.678779  RBA 1497


本文出自 “lichdiamand” 博客,请务必保留此出处http://lichdiamond.blog.51cto.com/12111063/1870669

oracle goldengate报错解决之OGG-01033