首页 > 代码库 > oracle goldengate报错解决之OGG-01033
oracle goldengate报错解决之OGG-01033
环境概述:
生产环境使用ogg进行数据同步,要求新增两张表
两张表增加后发现目标端数据总是多于源端数据,为此专门做了个测试,遇OGG-01033故障。
报错描述:pump1进程启动失败,状态为abended
看源端日志:
2016-11-07 16:25:40 ERROR OGG-01033 There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unk
nown error. (Remote file used is ./dirdat/ro002250, reply received is Unable to open file "./dirdat/ro002250" (error 2, No such file or directory)).
2016-11-07 16:25:40 ERROR OGG-01668 PROCESS ABENDING.
查看网上各种资料,有说可能是远程trail文件被锁(但是人家的日志里有明显日志说文件被锁,我的日志里并没有)
还是去看看rmtrail文件:
oracle@a-db61:/data/ogg/dirdat$ ls -lrt
total 170388
-rw-r----- 1 oracle dba 89575424 Nov 3 06:24 po000000083
-rw-r----- 1 oracle dba 84889622 Nov 3 11:42 ro000002250
发现文件更新时间发现停留在11.3日(我的实验是在11.7日)
除本测试环境,还有两个生产库作为源数据库,查看进程状态也为abended(晕。。)
将生产库的源端进程启动后,测试环境的pump1好了一会又是abended了,但是此次报错使用的文件变了:
2016-11-07 17:52:21 ERROR OGG-01033 There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unk
nown error. (Remote file used is ./dirdat/ro002452, reply received is Unable to open file "./dirdat/ro002452" (error 2, No such file or directory)).
2016-11-07 17:52:21 ERROR OGG-01668 PROCESS ABENDING.
查阅资料说单实例环境可以这样解决:
oracle@a-db61:/data/ogg/dirdat$ mv /data/ogg/dirdat/ro000002250 /data/ogg/dirdat/ro000002250bak
oracle@a-db61:/data/ogg/dirdat$ cp /data/ogg/dirdat/ro000002250bak /data/ogg/dirdat/ro000002250
oracle@a-db61:/data/ogg/dirdat$ pwd
/data/ogg/dirdat
oracle@a-db61:/data/ogg/dirdat$ ll
total 22520276
-rw-r----- 1 oracle dba 89575424 Nov 3 06:24 po000000083
-rw-r----- 1 oracle dba 84889622 Nov 7 17:46 ro000002250
-rw-r----- 1 oracle dba 84889622 Nov 3 11:42 ro000002250bak
………………
-rw-r----- 1 oracle dba 99999848 Nov 7 17:52 ro000002452
-rw-r----- 1 oracle dba 99999930 Nov 7 17:52 ro000002453
………………
重新启动仍旧不好使。
现在问题变成:只要是投递过去的trail文件都无法打开。
还有可能是远程trail文件的目录不正确,查看生产环境源端pump1参数
GGSCI (a-db2 as goldengate@BILLDB) 5> view params pump1
Extract pump1
PassThru
RmtHost 192.168.10.61, MgrPort 7809
RmtTrail ./dirdat/ro
GGSCI (a-db31 as goldengate@AAADB) 12> view params pump1
Extract pump1
PassThru
RmtHost 192.168.10.61, MgrPort 7809
RmtTrail ./dirdat/po
参数rmttrail都是./dirdat下的
查看我的测试环境pump参数:
GGSCI (a-test30 as goldengate@qatest30) 11> view params pump1
Extract pump1
PassThru
RmtHost 192.168.10.61, MgrPort 7809
RmtTrail ./dirdat/ro
终于发现问题了:
rmttrail文件生成格式与生产环境billdb的重复了!!都是./dirdat/ro
解决:
修改参数pump1 RmtTrail为./dirdat/go
此时直接重新start 是不可以的,因为之前添加ext,pump,rep进程时都使用的是./dirdat/ro,否则会报错:
2016-11-08 10:47:56 ERROR OGG-01044 The trail ‘./dirdat/go‘ is not assigned to extract ‘PUMP1‘. Assign the trail to the extract with the command "ADD EXTTRAIL/RMTTRAIL ./dirdat/go, EX
TRACT PUMP1".
2016-11-08 10:47:56 ERROR OGG-01668 PROCESS ABENDING.
需要删除所有ext,pump,rep进程,重新添加
删除:
目标端
GGSCI (a-db61 as goldengate@jfogg) 3> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REP1 09:09:58 115:49:19
REPLICAT RUNNING REP2 115:11:08 00:00:00
REPLICAT ABENDED REP3 111:49:46 12:39:24
GGSCI (a-db61 as goldengate@jfogg) 4> delete replicat rep3
Deleted REPLICAT REP3.
GGSCI (a-db61 as goldengate@jfogg) 5> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REP1 09:09:58 116:04:29
REPLICAT RUNNING REP2 00:00:00 00:00:06
(测试环境)源端:
GGSCI (a-test30 as goldengate@qatest30) 17> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING EXT1 00:00:00 00:00:00
EXTRACT ABENDED PUMP1 00:00:00 17:20:57
GGSCI (a-test30 as goldengate@qatest30) 18> delete RmtTrail ./dirdat/ro, Extract pump1
Deleting extract trail ./dirdat/ro for extract PUMP1
GGSCI (a-test30 as goldengate@qatest30) 19> delete Extract pump1
Deleted EXTRACT PUMP1.
GGSCI (a-test30 as goldengate@qatest30) 20> delete ExtTrail ./dirdat/eo, Extract ext1
Cannot delete extract trail ./dirdat/eo, extract EXT1 is running.
GGSCI (a-test30 as goldengate@qatest30) 21> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING EXT1 00:00:00 00:00:06
GGSCI (a-test30 as goldengate@qatest30) 22> stop ext1
Sending STOP request to EXTRACT EXT1 ...
Request processed.
GGSCI (a-test30 as goldengate@qatest30) 23> stop mgr
Manager process is required by other GGS processes.
Are you sure you want to stop it (y/n)?y
Sending STOP request to MANAGER ...
Request processed.
Manager stopped.
GGSCI (a-test30 as goldengate@qatest30) 24> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER STOPPED
EXTRACT STOPPED EXT1 00:00:00 00:00:10
GGSCI (a-test30 as goldengate@qatest30) 25> delete ExtTrail ./dirdat/eo, Extract ext1
Deleting extract trail ./dirdat/eo for extract EXT1
GGSCI (a-test30 as goldengate@qatest30) 26> delete Extract ext1
Deleted EXTRACT EXT1.
GGSCI (a-test30 as goldengate@qatest30) 27> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER STOPPED
添加
---------源端
GGSCI (a-test30 as goldengate@qatest30) 28> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER STOPPED
GGSCI (a-test30 as goldengate@qatest30) 29> Add Extract ext1, TranLog, Begin Now
EXTRACT added.
GGSCI (a-test30 as goldengate@qatest30) 30> Add ExtTrail ./dirdat/eo, Extract ext1, MegaBytes 100
EXTTRAIL added.
GGSCI (a-test30 as goldengate@qatest30) 31> Add Extract pump1, ExtTrailSource ./dirdat/eo, Begin Now
EXTRACT added.
GGSCI (a-test30 as goldengate@qatest30) 32> Add RmtTrail ./dirdat/go, Extract pump1, MegaBytes 100
RMTTRAIL added.
GGSCI (a-test30 as goldengate@qatest30) 33> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER STOPPED
EXTRACT STOPPED EXT1 00:00:00 00:01:21
EXTRACT STOPPED PUMP1 00:00:00 00:00:47
-------目标端
GGSCI (a-db61 as goldengate@jfogg) 6> Add Replicat rep3, ExtTrail ./dirdat/go, Begin Now
REPLICAT added.
GGSCI (a-db61 as goldengate@jfogg) 7> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REP1 09:09:58 116:09:21
REPLICAT RUNNING REP2 00:00:00 00:00:00
REPLICAT STOPPED REP3 00:00:00 00:00:34
启动全部进程
GGSCI (a-test30 as goldengate@qatest30) 34> start mgr
Manager started.
GGSCI (a-test30 as goldengate@qatest30) 35> start ext1
Sending START request to MANAGER ...
EXTRACT EXT1 starting
GGSCI (a-test30 as goldengate@qatest30) 36> start pump1
Sending START request to MANAGER ...
EXTRACT PUMP1 starting
GGSCI (a-test30 as goldengate@qatest30) 38> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING EXT1 00:02:33 00:00:00
EXTRACT RUNNING PUMP1 00:00:00 00:00:01
GGSCI (a-db61 as goldengate@jfogg) 8> start rep3
Sending START request to MANAGER ...
REPLICAT REP3 starting
GGSCI (a-db61 as goldengate@jfogg) 9> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REP1 09:09:58 116:10:40
REPLICAT RUNNING REP2 00:00:00 00:00:02
REPLICAT RUNNING REP3 00:00:00 00:00:07
一切正常了
GGSCI (a-db61 as goldengate@jfogg) 2> info rep3
REPLICAT REP3 Last Started 2016-11-08 10:56 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:00 ago)
Process ID 44117
Log Read Checkpoint File ./dirdat/go000000
2016-11-08 10:56:30.678779 RBA 1497
本文出自 “lichdiamand” 博客,请务必保留此出处http://lichdiamond.blog.51cto.com/12111063/1870669
oracle goldengate报错解决之OGG-01033