ARCH wait on SENDREQ等待事件

客户库,AWR报告采样间隔8小时,ARCH wait on SENDREQ等待事件平均等待时间较长,约3秒。

image

image

恰巧告警日志中提示无法分配日志,虽然archive_lag_target参数设置较小(为900),但是数据库有5个重做日志组,按理说应该不会出现无法分配日志组的情况出现。由于这个等待事件的出现,故猜想主备库之间的网络可能较差,导致主库的归档无法及时归档到备库上,从而引发cannot allocate new log。

Tue Jul 19 22:19:33 2011
Thread 1 cannot allocate new log, sequence 26887
Private strand flush not complete

从Metalink(MOS)上可查到如下关于ARCH wait on SENDREQ等待事件的信息

《Data Guard Wait Events》

“ARCH wait on SENDREQ” This wait event monitors the amount of time spent by all archiver processes to write the received redo to disk as well as open and close the remote archived redo logs.

《Troubleshooting 9i Data Guard Network Issues》

The ‘ARCH wait on SENDREQ’ wait event increases during a log switch period. This event’s average wait time also increases as the network round trip time (RTT) increases. If this wait event is in the top 5, then it may be indicative of a saturated network or a poorly configured network. Also, make sure that enough redo log groups are configured so that any delay in remote archiving does not result in a hung database due to no available online redo logs.

在log switch阶段会出现ARCH wait on SENDREQ等待事件,如果此事件出现在Top5等待事件中,说明网络满负载或网络配置问题(总之,网络较差)

还有一篇,说是metalink上的,不过没找到原文:

1)It means that there is a slow network between primary and standby database.

2)It also means that there is a chance of slow performance on disk where remote archiving is happening.

Solution:

1.Please get in touch with your network admin and check the network response.

2.If the remote destination is slow and archiver is taking longer to archive to that destination, then the user needs to allocate more redo log groups so that there is a logfile available for a logswitch to switch into, and not wait for the archiver to finish archiving to the destination.

3.One more workaround you can use is to set below parameter in primary site:

_LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’

第三种解决方案提示修改隐含参数:_LOG_ARCHIVE_CALLOUT

_LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’

If the above parameter is set then the ARCH process will begin archiving to the local destination first.  Once the redo log has been completely and successfully archived to at least one local destination, it will then be transmitted to the remote destination. This is the default behavior. beginning with Oracle Database 10g Release 1.

Starting in 9.2.0.7 patchsets, one ARCH process will begin acting as a ‘dedicated’ archiver, handling only local archival duties. It will not perform. remote log shipping or service FAL requests. This is a backport of behavior. from 10gR1 to 9iR2.

我对上文This is the default behavior. beginning with Oracle Database 10g Release 1.表示质疑,因为从我本地11.2.0.1的库来看,此隐含参数默认还是空。

尝试在非生产库上修改此隐含参数做测试:

SQL> set linesize 132
SQL> column name format a30
SQL> column value format a25
SQL> select
2    x.ksppinm  name,
3    y.ksppstvl  value,
4    y.ksppstdf  isdefault,
5    decode(bitand(y.ksppstvf,7),1,’MODIFIED’,4,’SYSTEM_MOD’,’FALSE’)  ismod,
6    decode(bitand(y.ksppstvf,2),2,’TRUE’,’FALSE’)  isadj
7  from
8    sys.x$ksppi x,
9    sys.x$ksppcv y
10  where
11    x.inst_id = userenv(‘Instance’) and
12    y.inst_id = userenv(‘Instance’) and
13    x.indx = y.indx and
14    x.ksppinm like ‘%_&par%’
15  order by
16    translate(x.ksppinm, ‘ _’, ‘ ‘)
17  /
输入 par 的值:  callout
原值   14:   x.ksppinm like ‘%_&par%’
新值   14:   x.ksppinm like ‘%_callout%’

NAME                           VALUE                     ISDEFAULT ISMOD      ISADJ
—————————— ————————- ——— ———- —–
_log_archive_callout                                     TRUE      FALSE      FALSE

SQL> alter system set “_log_archive_callout”=’LOCAL_FIRST=TRUE';

系统已更改。

SQL> set linesize 132
SQL> column name format a30
SQL> column value format a25
SQL> select
2    x.ksppinm  name,
3    y.ksppstvl  value,
4    y.ksppstdf  isdefault,
5    decode(bitand(y.ksppstvf,7),1,’MODIFIED’,4,’SYSTEM_MOD’,’FALSE’)  ismod,
6    decode(bitand(y.ksppstvf,2),2,’TRUE’,’FALSE’)  isadj
7  from
8    sys.x$ksppi x,
9    sys.x$ksppcv y
10  where
11    x.inst_id = userenv(‘Instance’) and
12    y.inst_id = userenv(‘Instance’) and
13    x.indx = y.indx and
14    x.ksppinm like ‘%_&par%’
15  order by
16    translate(x.ksppinm, ‘ _’, ‘ ‘)
17  /
输入 par 的值:  callout
原值   14:   x.ksppinm like ‘%_&par%’
新值   14:   x.ksppinm like ‘%_callout%’

NAME                           VALUE                     ISDEFAULT ISMOD      ISADJ
—————————— ————————- ——— ———- —–
_log_archive_callout           LOCAL_FIRST=TRUE          TRUE      SYSTEM_MOD FALSE

综上:我认为应该仔细检查一下用户主备库网络情况,检查客户备库磁盘IO。

参考资料:

OMS:Data Guard Wait Events

OMS:Troubleshooting 9i Data Guard Network Issues

ITPUB:《ARCH wait on SENDREQ主要是网络的问题吗?》http://www.itpub.net/thread-991316-1-1.html

这个老哥遇到的问题与我遇到的相似:http://www.dbasupport.com/forums/showthread.php?t=60772

《设置 _LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’ 后是传输归档 ?》http://space.itpub.net/?uid-35489-action-viewspace-itemid-561465

普人特福的博客cnzz&51la for wordpress,cnzz for wordpress,51la for wordpress