April 05, 2012

Grid Infrastructure Redundant Interconnect

After installing RAC 11.2.0.3 in one of my client, i been confuse why there is another ip with segment 169.254.*.* cause i only use one private connection for my cluster, and after googling finally i found this explanation, :) enjoy



11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip

Redundant Interconnect without any 3rd-party IP failover technology (bond, IPMP or similar) is supported natively by Grid Infrastructure starting from 11.2.0.2. Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg. Oracle Database, CSS, OCR, CRS, CTSS, and EVM components in 11.2.0.2 employ it automatically.

Grid Infrastructure can activate a maximum of four private network adapters at a time even if more are defined. The ora.cluster_interconnect.haip resource will start one to four link local HAIP on private network adapters for interconnect communication for Oracle RAC, Oracle ASM, and Oracle ACFS etc.

Grid automatically picks free link local addresses from reserved 169.254.*.* subnet for HAIP. According to RFC-3927, link local subnet 169.254.*.* should not be used for any other purpose. With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces, and corresponding HAIP address will be failed over transparently to other adapters if one fails or becomes non-communicative. .

The number of HAIP addresses is decided by how many private network adapters are active when Grid comes up on the first node in the cluster . If there's only one active private network, Grid will create one; if two, Grid will create two; and if more than two, Grid will create four HAIPs. The number of HAIPs won't change even if more private network adapters are activated later, a restart of clusterware on all nodes is required for the number to change, however, the newly activated adapters can be used for fail over purpose.

When Oracle Clusterware is fully up, resource haip should show status of ONLINE:

$ $GRID_HOME/bin/crsctl stat res -t -init
..
ora.cluster_interconnect.haip
1 ONLINE ONLINE racnode1


Case 1: Single Private Network Adapter

If multiple physical network adapters are bonded together at the OS level and presented as a single device name, for example bond0, it's still considered a single network adapter environment. Single private network adapter does not offer true HAIP, as there's only one adapter, at least two is recommended to gain true HAIP. If only one private network adapter is defined, such as eth1 in the example below, one virtual IP will be created by HAIP. Here is what's expected when Grid is up and running:

$ $GRID_HOME/bin/oifcfg getif
eth1 10.1.0.128 global cluster_interconnect
eth3 10.1.0.0 global public


$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1 10.1.0.128 PRIVATE 255.255.255.128
eth1 169.254.0.0 UNKNOWN 255.255.0.0
eth3 10.1.0.0 PRIVATE 255.255.255.128

Note: subnet 169.254.0.0 on eth1 is started by resource haip.


ifconfig
..
eth1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:10.1.0.168 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1122/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6369306 errors:0 dropped:0 overruns:0 frame:0
TX packets:4270790 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3037449975 (2.8 GiB) TX bytes:2705797005 (2.5 GiB)

eth1:1 Link encap:Ethernet HWaddr 00:16:3E:11:22:22
inet addr:169.254.167.163 Bcast:169.254.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1


Instance alert.log (ASM and database):

Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
[name='eth1:1', type=1, ip=169.254.167.163, mac=00-16-3e-11-11-22, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface 'eth3' configured from GPnP for use as a public interface.
[name='eth3', type=1, ip=10.1.0.68, mac=00-16-3e-11-11-44, net=10.1.0.0/25, mask=255.255.255.128, use=public/1]
..
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 3
..
Cluster communication is configured to use the following interface(s) for this instance
169.254.167.163

Note: interconnect will use virtual private IP 169.254.167.163 instead of real private IP. For pre-11.2.0.2 instance, by default it will still use the real private IP; to take advantage of the new feature, init.ora parameter cluster_interconnects can be updated each time Grid is restarted .


For 11.2.0.2 and above, v$cluster_interconnects will show haip info:

SQL> select name,ip_address from v$cluster_interconnects;

NAME IP_ADDRESS
--------------- ----------------
eth1:1 169.254.167.163


Case 2: Multiple Private Network Adapters

2.1. Default Status

Here is an example of 3 private networks eth1, eth6 and eth7 when Grid is up and running:

$ $GRID_HOME/bin/oifcfg getif
eth1 10.1.0.128 global cluster_interconnect
eth3 10.1.0.0 global public
eth6 10.11.0.128 global cluster_interconnect
eth7 10.12.0.128 global cluster_interconnect


$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1 10.1.0.128 PRIVATE 255.255.255.128
eth1 169.254.0.0 UNKNOWN 255.255.192.0
eth1 169.254.192.0 UNKNOWN 255.255.192.0
eth3 10.1.0.0 PRIVATE 255.255.255.128
eth6 10.11.0.128 PRIVATE 255.255.255.128
eth6 169.254.64.0 UNKNOWN 255.255.192.0
eth7 10.12.0.128 PRIVATE 255.255.255.128
eth7 169.254.128.0 UNKNOWN 255.255.192.0

Note: resource haip started four virtual private IPs, two on eth1, and one on eth6 and eth7


ifconfig
..
eth1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:10.1.0.168 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1122/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15176906 errors:0 dropped:0 overruns:0 frame:0
TX packets:10239298 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7929246238 (7.3 GiB) TX bytes:5768511630 (5.3 GiB)

eth1:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth1:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth6 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:10.11.0.188 Bcast:10.11.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1177/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7068185 errors:0 dropped:0 overruns:0 frame:0
TX packets:595746 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2692567483 (2.5 GiB) TX bytes:382357191 (364.6 MiB)

eth6:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.12.0.208 Bcast:10.12.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6435829 errors:0 dropped:0 overruns:0 frame:0
TX packets:314780 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2024577502 (1.8 GiB) TX bytes:172461585 (164.4 MiB)

eth7:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1


Instance alert.log (ASM and database):

Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
[name='eth1:1', type=1, ip=169.254.30.98, mac=00-16-3e-11-11-22, net=169.254.0.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface 'eth6:1' configured from GPnP for use as a private interconnect.
[name='eth6:1', type=1, ip=169.254.112.250, mac=00-16-3e-11-11-77, net=169.254.64.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface 'eth7:1' configured from GPnP for use as a private interconnect.
[name='eth7:1', type=1, ip=169.254.178.237, mac=00-16-3e-11-11-88, net=169.254.128.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface 'eth1:2' configured from GPnP for use as a private interconnect.
[name='eth1:2', type=1, ip=169.254.244.103, mac=00-16-3e-11-11-22, net=169.254.192.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Public Interface 'eth3' configured from GPnP for use as a public interface.
[name='eth3', type=1, ip=10.1.0.68, mac=00-16-3e-11-11-44, net=10.1.0.0/25, mask=255.255.255.128, use=public/1]
Picked latch-free SCN scheme 3

..
Cluster communication is configured to use the following interface(s) for this instance
169.254.30.98
169.254.112.250
169.254.178.237
169.254.244.103

Note: interconnect communication will use all four virtual private IPs; in case of network failure, as long as there is one private network adapter functioning, all four IPs will remain active.

2.2. When Private Network Adapter Fails

If one private network adapter fails, in this example eth6, virtual private IP on eth6 will be relocated automatically to a healthy adapter, and it is transparent to instances (ASM or database)

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1 10.1.0.128 PRIVATE 255.255.255.128
eth1 169.254.0.0 UNKNOWN 255.255.192.0
eth1 169.254.128.0 UNKNOWN 255.255.192.0
eth7 10.12.0.128 PRIVATE 255.255.255.128
eth7 169.254.64.0 UNKNOWN 255.255.192.0
eth7 169.254.192.0 UNKNOWN 255.255.192.0

Note: virtual private IP on eth6 subnet 169.254.64.0 relocated to eth7


ifconfig
..
eth1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:10.1.0.168 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1122/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15183840 errors:0 dropped:0 overruns:0 frame:0
TX packets:10245071 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7934311823 (7.3 GiB) TX bytes:5771878414 (5.3 GiB)

eth1:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth1:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.12.0.208 Bcast:10.12.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6438985 errors:0 dropped:0 overruns:0 frame:0
TX packets:315877 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2026266447 (1.8 GiB) TX bytes:173101641 (165.0 MiB)

eth7:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

2.3. When Another Private Network Adapter Fails

If another private network adapter is down, in this example eth1, virtual private IP on it will be relocated automatically to other healthy adapter with no impact on instances (ASM or database)

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth7 10.12.0.128 PRIVATE 255.255.255.128
eth7 169.254.64.0 UNKNOWN 255.255.192.0
eth7 169.254.192.0 UNKNOWN 255.255.192.0
eth7 169.254.0.0 UNKNOWN 255.255.192.0
eth7 169.254.128.0 UNKNOWN 255.255.192.0


ifconfig
..
eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.12.0.208 Bcast:10.12.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6441559 errors:0 dropped:0 overruns:0 frame:0
TX packets:317271 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2027824788 (1.8 GiB) TX bytes:173810658 (165.7 MiB)

eth7:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:4 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

2.4. When Private Network Adapter Restores

If private network adapter eth6 is restored, it will be activated automatically as virtual private IPs will be assigned to it:

$ $GRID_HOME/bin/oifcfg iflist -p -n
..
eth6 10.11.0.128 PRIVATE 255.255.255.128
eth6 169.254.128.0 UNKNOWN 255.255.192.0
eth6 169.254.0.0 UNKNOWN 255.255.192.0
eth7 10.12.0.128 PRIVATE 255.255.255.128
eth7 169.254.64.0 UNKNOWN 255.255.192.0
eth7 169.254.192.0 UNKNOWN 255.255.192.0


ifconfig
..
eth6 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:10.11.0.188 Bcast:10.11.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1177/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:398 errors:0 dropped:0 overruns:0 frame:0
TX packets:121 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:185138 (180.7 KiB) TX bytes:56439 (55.1 KiB)

eth6:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth6:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.12.0.208 Bcast:10.12.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6442552 errors:0 dropped:0 overruns:0 frame:0
TX packets:317983 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2028404133 (1.8 GiB) TX bytes:174103017 (166.0 MiB)

eth7:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1


HAIP Log File

Resource haip is managed by ohasd.bin, resource log is located in $GRID_HOME/log//ohasd/ohasd.log and $GRID_HOME/log//agent/ohasd/orarootagent_root/orarootagent_root.log

L1. Log Sample When Private Network Adapter Fails

In a multiple private network adapter environment, if one of the adapters fails:

  • ohasd.log

2010-09-24 09:10:00.891: [GIPCHGEN][1083025728]gipchaInterfaceFail: marking interface failing 0x2aaab0269a10 { host '', haName 'CLSFRAME_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x4d }
2010-09-24 09:10:00.902: [GIPCHGEN][1138145600]gipchaInterfaceDisable: disabling interface 0x2aaab0269a10 { host '', haName 'CLSFRAME_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x1cd }
2010-09-24 09:10:00.902: [GIPCHDEM][1138145600]gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x2aaab0269a10 { host '', haName 'CLSFRAME_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x1ed }

  • orarootagent_root.log

2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} failed to receive ARP request
2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} Assigned IP 169.254.112.250 no longer valid on inf eth6
2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} VipActions::startIp {
2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} Adding 169.254.112.250 on eth6:1
2010-09-24 09:09:57.719: [ USRTHRD][1129138496] {0:0:2} VipActions::startIp }
2010-09-24 09:09:57.719: [ USRTHRD][1129138496] {0:0:2} Reassigned IP: 169.254.112.250 on interface eth6
2010-09-24 09:09:58.013: [ USRTHRD][1082325312] {0:0:2} HAIP: Updating member info HAIP1;10.11.0.128#0;10.11.0.128#1
2010-09-24 09:09:58.015: [ USRTHRD][1082325312] {0:0:2} HAIP: Moving ip '169.254.112.250' from inf 'eth6' to inf 'eth7'
2010-09-24 09:09:58.015: [ USRTHRD][1082325312] {0:0:2} pausing thread
2010-09-24 09:09:58.015: [ USRTHRD][1082325312] {0:0:2} posting thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start {
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start }
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} HAIP: Moving ip '169.254.244.103' from inf 'eth1' to inf 'eth7'
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} pausing thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} posting thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start {
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start }
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} HAIP: Moving ip '169.254.178.237' from inf 'eth7' to inf 'eth1'
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} pausing thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} posting thread
2010-09-24 09:09:58.017: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start {
2010-09-24 09:09:58.017: [ USRTHRD][1116531008] {0:0:2} [NetHAWork] thread started
2010-09-24 09:09:58.017: [ USRTHRD][1116531008] {0:0:2} Arp::sCreateSocket {
2010-09-24 09:09:58.017: [ USRTHRD][1093232960] {0:0:2} [NetHAWork] thread started
2010-09-24 09:09:58.017: [ USRTHRD][1093232960] {0:0:2} Arp::sCreateSocket {
2010-09-24 09:09:58.017: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start }
2010-09-24 09:09:58.018: [ USRTHRD][1143847232] {0:0:2} [NetHAWork] thread started
2010-09-24 09:09:58.018: [ USRTHRD][1143847232] {0:0:2} Arp::sCreateSocket {
2010-09-24 09:09:58.034: [ USRTHRD][1116531008] {0:0:2} Arp::sCreateSocket }
2010-09-24 09:09:58.034: [ USRTHRD][1116531008] {0:0:2} Starting Probe for ip 169.254.112.250
2010-09-24 09:09:58.034: [ USRTHRD][1116531008] {0:0:2} Transitioning to Probe State
2010-09-24 09:09:58.034: [ USRTHRD][1093232960] {0:0:2} Arp::sCreateSocket }
2010-09-24 09:09:58.035: [ USRTHRD][1093232960] {0:0:2} Starting Probe for ip 169.254.244.103
2010-09-24 09:09:58.035: [ USRTHRD][1093232960] {0:0:2} Transitioning to Probe State
2010-09-24 09:09:58.050: [ USRTHRD][1143847232] {0:0:2} Arp::sCreateSocket }
2010-09-24 09:09:58.050: [ USRTHRD][1143847232] {0:0:2} Starting Probe for ip 169.254.178.237
2010-09-24 09:09:58.050: [ USRTHRD][1143847232] {0:0:2} Transitioning to Probe State
2010-09-24 09:09:58.231: [ USRTHRD][1093232960] {0:0:2} Arp::sProbe {
2010-09-24 09:09:58.231: [ USRTHRD][1093232960] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:09:58.231: [ USRTHRD][1093232960] {0:0:2} Arp::sProbe }

2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Arp::sAnnounce {
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Arp::sAnnounce }
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Transitioning to Defend State
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} VipActions::startIp {
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Adding 169.254.112.250 on eth7:2
2010-09-24 09:10:04.880: [ USRTHRD][1116531008] {0:0:2} VipActions::startIp }
2010-09-24 09:10:04.880: [ USRTHRD][1116531008] {0:0:2} Assigned IP: 169.254.112.250 on interface eth7

2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Arp::sAnnounce {
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Arp::sAnnounce }
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Transitioning to Defend State
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} VipActions::startIp {
2010-09-24 09:10:05.151: [ USRTHRD][1143847232] {0:0:2} Adding 169.254.178.237 on eth1:3
2010-09-24 09:10:05.151: [ USRTHRD][1143847232] {0:0:2} VipActions::startIp }
2010-09-24 09:10:05.151: [ USRTHRD][1143847232] {0:0:2} Assigned IP: 169.254.178.237 on interface eth1
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Arp::sAnnounce {
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Arp::sAnnounce }
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Transitioning to Defend State
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} VipActions::startIp {
2010-09-24 09:10:05.471: [ USRTHRD][1093232960] {0:0:2} Adding 169.254.244.103 on eth7:3
2010-09-24 09:10:05.471: [ USRTHRD][1093232960] {0:0:2} VipActions::startIp }
2010-09-24 09:10:05.471: [ USRTHRD][1093232960] {0:0:2} Assigned IP: 169.254.244.103 on interface eth7
2010-09-24 09:10:06.047: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop {
2010-09-24 09:10:06.282: [ USRTHRD][1129138496] {0:0:2} [NetHAWork] thread stopping
2010-09-24 09:10:06.282: [ USRTHRD][1129138496] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop }
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp {
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp {
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} Stopping ip '169.254.112.250', inf 'eth6', mask '10.11.0.128'
2010-09-24 09:10:06.288: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp }
2010-09-24 09:10:06.288: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp }
2010-09-24 09:10:06.288: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop {
2010-09-24 09:10:06.298: [ USRTHRD][1131239744] {0:0:2} [NetHAWork] thread stopping
2010-09-24 09:10:06.298: [ USRTHRD][1131239744] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop }
2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp {

2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp {
2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} Stopping ip '169.254.178.237', inf 'eth7', mask '10.12.0.128'
2010-09-24 09:10:06.299: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp }
2010-09-24 09:10:06.299: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp }
2010-09-24 09:10:06.299: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop {
2010-09-24 09:10:06.802: [ USRTHRD][1133340992] {0:0:2} [NetHAWork] thread stopping
2010-09-24 09:10:06.802: [ USRTHRD][1133340992] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop }
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp {
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp {
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} Stopping ip '169.254.244.103', inf 'eth1', mask '10.1.0.128'
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp }
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp }
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 0 ]: eth7 - 169.254.112.250
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 1 ]: eth1 - 169.254.178.237
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 2 ]: eth7 - 169.254.244.103
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 3 ]: eth1 - 169.254.30.98

Note: from above, even only NIC eth6 failed, there could be multiple virtual private IP movement among surviving NICs

  • ocssd.log

2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: [network] failed send attempt endp 0xe1b9150 [0000000000000399] { gipcEndpoint : localAddr 'udp://10.11.0.188:60169', remoteAddr '', numPend 5, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, req 0x2aaab00117f0 [00000000004b0cae] { gipcSendRequest : addr 'udp://10.11.0.189:41486', data 0x2aaab0050be8, len 80, olen 0, parentEndp 0xe1b9150, ret gipcretEndpointNotAvailable (40), objFlags 0x0, reqFlags 0x2 }
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos op : sgipcnValidateSocket
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos dep : Invalid argument (22)
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos loc : address not
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos info: addr '10.11.0.188:60169', len 80, buf 0x2aaab0050be8, cookie 0x2aaab00117f0
2010-09-24 09:09:58.314: [GIPCXCPT][1089964352] gipcInternalSendSync: failed sync request, ret gipcretEndpointNotAvailable (40)
2010-09-24 09:09:58.314: [GIPCXCPT][1089964352] gipcSendSyncF [gipchaLowerInternalSend : gipchaLower.c : 755]: EXCEPTION[ ret gipcretEndpointNotAvailable (40) ] failed to send on endp 0xe1b9150 [0000000000000399] { gipcEndpoint : localAddr 'udp://10.11.0.188:60169', remoteAddr '', numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, addr 0xe4e6d10 [00000000000007ed] { gipcAddress : name 'udp://10.11.0.189:41486', objFlags 0x0, addrFlags 0x1 }, buf 0x2aaab0050be8, len 80, flags 0x0
2010-09-24 09:09:58.314: [GIPCHGEN][1089964352] gipchaInterfaceFail: marking interface failing 0xe2bd5f0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaaac2098e0, ip '10.11.0.189:41486', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x6 }
2010-09-24 09:09:58.314: [GIPCHALO][1089964352] gipchaLowerInternalSend: failed to initiate send on interface 0xe2bd5f0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaaac2098e0, ip '10.11.0.189:41486', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x86 }, hctx 0xde81d10 [0000000000000010] { gipchaContext : host 'racnode1', name 'CSS_a2b2', luid '4f06f2aa-00000000', numNode 1, numInf 3, usrFlags 0x0, flags 0x7 }
2010-09-24 09:09:58.326: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0x2aaaac2098e0 { host '', haName 'CSS_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 1, flags 0x14d }
2010-09-24 09:09:58.326: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0xe2bd5f0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaaac2098e0, ip '10.11.0.189:41486', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x86 }
2010-09-24 09:09:58.327: [GIPCHALO][1089964352] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0xe2bd5f0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaaac2098e0, ip '10.11.0.189:41486', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0xa6 }
2010-09-24 09:09:58.327: [GIPCHGEN][1089964352] gipchaInterfaceReset: resetting interface 0xe2bd5f0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaaac2098e0, ip '10.11.0.189:41486', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0xa6 }
2010-09-24 09:09:58.338: [GIPCHDEM][1089964352] gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x2aaaac2098e0 { host '', haName 'CSS_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x16d }
2010-09-24 09:09:58.338: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created remote interface for node 'racnode2', haName 'CSS_a2b2', inf 'udp://10.11.0.189:41486'
2010-09-24 09:09:58.338: [GIPCHGEN][1089964352] gipchaWorkerAttachInterface: Interface attached inf 0xe2bd5f0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaaac2014f0, ip '10.11.0.189:41486', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x6 }
2010-09-24 09:10:00.454: [ CSSD][1108904256]clssnmSendingThread: sending status msg to all nodes

Note: from above, ocssd.bin won't fail as long as there's at least one private network adapter is working


L2. Log Sample When Private Network Adapter Restores

In a multiple private network adapter environment, if one of the failed adapters becomes restored:

  • ohasd.log

2010-09-24 09:14:30.962: [GIPCHGEN][1083025728]gipchaNodeAddInterface: adding interface information for inf 0x2aaaac1a53d0 { host '', haName 'CLSFRAME_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x41 }
2010-09-24 09:14:30.972: [GIPCHTHR][1138145600]gipchaWorkerUpdateInterface: created local bootstrap interface for node 'eyrac1f', haName 'CLSFRAME_a2b2', inf 'mcast://230.0.1.0:42424/10.11.0.188'
2010-09-24 09:14:30.972: [GIPCHTHR][1138145600]gipchaWorkerUpdateInterface: created local interface for node 'eyrac1f', haName 'CLSFRAME_a2b2', inf '10.11.0.188:13235'

  • ocssd.log

2010-09-24 09:14:30.961: [GIPCHGEN][1091541312] gipchaNodeAddInterface: adding interface information for inf 0x2aaab005af00 { host '', haName 'CSS_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x41 }
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created local bootstrap interface for node 'racnode1', haName 'CSS_a2b2', inf 'mcast://230.0.1.0:42424/10.11.0.188'
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created local interface for node 'racnode1', haName 'CSS_a2b2', inf '10.11.0.188:10884'
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaNodeAddInterface: adding interface information for inf 0x2aaab0035490 { host 'racnode2', haName 'CSS_a2b2', local (nil), ip '10.21.0.208', subnet '10.12.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x42 }
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaNodeAddInterface: adding interface information for inf 0x2aaab00355c0 { host 'racnode2', haName 'CSS_a2b2', local (nil), ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x42 }
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created remote interface for node 'racnode2', haName 'CSS_a2b2', inf 'mcast://230.0.1.0:42424/10.12.0.208'
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaWorkerAttachInterface: Interface attached inf 0x2aaab0035490 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaab005af00, ip '10.12.0.208', subnet '10.12.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created remote interface for node 'racnode2', haName 'CSS_a2b2', inf 'mcast://230.0.1.0:42424/10.11.0.188'
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaWorkerAttachInterface: Interface attached inf 0x2aaab00355c0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaab005af00, ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:31.437: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0x2aaab00355c0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaab005af00, ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:31.437: [GIPCHALO][1089964352] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaab00355c0 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaab005af00, ip '10.11.0.188', subnet '10.11.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x66 }
2010-09-24 09:14:31.446: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0x2aaab0035490 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaab005af00, ip '10.12.0.208', subnet '10.12.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:31.446: [GIPCHALO][1089964352] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaab0035490 { host 'racnode2', haName 'CSS_a2b2', local 0x2aaab005af00, ip '10.12.0.208', subnet '10.12.0.128', mask '255.255.255.128', numRef 0, numFail 0, flags 0x66 }


Miscellaneous

It's NOT supported to disable HAIP or stop HAIP while the cluster is up and running, however:

1. The feature is disabled in 11.2.0.2/11.2.0.3 if Sun Cluster exists

2. The feature does not exist in Windows 11.2.0.2/11.2.0.3

3. The feature is disabled in 11.2.0.2/11.2.0.3 if Fujitsu PRIMECLUSTER exists


4. With the fix of bug 11077756 (fixed in 11.2.0.2 GI PSU6, 11.2.0.3), HAIP will be disabled if it fails to start while running root script (root.sh or rootupgrade.sh), for more details, refer to Section bug 11077756

Known Issues

Bug 12425730

Issue: HAIP fails to start, gipcd.log shows rank 0 or "-1" for private network

Fixed in: 11.2.0.3 and onward, refer to note 1374360.1 for details.

bug 12674817

@ Bug 10370797 bug 13050540 bug 13347646

Issue: HAIP fails to start if root script (root.sh or rootupgrade.sh) is executed via sudo

Symptom:

  • Output of root script:

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode1'
CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'racnode1' failed

  • $GRID_HOME/log//agent/ohasd/orarootagent_root/orarootagent_root.log

2010-12-04 17:19:54.893: [ USRTHRD][2084] {0:3:37} failed to create arp
2010-12-04 17:19:54.893: [ USRTHRD][2084] {0:3:37} (null) category: -2, operation: ioctl, loc: bpfopen:2,os, OS error: 14, other:

OR

2011-09-29 16:44:46.770: [ USRTHRD][3600] {0:3:14} failed to create arp
2011-09-29 16:44:46.771: [ USRTHRD][3600] {0:3:14} (null) category: -2, operation: open, loc: bpfopen:1,os, OS error: 2, other:

OR

2011-09-29 16:44:46.770: [ USRTHRD][3600] {0:3:14} failed to create arp
2011-09-29 16:44:46.771: [ USRTHRD][3600] {0:3:14} (null) category: -2, operation: open, loc: bpfopen:1,os, OS error: 22, other:

OR

2011-11-03 11:03:01.217: [ USRTHRD][25] {0:0:166} (null) category: -2, operation: open, loc: devopen:1,os, OS error: 2, other:


Solution/Workaround:

It's known on AIX and Solaris that
command executed via sudo etc may not have full root environment, which could cause HAIP startup failure.

The solution is to execute root script (root.sh or rootupgrade.sh) as real root user directly. If root script already failed, it may fail with same error while re-running, and the workaround is to reboot the node and run root script as root user directly.

On AIX, alternative workaround is to execute "/usr/sbin/tcpdump -D" as root and verify that the following exists before re-running root script:

ls -ltr /dev/bpf*
cr-------- 1 root system 42, 0 Oct 03 10:32 /dev/bpf0
..


Bug 10332426

Issue: HAIP fails to start while running rootupgrade.sh

Symptom:

  • Output of root script:

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode1'
CRS-5017: The resource action "ora.cluster_interconnect.haip start"
encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'racnode1' failed

  • $GRID_HOME/log//gipcd/gipcd.log

2010-12-12 09:41:35.201: [ CLSINET][1088543040] Returning NETDATA: 0 interfaces
2010-12-12 09:41:40.201: [ CLSINET][1088543040] Returning NETDATA: 0 interfaces

Solution:

The cause is mismatch of private network information in OCR and on OS, output of the following should be consistent with each other regarding network adapter name, subnet and netmask - see note 1296579.1 for what to check.

oifcfg iflist -p -n
oifcfg getif
ifconfig


Bug 10363902

Issue: GIPC HA disabled or HAIP fails to start if cluster interconnect is Infiniband or any other network hardware that has hardware address (MAC) longer than 6 bytes

Fixed in: 11.2.0.3 for Linux and Solaris

Symptom:

  • Output of root script:

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode1'
CRS-5017: The resource action "ora.cluster_interconnect.haip start"
encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'racnode1' failed

  • $GRID_HOME/log//gipcd/gipcd.log

2010-12-07 13:23:08.560: [ USRTHRD][3858] {0:0:62} Arp::sCreateSocket {
2010-12-07 13:23:08.560: [ USRTHRD][3858] {0:0:62} failed to create arp
2010-12-07 13:23:08.561: [ USRTHRD][3858] {0:0:62} (null) category: -2,
operation: ssclsi_aix_get_phys_addr, loc: aixgetpa:4,n, OS error: 2, other:


2010-12-30 10:52:37.373: [ USRTHRD][15] {0:0:124} (null) category: -2, operation: ssclsi_dlpi_request, loc: dlpireq:8,na, OS error: 7, other:
2010-12-30 10:52:37.462: [ USRTHRD][15] {0:0:124} Arp::sCreateSocket {
2010-12-30 10:52:37.463: [ USRTHRD][15] {0:0:124} failed to create arp

# lanscan
Hardware Station Crd Hdw Net-Interface NM MAC HP-DLPI DLPI
Path Address In# State NamePPA ID Type Support Mjr#
..
LinkAgg1 0x0000004CFE8* 901 UP lan901 snap901 9 IB Yes 119
..
IPOIB0 0x0000004CFE8* 9000 UP lan9000 snap9000 5 IB Yes 119


Bug 10357258

Issue: many HAIP created after active NIC fails in IPMP

Fixed in: 11.2.0.3, 11.2.0.2 GI PSU3, interim patch 10357258 exists for 11.2.0.2, patch 11865154 for 11.2.0.2.1, affects Solaris only

Symptom:

  • ifconfig output:

nxge3:2: flags=21000843 mtu 1500 index 5
inet 169.254.20.88 netmask ffff0000 broadcast 169.254.255.255
nxge3:3: flags=21000842 mtu 1500 index 5
inet 169.254.20.88 netmask ffff0000 broadcast 169.254.255.255
..

Note the same HAIP shows up multiple times


Bug 10397652

Issue: HAIP does not failover even when private network experiences problem (i.e. switch port disabled or such) as OS is not providing reliable link information

Fixed in: 11.2.0.3

Workaround on AIX is to set "MONITOR" flag for all private network adapters

# ifconfig en1 monitor
# ifconfig en1
en1: flags=5e080863,2c0
inet 192.168.10.83 netmask 0xfffffc00 broadcast 192.168.11.255
inet 169.254.74.136 netmask 0xffff8000 broadcast 169.254.127.255
tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0


Bug 10253028

Issue: "oifcfg iflist -p -n" not showing HAIP on AIX

Fixed in: Expected behaviour on AIX

Symptom:

  • "oifcfg getif" output

en12 10.0.1.0 global public
en13 10.1.1.0 global cluster_interconnect

  • "ifconfig -a" output

en13: flags=5e080863,c0
inet 10.1.1.143 netmask 0xffffff00 broadcast 10.1.1.255
inet 169.254.228.154 netmask 0xffff0000 broadcast 169.254.255.255
tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
..
Note HAIP exists

  • v$cluster_interconnects

SQL> select * from gv$cluster_interconnects;

INST_ID NAME IP_ADDRESS IS_ SOURCE
---------- --------------- ---------------- ---
1 en13 169.254.228.154 NO
2 en13 169.254.55.162 NO

  • "oifcfg iflist -p -n" output

en12 10.0.1.0 PUBLIC 255.255.255.0
en13 10.1.1.0 PUBLIC 255.255.255.0

Note usually we expect HAIP to be listed here as well, however it's not listed on AIX


Bug 9795321

Issue: Wrong MTU size for HAIP on Solaris, refer to note 1290585.1 for more details.

Bug 11077756

Issue: Startup failure of HAIP fails root script, fix of the bug will allow root script to continue so HAIP issue can be worked later.

Fixed in: 11.2.0.2 GI PSU6, 11.2.0.3 and above

Note: the consequence is that HAIP will be disabled. Once the cause is identified and solution is implemented, HAIP needs to be enabled when there's an outage window. To enable, as root on ALL nodes:

# $GRID_HOME/bin/crsctl modify res ora.cluster_interconnect.haip -attr "ENABLED=1" -init
# $GRID_HOME/bin/crsctl stop crs
# $GRID_HOME/bin/crsctl start crs

Bug 12546712

Issue: ASM crashes as HAIP does not fail over when two or more private network fails , refer to note 1323995.1 for more details.

Note 1366211.1

Issue: HAIP fails to start if default gateway is configured for VLAN for private network on network switch

orarootagent_root.log shows: PROBE: conflict detected src { 169.254.12.247, }, target { 0.0.0.0, }

The solution is to remove default gateway setting on network switch for private network (VLAN), refer to note 1366211.1 for more details.

bug 10114953

Issue: Only one HAIP created on HP-UX

The bug is fixed in 11.2.0.4, patch 10114953 is required before 11.2.0.4 is released.

OS kernel parameter dlpi_max_ub_promisc must be set to greater than 1 for the patch to be effective.



No comments:

Post a Comment