При настроике heartbeat столкнулся с проблемой: вторая нода отказоустоичивого кластера считает первую упавшей и звбирает на себя ip адрес.
ha.cf такой:
Код: Выделить всё
logfacility local0
keepalive 5
deadtime 60
warntime 10
udpport 694
bcast eth0 eth1
auto_failback on
node cl1.cl
node cl2.cl
ping 172.16.0.1
Код: Выделить всё
auth 2
#1 crc
2 sha1 HI!
#3 md5 Hello!
Код: Выделить всё
cl1.cl IPaddr::172.16.0.240/24/eth0
[root@cl1 /]# uname -n
cl1.cl
[root@cl2 ~]# uname -n
cl2.cl
Вот лог с первой ноды (вообще они одинаковые)
Код: Выделить всё
Apr 24 13:28:13 cl1 heartbeat: [3044]: info: Pacemaker support: false
Apr 24 13:28:13 cl1 heartbeat: [3044]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
Apr 24 13:28:13 cl1 heartbeat: [3044]: info: **************************
Apr 24 13:28:13 cl1 heartbeat: [3044]: info: Configuration validated. Starting heartbeat 3.0.4
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: heartbeat: version 3.0.4
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: Heartbeat generation: 1366789115
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: glib: ping heartbeat started.
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: G_main_add_TriggerHandler: Added signal manual handler
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: G_main_add_TriggerHandler: Added signal manual handler
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: Local status now set to: 'up'
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: Link 172.16.0.1:172.16.0.1 up.
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: Status update for node 172.16.0.1: status ping
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: Link cl1.cl:eth0 up.
Apr 24 13:28:13 cl1 heartbeat: [3045]: info: Link cl1.cl:eth1 up.
Apr 24 13:29:14 cl1 heartbeat: [3045]: WARN: node cl2.cl: is dead
Apr 24 13:29:14 cl1 heartbeat: [3045]: info: Comm_now_up(): updating status to active
Apr 24 13:29:14 cl1 heartbeat: [3045]: info: Local status now set to: 'active'
Apr 24 13:29:14 cl1 heartbeat: [3045]: WARN: No STONITH device configured.
Apr 24 13:29:14 cl1 heartbeat: [3045]: WARN: Shared disks are not protected.
Apr 24 13:29:14 cl1 heartbeat: [3045]: info: Resources being acquired from cl2.cl.
Apr 24 13:29:14 cl1 harc(default)[3057]: info: Running /etc/ha.d//rc.d/status status
Apr 24 13:29:14 cl1 mach_down(default)[3091]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
Apr 24 13:29:14 cl1 mach_down(default)[3091]: info: mach_down takeover complete for node cl2.cl.
Apr 24 13:29:14 cl1 heartbeat: [3045]: info: mach_down takeover complete.
Apr 24 13:29:14 cl1 heartbeat: [3045]: info: Initial resource acquisition complete (mach_down)
Apr 24 13:29:14 cl1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.0.240)[3131]: INFO: Resource is stopped
Apr 24 13:29:14 cl1 heartbeat: [3058]: info: Local Resource acquisition completed.
Apr 24 13:29:14 cl1 harc(default)[3189]: info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
Apr 24 13:29:14 cl1 ip-request-resp(default)[3189]: received ip-request-resp IPaddr::172.16.0.240/24/eth0 OK yes
Apr 24 13:29:14 cl1 ResourceManager(default)[3212]: info: Acquiring resource group: cl1.cl IPaddr::172.16.0.240/24/eth0
Apr 24 13:29:14 cl1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.0.240)[3240]: INFO: Resource is stopped
Apr 24 13:29:14 cl1 ResourceManager(default)[3212]: info: Running /etc/ha.d/resource.d/IPaddr 172.16.0.240/24/eth0 start
Apr 24 13:29:14 cl1 IPaddr(IPaddr_172.16.0.240)[3325]: INFO: Using calculated netmask for 172.16.0.240: 255.255.255.0
Apr 24 13:29:14 cl1 IPaddr(IPaddr_172.16.0.240)[3325]: INFO: eval ifconfig eth0:0 172.16.0.240 netmask 255.255.255.0 broadcast 172.16.0.255
Apr 24 13:29:14 cl1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.0.240)[3299]: INFO: Success
Почему такое может просиходить ?