Perl 6 - the future is here, just unevenly distributed

IRC log for #fuel, 2017-02-16

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 masterjcool joined #fuel
00:31 byrdog55 joined #fuel
00:32 byrdog55 Is this a valid forum to post questions about current fuel openstack issues or should I go somewhere else?
00:34 byrdog55 I mean is this an official “fuel team only” channel?
00:41 Julien-zte joined #fuel
01:15 fandi joined #fuel
01:40 DavidRama joined #fuel
02:20 Julien-z_ joined #fuel
02:39 Openstuck joined #fuel
02:39 Openstuck hello, anyone can point me to documentation on the attribute of the yaml file when decoupling a role?
02:40 Openstuck the docuymentation on how to do it is fine, though I found nothing on how and what are dependencies on each different roles
03:06 zimboboyd joined #fuel
03:13 Openstuck joined #fuel
03:15 Openstuck joined #fuel
03:20 raunak joined #fuel
04:19 fatdragon joined #fuel
04:26 ipsecguy joined #fuel
04:35 mdnadeem joined #fuel
04:46 Julien-zte joined #fuel
04:49 Julien-zte joined #fuel
05:07 masterjcool joined #fuel
05:12 johnavp19891 joined #fuel
06:20 fatdragon joined #fuel
06:27 Guest60889 joined #fuel
06:45 Guest60889 left #fuel
07:01 dcs joined #fuel
07:24 korzen joined #fuel
07:56 Julien-zte joined #fuel
08:18 innis joined #fuel
08:22 astupnikov joined #fuel
08:23 astupnikov Hi there, fuel team. I would like to ask core members to review a fuel-library's backport that unlocks the swarm test: https://review.openstack.org/#/c/429662/
09:59 Julien-zte joined #fuel
10:12 Julien-zte joined #fuel
10:17 Julien-z_ joined #fuel
12:02 Egyptian joined #fuel
12:04 fandi joined #fuel
12:09 fandi joined #fuel
12:14 Julien-zte joined #fuel
12:31 GOro joined #fuel
13:09 fandi joined #fuel
13:32 innis exit
13:32 innis quit
14:09 goldenfri joined #fuel
14:21 Seafire joined #fuel
14:26 Seafire Hello everyone, I'm experiencing some problems with RabbitMQ on one of my three controllers, from time to time RabbitMQ just dies, this is what I see in the logs http://paste.openstack.org/show/599241/, when I do a pcs status I see that all nodes are online but most of the services on the controller where Rabbit has failed are down. When I manually restart rabbitmq I get this message: http://paste.openstack.org/show/599242/
14:26 Seafire I also tried to restart pacemaker but it doesn't restart RabbitMQ. Does anyone have any insights on this problem?
14:58 aglarendil Seafire: do you have any other symptoms of node misbehaviour? Pacemaker may stop the service due to high cpu load for example or other reasons. And then restart it when it gets back into normal state. Some situtation may lead to those services flapping
14:59 jose-phillips joined #fuel
15:04 Seafire aglarendil: No, not as far as I can tell...I'm using pretty powerful hardware 64GB of RAM and 24 cores CPUs
15:04 Seafire aglarendil: plus it's the same hardware for all three controllers
15:05 aglarendil you will need to examine logs of pacemaker on the master node
15:05 aglarendil in /var/log/remote
15:05 aglarendil there is lrmd.log/crmd.log/ocf-rabbimtq*log
15:05 aglarendil you should find info on whether it was pacemaker who shot the rabbitmq
15:05 aglarendil or whether it died on its own
15:13 Seafire aglarendil: I found this in the logs: http://paste.openstack.org/show/599254/
15:14 aglarendil so it was killed by pacemaker
15:14 aglarendil you need to check previous lines from this log
15:14 aglarendil and see crmd and pengine log as well
15:14 aglarendil to determine why pacemaker decided to stop it
15:15 Seafire aglarendil: ok thx
15:25 Seafire aglarendil: can't find any ofc-rabbitmq under /var/log/remote/
15:25 aglarendil just list ocf*
15:27 Seafire there are lots of ocf but no rabbit: http://paste.openstack.org/show/599258/
15:28 aglarendil then check lrmd.log
15:28 johnavp1989 joined #fuel
15:34 Seafire aglarendil: I don't see anything strange apart from what I linked you here http://paste.openstack.org/show/599254/, before this everything looks the same
15:35 aglarendil there clearly should be a reason why pacemaker stopped it. check pengine log and crmd.log for rabbimq string
15:37 Seafire this is what I see in crmd.log at the time of the crash: http://paste.openstack.org/show/599260/
15:38 aglarendil so it decided to stop the node services completely. there could be 2 reasons: 1. some of the system monitors configured for pacemaker, e.g. free disk size triggered an alarm 2. there was a network partitioning
15:39 aglarendil check corosync.log whether there are events of the node leaving the cluster
15:39 Seafire pengine log just refers to pacemakers
15:40 Seafire corosync.log is filled with : notice:    [TOTEM ] orf_token_rtr Retransmit List: 4910
15:40 Seafire but no errors
15:44 Seafire aglarendil: I'll try to look more into the logs later, now I'm off to a meeting. In the meantime thank you for your help
15:44 aglarendil ur welcome
15:59 xarses_ joined #fuel
16:09 fatdragon joined #fuel
16:19 benone joined #fuel
17:22 derrickb left #fuel
19:07 ipsecguy_ joined #fuel
19:46 DeMiNe0 joined #fuel
20:35 francois1 joined #fuel
22:07 Julien-zte joined #fuel

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary