Perl 6 - the future is here, just unevenly distributed

IRC log for #fuel, 2013-12-17

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
23:15 teran_ joined #fuel
23:18 IlyaE joined #fuel
23:22 albionandrew joined #fuel
23:28 teran joined #fuel
00:24 Shmeeny joined #fuel
00:24 teran joined #fuel
00:50 teran_ joined #fuel
02:14 teran joined #fuel
02:15 e0ne joined #fuel
02:21 xarses joined #fuel
02:48 vkozhukalov joined #fuel
02:50 rmoe joined #fuel
03:28 ArminderS joined #fuel
03:40 rongze joined #fuel
04:49 SergeyLukjanov joined #fuel
04:55 rongze joined #fuel
04:59 rongze_ joined #fuel
05:06 anotchenko joined #fuel
05:55 mihgen joined #fuel
06:14 ArminderS joined #fuel
06:14 teran joined #fuel
06:15 rongze joined #fuel
06:48 anotchenko joined #fuel
06:54 ArminderS weird thing, from the iso i created yesterday, in the fuel menu, its displaying the password i'm entering in "Root Password" instead of showing ***
07:16 ArminderS ceph rbd for ephemeral...wow
07:16 ArminderS love you guys
07:21 anotchenko joined #fuel
07:48 xarses and live migrations
07:49 xarses ArminderS: ^
07:57 SteAle joined #fuel
08:02 teran joined #fuel
08:06 mihgen joined #fuel
08:11 mrasskazov joined #fuel
08:14 e0ne joined #fuel
08:26 vkozhukalov joined #fuel
08:38 wputra joined #fuel
08:39 anotchenko joined #fuel
08:39 wputra hi all
08:39 wputra i installing fuel 3.2.1 using quantum GRE segmentation
08:40 wputra but there are many warning like this
08:40 wputra p_quantum-dhcp-agent_monitor_30000 (node=node-97, call=2395, rc=1, status=complete): unknown error
08:41 wputra how i can fix this, so quantum service always stable?
08:58 evgeniyl joined #fuel
09:02 mihgen joined #fuel
09:08 SergeyLukjanov joined #fuel
09:09 SergeyLukjanov joined #fuel
09:23 AndreyDanin joined #fuel
09:26 ArminderS right xarses
09:26 ArminderS the controllers are done
09:26 ArminderS lets see how it goes
09:26 ArminderS i wonder it starts building ceph nodes & compute nodes at same time
09:27 ArminderS until ceph osds are available, won't it fail compute if we are using ceph rbd for ephemeral?
09:28 ArminderS or just the presence of ceph mons are fine for this?
09:33 mihgen_ joined #fuel
09:39 bas joined #fuel
09:39 rvyalov joined #fuel
09:41 bas hi! I had a question about fuel, is it also possible to setup fuel itself in a HA manner? e.g. is it possible to have two fuel masters for instance (active/passive) or some other failover setup?
09:47 teran joined #fuel
09:51 vkozhukalov joined #fuel
09:57 SergeyLukjanov joined #fuel
09:57 teran joined #fuel
10:03 xdeller joined #fuel
10:06 wputra bas, i dont think its possible
10:08 wputra because when fuel deploy the node via pxe, it will distract each other
10:14 bas ok, that's too bad
10:16 bas are there any other best practices however for when the fuel master node disappears?
10:17 bas i mean, isn't having a single fuel master node a source for a single point of failure?
10:29 teran joined #fuel
10:52 vkozhukalov joined #fuel
11:08 ruhe joined #fuel
11:09 anotchenko joined #fuel
11:22 mihgen joined #fuel
11:22 anotchenko joined #fuel
11:26 SergeyLukjanov joined #fuel
11:37 SergeyLukjanov joined #fuel
11:43 ruhe joined #fuel
11:44 rongze joined #fuel
11:45 rongze_ joined #fuel
11:46 AndreyDanin bas, You're right. we want to change it in future.
12:05 teran_ joined #fuel
12:12 anotchenko joined #fuel
12:20 miguitas joined #fuel
12:27 anotchenko joined #fuel
12:32 sanek joined #fuel
12:34 rongze joined #fuel
12:52 Vidalinux joined #fuel
12:58 SergeyLukjanov joined #fuel
13:09 e0ne joined #fuel
13:14 MiroslavAnashkin wputra: If this message just a warning - it is normal. Quantum spends 1-2 minutes to reconfigure network settings and sends such warnings as keepalive messages
13:14 MiroslavAnashkin wputra: I mean quantum/neutron agent configuration script
13:15 bas AndreyDanin: thanks for your answer!
13:15 MiroslavAnashkin wputra: But please be prepared - GRE segmentation may work slow on some hardware/drivers or in some curcumstances
13:15 ruhe joined #fuel
13:18 rongze joined #fuel
13:25 e0ne joined #fuel
13:26 anotchenko joined #fuel
13:28 xdeller joined #fuel
13:29 sanek joined #fuel
13:35 sanek joined #fuel
13:39 rongze_ joined #fuel
13:48 rongze joined #fuel
13:49 rongze_ joined #fuel
13:56 tsduncan_ joined #fuel
14:04 aglarendil_ joined #fuel
14:07 e0ne_ joined #fuel
14:10 anotchenko joined #fuel
14:16 wputra MiroslavAnashkin: i dont think it is just warning, because the p_quantum-dhcp-agent, p_quantum-l3-agent and p_quantum-openvswitch-agent really going down
14:17 wputra MiroslavAnashkin: sometimes the quantum service cant up automatically, so i must cleanup the resources
14:17 wputra MiroslavAnashkin: yes, i hope my hardware good enough to run GRE
14:22 wputra anyway, is it possible to split quantum controller in others server?
14:27 MiroslavAnashkin wputra: How many controllers you have in your installation?
14:27 wputra MiroslavAnashkin: i have 3 controllers
14:28 MiroslavAnashkin wputra: How did you determined the p_wuantum_dhcp_agent and p_quantum_l3_agent are not started - via services or with crm command?
14:29 wputra via "crm status"
14:29 wputra MiroslavAnashkin: and also i compared with " quantum agent-list"
14:30 MiroslavAnashkin wputra: and the last question ;-) Are you trying to start up  p_quantum-dhcp-agent and p_quantum-l3-agent on all 3 controllers at one time?
14:31 MiroslavAnashkin There should be single instance of  p_quantum-dhcp-agent and p_quantum-l3-agent for the whole OpenStack cluster at single time. If one of these agents goes down - Pacemaker migrates it to other node.
14:32 wputra MiroslavAnashkin: no, the dhcp agent n l3 just running in one node
14:32 wputra MiroslavAnashkin: yes, i think so
14:33 wputra MiroslavAnashkin: but sometimes pacemaker dont bring it up again
14:33 MiroslavAnashkin wputra: But there should be p_quantum-openvswitch-agent clone running on each controller.
14:33 wputra MiroslavAnashkin: yes, but only one node running in one time
14:34 MiroslavAnashkin wputra: So, in you case you should have 3 p_quantum-openvswitch-agent clones running - one per controller, but only single instance of p_quantum-dhcp-agent of p_quantum-l3-agent, running on one of controllers
14:36 wputra yes, the crm status show me like that
14:36 wputra it is normal?
14:36 MiroslavAnashkin wputra: yes
14:36 wputra but sometimes, after several minutes, the p_quantum-openvswitch-agent in controller node failed
14:37 wputra and the warning like before appear
14:37 MiroslavAnashkin wputra: Only single DHCP and L3 agent instances allowed per whole network, it is the way Quantum/Neutron designed.
14:38 wputra in grizzly: yes, i agree
14:38 wputra i think the failure of p_quantum-openvswitch-agent have affect to p_quantum-dhcp-agent and l3-agent
14:39 wputra lets say, i have node-1, node-2, node-3
14:39 wputra the p_quantum-dhcp-agent running in all node, right?
14:40 wputra sorry, i mean p_quantum-openvswitch-agent. running in all node
14:41 wputra but the dhcp & l3 agent just run in one node at same time
14:41 wputra let say, node-2 and node-3
14:41 wputra nah, sometimes ovs agent in node-3 down
14:42 wputra it is also make l3 agent in node-3 down
14:43 wputra when pacemaker restart the ovs agent in node-3, l3 agent up again
14:43 wputra but it happen several times
14:43 wputra not only in node-3, it happened in node-1, node-2 randomly
14:44 wputra when node-2 fail, the dhcp agent fail too
14:44 ruhe joined #fuel
14:45 wputra this event happen repeatly, until at one time, pacemaker fail to bring the agent up too
14:45 MiroslavAnashkin wputra: Please run `crm status` at the time p_quantum-openvswitch-agent is down and share its output in http://paste.openstack.org/
14:47 wputra http://paste.openstack.org/show/55148/
14:49 wputra actually i use "watch crm status" to monitoring
14:53 wputra 1 node failed several minutes later: http://paste.openstack.org/show/55149/
14:56 MiroslavAnashkin wputra: These are not warnings...
14:58 wputra and as a result, we have failed dhcp n l3 agent: http://paste.openstack.org/show/55150/
15:00 wputra maybe the p_quantum-openvswitch-agent cluster in controller node (managed by corosync&pacemaker) not stable?
15:02 mihgen joined #fuel
15:27 SteAle joined #fuel
15:29 IlyaE joined #fuel
15:36 kpimenova_ joined #fuel
15:39 Shmeeny joined #fuel
15:41 Shmeeny joined #fuel
16:17 MiroslavAnashkin joined #fuel
16:28 anotchenko joined #fuel
16:53 rongze joined #fuel
16:54 SergeyLukjanov joined #fuel
16:59 angdraug joined #fuel
17:22 ruhe joined #fuel
17:27 rmoe joined #fuel
17:40 rongze joined #fuel
17:44 vkozhukalov joined #fuel
17:54 rongze joined #fuel
18:00 xarses joined #fuel
18:15 dan_a joined #fuel
18:27 AndreyDanin bogdando, xarses are you sure we really need it? https://review.openstack.org/#/c/61966/
18:27 AndreyDanin Sorry, wrong chat
18:37 vkozhukalov joined #fuel
19:13 rongze joined #fuel
19:23 ruhe joined #fuel
19:30 rmoe joined #fuel
19:35 e0ne joined #fuel
19:42 Fecn joined #fuel
19:45 Fecn Hi Folks - I think we're hitting an issue with broadcom nics combined with vlan tagging - Is this a known issue, and is there a fix? (Fuel 3.2.1, Neutron+ GRE and Neutron+Vlan segmentation both hit it)
19:46 Fecn The issue presents with messages such as "Dec 14 00:02:48 node-24 kernel: kvm: 4478: cpu0 unhandled wrmsr: 0x684 data 0" being logged to /var/log/syslog - my googling seems to indicate that this is a broadcom related issue
19:46 Fecn I would really love it if you guys already have a fix for this... as it is looking like we're going to end up going with vmware because of it :-(
19:55 MiroslavAnashkin Cannot even find such bug in launchpad
19:58 Fecn I found details on this (non-opentsack, but still KVM) post.. http://forum.proxmox.com/threads/5046-Error-kvm-cpu0-unhandled-wrmsr-amp-unhandled-rdmsr
19:58 Fecn Description of the problem perfectly matches what we've been seeing
19:58 Fecn everything cool for 5 mins... then networking drops for 30 secs.. then comes back again
19:59 Fecn I think this was most likely the cause of our kilobits-per-second throughput on GRE tunnels... and the cause of our dropouts with vlan segmentation
19:59 Fecn Apparently there's a fixed version of the broadcom driver in the 3.2 series kernels
20:00 Fecn MiroslavAnashkin: BTW - thanks for your suggestion on Friday about injecting resource params into fuel (not that I understand how to do that)
20:12 MiroslavAnashkin Fecn: What kernel version do you use?
20:14 Fecn The centos image in 3.2.1 seems to have 2.6.32 as a kernel
20:15 MiroslavAnashkin Fecn: BTW, there is patch for 2.6.32-220 kernel. We may ask OSCI to apply this patch and rebuild kernel
20:15 MiroslavAnashkin https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=816308
20:16 MiroslavAnashkin Fecn: Patch is trivial
20:17 Fecn Is there an easy way that I can upgrade the kernel on the deployed nodes? (add to yum repo and then yum install it).. etc
20:18 MiroslavAnashkin Fecn: If there is new kernel version in our repos - then yes. Otherwise - rebuild kernel from patched source:(
20:21 MiroslavAnashkin There is 2.6.32-358 kernel, it may have this patch included, http://download.mirantis.com/fuelweb-repo/3.2/centos/os/x86_64/Packages/
20:24 IlyaE joined #fuel
20:27 Fecn MiroslavAnashkin: Excellent - I'll give that a whirl right now
20:59 Fecn Aha.. that kernel is actually slightly older than the one which 3.2.1 installed already... that one is 2.6.2-358-123 (where as the one from 3.2 ends in 118)
20:59 Fecn so presumably if there was a patch in there it would be in 3.2.1 too
20:59 Fecn Guess I could build one though
21:31 pcatalog joined #fuel
21:58 mutex joined #fuel
21:58 mutex hi
21:58 mutex I seem to have gotten into a strange state with my cluster manager
21:58 e0ne joined #fuel
21:58 mutex vip__public_old resource is not working
21:58 mutex i'm not super familiar with crm so i'm poking around in the dark
21:58 mutex but I cannot get the public IP up with commands like crm resource start vip__public_old
21:59 mutex what else should I be doing ?
22:16 rascez joined #fuel
22:19 teran joined #fuel
22:35 miroslav_ joined #fuel

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary