Perl 6 - the future is here, just unevenly distributed

IRC log for #fuel, 2014-10-17

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:29 alex_didenko joined #fuel
00:36 rmoe joined #fuel
00:38 alex_didenko joined #fuel
00:52 emagana joined #fuel
00:53 Rajbir joined #fuel
00:54 adanin joined #fuel
00:57 alex_didenko joined #fuel
01:07 klonhj joined #fuel
01:11 xarses joined #fuel
01:37 Longgeek joined #fuel
01:51 alex_didenko joined #fuel
01:55 emagana joined #fuel
01:55 adanin joined #fuel
02:08 emagana joined #fuel
02:14 teran joined #fuel
02:20 alex_didenko joined #fuel
02:53 adanin joined #fuel
03:00 emagana joined #fuel
03:22 Longgeek joined #fuel
03:54 adanin joined #fuel
04:03 teran joined #fuel
04:18 ArminderS joined #fuel
04:55 adanin joined #fuel
05:04 teran joined #fuel
05:26 emagana joined #fuel
05:30 anand_ts joined #fuel
05:31 kaliya_ joined #fuel
05:38 emagana joined #fuel
05:46 emagana joined #fuel
05:55 anand_ts hello all, any solution for this https://ask.openstack.org/en/question/51090/the-screen-shows-cannot-get-disk-parameters-while-deploying-mirantis-openstack-using-fuel/ ? I face similar problem. First node (controller) deployment successful but when installing compute node , it shows error.
06:05 teran joined #fuel
06:10 syt joined #fuel
06:27 baboune joined #fuel
06:27 baboune hello
06:28 baboune any updates on the missing nova server-group-add capabilities?
06:28 baboune see https://ask.openstack.org/en/question/30433/why-are-nova-server-group-apis-missing-in-rdo-icehouse-installation/
06:31 pal_bth joined #fuel
06:34 sc-rm left #fuel
06:34 jpf joined #fuel
06:35 hyperbaba_ joined #fuel
06:38 sc-rm joined #fuel
06:38 e0ne joined #fuel
06:47 e0ne joined #fuel
06:52 e0ne joined #fuel
06:59 baboune what is a reasonable size for the fuel /dev/mapper/os-var   partition?  at 50 G it is already 80% full
07:05 pasquier-s joined #fuel
07:06 teran joined #fuel
07:11 pasquier-s joined #fuel
07:13 ArminderS- joined #fuel
07:16 teran joined #fuel
07:52 e0ne joined #fuel
07:53 saibarspeis joined #fuel
08:17 kaliya baboune: it really depends on your deployment purposes. If you discover and provision many nodes, many remote logs will be stored them with rsyslog
08:17 kaliya baboune: but you can clean. dockerctl stop rsyslog. remove not relevant logs, and dockerctl start
08:18 sc-rm kaliya: Now that I have zabbix installed on a node, it reports the memcache to be down on both controller nodes, but memcache is running fine
08:19 kaliya sc-rm: zabbix reports? with which log message
08:19 skonno joined #fuel
08:20 sc-rm kaliya: http://snag.gy/CUoOD.jpg in http://172.16.4.5/zabbix/overview.php
08:21 ddmitriev left #fuel
08:25 ddmitriev joined #fuel
08:29 kaliya sc-rm: did you deploy on nova-network or neutron?
08:29 sc-rm neutron
08:30 skonno joined #fuel
08:32 kaliya sc-rm: because we have a similar issue already https://bugs.launchpad.net/fuel/+bug/1365171
08:33 azemlyanov joined #fuel
08:39 sc-rm kaliya: did the fix and it works
08:41 holser joined #fuel
08:41 Longgeek joined #fuel
08:42 dmitryme joined #fuel
08:48 sc-rm kaliya: Now that I tried to update python-oslo.messaging on all nodes I still have the problem with “connection_closed_abruptly” for rabbitmq and health check failing for “Create volume and attach it to instance”
08:48 sc-rm kaliya: If I run the health check for this entry alone, some times it passes, but when run with the rest it fails for like 99 out of 100 times
08:53 syt left #fuel
08:56 akupko joined #fuel
09:02 HeOS joined #fuel
09:02 adanin joined #fuel
09:20 baboune kaliya: what is safe to delete as part of the logs? /docker-logs/remote ?
09:20 kaliya sc-rm: did you update the oslo package? Sorry I don't remember
09:20 kaliya baboune: yes
09:23 baboune kaliya: is there a way to set an upper limit to the log size files? or should I monitor the disk space and clean up when needed?
09:23 sc-rm kaliya: Yep, I did so to this version http://fuel-repository.mirantis.com/fwm/5.1.1/ubuntu/pool/main/python-oslo.messaging_1.3.0-fuel5.1~mira5_all.deb
09:28 kaliya baboune: unfortunately, not... we're fixing these kinds of issues on logs
09:28 kaliya sc-rm: did you restart the relevant openstack services then?
09:29 sc-rm kaliya:  yep, restarted the entire openstack env. but I can try again
09:30 kaliya sc-rm: is this bug reflecting your case? https://bugs.launchpad.net/fuel/+bug/1371906
09:35 geekinutah joined #fuel
09:37 baboune it is not a big pb...
09:37 baboune thx
09:39 sc-rm kaliya: It might be, but I don’t have “AMQP server on 192.168.0.2:5672 is unreachable: timed out. Trying again in 5 seconds.” in the rabbitmq logs, only the connection_closed_abruptly
09:39 * merdoc dives into chat logs.
09:39 kaliya hi merdoc
09:39 merdoc hi!
09:39 kaliya sc-rm: seems related to oslo as always :)
09:40 sc-rm kaliya: has oslo been a common problem for some time?
09:40 merdoc so, CoW for raw will be available in 6.0? is it possible to get patch for 5.1?
09:42 kaliya sc-rm: it's used kind of shared library
09:45 kaliya sc-rm: are you in HA?
09:53 sc-rm kaliya: Yep HA
09:54 kaliya sc-rm: your cluster is up right? rabbitmqctl cluster_status
09:55 sc-rm kaliya: rebooting everything right now, so I’ll come back, when it’s online again
09:56 f13o joined #fuel
10:48 sc-rm kaliya: http://paste.openstack.org/show/121608/
10:49 sc-rm kaliya: After this reboot, healt check seems stable at passing right now, but still have the warnings in rabbit log
10:51 alexz joined #fuel
11:28 baboune backup failed: tar: Removing leading `/' from member names Compressing archives... tar: /var/backup/fuel/backup_2014-10-17_1025/fuel_backup_2014-10-17_1025.tar: Wrote only 6144 of 10240 bytes tar: Error is not recoverable: exiting now ^X^X^C^C^C^CChunk: 51%
11:29 baboune Any way to recover files used before the error?
11:52 evg baboune: What's happened to your files?
11:53 harybahh joined #fuel
11:54 baboune kaliya: I am not sure... during the compression something failed
11:55 baboune the thing is that the only thing available in the directory is the lrz
11:55 baboune and that is corrupted
12:08 sc-rm kaliya: After stresstesting with continuous running healt check, I don’t see the problem anymore. So the oslo upgrade seems to work from my perspective :-)
12:08 skonno joined #fuel
12:12 baboune another pb... there seeems to be one inode that is failing within one of the docker container
12:12 baboune fuel/postgres_5.1:latest
12:13 baboune we see lots of write errors
12:14 baboune it mentions dm-5 in the errors ( /etc/mapper/docker-253:2-290-ff720279a8fcc16643946a29c22e6ea27be5c615835b0c3ed336c549b6da8233 is linked to /etc/dm-5. Is this recoverable?
12:15 evg baboune: what did you try to do with tar? It doesn't breake files.
12:17 baboune evg: I did nothing
12:17 evg baboune: what sort of errors?
12:17 baboune evg: I ran the "docketctl backup" commmand
12:19 evg baboune: I've got it now, let me see
12:20 baboune messages:Oct 17 09:12:26 kds-cmc-fuel-02 kernel: EXT4-fs error (device dm-5): __ext4_ext_check_block: bad header/extent in inode #150545: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
12:20 baboune so that is what I see from the syslog
12:22 baboune basically the sequence of events. 1) the disk filled itself with logs 2) I got a bus error when doing a yum command to add the nfs-utils 3) dockerctl stop ryslog, cleaned up the logs 4) tried to do a mount and write to an NFS drive 5) add some access pbs 6) dockerctl backup backup towards default destination to try
12:23 baboune then got the error: backup failed: tar: Removing leading `/' from member names Compressing archives... tar: /var/backup/fuel/backup_2014-10-17_1025/fuel_backup_2014-10-17_1025.tar: Wrote only 6144 of 10240 bytes tar: Error is not recoverable: exiting now ^X^X^C^C^C^CChunk: 51%
12:23 baboune plus now I see the inode error mentioned..
12:23 baboune kernel: EXT4-fs error (device dm-5): __ext4_ext_check_block: bad header/extent in inode #150545: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
12:23 saibarspeis joined #fuel
12:25 baboune joined #fuel
12:25 baboune can I fschk within the container?
12:32 evg baboune: I'm not sure it's possible
12:34 adanin joined #fuel
12:35 evg baboune: are you errors in in master's log?
12:35 evg baboune: i mean outside conteiners?
12:36 odyssey4me joined #fuel
12:38 odyssey4me joined #fuel
12:39 pasquier-s_ joined #fuel
12:39 evg baboune: sorry, last qwestion. I see now
12:39 baboune yep.. weird pb...
12:40 odyssey4me Hi all - I see that the MTU for interfaces can be configured as per: http://docs.mirantis.com/openstack/fuel/fuel-5.1/reference-architecture.html#adjust-the-network-configuration-via-cli
12:40 baboune I could try another backup.
12:40 odyssey4me However, what I'm seeing is that none of the bridges or bonded interfaces inherit the MTU. How do I go about getting those configured with a higher MTU too?
12:41 baboune what is the size of a backup for a cluster of about 20 nodes?
12:41 pasquier-s__ joined #fuel
12:44 baboune evg:  can this be related: https://github.com/docker/docker/issues/7229?
12:45 evg baboune: yes
12:45 evg baboune: and thanks for the link
12:46 evg baboune: it seems I've seen such an issue berofe.
12:46 baboune evg: sigh...  Can I somehow recover? I have another host with the same isntall of Fuel 5.1, can I somehow just backup and restore the node manually?  I guess I mostly need to backup PG
12:48 evg baboune: to backup postgres - sure.
12:48 baboune evg: is it sufficiemt? Dont I also need the cobbler part?
12:49 kaliya odyssey4me: I don't know if you can configure MTU, but I will query our devs
12:49 kaliya odyssey4me: you mean to set the MTU for each of the bonded interfaces?
12:50 evg baboune: does it happen to the only one container?
12:50 baboune evg: seems it is always the same inode/msg
12:50 baboune evg: so I would say yes
12:50 odyssey4me kaliya - I've done eth0 and eth1... but the bond of them does not inherit the higher MTU, so I want to set it... and then there are bridges on the bond (storage, etc) which I want to set the higher MTU for.
12:51 pal_bth joined #fuel
12:56 evg baboune: haven't you tryed to restart the contailner? rebuild it?
12:57 baboune evg: well I rebooted the machine several times, and since the data in the DB is my existing production cluster, I dont know what happens if I "rebuild" it
12:58 baboune when the final compression step kicks in during a backup, when a failure occurs it might be a good idea not to delete the tar files
12:58 baboune I am not even sure that compressing is a good idea at all
13:02 evg baboune: sorry, will be away for a time..
13:40 MiroslavAnashkin baboune: Yes, looks exactly like this bug. We'll try to reproduce.
13:42 baboune MiroslavAnashkin: ok... I am trying to export the backup to an NFS mount point.  Anything I can do to recover the docker container?
13:47 evg baboune: I don't understand why deleting useless broken tar files is not a bad idea.
13:48 evg baboune: what exactly container is broken?
13:49 odyssey4me kaliya - any luck with getting an answer?
13:49 baboune evg: because in the case where the compression fails, then you lose all the tars and the generated back up info
13:49 baboune evg: plus compressing does not add any value
13:50 baboune evg: the pg container is broken
13:51 evg baboune: the worst thing from all
13:51 evg baboune: btw, how much free space do you have on master?
13:52 baboune evg: now I have 26G
13:52 baboune evg: but when I logged in this moring the /var partition was full since the logs were accumulating and filling it
13:53 baboune evg: I suspect that might have triggered the container inode issue... but it could also be a docker bug
13:53 harybahh joined #fuel
13:54 evg baboune: could it happen during the disk space shortage? What do you think?
13:55 baboune evg: there is monitoring of the disk usage on the node, so it might have happened during it, yes
13:55 baboune evg: it seems logical I think
14:03 evg baboune: why do think it's logical? bugs are not logical.
14:05 evg baboune: I'm going to reproduce it (if I'm lucky)
14:09 baboune evg: anything I can do to fix the pb within the container?
14:11 evg baboune: are you still able to dumb db?
14:12 baboune evg: I think the last "dockerctl backup" that is still running might complete
14:12 baboune evg: but I am unsure about dumping the db
14:13 baboune evg: the created db_backup folder ("dockerctl backup") is empty
14:14 baboune evg: how to dump it again?
14:15 brad[] hi all, if I've deployed a node as a ceph-osd can I then add compute functionality to it or do I have to tear the thing down? Risk of data loss implied here.
14:16 brad[] I haven't looked past the web UI mind you. The CLI may afford me this capability
14:16 brad[] but I thought I'd be lazy and ask first. :-)
14:23 HeOS joined #fuel
14:23 baboune evg: should the  "dockerctl backup" command create a valid /db_backup folder with content?
14:25 jobewan joined #fuel
14:26 evg baboune: I don't think your / has anought space for backup. You've said your last backup is broken.
14:27 evg baboune: I meant backup postgres db
14:28 MiroslavAnashkin baboune: Please check this bug as well - https://bugs.launchpad.net/fuel/+bug/1378327  There is fix for daily logrotate settings, proposed in the bug comments
14:29 MiroslavAnashkin brad[]: No, currently it is impossible to add or delete role to/from already deployed node. Only removet the whole node and re-deploy with new role set.
14:31 mattgriffin joined #fuel
14:31 mpetason joined #fuel
14:32 baboune evg: I am using an NFS mount to an external file storage
14:32 mattgriffin joined #fuel
14:32 baboune evg: but the db_backup is emmpty
14:32 mattgriffin joined #fuel
14:32 baboune evg: drwxrwxrwx 2 root       kds           2 Oct 17 14:55 db_backup -rw-r--r-- 1 4294967294 kds 16998656000 Oct 17 15:47 fuel_backup_2014-10-17_1320.tar -rw-r--r-- 1 4294967294 kds  3012015144 Oct 17 16:20 fuel_backup_2014-10-17_1320.tar.lrz
14:33 baboune evg: and one of the effect of compression is that it takes more space during the backup and u have no idea if the content is correct
14:34 baboune evg: in any case the generated db_backup on the NFS mount is empty, is that expected?
14:34 baboune evg: http://docs.mirantis.com/openstack/fuel/fuel-5.1/operations.html#howto-backup-and-restore-fuel-master does nto mention it
14:35 MiroslavAnashkin baboune: BTW, you may use pg_dump to backup postgres DB. And full DB dump is included to diagnostic snapshot. With this dump you may re-deploy the whole master node with the same network settings  and then re-populate DB from dump and continue working
14:37 baboune MiroslavAnashkin: what will happen to the cobbler info related to the nodes?  Wont that mess up the MAC and IPs and images?
14:38 MiroslavAnashkin No, Cobbler takes this info from Nailgun . You have only to re-sync cobbler
14:39 baboune MiroslavAnashkin: I would need a walkthrough... btw root@ff720279a8fc ~]# pg_dump pg_dump: [archiver (db)] connection to database "root" failed: FATAL:  role "root" does not exist, I need the login/pwd and connection
14:42 dhblaz joined #fuel
14:50 baboune is this the command for the dump: pg_dump -c -h  127.0.0.1 -U nailgun nailgun?
14:53 MiroslavAnashkin Backup DB
14:53 MiroslavAnashkin # `dockerctl shell postgres`
14:53 MiroslavAnashkin # `sudo -u postgres pg_dump > /var/www/nailgun/dump_file.sql`
14:53 MiroslavAnashkin This path /var/www/nailgun/ is cross-mounted between all the containers, so your dump appear in the root filesystem at the same path
14:53 MiroslavAnashkin # `exit`
14:53 MiroslavAnashkin Restore postgres DB from dump
14:53 MiroslavAnashkin Place your dump file to /var/www/nailgun/
14:53 MiroslavAnashkin # `dockerctl shell postgres`
14:53 MiroslavAnashkin # `sudo -u postgres psql nailgun -S -c 'drop schema public cascade; create schema public;'`
14:53 MiroslavAnashkin # `sudo -u postgres psql nailgun < /var/www/nailgun/dump_file.sql`
14:53 MiroslavAnashkin # `exit`
14:53 MiroslavAnashkin # `dockerctl restart nailgun`
14:53 MiroslavAnashkin # `dockerctl restart nginx`
14:53 MiroslavAnashkin # `dockerctl shell cobbler`
14:54 MiroslavAnashkin # `cobbler sync`
14:54 MiroslavAnashkin # `exit`
15:02 baboune MiroslavAnashkin: great! thx will try thi son Monday
15:03 baboune MiroslavAnashkin: on 'dockerctl backup'¨ should the "db_backup" contain the database? Or is that only a left over from previous fuel version?
15:03 brad[] MiroslavAnashkin: hrm ok.
15:04 MiroslavAnashkin Dockerctl backup stores the whole postgres container, with the database
15:04 baboune MiroslavAnashkin: ok so this created folder during the backup is not necessary?
15:04 MiroslavAnashkin It is recommended backup way
15:05 brad[] The other lazy question I have is, can fuel facilitate any kind of non-invasive upgrade of Openstack components from one version to the next? I could always remove nodes from one pool and add them to a new one, given sufficient spare hardware (min. 4 machines) but that doesn't seem like it would scale
15:06 MiroslavAnashkin brad[]: Planned in upcoming 6.0 Current 5.1 does only minor upgrades in scope of the same OpenStack release, say Icehouse 2014.1.1 to Icehouse 2014.1.2
15:07 MiroslavAnashkin baboune: which folder?
15:07 brad[] MiroslavAnashkin: I assume  that will have a lot to do with how openstack components themselves make themselves amenable to upgrades ... e.g. massive DB schema changes or dependencies on new subsystems etc...
15:07 brad[] But from my reading it seems they've been paying more attentoin to that of late
15:08 baboune MiroslavAnashkin: When the "dockertctl backpup" command starts, it creates a foldre''
15:08 baboune MiroslavAnashkin: a folder in the target path named db_backup
15:08 baboune MiroslavAnashkin: this folder is empty and reamins empty
15:08 MiroslavAnashkin Aah - if you don't need this folder - delete it
15:08 baboune MiroslavAnashkin: and it si confusing because you think it should contain the dump
15:09 MiroslavAnashkin Next time dockerctl should create new one
15:09 baboune MiroslavAnashkin: I did not create it... It gets created for me.  And since it is empty it male it look like the dump of the DB failed
15:09 MiroslavAnashkin Hmm. let me check...
15:11 jobewan joined #fuel
15:16 teran joined #fuel
15:19 youellet joined #fuel
15:36 blahRus joined #fuel
15:40 alex_didenko joined #fuel
15:41 skonno joined #fuel
15:46 emagana joined #fuel
15:51 alex_didenko left #fuel
15:57 MiroslavAnashkin Default dump path is in /var/backup/fuel folder
15:58 emagana joined #fuel
15:59 e0ne joined #fuel
16:00 emagana joined #fuel
16:02 e0ne joined #fuel
16:06 emagana joined #fuel
16:07 pasquier-s__ joined #fuel
16:09 rmoe joined #fuel
16:12 e0ne joined #fuel
16:34 xarses joined #fuel
16:37 syt joined #fuel
16:43 ArminderS joined #fuel
16:46 pasquier-s joined #fuel
16:56 skonno joined #fuel
17:01 pasquier-s joined #fuel
17:10 e0ne joined #fuel
17:19 syt joined #fuel
17:20 mattgriffin joined #fuel
17:30 jpf joined #fuel
17:37 odyssey4me kaliya - I came right, by the way. :)
17:44 syt joined #fuel
17:48 emagana joined #fuel
17:53 jpf joined #fuel
17:54 harybahh joined #fuel
17:56 skonno joined #fuel
18:09 emagana joined #fuel
18:14 kupo24z joined #fuel
18:15 kupo24z xarses: have you seen this before? http://pastebin.mozilla.org/6803925
18:15 kupo24z keystone gets to about 80% cpu and hits that python error even after restarting
18:16 jobewan joined #fuel
18:16 kupo24z pretty intermittent though
18:17 kupo24z Could it be too many requests? I have 40~ nodes making requests every minute
18:19 blahRus weird
18:25 syt joined #fuel
18:51 jetole_ joined #fuel
18:54 dhblaz joined #fuel
18:59 xarses kupo24z: you have 40 api calls / min?
19:00 xarses keystone will only fireup for endpoint / auth / validate on api calls
19:01 angdraug joined #fuel
19:01 emagana joined #fuel
19:04 syt joined #fuel
19:10 kupo24z xarses: see pm, not sure if that would cause what im seeing
19:16 emagana joined #fuel
19:17 jetole_ Hey guys. I have a relative fresh cluster install. Bare metal, neutron/vlan. It was working but right now. from firing up cirros, I can see that the default gateway on net04 isn't responding to ping and I can't seem to ping 8.8.8.8 (internet IP)
19:19 kupo24z jetole_ did you run network verification?
19:19 jetole_ yes
19:20 jetole day before yesterday all networking was running fine and network verification all passed
19:22 xarses kupo24z: still thinking about it
19:23 jetole I'm running net verification again and health checks again. all health are thumbs up so far though it's waiting on Check network connectivity from instance via floating IP
19:23 jetole ...which will fail
19:23 jetole oh wow. net verification did not pass
19:24 jetole I don't know why but I am going to try restarting the switches
19:24 xarses kupo24z: maybe check on haproxy
19:25 xarses jetole: you can check that the neutron router's namespace is working properly, and that you can ping through it
19:25 jetole xarses, how do I do that?
19:26 kupo24z xarses: looks like its still occuring after i nuked all instances
19:26 xarses then work backwards, try to ping out from the nodes router namespace, if that fails trace between the two
19:26 jetole xarses, how do I do this?
19:26 kupo24z crm status shows everything fine, is there a specific thing im looking for?
19:27 xarses one of the controllers will be running neutron-l3-agent, you can see this from the output of 'pcs status'
19:27 xarses from that controller, you can use 'ip netns' to list the namespaces
19:27 xarses you will see 'qrouter-<UUID>'
19:28 xarses uuid will match the output of 'neutron router-list'
19:28 jetole xarses, p_neutron-l3-agent(ocf::mirantis:neutron-agent-l3):Started node-4.example.com
19:28 jetole Is that the line of the controller in question
19:29 xarses you can then run commands in the netns for the router by 'ip netns exec<namespace name (qrouter-<UUID>) <command>'
19:29 xarses i like to exec bash so i can run multiple commands
19:29 xarses jetole: correct, node-4 is running the l3-agent
19:30 jetole I have two qrouter-<UUID>'s
19:30 xarses correct, one will be net04, and the other net04_ext
19:30 jetole xarses, what is the file I source on a node for keystone logins
19:30 jetole . <something>
19:30 xarses . openrc
19:31 jetole OK. I see the one for router04 on neutron router-list
19:32 jetole xarses, ip netns exec 9bdc9808-0f8c-473d-97af-14e88ed15c6e /bin/bash
19:32 jetole ??
19:33 xarses qrouter-9bdc9808-0f8c-473d-97af-14e88ed15c6e
19:33 xarses but ya
19:33 jetole I got a blank line when I did that and a return code of 0
19:33 jetole ooohhhh
19:33 xarses you are now in, the netns
19:33 jetole it was from/to the same machine
19:33 jetole so it worked
19:34 xarses so you can check the ip settings
19:34 xarses do a ping
19:34 xarses traceroute
19:34 xarses all kinds of fun debug
19:34 jetole thumbs up on ping 8.8.8.8
19:34 jetole ...need to launch another test instance. Nothing running now
19:34 xarses ok, and thats the qrouter for net04, not net04_ext?
19:35 emagana joined #fuel
19:35 xarses can you paste the output of 'neutron agent-list'
19:35 jetole I think it's the same router. neutron router-list doesn't show a name for net04_ext
19:36 xarses ok
19:36 jetole neutron agent-list seems to be taking a while
19:37 jetole Connection to neutron failed:
19:37 xarses try 2 more times
19:38 jetole k
19:38 kupo24z xarses: i put debug on and restarted, will see what happens in the logs
19:38 jetole I started a cirros instance too in another term and went to the controller and did the netns /bin/bash command, etc and I can't ping the cirros instance
19:39 xarses kupo24z: it seems like the connection reset is likely the problem here, it could explain the high usage, and the message in the log, check datapath from client -> vip__management_old -> haproxy -> one of your keystones. You can try moving the vip around, also try keystone client calls through the vip and directly to each node, I'd do around 100 each
19:39 jetole I got "Connection to neutron failed:" error again. Running one more time
19:39 xarses like say, keystone endpoint-list
19:40 jetole xarses, all 3 times I got "Connection to neutron failed:" from `neutron agent-list`
19:41 xarses jetole: same if you do neutron router-list 3 times?
19:41 kupo24z xarses: how should i be restarting keystone? its not in crm so is init.d fine?
19:42 xarses kupo24z: correct
19:42 kupo24z of course now after i restarted in debug no tracebacks
19:42 kupo24z sigh
19:43 jetole xarses, `neutron router-list`seems to work fine under a normal shell but not under the netns shell
19:44 xarses jetole: oh, it wont work in the netns shell
19:44 xarses you'd need to exit that
19:44 jetole oh
19:44 syt joined #fuel
19:44 jetole neutron agent-list works fine outside of the netns shell
19:44 e0ne joined #fuel
19:44 jetole as does router-list
19:44 xarses ok, yay
19:44 jetole lol
19:44 jetole do you want me to paste the output
19:44 jetole ?
19:45 jetole oh! Also, router-list shows two routers. router04 which I believe is the same router for net04 and net04-ext and a test router I created
19:46 xarses paste here http://paste.openstack.org/ (or somewhere similar) and then put the url in IRC
19:47 xarses back in a bit
19:48 jetole xarses, http://paste.openstack.org/show/121708/
19:49 jetole xarses, http://paste.openstack.org/show/121709/
19:49 jetole bb in 15
19:54 harybahh joined #fuel
19:59 skonno joined #fuel
20:02 emagana joined #fuel
20:08 nullme joined #fuel
20:11 emagana joined #fuel
20:11 nullme hey got a question on Fuel, I have 1.1.1.2 to 1.1.1.5 assigned to Public and 1.1.1.6 to 1.1.1.14 assigned to Floating... why does my floating pool only have 5 of the 9 ips available, I know 1 is attached to router... but where are the rest?
20:18 xarses joined #fuel
20:18 jetole nullme, good question. I have 240 floating IP's and only see ~15 of them
20:20 jetole nullme, also I hope you're kidding about 1.1.1.x since that's an actual public net that I am pretty sure you don't own
20:22 nullme HA HA HA, only example... and we do own a lot of IP's but, this was a lab test before deployment and i can't find documentation or any indication on what happens to the other IP's
20:23 jetole yeah I don't know either. I'm curious but have been dealing with other deployment issues so far and haven't looked into it
20:24 jetole I'm considering moving to provider networking. I'd like the idea that staff can ssh directly to the internal network IP's without a floating IP and only assign floating IP's for public facing services
20:24 pasquier-s joined #fuel
20:25 Dr_Drache nullme; jetole as long as the ips are in the "pool" there are there just not displayed in horizon
20:25 nullme jetole, thanks for the laugh, and trust me, im in the same boat... lOL
20:25 jetole Dr_Drache, I thought that was probably the case but wasn't sure
20:26 nullme Dr_Drache, how do you get them assigned to VM from Horizon, is this OpenStack issue or Fuel? I need to find some answers on this for our Clients...
20:27 Dr_Drache nullme, you mean to "see" the external network (floating) IP in the instanace?
20:28 nullme Dr_Drache, yes to be able to assign the floating IP instance or Router... both report no more IP's available
20:29 emagana joined #fuel
20:29 Dr_Drache how big is your pool?
20:29 Dr_Drache that you assigned to the external network
20:30 Dr_Drache this would be an openstack issue technically.
20:31 kupo24z you can always add floating ips in bulk to nova through the CLI
20:31 Dr_Drache you should be able to choose "assign floating IP" when viewing the instance in horizon.
20:31 kupo24z assuming routable through your existing public gateway
20:32 Dr_Drache and assuming you don't cross masks :P
20:33 nullme Dr_Drache, In Fuel, the public range is 3 ip's example 1.1.1.2 to 1.1.1.5 the (floating) IP range is 1.1.1.6 to 1.1.1.14
20:33 Dr_Drache so, 10 total ips
20:33 Dr_Drache in your example.
20:34 nullme kupo24z, i know you can add this through cli, but im talking about easy Clients no IT experience they only work from Horizon..., thanks for help
20:34 Dr_Drache nullme, how many nodes?
20:34 nullme Dr_Drache, 9 for floating...
20:35 Dr_Drache and do you have nodes being assigned external ips?
20:35 Dr_Drache nullme, sorry, I wasn't doing inclusive.
20:35 jetole I have a instance in error state and reset-state doesn't seem to be affecting it. anyone have any suggestions? Dr_Drache ?
20:35 Dr_Drache my bad
20:35 nullme from that 9, 1 gets assigned to router,,, so that is 8 left...
20:35 nullme jetole, reset-state --active
20:35 Dr_Drache and the instance is only attached to the internal network?
20:36 nullme lets say all my instance are attached to external only
20:36 Dr_Drache nullme, doesn't work that way.
20:36 jetole nullme, thanks. That worked
20:36 Dr_Drache at least, I've been told, and never had it work.
20:37 Dr_Drache nullme, instances must be attached to the internal network, then they can communicate, and be assigned a floating.
20:37 Dr_Drache (a) internal network at least, more are possible of course.
20:37 nullme Ok, lets say no floating ip's are attached to instance... my question is that i should have 8 ip's left in Horizon to be able to allocate.. and i only see 5
20:38 Dr_Drache that's a gui thing, they are all still there.
20:38 nullme Dr_Drache, I know all instances get a internal L3 network then you attach that to floating... that is basic... im saying in horizon i should see 8 ip's available and only see 5
20:39 Dr_Drache nullme, I'm saying, you don't see them all.
20:39 Dr_Drache at least I never have.
20:39 Dr_Drache I see a portion of them.
20:39 nullme Dr_Drache, i know they are there... just not in Horizon,,, so my question is is this Fuel or OpenStack issue... and has anyone found why they don't show up when i allocate them in Fuel...
20:40 Dr_Drache either I pick from the list, and the list is refreshed, or I manually add one that is not in the list, but in the pool.
20:40 Dr_Drache that is a openstack design choice IIRC.
20:41 nullme yeah i was thinking i start with Fuel, since in fuel i allocate 9 IP's and in Horizon i only have 5 available, so i figured maybe fuel does not communicate everything to horizon.
20:41 Dr_Drache I've never seen my full allocation in horizon.
20:42 nullme Dr_Drache, just checked nova and still can't assign 3 of the 9 ip's
20:42 syt joined #fuel
20:42 Dr_Drache but I've never tested with anything less than a /24
20:43 Dr_Drache and I assume they are not assigned to nodes?
20:44 nullme Dr_Drache, this is bad, and this is on a big /24 network in data center, but we chop it down to /28 for clients... or /27 and they lose ip's in openstack... compared to standard traditional network... lets see what others do... thanks for all you help!!!
21:01 skonno joined #fuel
21:06 jetole well I'm back to drawing a blank. All VLAN's are down for verify network test. cirros can't connect to heat (though heat engine is running). This all worked day before yesterday and I can't find out why my networks have failed since
21:06 jetole xarses? Dr_Drache? anyone?
21:12 emagana joined #fuel
21:13 xarses jetole: did you check to see if the instances are being plumbed in ovs from the compute to the neutron-router?
21:15 jetole xarses, I'm not sure what you just said but right now, after having rebooted all nodes and hard power cycled the switches (stacked Cisco 3750-E's) I just ran `ip netns` from the l3 neutron controller and haproxy is the only name space it returned
21:20 xarses output from neutron agent-list?
21:21 kupo24z xarses: It looks like its definetly the cron, i moved the script and i havnt seen one traceback
21:22 kupo24z Does it get a new token every nova/neutron command?
21:22 xarses kupo24z: odd, I'll look at it again
21:22 xarses kupo24z: probably
21:23 jetole xarses, http://paste.openstack.org/show/121724/ - node6 is rebooting at the moment
21:24 jetole node6 was l3 neutron and node5 has now become l3 neutron
21:24 xarses jetole: thats fine, they where moved to another node by corosync/pacemaker
21:25 xarses metadata will restore, the l3, and DHCP will stay XXX
21:25 xarses even once the node restores
21:25 xarses which is fine
21:25 xarses they only need to be running on one of the controllers
21:25 jetole ok
21:25 xarses so node-5 has no qrouter-<UUID> namespaces?
21:26 jetole no it also only list haproxy
21:26 kupo24z xarses: deploying 6.0 right now and ran into existing bug on setting disk size IndexError: list index out of range
21:27 jetole xarses, if you're interested: http://paste.openstack.org/show/121725/
21:29 xarses try 'crm resource restart neutron-l3-agent'
21:30 kupo24z Should i feel free to submit any bugs i see or check with you first?
21:31 jetole xarses, http://paste.openstack.org/show/121728/
21:32 xarses kupo24z: feel free to log bugs
21:32 xarses sorry, its p_neutron-l3-agent
21:34 jetole done. It still only shows haproxy
21:34 jetole horizon is showing router04
21:37 xarses jetole: do you have any instance on router04?
21:37 xarses net04 even?
21:38 jetole I have only on instance overall running and that, I am not sure of. It has a net04 IP but it's unusable atm
21:39 jpf_ joined #fuel
21:39 rmoe unusable in what way?
21:40 jetole the network isn't working on the instances
21:40 jetole rmoe, that's my problem
21:40 rmoe do the instances get an IP from DHCP?
21:40 jpf_ joined #fuel
21:41 rmoe as in if you conect to vm console and run ip a you see an IP
21:41 jetole two days ago everything worked fine. Now the network test is failing. the instances aren't contacting heat, etc
21:41 jetole not on cirros, I think ubuntu did but still wasn't working. Let me verify
21:43 emagana joined #fuel
21:43 jetole ubuntu is stalling at cloud-init due to heat being inaccessible due to networks being down and I am getting very frustrated
21:44 rmoe cloud-init will fail if the instance doesn't get an IP from DHCP
21:44 rmoe so that's probably the root of these problems
21:44 jetole I know
21:44 jetole the root of the problem is the networks aren't functioning at all
21:44 rmoe is the dhcp-agent running?
21:45 jetole rmoe, read up
21:45 rmoe ok, I see it
21:47 adanin joined #fuel
21:48 rmoe is there a dnsmasq process running on the node with the dhcp-agent?
21:48 rmoe there should be one process for each network you create with a dhcp range
21:50 jpf joined #fuel
21:50 jpf joined #fuel
21:51 jetole node-1 is running dnsmasq but the controllers are not showing the router in the namespace. xarses was helping me with this
21:52 rmoe ok, we'll figure that out too, but it's a separate issue from your dhcp problems
21:53 rmoe so if your agent is running and you see dnsmasq processes then the next thing to try is using tcpdump inside the correct dhcp-<uuid> namespace
21:53 jetole well if the the neutron router isn't working then how would dhcp get from dnsmasq to the instance. I thought the router was the stepping stone between the two?
21:54 rmoe the router is for l3 connectivity, floating ips and connecting different neutron subnets
21:54 xarses jetole: you should see a qdhcp namespace on the node running neutron-dhcp-agent
21:54 harybahh joined #fuel
21:55 jetole how do I know which node is running the neutron-dhcp-agent? pcs status?
21:55 xarses yes
21:56 jetole I see two qdhcp namespaces
21:56 rmoe run neutron net-list and fid the UUID for the network your vm is connected to
21:57 jetole got it. One if for net04. one is for net04-ext. I know I want the one for net04
21:57 rmoe yep
21:57 jetole ok
21:58 rmoe so if you run ip a inside the namespace you should see  tap interface with the IP of your DHCP server for that network
21:58 rmoe if the interface is up and configured that's what you'll want to run tcpdump on
21:58 jetole I only got lo from inside the namespace
21:59 rmoe is there anything useful in the dhcp agent logs?
21:59 jetole what does the "a" in ip a do? I usually do ip -4 -o addr ls scope global
21:59 rmoe a is just short for addr
22:00 tatyana joined #fuel
22:00 jetole where are the dhcp-agent logs?
22:00 rmoe should be /var/log/neutron/dhcp-agent.log
22:01 jetole /var/log/neutron/dhcp-agent.log
22:01 jetole rmoe, I'm running bare metal, neutron/vlan
22:02 jetole 2014-10-17 21:42:38.651 6007 ERROR neutron.agent.dhcp_agent [req-ff6a3b05-9a91-4103-b1e8-53af40bbe9da None] Unable to reload_allocations dhcp for cd097fbe-4bd3-4c2e-9a5f-4c610ea19bae.
22:02 jetole 2014-10-17 21:42:38.651 6007 TRACE neutron.agent.dhcp_agent Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-cd097fbe-4bd3-4c2e-9a5f-4c610ea19bae', 'ip', 'route', 'list', 'dev', 'tap0d97d609-03']
22:02 jetole that uuid matches the qdhcp for net04
22:05 jetole rmoe, xarses, this seems to re-occur a few times throughout the dhcp log: http://paste.openstack.org/show/121730/
22:07 emagana joined #fuel
22:11 e0ne joined #fuel
22:12 xarses kupo24z: before you loop, do keystone token-get and export id as OS_AUTH_TOKEN, then it will attempt to use the token before getting a new token
22:12 kupo24z xarses: cool will try it out
22:12 kupo24z thanks
22:12 kupo24z btw 6.0 deployment failed, already submitted bug
22:13 HeOS joined #fuel
22:13 xarses if the token becomes invalid (~1hr), the rest of the parts are still there to get another one
22:13 jetole xarses, rmoe, I ran quantum-ovs-cleanup and then service neutron-dhcp-agent restart. A new instance seems to have received data from heat via cloud-init
22:13 xarses but it won't re-export it
22:13 rmoe excellent
22:14 rmoe I found a launchpad bug that looks like the same issue you're running into, it seems to be intermittent as you've noticed
22:14 jetole but now I am "url_helper.py[WARNING]" messages on the instance
22:14 jetole looks like it can't connect to heat engine
22:14 jetole I saw another launchpad bug that mentioned quantum-ovs-cleanup is not run on reboot
22:16 jetole I see heat engine isn't running on one controller node
22:19 jetole f***!!! I started openstack-heat-engine on that controller and now a new instance is presenting a cloud-init-nonet notification
22:21 jetole 2014-10-17 20:15:08.402 46650 INFO neutron.openstack.common.service [-] Caught SIGTERM, exiting
22:21 jetole rmoe, does that mean anything ?
22:21 rmoe did you restart a service at that time?
22:21 rmoe or send a term signal to kill the process?
22:22 jetole that was actually before the system had been rebooted
22:24 rmoe usually those messages just mean the service was restarted, hopefully there were log messages shortly afterwards of the service starting again
22:24 syt joined #fuel
22:25 jetole here is the entire boot log of a ubuntu 14.04 instance. It looks like it is having issues connecting to heat / cloud-init but then it looks like there is cloud-init data presented: http://paste.openstack.org/show/121736/
22:25 jetole oh scratch that. ssh host key. not client key
22:26 e0ne joined #fuel
22:27 rmoe still not getting an IP
22:27 rmoe you attached the VM to a network with DHCP?
22:27 jetole so dhcp and some aspect of cloud-init was working from changes I made on node-4 until I started heat-engine on node-6
22:27 jetole and then it failed
22:28 rmoe are you launching these instances with heat or through horizon or the cli?
22:28 jetole through horizon
22:28 rmoe ok
22:28 rmoe and you're choosing the net04 network?
22:28 jetole yep
22:29 rmoe ok, so we're back to verifying the network config of that namespace and working from there
22:29 jetole rmoe, afaik this is a virtually fresh and untained install that is only a few days old
22:29 rmoe I can't imagine how starting heat-engine could affect the dhcp-agent
22:29 rmoe this seems really weird
22:30 jetole I'm tempted to redeploy exceptI don't know what has caused this problem
22:30 jetole and I can't have these issues re-occuring in production
22:30 rmoe definitely not
22:34 jetole so... I don't know what to do now
22:35 rmoe let's go back to looking into the dhcp namespace
22:35 rmoe make sure that tap interface is up
22:35 rmoe if it's not check the logs again for that error you saw earlier
22:36 rmoe there was a launchpad bug for the same issue you saw and I verified that we have the patch in our packages that (supposedly) fixed the issue
22:36 jetole [root@node-4 ~]# ip -o l | grep tap
22:36 jetole [root@node-4 ~]#
22:36 rmoe that's from inside the qdhcp namespace?
22:36 jetole oh no
22:36 jetole my bad
22:37 jetole only lo from inside the netns
22:37 jetole [root@node-4 ~]# ip netns exec qdhcp-cd097fbe-4bd3-4c2e-9a5f-4c610ea19bae ip -o l
22:37 jetole 25: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN \    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
22:37 rmoe are there more errors in the logs?
22:38 jetole It looks like the same error
22:38 jetole I just did the same resolution
22:38 jetole [root@node-4 ~]# ip netns exec qdhcp-cd097fbe-4bd3-4c2e-9a5f-4c610ea19bae ip -o l
22:38 jetole 25: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN \    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
22:38 jetole 35: tap0d97d609-03: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN \    link/ether fa:16:3e:85:cd:16 brd ff:ff:ff:ff:ff:ff
22:40 jetole launching an instance now
22:41 jetole cloud-init-nonet warning
22:45 jetole rmoe, so the tap is up, I am running tail with follow on the log to see events as they happen but the instance started without cloud-init or network
22:46 jetole ...and I don't know how else to troubleshoot this
22:46 rmoe did you have to restart the agent again to get the tap interface to come up?
22:46 jetole yeah
22:46 rmoe have you booted a VM since you brought the tap interface abck up?
22:46 jetole yes the one I booted just after
22:46 jetole that's what I was referring to
22:47 rmoe ok
22:47 rmoe is the tap interface still up?
22:47 jetole dammit
22:47 jetole no
22:47 rmoe and do you have the same error messages as before?
22:47 jetole not yet
22:50 rmoe can you enable debug logging (if you haven't already) and restart the agent again?
22:50 jetole how do I enable debug
22:50 jetole ?
22:50 rmoe then run tcpdump on the tap interface and boot a vm
22:50 rmoe vim /etc/neutron/neutron.conf
22:50 rmoe there is a line that says debug=false, change it to debug=true
22:50 jetole got it
22:51 rmoe hopefully between debug logging and tpcdump something will stand out
22:52 jetole do you know if tcpdump has a option to display the content when you write it to a file? I know tshark does but centos doesn't seem to have tshark in the repos
22:52 rmoe -A will print the packet contents
22:53 rmoe I don't know about when writing to a file though
22:53 jetole OK. I started it without writing
22:53 jetole I saw some ip6
22:53 jetole some bootpc which I believe is dhcp
22:53 jetole arp who-has 10.2.6.1 (router)
22:53 jetole that keeps repeating
22:54 jetole the console is showing cloud-init data with the proper ip
22:54 jetole and I just saw a bootpc and bootps packet
22:54 rmoe so the vm got an ip this time
22:54 jetole more of those arp
22:54 rmoe this makes no sense to me then :(
22:55 jetole ah with the router down which was a earlier problem and hence no default gateway that explains why the helper_url.py error happened
22:55 jetole I see dhcp request / replies
22:56 jetole so it looks like it's getting dhcp on the local net but no router/gateway which was the problem I was having before we started looking into the dhcp
22:56 rmoe ok, so dhcp is fixed even though we don't know why...
22:56 jetole I may no why
22:57 rmoe oh yeah?
22:57 jetole Im going to have to bury my head in the mud after admitting this
22:57 jetole I wrote a bash script that did service XXX start for all applicable openstack services including ceph and mcollective and crontab'd it every minute on every node.
22:58 rmoe ah, so when the tap interface was down it was because the agent was being restarted?
22:58 jetole in theory it shouldn't have been
22:58 jetole service is supposed to do nothing if you start a service that is already running
22:58 jetole in theory
22:59 jetole <- embarrassed!
23:00 jetole I did because I would occassionally have heat-engine die on various nodes and after a reboot it seemed random if a ceph node would come back up unless I manually logged in and launched it
23:01 rmoe ok, so let's figure out the l3-agent issues then
23:01 jetole thank you
23:01 jetole what do I do?
23:01 rmoe let's see what's in your qrouter namespace first
23:01 rmoe interface-wise i mean
23:02 skonno joined #fuel
23:02 jetole qrouter-9bdc9808-0f8c-473d-97af-14e88ed15c6e
23:02 jetole that wasn't there earlier but may have been for the same reason
23:02 jetole that and haproxy
23:02 rmoe what interfaces are configured inside that?
23:03 jetole how do I see that?
23:06 jetole rmoe, I just ran `neutron agent-list` and it shows DHCP agent down on node-6 and up on node-4. node-4 was returned by `pcs status` but that's the only instance I have in agent-list where it shows a service as not alive
23:06 rmoe there is an l3 and dhcp agent on each node
23:06 rmoe but only one runs at any given time
23:06 rmoe controlled by pacemaker
23:06 jetole ok
23:06 rmoe that's why only one of the 3 will ever show as being up
23:07 jetole [root@node-6 ~]# ip netns exec qrouter-9bdc9808-0f8c-473d-97af-14e88ed15c6e ip -o l
23:07 jetole 22: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN \    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
23:08 skonno joined #fuel
23:09 jetole I just restarted openvswitch and now I amd showing 11 interfaces from that command plus lo, so 12 total
23:09 HeOS joined #fuel
23:10 emagana joined #fuel
23:10 jetole and managed to crash horizon in the process
23:10 rmoe impressive, don't think I've seen that one before
23:10 jetole I'm going to reboot the lot except the osd only machines
23:10 jetole Error: Unauthorized: Unable to retrieve instances
23:10 jetole that modal popup appeared 10+ times after I restarted the service
23:12 rmoe you might just have to log out and log back into horizon
23:12 rmoe I've seen weird issues like that before
23:12 jetole I already started the reboot on 7/9 nodes. I have two osd only machines I skipped and two osd+controller
23:13 rmoe ok
23:15 jetole dell poweredge R series servers take a good 7-10 minutes to boot so knock on wood
23:15 jetole well node 1,4 and 5 are back up
23:16 jetole and 9
23:16 jetole only 6 is still down
23:34 jetole rmoe, my systems are all back up, ceph is healthy. neutron l3 and dhcp agent look ok, I think (http://paste.openstack.org/show/121747/). heat engine is not running on any of the controllers. It should be running on all 3. Right?
23:35 jetole oh and the horizon issue seems to be resolved
23:35 rmoe anything in the logs for heat?
23:35 jetole on the controllers? let me check
23:35 rmoe now that the l3-agent is working your VMs should be able to access the internet and floating IPs should work
23:35 rmoe yep, on the controllers
23:36 jetole isn't heat-engine required for cloud-init? should I boot a test vm before I look into heat engine or look into heat engine before I boot a test vm?
23:36 rmoe no, heat isn't required for cloud-init
23:36 jetole ok. let me see...
23:37 jetole well my instance when straight to error
23:38 jetole two different instances. One on ubuntu and one on cirros
23:38 jetole which log should I start with?
23:38 rmoe horizon will show you the error message if you click on the vm name
23:38 jetole File "/usr/lib/python2.6/site-packages/nova/scheduler/filter_scheduler.py", line 108, in schedule_run_instance raise exception.NoValidHost(reason="")
23:39 rmoe what does nova service-list show?
23:40 jetole 4 instances of nova-compute down. everything else is up
23:40 jetole that's odd
23:41 rmoe weird, check the nova.log on one of the computers
23:43 jetole looks like no amqp available. I think maybe compute nodes were up before controller nodes
23:44 rmoe try restarting one of the compute services and see if it comes up
23:44 rmoe also check rabbitmqctl cluster_status from one of the controllers
23:45 jetole I started nova compute on one of the compute nodes and checked nova service list and it seems to be up
23:45 jetole let me do the rabbit thing
23:45 rmoe ok, you might just need to restart the compute services, normally they will reconnect to rabbit themselves though
23:46 jetole rabbitmq looks good. yeah I think it was just the controllers were still down when the compute nodes came up
23:47 jetole they are all up via service-list. going to launch another instance
23:51 jetole rmoe, does the name ost1_test-server-smoke mean... oh... health tests
23:52 rmoe yep, it creates its own flavors and whatnot when it runs the tests
23:55 harybahh joined #fuel
23:55 jetole well I added a floating IP
23:55 jetole and connected to it
23:55 jetole I'm at home and it's a real floating ip
23:55 jetole so yay
23:56 jetole autoscaling failed the test. I don't really care tonight
23:57 jetole everything seems to be working at the moment
23:59 rmoe excellent
23:59 jetole rmoe, thank you for all the help

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary