Perl 6 - the future is here, just unevenly distributed

IRC log for #fuel, 2014-02-26

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:33 rmoe joined #fuel
01:52 rmoe joined #fuel
02:00 rongze joined #fuel
04:38 IlyaE joined #fuel
04:44 rongze joined #fuel
05:15 rongze joined #fuel
05:16 rongze_ joined #fuel
05:43 mihgen joined #fuel
06:06 xarses joined #fuel
06:11 vkozhukalov_ joined #fuel
06:15 rongze joined #fuel
06:18 Ch00k joined #fuel
06:21 rongze joined #fuel
06:50 saju_m joined #fuel
07:04 Ch00k joined #fuel
07:34 mrasskazov1 joined #fuel
07:42 rvyalov joined #fuel
07:48 vkozhukalov_ joined #fuel
07:58 e0ne joined #fuel
08:19 Ch00k joined #fuel
08:27 mihgen joined #fuel
09:07 miguitas joined #fuel
09:12 vk joined #fuel
09:18 evgeniyl joined #fuel
09:29 rvyalov joined #fuel
09:52 bogdando joined #fuel
09:54 tatyana joined #fuel
10:19 Ch00k joined #fuel
10:25 Ch00k joined #fuel
10:53 Ch00k joined #fuel
11:13 vk joined #fuel
11:17 pbrooko joined #fuel
11:56 dmit2k Hello! Can anyone please explain one issue with Fuel 4.0 GUI?  I'm configuring HA cluster with CEPH backend for Cinder and Glance. Tried both with Ephemeral and RADOS options on/off. When configuring disks for nodes I am always forced to assign space for Virtual and Image storage
11:56 dmit2k As I thought if I use CEPH then I do not need any other storage types, do I?..
12:16 Ch00k joined #fuel
12:17 Ch00k joined #fuel
12:30 TVR___ joined #fuel
12:47 MiroslavAnashkin dmit2k: Hello! Ceph is backend for Cinder and Glance, but Glance, Libvirt and other services require some  (and large) disk space for image cache
12:49 MiroslavAnashkin dmit2k: For some operations with images OpenStack uses curl - and it also require some space for temporary files
12:50 mattymo for file injection, I believe glance needs plenty of cache space
13:03 Ch00k joined #fuel
13:17 Dr_Drache joined #fuel
13:19 justif joined #fuel
13:21 e0ne joined #fuel
13:46 mihgen joined #fuel
13:52 Ch00k joined #fuel
14:07 Ch00k joined #fuel
14:14 dmit2k MiroslavAnashkin: tnanks for the reply! Is it somehow covered in Fuel documentation? Could not find anything about this issue :(
14:16 dmit2k MiroslavAnashkin: would be very grateful for some more info regarding this - at least where to dig
14:18 MiroslavAnashkin Actually, it is common OpenStack issue and it is not covered well in OpenStack docs
14:19 MiroslavAnashkin The info about required disk space is scattered among the different OpenStack documentation chapters
14:21 MiroslavAnashkin And "out of free space" issues even may not appear in OpenStack logs, at least the cases of Curl failures.
14:21 dmit2k MiroslavAnashkin: OK... Any hint how this space requirements should be calculated at least approximately?
14:21 Dr_Drache it seems openstack is having the "opensource" issue.
14:22 dmit2k MiroslavAnashkin: and how does it affect HA in common?
14:23 MiroslavAnashkin Twice of your largest image filesystem size.
14:23 dmit2k MiroslavAnashkin: I have experience with CloudStack and there is very similar requirement for temp space for image conversions etc, but it uses NFS and those files are safe to loose if something goes wrong
14:24 MiroslavAnashkin Glance expands RAW images to full size - including free space on virtual disks
14:25 dmit2k MiroslavAnashkin: OK, does it mean that if I use only RAW format no conversions will take place?
14:25 MiroslavAnashkin RAW is preferable for Ceph backends. Qcow - for LVM backends.
14:26 MiroslavAnashkin And Glance does not expands qcow images to full size.
14:28 MiroslavAnashkin So - it is recommended to use Raw with Ceph backends, but it require a lot of temporary space on both - controller and compute
14:28 TVR___ Verification failed.
14:28 TVR___ Network verification on Neutron is not implemented yet
14:28 MiroslavAnashkin If you use 1Tb filesystem size in the image - please reserve at least 2 TB
14:29 TVR___ do we have a timeline for this being implemented?
14:29 TVR___ neutron with GRE
14:30 Dr_Drache MiroslavAnashkin, it makes me sad every time I hear that explained.
14:30 TVR___ is this a 4.1 release, or longer?
14:30 Dr_Drache MiroslavAnashkin, BUT, ceph supports sparse RAW
14:32 dmit2k MiroslavAnashkin: actually, I'm a bit surprised with this, I thought if CEPH is used both for Glance and Cinder it uses copy-on-write (at least as was stated in https://ceph.com/cloud/ceph-and-mirantis-openstack/)
14:32 e0ne_ joined #fuel
14:32 MiroslavAnashkin TVR___: Verification works only before environment is deployed. And yes, there is no Neutron+GRE verification in 4.0. It is implemented in 4.1
14:33 MiroslavAnashkin The Openstack workflow is following:
14:33 TVR___ ok.. cool.. I can wait... it would have helped me with an issue I am having.. but I have work arounds
14:33 TVR___ thanks
14:34 Dr_Drache dmit2k, it still uses RAW, but sparse raw (It supports it, I don't know if it's default) and some of those features dmit2k are from emperor, which isn't being used.
14:34 MiroslavAnashkin Glance on controller downloads image from its backend to image cache. And, if nesessary, makes automatic image conversion. At this time it expands raw to full size.
14:35 Dr_Drache MiroslavAnashkin, so we are not using sparse raw?
14:35 MiroslavAnashkin AFter that, it gives link to the image to compute. Compute downloads the image (using curl) to image cache
14:36 MiroslavAnashkin During the download, curl uses temporary space and then copies image to cache
14:37 dmit2k MiroslavAnashkin: sounds weird :( the link above states: "Patch Nova: clone RBD backed Glance images into RBD backed ephemeral volumes, pass RBD user to qemu-img"
14:37 MiroslavAnashkin And only then Cinder gets the image and stores it to its backend.
14:39 MiroslavAnashkin And yes, if you use Ceph RBD backend - this long image conversion chain is necessary only then you uploading new image to Ceph or exporting it back to image
14:41 MiroslavAnashkin So, even with Ceph, cache is necessary at least for initial image upload to Ceph. After image is uploaded - RBD backend is used.
14:42 MiroslavAnashkin But in this case Ceph require to expand qcow images inside Ceph OSD to full size. it is why Raw is preferable for Ceph - faster cloning without image expansion
14:42 dmit2k MiroslavAnashkin: OK, that sounds more logical to me :) How long this cache usually remains there? Or it is being deleted immediately?
14:44 MiroslavAnashkin Cache is ordinary mount point. With Mirantis OpenStack it points to dedicated partition
14:44 Dr_Drache MiroslavAnashkin, that unused virtual disk space bug, that a 4.1 fix?
14:47 dmit2k MiroslavAnashkin: I mean after conversion of an image on the controller node by Glance and after downloading and storing it to backend on compute -- those cache files are purged immediately?
14:47 MiroslavAnashkin As soon as image upload/download finished and image places to the target point - both Glance and Curl clear their temporary files from cache
14:48 MiroslavAnashkin And even clear in case of error
14:50 MiroslavAnashkin Dr_Drache: Which bug?
14:52 dmit2k MiroslavAnashkin: thanks for explanation, but then I'm more frustrated: I deployed 3 instances from the sample image I got after initial cluster deployment by Fuel (each of 20Gb). Now Horizon tells me in Hypervisor Summary that disk usage is 60Gb on compute node
14:52 dmit2k MiroslavAnashkin: still I see that those volumes and images were deployed to CEPH backend
14:54 MiroslavAnashkin Yes, Ceph reserves full image size when you clone image. Do you use Ceph Ephemerial disks?
14:55 dmit2k MiroslavAnashkin: yes, I do. But from the compute console I see that /var/lib/nova uses only 747MB
14:55 MiroslavAnashkin Yes, Ceph has built-in deduplication
14:57 dmit2k MiroslavAnashkin: OK, maybe I just misunderstand some terms -- what exactly Horizon shows me in Hypervisor Summary about Disk Usage? It says 60GB of 123GB
14:58 dmit2k MiroslavAnashkin: my ceph storage iz 27TB, 123GB is the Virtual Storage space created on Compute node
14:59 TVR___ it would be cool if the horizon dashboard could display the storage total from ceph when using it as a backend
14:59 MiroslavAnashkin Horizon reports virtual sizes. It does not take into account neither data compression not deduplication.
15:01 e0ne joined #fuel
15:02 TVR___ so... forgive my ignorance here... but what is the virtual disk size used for if ceph is the back end for both volumes and images?
15:02 jobewan joined #fuel
15:04 dmit2k MiroslavAnashkin: I'm slightly missing your point -- if the instance volumes are stored in ceph, why does Horizon say that my compute node utilises 60Gb of its local storage? 60Gb iz exactly the size of my three instance volumes
15:06 Dr_Drache MiroslavAnashkin, the bug where if i have ceph for my full back end, there is still partitions created that remain unused
15:07 TVR___ on a completely unrelated note.. the network verification servers pictures look like APC UPS's
15:07 TVR___ heh
15:15 MiroslavAnashkin Horizon shows the worst case. Yes, initially your image clones require less space, but a differences accumulate between the clones, actual occupied size will draw near the virtual.
15:17 MiroslavAnashkin Ceph uses deduplication to reduce traffic between the nodes, that makes overall filesystem faster.
15:19 MiroslavAnashkin Dr_Drache: No, both these bugs are still in progress
15:20 Dr_Drache MiroslavAnashkin, ok. good to know.
15:21 MiroslavAnashkin Both are targeted to Fuel 5.0
15:25 dmit2k MiroslavAnashkin: sorry, seems that you miss my point.... I know how ceph works as I'm using it quite long time now. My question is what Horizon shows in Hypesvisors / Disk Usage page? I see the list of compute nodes there and each compute node has its local Virtual Storage on the HDD partition (as configured deployed by Fuel)
15:26 dmit2k MiroslavAnashkin: say I have a 123GB partition on compute node and Horizon / Hypervisors page shows it correctly
15:27 IlyaE joined #fuel
15:27 dmit2k MiroslavAnashkin: but after creation on 3 instances (with volumes 20GB each) this page shows Storage (used) on this compute node as 60GB
15:28 dmit2k MiroslavAnashkin: so my doubt is WHY? Those volumes created in RBD backend
15:30 TVR___ yes, that too would be a good point...
15:31 TVR___ even if it cannot list the storage size of the ceph backend, it at least, should not show the usage from the wrong pool
15:31 MiroslavAnashkin Since with time actual size of these volumes may grow to virtual size - it is better not take into account the actual size but report the maximum possible.
15:32 MiroslavAnashkin Horizon cannot predict how are you going to use these volumes
15:32 dmit2k MiroslavAnashkin: TVR___ got my point right
15:32 MiroslavAnashkin And it also necessary for quota calculations
15:33 dmit2k MiroslavAnashkin: why it shows that I'm using 60GB of 123GB while my CEPH cluster is 27TB
15:33 dmit2k MiroslavAnashkin: 123GB is my local cache partition on compute node
15:34 MiroslavAnashkin Looks like bug.
15:34 dmit2k MiroslavAnashkin: this confuses me much as I'm new to OpenStack and not sure if things go the right way
15:34 TVR___ as another example... I can create a volume of 800G far exceeding my virtual volume size.... so, what does the virtual storage do exactly when using ceph as a backend for both volumes and images? I can create a file in my instance using that 800G volume that is also larger than my virtual disk
15:35 dmit2k MiroslavAnashkin: this is why I started asking for the purose of those Virtual Storage and Image Storage partitions created by Fuel :)
15:36 dmit2k TVR___: yes, exactly!
15:36 dmit2k MiroslavAnashkin: any comment?...
15:37 Dr_Drache dmit2k, this is the bug i am talking about
15:37 MiroslavAnashkin https://bugs.launchpad.net/fuel/+bug/1262313 and https://bugs.launchpad.net/fuel/+bug/1262312
15:37 Dr_Drache those are not used with ceph backed emph.
15:38 MiroslavAnashkin These are the same bugs Dr_Drache asked some time ago.
15:38 dmit2k MiroslavAnashkin, Dr_Drache: OK, so I can ignore those partitions and set them to minimal possible size when deploying cluster?
15:39 MiroslavAnashkin dmit2k: Yes
15:40 dmit2k MiroslavAnashkin, Dr_Drache: OK, thank you! So am I correct that those partitions will be used only if I have to deploy volume from non RAW image or ISO?
15:40 Dr_Drache no
15:40 TVR___ I generally resize mine to 30G (as I wasn't sure what exactly it did, but knew I wasn't using it) and the rest on / so logs or whatnot wouldn't fill to quickly...
15:40 Dr_Drache they will only be used if you use cinder LVM
15:41 Dr_Drache for emph.
15:41 dmit2k Dr_Drache: even better :)
15:41 Dr_Drache if you use a full CEPH back end, they stay empty
15:41 dmit2k Dr_Drache: but the are not
15:42 Dr_Drache are you using ceph for "everything"
15:42 dmit2k Dr_Drache: now it uses 747M, and has the following structure:
15:42 dmit2k root@node-17:~# du -m /var/lib/nova/instances/ 0       /var/lib/nova/instances/locks 1       /var/lib/nova/instances/4728b893-dfe0-4c5b-b308-f0b33702594b 1       /var/lib/nova/instances/9314bdb6-969f-4c55-93a6-b39ff11dc11d 1       /var/lib/nova/instances/16a7a1d8-d91a-41c9-9c78-1d162f0a78e2 679     /var/lib/nova/instances/_base 680     /var/lib/nova/instances/ r
15:42 dmit2k Dr_Drache: yes, I configured Fuel to use RBD for everything
15:42 Dr_Drache seems to be minimal cache. as far as I can tell, it should be empty
15:43 mihgen joined #fuel
15:43 dmit2k So I can ignore the stats on Horizon / Hypervisor page and consider it as a bug?
15:43 MiroslavAnashkin From the bug:  we should set image_cache_max_size to 0 when rbd backend for Glance is enabled.
15:43 Dr_Drache dmit2k, yes, for now.
15:44 dmit2k Thank you very much!
15:44 TVR___ 5.0 will be awesome!
15:44 Dr_Drache dmit2k, the bug, as MiroslavAnashkin is saying, uses it for cache that's not used. basiclly
15:44 TVR___ so many features, and so many bug routed out
15:44 Dr_Drache well, IMO, just the upgrade to firefly ceph is nice.
15:45 amartellone joined #fuel
15:45 dmit2k will read the bugs, didn't think my issue was actually a bug, I just thought that I misunderstood Fuel logics
15:45 Dr_Drache really wish we could get emperor.
15:45 dmit2k about Emperor - is it going to appear in 4.1?
15:45 Dr_Drache nope
15:45 dmit2k sad
15:45 Dr_Drache emperor is not a "LTS" so mirantis won't include it.
15:46 Dr_Drache at least, that's what I understand.
15:46 TVR___ 0.67 through to... icehouse? 0.72 then?
15:46 dmit2k by the way, will I break everything if I will upgrade to Emperor myself?
15:46 Dr_Drache TVR___, icehouse and firefly.
15:46 Dr_Drache for 5.0
15:46 Dr_Drache and ubuntu 14.04
15:46 dmit2k I'm using async replication so Emperor is minimal for me
15:47 Dr_Drache dmit2k, overall, emperor is "faster" even with async.
15:47 dmit2k so can I upgrade it manually on my cluster or things will break?
15:47 Dr_Drache by what I have figured out, there is a large reduction in overhead.
15:48 Dr_Drache I can't say, I have no idea how to even approch that.
15:48 MiroslavAnashkin http://www.mirantis.com/blog/ceph-mirantis-openstack/
15:48 Dr_Drache i'm a n00b
15:48 MiroslavAnashkin We want to use Emperor
15:50 e0ne joined #fuel
15:50 dmit2k MiroslavAnashkin: your opinion if I will break everything if manually upgrade to Emperor on my Fuel cluster?.. Just substitute the CEPH repo and proceed with apt-get upgrade?
15:51 Dr_Drache MiroslavAnashkin, I recall being told, emperor won't be used because it is not LTS, and is being skipped for firefly
15:52 MiroslavAnashkin But for 4.1 we still use Ceph 0.67.5
15:54 MiroslavAnashkin dmit2k:  Hmm - it is question to angdraug
15:54 TVR___ joined #fuel
15:56 MiroslavAnashkin dmit2k: It is still about 8:00 AM in California, anddraug should appear in 2 hours
15:56 MiroslavAnashkin dmit2k: Appear in this chat
15:57 rvyalov joined #fuel
16:02 Dr_Drache MiroslavAnashkin, silly question, if you want to use something, why don't you?
16:04 MiroslavAnashkin We must at least test it before usage
16:05 dmit2k MiroslavAnashkin: OK, thanx!
16:05 Dr_Drache I'll refrain from some comments
16:06 MiroslavAnashkin For instance, yesterday we downgraded OpenVSwitch version in 4.1 from 1.10 to 1.9 due to stability issues
16:09 dmit2k ppl, please suggest - should I go for pre-production with 4.1 once released and be able upgrade to 5.0 in April?
16:09 dmit2k I'm in doubt if Fuel will be able upgrade the system painlessly
16:09 Dr_Drache dmit2k, I want to do the same.
16:10 TVR___ 4.1 ~IS~ slated to have fixes for some annoying bugs... so I think it is a better choice than 4.0...
16:10 Dr_Drache MiroslavAnashkin, I'd like to work on some more ubuntu fixes, not pushing, just let me know when there is something to try.
16:10 MiroslavAnashkin dmit2k: No, we do not expect we'll be able to implement upgrade in 5.0
16:10 TVR___ as for 5.0, remember.. 4.x does not have an upgrade path but more of a migration path.. so prepare to dole out some hardware to do the move...
16:12 Dr_Drache makes it really hard to choose a production option with no upgrade path.
16:12 e0ne_ joined #fuel
16:13 dmit2k sad again
17:04 mihgen joined #fuel
17:06 saju_m joined #fuel
17:11 rmoe joined #fuel
17:14 kobier joined #fuel
17:18 aleksandr_null joined #fuel
17:19 xarses joined #fuel
17:27 Ch00k joined #fuel
17:36 angdraug joined #fuel
17:43 dmit2k angdraug: Hello! I was advised by MiroslavAnashkin to ask you my question :)
17:44 dmit2k MiroslavAnashkin: your opinion if I will break everything if will manually upgrade to Emperor on my Fuel cluster?.. Just by adding the official CEPH repo and going through the apt-get upgrade?
17:51 angdraug dmit2k: why do you want Emperor?
17:51 angdraug it's not an LTS release of Ceph
17:51 angdraug and there were reports of breakage on upgrade to Emperor on ceph-users@ ML
17:52 dmit2k angdraug: I started using CEPH in production from 0.6x and later seamlessly upgraded to 0.72
17:52 angdraug in other words, in theory you should be fine and Fuel doesn't do anything specific to Dumpling that would break with Emperor, but I still don't think it's a good idea
17:52 dmit2k angdraug: it is faster and has less overhead
17:52 angdraug well if you feel adventurous, sure, give it a shot
17:53 dmit2k I also want to utilise async replication feature
17:53 angdraug not something we can officially support or endorse, but I don't see any specific reason for it to fail
17:53 angdraug by the way as far as I know you should be able to use radosgw-agent from Emperor with Dumpling
17:53 dmit2k I was asking in case Mirantis provides a kind of patched CEPH
17:54 angdraug the only patch we have is a fix that should already be there in Emperor
17:54 angdraug http://tracker.ceph.com/issues/5426
17:54 dmit2k OK, thank you!
17:55 angdraug no problem :)
17:56 dmit2k I have one more question which I'm afraid was asked thousand times before :) Infiniband :)
17:57 dmit2k is it possible to utilise Infiniband for CEPH traffic in Fuel deployed cluster?
17:58 dmit2k I don't need any Mellanox plugins I thins as we use older cards without ethernet switch support
17:59 dmit2k is it much difficult to patch things on the already deployed cluster to route all CEPH traffic through Infiniband adapters?...
18:00 dmit2k maybe just some hints or a few points where and how it should be done
18:02 dmit2k at the moment I have 2 NIC adapters in each server, eth0 for PXE and eth1 for everything else, storage network untagged (no VLAN)
18:02 Dr_Drache dmit2k, you said apt-get, you deployed as ubuntu?
18:02 dmit2k Dr_Drache: yes
18:02 Dr_Drache what hardware?
18:02 dmit2k Suoermicro twin servers
18:03 Dr_Drache blah, I can't get it to deploy the controller on my new dells.
18:03 Dr_Drache which is funny, cause supermirco is dells OEM.
18:03 dmit2k if you mean that bug with swapping eth names - I didn't met this
18:03 Dr_Drache no, i mean, it installs, but is unbootable
18:03 dmit2k quite strange, symptoms?
18:05 Dr_Drache grub loads super slow, like 3 sec to load the grub, doesn't auto boot (no countdown timer) then about 5min blank screen then unable to locate drive.
18:05 Dr_Drache same hardware takes a vanilla unbuntu install perfectly, and centOS.
18:05 dmit2k do you boot from hdd directly or still on PXE?
18:06 Dr_Drache after install, even with PXE, you boot from the HDD.
18:06 Dr_Drache the PXE redirects.
18:06 Dr_Drache same status either way though.
18:06 dmit2k not really I suppose, I don't see any grub menu when PXE-booting
18:06 dmit2k can cheek once again
18:07 Dr_Drache well, it does, but there is zero wait, and "quite" boot, so you won't see it.
18:10 dmit2k OK, maybe, but no problem with booting
18:10 dmit2k BIOS update is the only idea
18:10 dmit2k I would also try to disable remote serial access in BIOS
18:11 dmit2k I had problems with slow booting earlier because of serial console activated in BIOS
18:11 Dr_Drache slow booting I could handle, this is no booting. :P
18:11 dmit2k that may be the same reason, would you try?
18:11 Dr_Drache I will in a few yea
18:12 Dr_Drache need to save some data from this cluster before I destroy it
18:12 dmit2k another thing may be ACPI-related
18:13 dmit2k I saw it somewhere as well
18:13 Dr_Drache damn, wish dell didn't require me to install windows software to get bios updates.
18:13 dmit2k :-D
18:15 TVR___ ummm... you DO remember dell's add campaign not too long ago was a stoner saying "dude! You're gonna get a Dell!!"
18:16 TVR___ they may not have realized yet the numbers for open source in the market...
18:17 e0ne joined #fuel
18:18 rvyalov joined #fuel
18:19 dmit2k does anyone know if 4.1 will be released in time as proposed? 2014-02-28
18:22 Dr_Drache i think it's 3-3-13
18:22 Dr_Drache err
18:22 Dr_Drache 14
18:22 e0ne joined #fuel
18:22 Dr_Drache is the last date I waS aware of
18:26 dmit2k sounds great... so I still have some time to play with infiniband and probably destroy my test cluster..... :)
18:29 dmit2k BTW, is it possible to somehow use software raid for system partition in Fuel?
18:32 dmit2k Fuel was so clever to automagically  create journal partitions for all my OSD drives, so I thought maybe if I will create two same System partitions on first two drives it will make them soft RAID :) Unfortunately it didn't
18:33 dmit2k but that could be a nice feature
18:43 rmoe_ joined #fuel
18:45 MiroslavAnashkin dmit2k: It was possible with Fuel 2.x after Cobbler snippets modification. With 4.x you have to modifty Cobbler templates as well and use CLI to run separate provosioning stage
18:48 dmit2k am I right that I can use CLI on Fuel only before the whole cluster is provisioned and can't make any changes later, except for add / remove of compute nodes?
18:49 MiroslavAnashkin Dr_Drache: It is interesting, but we gathered in 4 persons and found everything is correct in your logs. It was why I disappeared. And yes, we know, anyway GRUB fails to find boot partition in your Ubuntu deployment.
18:49 MiroslavAnashkin dmit2k: Yes, you right.
18:49 Dr_Drache MiroslavAnashkin, I was just explaining to dmit2k
18:50 MiroslavAnashkin dmit2k: The only way - remothe the node, add it back to the cluster as new, make the necessary configurations and re-deploy.
18:50 Dr_Drache maybe he had done something I missed
18:50 MiroslavAnashkin remothe=>remove
18:58 dmit2k MiroslavAnashkin: OK, then what should I do in case of one controller node physical failure? I have no way to redeploy it back?
18:59 dmit2k Dr_Drache: I suppose we have different systems with different BIOSes, as Supermicro does produce custom servers for DELL, it is not a pure rebranded hardware
18:59 dmit2k they just look similar by concept
19:00 Dr_Drache dmit2k, i beg to differ.
19:00 Dr_Drache since my dells, on the boards, are branded supermirco
19:00 dmit2k yes, they are produced by them, but those are not just clones of existing SM models
19:00 Dr_Drache hell, I have a home lab, with a dozen or so dells, that you open them up, and they are labled "supermirco"
19:01 Dr_Drache micro
19:01 Dr_Drache wow, i cannot type.
19:01 Dr_Drache some of them are.
19:01 Dr_Drache the dell C-series are.
19:01 Dr_Drache (they are direct rebrands of some of the Twins.
19:01 dmit2k I updated BIOS for my servers with a bootable USB flash with FreeDOS
19:02 dmit2k had no need for windows :)
19:02 Dr_Drache yea
19:02 Dr_Drache well, if I take a server down for a update, I want all systems updated
19:03 Dr_Drache so, PITA
19:03 dmit2k I suggest you to update everything, including IPMI modules, not just BIOS
19:03 Dr_Drache that's what i am saying.
19:04 dmit2k sounds very strange for me as I have very good experience with SM and some made for DELL models
19:05 dmit2k quite easy to manage and support
19:05 Dr_Drache sure, if you only use windows, or cent.
19:05 dmit2k never used windows in non-virtualised mode ;) and using mostly Ubuntu
19:06 IlyaE joined #fuel
19:06 dmit2k but sh*t hapens
19:06 Dr_Drache dell's managment software has to be "hacked" to run on anything but cent or RHEL
19:18 MiroslavAnashkin dmit2k: Simply boot new node to bootstrap and add it to already deployed environment as controller, make necessary configurations then click Deploy
19:18 MiroslavAnashkin Should work at least for HA environments
19:19 MiroslavAnashkin And do not forget to remove failed node from environment in Fuel UI
19:21 dmit2k MiroslavAnashkin: really? great! I found in FAQ that it is not possible to add controller afterwards and this option will be even removed from 4.1+
19:25 dmit2k "Compute and Cinder nodes can be redeployed in both multinode and multinode HA configurations. However, controllers cannot be redeployed without compeltely redeploying the environment. "
19:25 MiroslavAnashkin For non-HA - yes. For HA - i'll check, but it is HA feature - you can change the number of controllers.
19:26 vkozhukalov_ joined #fuel
19:28 dmit2k this is what manual states
19:31 MiroslavAnashkin Hsss. It is hidden feature, because there are possible issues with new Neutron from Havana. If you don't know how to fix these issues after they happened - do not try to add controllers.
19:32 TVR___ I am adding 2 controllers + OSD to my HA 3 controller + OSD cluster now
19:33 TVR___ I will let you know if it fails
19:33 TVR___ I am running rados bench to see what performance changes happen after adding disks to the ceph backend
19:34 dmit2k fine! :) at least there must be a way to redeploy a broken one, not a big problem if can't increase the number of controllers
19:34 TVR___ I need to get swift bench working.. but install has had it's issues for me
19:35 TVR___ I do know you can edit the mysql to drop nodes out of the cluster manually
19:35 rvyalov joined #fuel
19:35 TVR___ not sure how well fuel would like that
19:38 dmit2k would be nice to get a more in-depth explanation for this statement "Compute and Cinder nodes can be redeployed in both multinode and multinode HA configurations. However, controllers cannot be redeployed without compeltely redeploying the environment. "
19:39 crandquist joined #fuel
19:39 dmit2k I saw some fixed bugs for disallowing creation of an additional controller once the cluster is deployed
19:39 TVR___ they probably mean the SAME node cannot be redeployed....
19:40 dmit2k since 4.1 it will not allow
19:40 TVR___ so.. I guess if a node (controller) fails... either bring it back, or change the nic?
19:40 TVR___ it does seem odd to me
19:41 dmit2k even more - appears that you have to somehow restore its content
19:41 dmit2k yourself
19:41 dmit2k as you can't redeploy it once again
19:42 dmit2k weird if so
19:42 TVR___ I will need to bang on this a bit to see what's possible
19:43 TVR___ if these additional controllers install completely... maybe I will dd the drive as if I had a catastrophic failure and see if 1.) the cluster is actually HA, and 2.) if I can redeploy it from anew....
19:44 dmit2k would be nice... or it is barely suitable for production without additional recovery policies
19:44 TVR___ dd the drive of say.. node-1
19:44 TVR___ I believe this will be my next step, yes
19:46 dmit2k please report results! :)
19:46 TVR___ FYI, and this should be documented somewhere.... if you install onto a server with many multiple NICs in it... even if disconnected... it will SUBSTANTIALLY increase build time, as the puppet modules look at every NIC and wait for every NIC to get connectivity... not necessarily a bad thing, just be aware...
19:47 Dr_Drache yea
19:47 Dr_Drache noticed that
19:47 Dr_Drache if they are disconnected and not "in use"
19:47 Dr_Drache they should be passed over
19:47 TVR___ I watched the build log on a server with (2) x 4NIC cards plus the 4 built-in NICs and it was painful, but at least I now know the reason.
19:48 TVR___ maybe do an ethtool on them, and if not active, pass on them?
19:48 meow-nofer joined #fuel
19:48 vkozhukalov joined #fuel
19:49 Dr_Drache damn
19:49 Dr_Drache I didn't respond to vkozhukalov
19:50 TVR___ bad person !!
19:50 dmit2k https://blueprints.launchpad.net/fuel/+spec/disable-ability-to-add-controllers-in-ha
19:50 Dr_Drache just looked at the bug report.
19:51 Dr_Drache of course, they all seem to think my ubuntu problem has something to do with HP.
19:52 TVR___ OK.. so, not being able to deploy additional controllers... I agree and disagree with it...
19:53 TVR___ If you have 1 controller, you should not be allowed to add any more ==> YES
19:53 Dr_Drache aka, remove a large reason openstack exsist.
19:53 Dr_Drache scalbilty.
19:53 Dr_Drache (sorry on my spelling)
19:54 dmit2k agree
19:54 TVR___ why? Because ceph cannot handle adding another server, as it will restart and not know who to trust.. so adding to one ceph mon will break it...
19:55 dmit2k with both of you -- it is quite difficult to transform a non-HA cluster (mostly used for testing) to full HA
19:55 Dr_Drache dmit2k, that's impossible.
19:55 dmit2k but extending HA cluster should be possible though
19:55 Dr_Drache so i'm told
19:55 Dr_Drache yea
19:56 dmit2k and there is nothing with CEPH
19:56 dmit2k I used to have 2 and 4 MONs with no problems
19:56 dmit2k played a lot with it
19:57 dmit2k but even if we can survive without increasing number of HA controllers, still it is a great fail not to have an option to redeploy existing one
19:57 dmit2k this is what Fuel actually stands for
19:57 dmit2k management of a cluster
19:57 Dr_Drache without the management.
19:58 Dr_Drache right now, it works well as a deployment of a static cluster.
19:58 dmit2k so I hope it was something "lost in translation" in the FAQ and authors meant something different
19:58 bogdando joined #fuel
19:59 rmoe joined #fuel
20:00 dmit2k but it is quite logical to have a way of redeploying a broken controller if we still have an option to use Fuel to extend it with OSD or compute nodes
20:00 ruhe- joined #fuel
20:01 Ch00k_ joined #fuel
20:01 TVR___ so.. the adding controllers should be limited to the final count being an ODD number...
20:01 TVR___ but
20:01 TVR___ you need to be able to add controllers.... as the best architectural design is.... controller + OSD on the ceph servers, and just compute on the others....
20:01 TVR___ I ~suppose~ having the initial 3 as controllers and all the rest of the ceph nodes be OSD only works, but there is no reason to be limited that way
20:01 TVR___ those would be my thoughts on the subject
20:03 angdraug_ joined #fuel
20:05 bogdando joined #fuel
20:05 angdraug joined #fuel
20:05 TVR___ 2 mons in ceph work, as do 4, sure. with 1 mon, however, you cannot add another. I have tried it many times, and it has never succeeded.
20:06 TVR___ I agree and am very big on the idea of any provisioning system needs to have the ability to expand the cluster.
20:13 alex_didenko1 joined #fuel
20:13 dmit2k TVR___: agree, but I'm more worried about not having a way to redeploy broken controller
20:13 MiroslavAnashkin joined #fuel
20:14 TVR__ joined #fuel
20:15 dmit2k TVR___: agree, but I'm more worried about not having a way to redeploy broken controller while there is also no option to configure software raid
20:21 TVR__ joined #fuel
20:44 mattymo joined #fuel
21:18 Arminder joined #fuel
21:18 bookwar1 joined #fuel
21:51 miguitas joined #fuel
21:55 e0ne joined #fuel
21:57 e0ne joined #fuel
21:58 e0ne joined #fuel
22:16 xarses dmit2k: odd, they only disabled adding controllers in the ui
22:16 xarses you can still do it in the cli by the looks of it
22:18 xarses TVR__ is gone but "and this should be documented somewhere.... if you install onto a server with many multiple NICs in it..." Is no longer true, I fixed that for 4.1
22:58 e0ne joined #fuel
23:30 dmit2k xarses: thanx for comment, but what exactly is the reason for keeping this statement in global FAQ =="controllers cannot be redeployed without compeltely redeploying the environment"?
23:35 xarses dmit2k: I'm trying to figure out why it was done in the first place still, we had a discussion about it that wasn't resolved before the patch was put in
23:36 xarses dmit2k: adding an additional controller requires that all of the controllers re-run their puppet (even if you are "redeploying" a failed controller)
23:36 dmit2k xarses: is there any proper way of replacing a faulty controller node? there are two possible scenarios - a) node fried and replaced by another one and b) HDD failure
23:36 xarses dmit2k: that is a fact ^^
23:37 xarses now there is confusion as to 1) weather or not its idempotent 2) does in-fact re-run puppet on all of the controllers
23:37 dmit2k so there is no way of replacing a faulty one?
23:38 xarses 2) was discussed, and the code does send all controllers to puppet, but https://review.openstack.org/#/c/73546/ was merged anyway
23:38 dmit2k "idempotent"?
23:39 xarses Idempotence is the property of certain operations in mathematics and computer science, that can be applied multiple times without changing the result beyond the initial application. --Wikipedia
23:39 xarses ie, re-running the controllers wont result in a broken, or reset cluster
23:41 xarses dmit2k: so i think the change is wrong, i need to validate if the failed controller option works, if not i will raise a larger stink
23:41 dmit2k so all the configuration changes made initially by Fuel when deploying the cluster will stay intact (all the keys etc) after redeployment, but puppet will rebuild all three controllers?
23:41 xarses we will in the 5.0, maybe as late as 5.1 rework our orchestrator so that we can perform these lifecycle cases more effectively
23:43 xarses dmit2k: anything that puppet was told to configure could get reset if it was changed later, like say the admin password
23:43 xarses things that it wasn't told to configure, or was told there are conditions to not replace, should stay intact
23:47 dmit2k so correct me if I'm wrong: 1) in case I have three controller nodes all combined with CEPH OSDs (all with real data already), then one fries and I ask Fuel to redeploy the faulty one,
23:48 dmit2k then Fuel will rebuild all three controller-odd nodes keeping all previous settings, keys, new ceph accounts and data?
23:50 dmit2k and all configuration changes I did myself to the nodes will purge?
23:52 dmit2k i.e. Fuel stores all the initial configuration settings including auth keys in its own DB, so it will do the clean install of all three controllers with initial configuration?
23:54 dmit2k is it correct? then what will happen to CEPH data, CEPH accounts (if I added some at some point) and ceph.conf?
23:54 xarses dmit2k: it's supposed to. With, ceph i wrote most of the manifests so i can speak to them, all of the config should stay intact, and the data drives on the system being rebuilt will be formatted
23:54 xarses once the OSD's are back online the data will backfill from the other replicas on the other controllers
23:55 dmit2k mmmm, so Fuel will not rebuild ALL three controllers, only deploy that replacement one with initial config?
23:55 xarses and any configs that are managed by puppet will be reset back to what puppet had
23:55 xarses its more complicated than that, it re-run's puppet
23:56 xarses things that changed in the config sent to the controllers will be replaced
23:57 xarses of peticular note is like the configuration for corosync and haproxy, they need to be re-configured with the current information (IP's node names) of the members of the cluster
23:57 e0ne joined #fuel
23:57 xarses but tasks like create the database wont re-run because they test to see that there is a db present
23:58 xarses fuel only "rebuilds" the dead controller
23:58 xarses as if it where a new controller
23:59 xarses but the other controllers have puppet run on them to ensure that settings that changed at the cluster are re-written to the surviving members
23:59 dmit2k now getting more clear, thanx!

| Channels | #fuel index | Today | | Search | Google Search | Plain-Text | summary