Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-02-10

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:16 jporterfield joined #gluster
00:21 pdrakeweb joined #gluster
00:30 jporterfield joined #gluster
00:41 sputnik13net joined #gluster
00:45 overclk joined #gluster
00:50 dbruhn joined #gluster
01:02 kshlm joined #gluster
01:02 yinyin joined #gluster
01:03 vpshastry joined #gluster
01:11 tokik joined #gluster
01:25 ^rcaskey joined #gluster
01:31 hflai joined #gluster
01:38 lyang0 joined #gluster
01:45 atrius joined #gluster
01:47 jporterfield joined #gluster
01:53 bala joined #gluster
02:01 bala1 joined #gluster
02:11 recidive joined #gluster
02:12 jporterfield joined #gluster
02:19 jporterfield joined #gluster
02:37 jporterfield joined #gluster
02:42 jporterfield joined #gluster
02:48 jporterfield joined #gluster
02:52 bharata-rao joined #gluster
02:55 harish joined #gluster
02:59 harish joined #gluster
03:01 harish joined #gluster
03:06 pixelgremlins_ba joined #gluster
03:07 shubhendu joined #gluster
03:09 pdrakeweb joined #gluster
03:25 PacketCollision joined #gluster
03:27 kshlm joined #gluster
03:59 surabhi joined #gluster
03:59 dbruhn joined #gluster
04:01 kshlm joined #gluster
04:08 marcoceppi joined #gluster
04:08 marcoceppi joined #gluster
04:08 eastz0r joined #gluster
04:10 dbruhn joined #gluster
04:12 shyam joined #gluster
04:15 saurabh joined #gluster
04:16 _dist joined #gluster
04:18 _dist evening, I'm curious. Is anyone else here running gluster on zfs? (ZoL) I'm noticing that when a volume is started zfs starts filling the slab like crazy, 100s of GB of ram get filled
04:21 yinyin joined #gluster
04:22 _dist so the setup I have is zpool --> zvol --> xfs --> gluster volume. When the volume is started proc/zfs/kmem/slab starts taking up all kinds of ram. If anyone has run into this let me know :)
04:24 jporterfield joined #gluster
04:25 mohankumar joined #gluster
04:27 kanagaraj joined #gluster
04:29 rfortier1 joined #gluster
04:31 kdhananjay joined #gluster
04:32 psharma joined #gluster
04:33 vpshastry joined #gluster
04:34 semiosis _dist: xfs on zfs?
04:35 _dist semiosis: right, I think I found the issue just now actually
04:35 ndarshan joined #gluster
04:35 _dist semiosis: looks like it's the primarycache setting for the zvol
04:35 semiosis how do you run xfs on zfs?
04:35 * semiosis puzzled
04:36 _dist semiosis: you can create what's called a zvol. Which is essentially a virtual drive backed by zfs (all zfs feature still). Then you can create a partition on that any format you want
04:36 _dist I did that because I (as we talked about it couple weeks ago?) wanted cache=none for live migration
04:37 semiosis interesting
04:37 RameshN joined #gluster
04:37 RameshN_ joined #gluster
04:37 semiosis what features specifically did you want from zfs that lvm couldnt do?
04:38 _dist semiosis: scrubbing, none FW based raid, raidz3 (essentially raid7) and sending of snapshots. Also being able to use swappable ssd caches for writing and reading is nice
04:38 semiosis cool
04:39 _dist semiosis: if brtfs gets there I'd prefer to use it (cause it's in the kernel), I should probably give it a try every 6 months or so
04:41 _dist so now that it looks like I've finally nipped this issue (though it will have read cache consequences) I'm really curious about the recommended volume settings in gluster for VM storage. I get the eager locking, but I keep reading the performance.x-x descs and I don't what they really do. Anyone here think they can explain it to me so I'd understand?
04:43 _dist the ones listed in here https://access.redhat.com/site/documentation/en​-US/Red_Hat_Storage/2.0/html/Quick_Start_Guide/​chap-Quick_Start_Guide-Virtual_Preparation.html
04:43 glusterbot Title: Chapter 3. Managing Virtual Machine Images on Red Hat Storage Servers (at access.redhat.com)
04:43 kshlm gfm
04:46 ppai joined #gluster
04:56 bala joined #gluster
04:58 itisravi joined #gluster
05:00 tokik joined #gluster
05:01 RameshN_ joined #gluster
05:01 daMaestro joined #gluster
05:07 bala joined #gluster
05:07 jporterfield joined #gluster
05:12 aravindavk joined #gluster
05:19 _dist left #gluster
05:22 prasanth joined #gluster
05:24 hagarth joined #gluster
05:27 spandit joined #gluster
05:28 yinyin joined #gluster
05:32 CheRi joined #gluster
05:34 nshaikh joined #gluster
05:35 gdubreui joined #gluster
05:39 rastar joined #gluster
05:41 tjikkun_work joined #gluster
05:43 lalatenduM joined #gluster
05:47 rfortier joined #gluster
05:48 psharma joined #gluster
05:51 raghu` joined #gluster
05:57 jporterfield joined #gluster
05:58 kshlm joined #gluster
05:59 benjamin_ joined #gluster
06:18 davinder joined #gluster
06:19 Philambdo joined #gluster
06:21 rjoseph joined #gluster
06:22 vimal joined #gluster
06:38 dusmant joined #gluster
06:58 jporterfield joined #gluster
07:12 shylesh joined #gluster
07:13 yinyin joined #gluster
07:16 jtux joined #gluster
07:19 jporterfield joined #gluster
07:37 jporterfield joined #gluster
07:39 keytab joined #gluster
07:46 ngoswami joined #gluster
07:49 ctria joined #gluster
07:51 ktosiek joined #gluster
07:56 jporterfield joined #gluster
07:58 eseyman joined #gluster
08:03 ktosiek_ joined #gluster
08:04 jporterfield joined #gluster
08:09 glusterbot New news from newglusterbugs: [Bug 1021686] refactor AFR module <https://bugzilla.redhat.co​m/show_bug.cgi?id=1021686>
08:13 harish joined #gluster
08:19 saurabh joined #gluster
08:19 qdk joined #gluster
08:30 harish joined #gluster
08:37 blook joined #gluster
08:39 glusterbot New news from resolvedglusterbugs: [Bug 1062522] glusterfs: failed to get the 'volume file' from server <https://bugzilla.redhat.co​m/show_bug.cgi?id=1062522>
08:49 blook2nd joined #gluster
08:50 spiekey joined #gluster
08:50 spiekey Hello
08:50 glusterbot spiekey: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
08:56 tokik joined #gluster
09:04 dneary joined #gluster
09:09 glusterbot New news from newglusterbugs: [Bug 1063190] [RHEV-RHS] Volume was not accessible after server side quorum was met <https://bugzilla.redhat.co​m/show_bug.cgi?id=1063190>
09:09 liquidat joined #gluster
09:11 liquidat joined #gluster
09:15 andreask joined #gluster
09:18 liquidat joined #gluster
09:19 mgebbe_ joined #gluster
09:23 shylesh joined #gluster
09:32 mohankumar joined #gluster
09:33 DV joined #gluster
09:35 Slash joined #gluster
09:36 Elico joined #gluster
09:37 harish joined #gluster
09:39 yinyin joined #gluster
09:46 psharma joined #gluster
10:02 pk1 joined #gluster
10:07 psharma joined #gluster
10:09 jporterfield joined #gluster
10:12 franc joined #gluster
10:13 davinder joined #gluster
10:13 ndarshan joined #gluster
10:16 ninkotech_ joined #gluster
10:19 chirino joined #gluster
10:24 chirino joined #gluster
10:37 shylesh joined #gluster
10:39 glusterbot New news from newglusterbugs: [Bug 1063230] DHT - rebalance - when any brick/sub-vol is down and rebalance is not performing any action(fixing lay-out or migrating data) it should not say 'Starting rebalance on volume has been successful' . <https://bugzilla.redhat.co​m/show_bug.cgi?id=1063230>
10:40 badone joined #gluster
10:46 ndarshan joined #gluster
10:48 psharma joined #gluster
10:50 kanagaraj joined #gluster
10:52 dusmant joined #gluster
10:53 shubhendu joined #gluster
10:54 RameshN_ joined #gluster
10:55 RameshN joined #gluster
10:58 pk1 left #gluster
11:06 psharma joined #gluster
11:09 glusterbot New news from newglusterbugs: [Bug 1040355] NT ACL : User is able to change the ownership of folder <https://bugzilla.redhat.co​m/show_bug.cgi?id=1040355>
11:17 yinyin joined #gluster
11:20 RameshN joined #gluster
11:20 RameshN_ joined #gluster
11:25 hagarth joined #gluster
11:29 20WAA5U1U joined #gluster
11:29 77CABDVVH joined #gluster
11:31 pkoro joined #gluster
11:33 kgu87 joined #gluster
11:34 kgu87 left #gluster
11:39 glusterbot New news from resolvedglusterbugs: [Bug 1062674] Write is failing on a cifs mount with samba-4.1.3-2.fc20 + glusterfs samba vfs plugin <https://bugzilla.redhat.co​m/show_bug.cgi?id=1062674>
11:49 arcimboldo joined #gluster
11:50 jporterfield joined #gluster
11:51 RameshN_ joined #gluster
11:51 RameshN joined #gluster
11:56 arcimboldo hi all, is there anyone with experience with gluster-swift?
11:57 diegows joined #gluster
11:59 edward1 joined #gluster
12:03 itisravi joined #gluster
12:15 CheRi joined #gluster
12:19 dusmant joined #gluster
12:19 shubhendu joined #gluster
12:20 jclift_ ndevos: Do you remember who puts time into Gluster-swift?
12:20 ppai joined #gluster
12:21 ndevos jclift_: portante, lpabon and thiago - to name a few
12:22 ndevos jclift_: you can shech the github project for the most current names :)
12:22 ndevos *check even
12:22 jclift_ ndevos: Thx. :)
12:22 jclift_ arcimboldo: ^^^
12:23 arcimboldo thnx, jclift_
12:23 arcimboldo and ndevos :)
12:23 kkeithley joined #gluster
12:24 arcimboldo portante: can you answer a few questions on gluster-swift? I have trouble setting it up
12:26 bennyturns joined #gluster
12:30 calum_ joined #gluster
12:33 hagarth joined #gluster
12:35 kdhananjay joined #gluster
12:38 arcimboldo I guess I'll have more luck sending an email to the mailing list? :)
12:43 portante arcimboldo: sure
12:43 burn420 joined #gluster
12:46 RameshN joined #gluster
12:47 RameshN_ joined #gluster
12:48 recidive joined #gluster
12:50 vpshastry joined #gluster
12:56 pdrakeweb joined #gluster
13:00 arcimboldo mail sent, hoping someone is willing to help :)
13:00 ira joined #gluster
13:01 vpshastry left #gluster
13:02 LessSeen joined #gluster
13:04 arcimboldo can I ask how big your gluster installation is and for what are you using it?
13:10 pdrakeweb joined #gluster
13:11 blook joined #gluster
13:20 andreask joined #gluster
13:32 davinder joined #gluster
13:36 kshlm joined #gluster
13:39 arcimboldo a few questions: if I want to remove 2 bricks from a replicated volume made of 8 bricks, but I don't want to loose any data, how can I do it?
13:39 arcimboldo is there any way to move all the data from those bricks to other servers?
13:40 dbruhn joined #gluster
13:40 glusterbot New news from newglusterbugs: [Bug 1060703] client_t calls __sync_sub_and_fetch and causes link failures on EPEL-5-i386 <https://bugzilla.redhat.co​m/show_bug.cgi?id=1060703>
13:42 bala joined #gluster
13:46 tryggvil joined #gluster
13:47 kkeithley https://access.redhat.com/site/documentation/en-​US/Red_Hat_Storage/2.0/html/Administration_Guide​/sect-User_Guide-Managing_Volumes-Shrinking.html
13:47 glusterbot Title: 10.3. Shrinking Volumes (at access.redhat.com)
13:48 kkeithley long answer ^^^
13:49 kkeithley arcimboldo: ^^^
13:51 arcimboldo thnx
13:51 arcimboldo so rebalancing it's done automatically, I don't need to do anything
13:51 arcimboldo one more question: I tried to remove and then re-add the bricks
13:51 recidive joined #gluster
13:51 arcimboldo but gluster volume add-brick ... replica 2 ... only gives me:
13:51 kkeithley and you got a brick already used errror
13:51 arcimboldo volume add-brick: failed:
13:52 arcimboldo I don't know, I don't see anything :)
13:53 kkeithley what's in the logs? /var/log/glusterfs/cli.log
13:53 arcimboldo should I delete the old brick directory from the nodes?
13:53 arcimboldo [2014-02-10 13:53:25.952947] I [cli-rpc-ops.c:1695:gf_cli_add_brick_cbk] 0-cli: Received resp to add brick
13:53 arcimboldo [2014-02-10 13:53:25.953043] I [input.c:36:cli_batch] 0-: Exiting with: -1
13:53 arcimboldo on the machine I am runningi the command
13:53 arcimboldo which is not the brick I'm adding
13:54 kkeithley if you used a subdir then you can just `rm -rf $subdir; mkdir $subdir`.  If you didn't use a subdir then there are xattrs you need to remove in addition to the .glusterfs directory. Might be easier to redo the mkfs.
13:54 kkeithley log on the system where the brick resides?
13:55 arcimboldo I'm using a test machine so I'm actually using a directory under / filesystem
13:55 CheRi joined #gluster
13:55 kkeithley okay, so just `rm -rf $path_to_brick/$subdir; mkdir $path_to_brick/$subdir` should be fine
13:56 arcimboldo ok thanx it worked.
13:56 arcimboldo Then, do I need to run a volume rebalance to redistribute the files?
13:56 sroy_ joined #gluster
13:57 arcimboldo apparently, rebalance did the trick
13:57 kkeithley yes.
13:58 kkeithley and yes
13:59 arcimboldo Dunno if it's related to tests I did previously, but on glusterfs.log I see a lot of the following error:
14:00 arcimboldo [2014-02-10 14:00:06.353610] W [socket.c:514:__socket_rwv] 3-default-client-2: readv failed (No data available)
14:00 arcimboldo [2014-02-10 14:00:06.353682] I [client.c:2097:client_rpc_notify] 3-default-client-2: disconnected
14:01 B21956 joined #gluster
14:03 ctria joined #gluster
14:05 arcimboldo ah, one of the gluster is not visible in the peer status!
14:06 prasanth joined #gluster
14:06 arcimboldo i've readded it with peer probe, but I still see the errors in the file
14:07 awheeler_ joined #gluster
14:08 blook joined #gluster
14:14 bennyturns joined #gluster
14:17 jobewan joined #gluster
14:17 haomaiwa_ joined #gluster
14:19 dusmant joined #gluster
14:21 theron joined #gluster
14:21 japuzzo joined #gluster
14:22 ira joined #gluster
14:25 jmarley joined #gluster
14:27 theron joined #gluster
14:27 kanagaraj joined #gluster
14:29 dbruhn joined #gluster
14:32 calum_ joined #gluster
14:40 rfortier1 joined #gluster
14:50 pdrakeweb joined #gluster
14:52 davinder joined #gluster
14:59 bugs_ joined #gluster
15:04 pixelgremlins joined #gluster
15:04 calum_ joined #gluster
15:08 ndk joined #gluster
15:12 kaptk2 joined #gluster
15:21 bala joined #gluster
15:25 plarsen joined #gluster
15:26 primechuck joined #gluster
15:26 dusmant joined #gluster
15:27 wushudoin joined #gluster
15:29 harold[mtv] joined #gluster
15:29 jbrooks joined #gluster
15:32 calum_ joined #gluster
15:37 daMaestro joined #gluster
15:37 blook joined #gluster
15:43 [o__o] joined #gluster
15:43 glusterbot New news from resolvedglusterbugs: [Bug 1055037] Add-brick causing exclusive lock missing on a file on nfs mount <https://bugzilla.redhat.co​m/show_bug.cgi?id=1055037>
15:49 theron joined #gluster
15:53 primechuck Is there anything simliar to the NUFA plugin where reads can be preferenced to a local brick and writes are still synchronous between all the bricks in a volume?
15:55 vpshastry joined #gluster
15:56 vpshastry left #gluster
15:57 haomaiw__ joined #gluster
15:57 theron joined #gluster
16:00 calum_ joined #gluster
16:02 kanagaraj joined #gluster
16:08 recidive joined #gluster
16:11 vpshastry joined #gluster
16:11 pdrakeweb joined #gluster
16:19 blook joined #gluster
16:22 sprachgenerator joined #gluster
16:26 vpshastry joined #gluster
16:30 rpowell joined #gluster
16:31 vpshastry left #gluster
16:31 ^^rcaskey joined #gluster
16:41 pdrakeweb joined #gluster
16:42 TheDruidsKeeper joined #gluster
16:52 asku joined #gluster
16:55 blook joined #gluster
16:57 TheDruidsKeeper I'm working on a fun home project.. I have a server running VMWare Esxi with several guests, including an ubuntu install with OpenVPN. I just added two 4tb drives to the server (directly on the blade, not a SAN) that I want to have connected to the vpn guest so I can use it as a cloud storage for my VPN clients (although mostly it will just be media accessed through a DLNA server running on that guest). so now I'm looking into the "best" way to
16:57 ctria joined #gluster
16:59 jclift_ TheDruidsKeeper: Doesn't immediately sound like a thing which "distributed storage" would suit, since there's physically only one box?
16:59 TheDruidsKeeper yeah, currently there is only 1 physical box
17:00 TheDruidsKeeper i'm thinking that that will change in the future
17:00 jclift_ TheDruidsKeeper: Technically, you could probably make it work by dedicating a drive each to two vms... but yeah, you'd want more than one server if you care about the data. :)
17:00 jclift_ TheDruidsKeeper: For just mucking around with though, it should all work fine.
17:01 TheDruidsKeeper sounds reasonable enough. i do have another machine that i could set up to work as the parity
17:03 TheDruidsKeeper a big question i have though since i'm doing this through a vm host.. should i be putting the vmfs on the drives, and then gluster goes on top of the vmdk's? or should I do raw vmdk's to let gluster have more direct accesses to the disks?
17:04 jclift_ TheDruidsKeeper: It's a good question.  I'm not sure of best practices around this.  Personally, my first guess would be to _try_ and give the vm's direct access to the disks, so there's less layers of IO software in the way.
17:05 jclift_ TheDruidsKeeper: But, like most things, if you've got time you should probably try both ways and doing some basic testing to see if one stands out as better than the other in practise.
17:05 TheDruidsKeeper jclift_: that was my thought too, so i did create raw disks and mounted them to the guest.. but thought I should ask before i got too far into it.
17:06 jclift_ :)
17:06 TheDruidsKeeper jclift_: try both, i like it :)
17:18 ira joined #gluster
17:20 kanagaraj joined #gluster
17:20 ira joined #gluster
17:25 cfeller joined #gluster
17:27 rotbeard joined #gluster
17:34 Mo_ joined #gluster
17:36 dbruhn jclift_, that performance measurement panel you are working on looks awesome.
17:36 sjoeboo joined #gluster
17:47 diegows joined #gluster
17:48 jclift_ dbruhn: Thanks. :)
17:49 jclift_ dbruhn: Found a bug in Glupy that I need to fix before it's practical for people to use, but I think GlusterFlow will come along pretty quickly when people starting using it and having ideas. :)
17:50 purpleidea jclift_: is glupy being actively maintained/hacked on these days?
17:50 purpleidea (at some point in the past, i thought it was abandoned)
17:50 purpleidea https://github.com/jdarcy/glupy/
17:51 glusterbot Title: jdarcy/glupy · GitHub (at github.com)
17:51 jclift_ purpleidea: It's been merged into the main Gluster codebase, so will be part of 3.5.
17:51 purpleidea jclift_: ah, okay. cool.
17:51 jclift_ purpleidea: But there doesn't seem to be any test framework for it (yet), so the code there doesn't seeem to work due to a very simple namespace conflict.
17:51 purpleidea ^^ maybe someone could patch the README there to point to the new code.
17:52 jclift_ purpleidea: Jeff Darcy's repo isn't actually workable, I've fixed some bugs in it in my fork, and even my repo has the same namespace conflict problem.  (will fix publicly soon)
17:52 jclift_ Won't be today though.
17:53 purpleidea jclift_: no worries. glad to see this is getting traction though! i'd love to write a translator without having to c hack.
17:53 jclift_ purpleidea: And yeah, I suspect I'll need to take over maintainership of the Glupy code... which is kind of scary since I don't yet know c-types.
17:53 jclift_ c-types being the way that we get to use Gluster C data structures from Python.
17:53 purpleidea jclift_: it will be okay... it's all like, something, something, | #python
17:53 jclift_ :D
17:54 jclift_ I guess I'll pick it up on the way or something. ;)
17:54 purpleidea what's the path in the glusterfs source to look for glupy?
17:54 jre1234 joined #gluster
17:55 delhage joined #gluster
17:57 jre12345 joined #gluster
17:58 jre12345 left #gluster
17:58 jre12345 joined #gluster
17:59 jre12345 Him I am trying to debug an issue I am having with geo-replication on gluster 3.3.2
18:00 jre12345 I keep getting "error: (9, 'Bad file descriptor')" logged on the master server and the status alternates between faulty and ok during the inital sync
18:01 cfeller joined #gluster
18:02 sprachgenerator joined #gluster
18:05 spiekey hello
18:05 glusterbot spiekey: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
18:05 spiekey i am from the drbd world and i am testing glusterfs :)
18:06 aurigus joined #gluster
18:06 spiekey my first question is: if i am using Type: Replicate with two nodes, and i shut down node2 for a while. will it resync onces it comes back up?
18:08 purpleidea spiekey: yes
18:08 purpleidea it's called self-heal
18:10 Matthaeus joined #gluster
18:11 semiosis spiekey: interesting differences... glusterfs replication is multi-master, and the replication is handled by the clients (when using native fuse clients)
18:13 pdrakeweb joined #gluster
18:14 jre12345 Is there a way of debugging which file is causing the "bad file descriptor" errors ?
18:15 spiekey purpleidea: can i somehow view the resync status?
18:16 purpleidea gluster volume heal <volname> info
18:17 purpleidea spiekey: as semiosis mentioned, it's quite different from drbd. you should try it out to get familiar, and then your questions will be better answered too!
18:17 purpleidea spiekey: obligatory you can try it out easily with ,,(vagrant)
18:17 glusterbot spiekey: (#1) Part 1 @ https://ttboj.wordpress.com/2013/12​/09/vagrant-on-fedora-with-libvirt/, or (#2) Part 2 @ https://ttboj.wordpress.com/2013/12​/21/vagrant-vsftp-and-other-tricks/, or (#3) Part 3 @ https://ttboj.wordpress.com/2014/01/​02/vagrant-clustered-ssh-and-screen/, or (#4) Part 4 @
18:17 glusterbot https://ttboj.wordpress.com/2014/0​1/08/automatically-deploying-glust​erfs-with-puppet-gluster-vagrant/, or (#5) https://ttboj.wordpress.com/2014/01/16​/testing-glusterfs-during-glusterfest/
18:18 semiosis wow such links
18:18 purpleidea is it too much? feel free to prune :P
18:18 semiosis no just been thinking in doge lately
18:20 spiekey thanks a lot!
18:21 purpleidea yw!
18:21 purpleidea ,,(next)
18:21 glusterbot Another satisfied customer... NEXT!
18:22 plarsen joined #gluster
18:23 plarsen joined #gluster
18:26 semiosis purpleidea: there's trouble afoot in #gluster-dev :O
18:27 purpleidea semiosis: i know this guy :P
18:28 spiekey if i use Replicate, will it then only read locally or also read from the 2nd brick?
18:28 purpleidea spiekey: there are different ways...
18:28 spiekey i mean, it could, in theory.
18:28 purpleidea @undocumented
18:28 glusterbot purpleidea: I do not know about 'undocumented', but I do know about these similar topics: 'undocumented options'
18:28 purpleidea ~ undocumented options | spiekey
18:28 glusterbot spiekey: Undocumented options for 3.4: http://www.gluster.org/community/documentat​ion/index.php/Documenting_the_undocumented
18:28 purpleidea and there is a new style replication thingy coming soon (ish) ?
18:28 spiekey ok, i might NOT start with that :)
18:30 purpleidea spiekey: yeah, like i said, get it working, and play with it a bit first. that will probably answer most of your questions. then come here with the wtf's :P
18:30 semiosis +1
18:31 purpleidea (not that there are ever any wtf's, right semiosis ?)
18:31 semiosis although keep in mind glusterfs was designed with large deployments in mind.  the trivial case of syncing a disk/directory between two servers is not really what it's optimized for
18:31 semiosis that does work, and lots of people (including me) do that
18:32 semiosis but worrying about low level optimizations like that for such a small deploy isn't really useful
18:32 Matthaeus joined #gluster
18:32 semiosis until you're much further along anyway
18:33 spiekey i am planing to use it with ovirt/kvm
18:34 semiosis also note, if you plan on having your servers also be clients, then you really should use quorum with replica 3
18:40 purpleidea semiosis: ^^ why 3 in particular?
18:40 spiekey so its odd
18:44 primechuck In a Distribute+Replicate environment, if a volume have cluster.choose-local set to true and each server has both bricks and a client, does it automatically figure out which bricks are local or is there an additional definition for local subvols.
18:44 primechuck Or does it not do what I think it does :)
18:47 dkphenom joined #gluster
18:48 spechal left #gluster
18:54 arcimboldo joined #gluster
18:59 cp0k joined #gluster
19:05 tdasilva joined #gluster
19:06 zerick joined #gluster
19:08 semiosis purpleidea: so when one server goes down you still have two running.  if you have half or less of replicas online then clients will turn read-only
19:08 semiosis purpleidea: ...when using quroum
19:08 semiosis purpleidea: with replica 2 without quorum you might get into the unfortunate situation where each server thinks the other is down & so both end up with the same vm(s) running -- major split brain
19:09 purpleidea semiosis: gotcha... did the vm's start needing quorum data?
19:09 semiosis ???
19:09 semiosis idk what you are asking
19:10 primechuck Is quorum "server" based or brick server based?  I.E.  Odd number of bricks between even number of servers?
19:10 purpleidea sorry, what i mean is, when you're hosting vm's locally, what do you need to change? i didn't realize by default that 1/2 nodes down by default with replica =2 causes read only...
19:10 semiosis primechuck: idk what you are asking either?!!
19:11 purpleidea semiosis: you've got your babel fish in backwards :P
19:11 semiosis purpleidea: half nodes down with replica 2 causes read-only -- *when quorum is enabled*
19:11 semiosis by default quorum is not enabled
19:11 purpleidea semiosis: aha! that's what i was missing, thanks
19:11 semiosis :)
19:12 cp0k Is it the same case with the volume turning read-only if quorum is not enabled?
19:12 semiosis purpleidea: but replica 2, between two machines that are servers & also clients, is a recipie for split brain -- if they lost contact with each other and both think "my peer went away, now I'm master"
19:12 purpleidea cp0k: apparently not
19:12 semiosis cp0k: half nodes down with replica 2 causes read-only -- *when quorum is enabled*
19:13 cp0k cool
19:13 cp0k good to know, Im about to do a nice upgrade on my gluster to 3.4.2
19:13 ninkotech joined #gluster
19:13 arcimboldo_ joined #gluster
19:13 purpleidea semiosis: indeed... so has someone built something specific for vm's on gluster hosts? i'd imagine they might want to use rgmanager or similar to maintain the vm's... not sure how well fencing works in conjunction with glusterfs. would be cool to test.
19:13 ninkotech_ joined #gluster
19:13 primechuck semiosis:  It is kind of a silly question, but 4 Machines running with Replica 2.  2 Machines have 3 bricks, 2 machines have 2 bricks.  Is the quorum per Machine or is it per brick server process?
19:13 cp0k 127.0.0.1:/storage  189T  160T   19T  90% /storage
19:13 cp0k its going to be fun
19:14 semiosis primechuck: no one should ever have that config, it's insane :)
19:14 purpleidea +1
19:14 semiosis oh actually, i misunderstood
19:15 purpleidea symmetrical setups are recommended
19:15 dbruhn +1
19:15 semiosis purpleidea: it is symmetrical
19:15 semiosis oops
19:15 semiosis first pair has 2x3, second pair has 2x2, whole volume is 2x5 distributed-replicated
19:15 cp0k how about this
19:16 cp0k Type: Distributed-Replicate
19:16 cp0k Volume ID: b730850b-e19a-4ee9-94c4-62a3c63c240f
19:16 cp0k Status: Started
19:16 cp0k Number of Bricks: 15 x 2 = 30
19:16 semiosis primechuck: there's two kinds of quorum
19:16 mik3 joined #gluster
19:16 semiosis primechuck: see 'gluster volume set help' for info.  i think there's server quorum & cluster quorum IIRC
19:16 cp0k seems gluster has no problem with this config so far
19:16 semiosis cp0k: fine
19:16 purpleidea semiosis: https://github.com/purpleidea/puppet-gluster/​blob/master/manifests/volume/property/data.pp (search for quorum)
19:16 glusterbot Title: puppet-gluster/manifests/volume/property/data.pp at master · purpleidea/puppet-gluster · GitHub (at github.com)
19:17 semiosis neat
19:18 primechuck Hence the silly question part :)  Mainly just a thought exercise in how into how it works with a CRAZY configuration.
19:20 neofob left #gluster
19:25 mik3 when dealing with 2TB of small files being accessed concurrently by web apps, it would probably be more prudent to throw in an active/passive nfs configuration as opposed to using glusterfs/xfs, correct?
19:25 rotbeard joined #gluster
19:26 semiosis not enough info to make a judgement imho
19:27 mik3 k
19:27 wgao joined #gluster
19:27 semiosis could use front end caching
19:27 semiosis or work out a different code deployment system
19:31 mik3 these are basically thumbnails and stuff, static files. they're being served up to jboss app servers. i'm dealing with devops asking for clustered/load balanced file availability that is expected to be exported via nfs, or gluster's native client, as well as available/mounted on the bricks (currently configured with drbd/clvmd/gfs2) for maintenance/dev purposes
19:31 mik3 basically asking magic performed on 2TB of files to be accessed from everywhere
19:33 mik3 so the same share of 2TB worth of thumbnails and other small files, exported via NFS for production purposes, and also being stressed out for our own version of geo-replication, as well as system file backups being written
19:34 spiekey if i have 2 nodes with replication and the Link is 1GBit i should get a max write rate of about 100MB/sec, right?
19:34 spiekey now if i set the network speed to 10MBit, then i should get a write speed of 1MB/sec?
19:37 semiosis client sends writes to all replicas, so available bandwidth / 2 for replica 2
19:37 semiosis afk
19:41 zaitcev joined #gluster
19:57 rpowell left #gluster
19:57 pixelgremlins_ba joined #gluster
20:02 VeggieMeat joined #gluster
20:02 RedShift joined #gluster
20:03 fyxim joined #gluster
20:05 pdrakeweb joined #gluster
20:06 recidive joined #gluster
20:11 _dist joined #gluster
20:12 _dist afternoon, I was wondering if eager-lock makes shd slower? my default volume healed at a very quick rate, but since I applied the recommend changes for libgfapi it seems to take its' sweet time! :)
20:14 _dist also I was wondering if anyone has written a script yet to parse out the heal volume info, since I believe presently the only way to know if a file is "synced" on both ends is to watch the output and look for a split second where it doesn't show up in the heal info?
20:16 jmalm joined #gluster
20:16 B21956 joined #gluster
20:17 _dist I've got around 12gbps between two replication nodes, but it seems to take quite a long time to heal even small differences
20:17 lyang0 joined #gluster
20:18 purpleidea _dist: use the --xml flag. i have such a script for other options here: https://github.com/purpleidea/puppet-​gluster/blob/master/files/xml.py#L19
20:18 glusterbot Title: puppet-gluster/files/xml.py at master · purpleidea/puppet-gluster · GitHub (at github.com)
20:18 _dist cool
20:31 _dist purpleidea: do you have an example call where I could view healing activity?
20:31 purpleidea _dist: run the command ?
20:33 _dist purpleidea: right, but the idea is redirect specific gluster volume command into xml format, and then your script takes over. What does it want for path (gluster volume, or brick?)
20:34 purpleidea "for path" ?
20:34 _dist "gluster volume status --xml [<VOLNAME>] | ./xml.py port --volume <VOLUME> --host <HOST> --path <PATH> <PORT>"
20:35 purpleidea _dist: okay, in my xml.py file i've implemented specific parsers for different parts that i'm interested in. you'll want to write your own for heal status
20:35 purpleidea the path one above is brick path
20:36 _dist purpleidea: yeah I can do that, but there's no good way then in gluster to see what's healed and what isn't correct? (in a running VM sitaution). Also, do you know if eager-lock makes the heal much slower? (just a hunch I have)
20:36 purpleidea _dist: i'm sorry, i'm really not sure what you're trying to do, or what your question is... maybe you can rephrase
20:37 _dist purpleidea: I have a (presently two node) hypervisor. Each is also part of a replicate volume, for live migration. When I take one down for maintenence and bring it back up, I have no idea how to tell (for certain) when the VM images are "healed" thus safe for the other to go down.
20:39 purpleidea _dist: so as i mentioned, pipe the heal info command --xml into xml.py (which you'll need to patch to parse out whatever specific info you're interested in) and then read that result.
20:39 _dist purpleidea: The only thing I can think to do, is get xml.py to receive a "gluster volume heal x info" but like 100 times in a row, and then look for a single instance where file Y wasn't there (and then we know it's healed). Do you follow that thinking?
20:40 purpleidea _dist: you should probably run a few tests to make sure this does what you need.
20:42 _dist purpleidea: But you understand my thinking right? VMs are _always_ in the heal info, even when healthy more often than not. When I've asked before I the only solution anyone could come up with was to watch for when the particular file isn't in the heal info display. If that's true, I might end up writing a web interface that does all this work for me otherwise I'll never feel safe powering off a node. I guess first I'l
20:42 semiosis _dist: ,,(pasteinfo)
20:42 glusterbot _dist: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
20:43 purpleidea _dist: fpaste the output if your heal command
20:43 purpleidea semiosis: :P good timing
20:44 _dist this how it'll look for the first while, until the GFIDs turn into file names https://dpaste.de/f1mY
20:44 glusterbot Title: dpaste.de: Snippet #256818 (at dpaste.de)
20:44 _dist eventually I'll know it's healthy when all of the VMs are equally on the top and bottom but flashing there and back with a watch -n1
20:45 purpleidea _dist: run the command with --xml and paste the command you're using too
20:46 _dist ok
20:46 arcimboldo_ joined #gluster
20:47 _dist purpleidea: where in a "gluster volume heal vol info" does the --xml go? I can't seem to get it to spit out xml
20:47 purpleidea at the end i would expect
20:48 _dist ah, if I just enter gluster with --xml it's all good, at the end doesn't work with full cli for me
20:48 purpleidea or after 'volume'
20:48 mrfsl joined #gluster
20:49 _dist purpleidea: doesn't look like gluster is willing to spit of xml for this, even in gluster --xml it still does it as parsed in my dpsate
20:49 semiosis _dist: ,,(pasteinfo)
20:49 glusterbot _dist: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
20:49 purpleidea _dist: hm, maybe it's a bug. i didn't test this yet.
20:49 mrfsl Looking for help troubleshooting a volume heal - when i type volume heal vol1 info, it hangs and eventually goes to the next line - without outputting any information
20:50 _dist purpleidea: https://dpaste.de/N6C4 this is with command
20:50 glusterbot Title: dpaste.de: Snippet #256821 (at dpaste.de)
20:50 _dist doesn't seem to matter where I put the --xml
20:51 purpleidea ~fileabug | _dist
20:51 glusterbot _dist: Please file a bug at http://goo.gl/UUuCq
20:51 semiosis it's because there's no libxml2, dont file a bug
20:51 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
20:51 purpleidea feel free to cc me on it
20:51 semiosis _dist: what distro are you on?
20:51 semiosis can you double check you have libxml2 installed?
20:51 semiosis at least, don't file a bug *yet*, until we can confirm if it's a simple dep issue
20:51 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
20:52 semiosis haha, trolled myself
20:52 purpleidea hehe
20:52 semiosis also, that ,,(pasteinfo) any time you feel like it :)
20:52 glusterbot Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
20:52 _dist semiosis: I'm on debian wheezy, and xml works for other commands
20:52 purpleidea semiosis: i don't have my gluster cluster running atm. does --xml work for you with heal info ?
20:52 _dist semiosis: I mean like gluster volume status --xml for example does give me xml
20:53 semiosis oh interesting, that might be buggable :)
20:53 semiosis purpleidea: my laptop is in the bag now, cant check
20:53 purpleidea same
20:53 purpleidea _dist: so for now, ,,(fileabug)
20:53 glusterbot _dist: Please file a bug at http://goo.gl/UUuCq
20:54 _dist purpleidea/semiosis: transparency, I'm on 3.4.1 wouldn't be correct to file a bug until I check on latest
20:54 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
20:54 semiosis right
20:55 _dist semiosis: either way my main question, is about if there is a better way to know when shd is done with something. Seems like sometimes it can be seconds or hours. Also for some odd reason it's not going from GFID to filenames this time
20:56 _dist I'll put glusterfs on my test ubuntu vm and check if heal does xml on 3.4.2
20:57 semiosis gbtw.  ping me when you have that ,,(pasteinfo)
20:57 glusterbot Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
21:06 _dist cloning vm, have to create a replicate cluster to test this
21:06 mik3 is the nfs server built in to gluster sharing states with one another so i can use a vip ?
21:07 mik3 s/one another/each brick
21:09 kkeithley the gluster NFS server is itself a client of the gluster servers for a volume. I'm not sure what state would be shared. Can you be more specific.
21:10 ira joined #gluster
21:10 mik3 i'm basically trying to achieve NFS failover
21:14 mik3 the data i'm working with is a lot of small static files being served up to jboss servers, since NFS is going to be the best route to handle that load i'm trying to determine if i should even bother with gluster to benefit from its other features/its turn-key solution
21:14 mik3 so i guess i'm asking, would a floating ip among two bricks work?
21:15 mik3 work for the nfs clients in the event of a failure/floating ip migrating to the last remaining node
21:19 _dist doesn't appear to work in 3.4.2 either https://dpaste.de/qqYV
21:19 glusterbot Title: dpaste.de: Snippet #256825 (at dpaste.de)
21:20 kkeithley I believe that many gluster users do use virtual ips for gluster nfs.
21:20 mik3 i would imagine so, seems like the google results i found suggest so as well
21:20 mik3 weird that it's not documented
21:21 kkeithley yes, our docs can be rather hit-or-miss
21:21 tryggvil joined #gluster
21:28 purpleidea mik3: kkeithley: you can use a vip for NFS (it's a good way to mount) but on failover, your connection state (tcp) is _NOT_ replicated (AFAICT), so you'll have to reconnect those clients. maybe they can auto resume, but i don't know. anything in progress will fail or hang.
21:28 kkeithley correct
21:29 purpleidea that type of functionality is _why_ there's the native fuse client. also, i think pNFS will support this type of thing. afaict gluster doesn't do pnfs yet. i think they call it ganeshi or something
21:30 andreask1 joined #gluster
21:31 _dist purpleidea/semiosis: https://bugzilla.redhat.co​m/show_bug.cgi?id=1063506 submitted. Still though, even if it had xml output it wouldn't answer the question of, how do I know when two replicate volumes are in sync. If there is no way, I can think of a few things I could do
21:31 glusterbot Bug 1063506: low, unspecified, ---, kaushal, NEW , No xml output on gluster volume heal info command with --xml
21:32 kkeithley nfs-ganesha or just ganesha. Doesn't do pnfs quite yet. And BTW you can try it out now — it's in Fedora 20 and you can get RPMs in a YUM repo for Fedora 19, RHEL 6, and CentOS 6 at http://download.gluster.org/pub​/gluster/glusterfs/nfs-ganesha/
21:32 glusterbot Title: Index of /pub/gluster/glusterfs/nfs-ganesha (at download.gluster.org)
21:32 pdrakeweb joined #gluster
21:32 purpleidea kkeithley: (if pnfs planned?)
21:32 purpleidea is*
21:33 semiosis isn't the joke that pnfs has been planned for a decade?
21:33 purpleidea _dist: the output should hopefully show something useful in --xml
21:33 kkeithley yes, it's in the works. Might be in 2.1. We'll probably get pnfs in nfs-ganesha before btrfs is fully baked.
21:33 _dist purpleidea: ah, that'd be great. Then I could just grep for the positive/negative.
21:33 purpleidea _dist: ,,(pasteinfo)
21:33 glusterbot _dist: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
21:34 purpleidea _dist: read this ^^^
21:35 purpleidea semiosis: i think it _literally_ has been hacked on for a decade
21:35 mik3 purpleidea: so nfs failover doesn't work
21:35 mik3 is basically what that means
21:35 purpleidea mik3: not the way you want it to, correct.
21:36 _dist purpleidea: I did, it was  https://dpaste.de/qqYV , wait you mean without the --xml? it's just the same output
21:36 glusterbot Title: dpaste.de: Snippet #256825 (at dpaste.de)
21:36 purpleidea mik3: you can investigate conntrackd to help with some of the tcp connection state, but i don't know how that will confuse the nfs server. probably enough to not work. i'd be interested to hear about it though!
21:37 mik3 that sort of functionality exists, just not in glusterfs
21:37 mik3 fwiw
21:37 purpleidea mik3: indeed
21:37 mik3 and it's a deal breaker unfortunately
21:37 purpleidea mik3: in what?
21:38 mik3 i'm serving up 2TB of small static files to jboss apps so the fuse client isn't really a good idea in my situation
21:38 purpleidea mik3: where does that functionality exist?
21:38 mik3 also being tapped by other non-essential systems for dev purposes (my nightmare)
21:38 purpleidea mik3: it many situations, it's probably actually not really HA. you just might think it exists...
21:38 semiosis _dist: please try this 'gluster volume set VOLNAME group virt' as described here: https://access.redhat.com/site/documentation/en​-US/Red_Hat_Storage/2.0/html/Quick_Start_Guide/​chap-Quick_Start_Guide-Virtual_Preparation.html
21:38 glusterbot Title: Chapter 3. Managing Virtual Machine Images on Red Hat Storage Servers (at access.redhat.com)
21:39 semiosis _dist: that will set several options for hosting vms afaict
21:39 mik3 purpleidea: uhm, what? you think nfs can't be thrown into a HA configuration?
21:40 _dist semiosis: what will that change? I applied all the other options previously other than the group virt
21:40 _dist semiosis: and can I do it while my machines are on, the other options tanked the guests (all the performance ones do)
21:40 mik3 purpleidea: in dual-active configurations it doesn't work but nfs/ext4 with a floating ip in active/passive configuration works fine, but the stateful data needs to be stored on the replicated share
21:41 semiosis _dist: maybe my xml glasses are failing me, but i dont see any volume options reconfigured in your xml volume info output
21:41 mik3 i'm pretty sure active/active configs will work too
21:41 _dist semiosis: that was a test spun up with two vms, just for the purpose of the bug entry (needed 3.4.2 remember) my prod is on 3.4.1
21:42 semiosis then why the heck did you give that output when we asked for volume info output?  I'm trying to help you and you're trolling me
21:42 purpleidea mik3: works fine with what storage?
21:42 mik3 purpleidea: right now i'm using a drbd/clvmd/gfs2 configuration with nfs being served up and it's kind of a nightmare, so i've been weighing my options with gluster
21:42 * semiosis frustrated
21:43 mik3 purpleidea: drbd active/passive
21:43 mik3 ext4
21:43 mik3 any fs
21:43 purpleidea mik3: you're mistaken that your setup that is HA for NFS clients.
21:43 mik3 what?
21:44 glusterbot New news from newglusterbugs: [Bug 1063506] No xml output on gluster volume heal info command with --xml <https://bugzilla.redhat.co​m/show_bug.cgi?id=1063506>
21:44 _dist semiosis: I only gave it to show side by side comparisons of --xml working on one command and not another. The bug is entirely about the bug glusterbot just told us about :) nothing to do with VMs
21:44 mik3 purpleidea: i don't know what you just said.
21:44 kmai007 joined #gluster
21:45 purpleidea mik3: okay let me try again:
21:45 cfeller joined #gluster
21:45 purpleidea mik3: my understanding is that you want HA (high availability) for nfs clients using some sort of storage, correct?
21:45 kmai007 question, gluster FUSE clients, can they essential mount up any brick regardless if its replicate-0 or replicate-1 ?
21:45 mik3 correct
21:46 mik3 (originally i was looking at gluster's native fuse client for this, we are currently using NFS)
21:46 _dist semiosis: this is the volume output from my prod, tuned as the redhat article asks https://dpaste.de/guWZ - I didn't want to mix the two issue xml/vs heal performance
21:46 glusterbot Title: dpaste.de: Snippet #256828 (at dpaste.de)
21:47 purpleidea mik3: okay, so at the moment, your setup involves an NFS server mounting storage which is ontop of gfs2,DRBD right?
21:47 mik3 correct
21:47 mik3 with floating IPs
21:48 purpleidea mik3: now where does your NFS server RUN? (i'm assuming you have an active/active 2 node cluster)
21:48 semiosis _dist: thank you for that.
21:49 purpleidea mik3: i'm guessing you probably use rgmanager or similar, right?
21:49 purpleidea mik3: let's use the term "vip" instead of floating anything.
21:49 semiosis _dist: you know, one possibility is that the heal list sees modifications and checks if those files need to be healed, even when they dont need to be.  not sure, but it's a thought
21:49 semiosis _dist: you could try turning up logging level (tho i'm not sure which one) maybe the shd will tell you if it's doing any *actual* healing, or just double checking consistency
21:50 purpleidea semiosis: what about ls <file> (to trigger a heal) and then checking heal info ?
21:50 semiosis _dist: idk
21:50 mik3 correct, right now there's a NFS server resident on each nodes because we have 2 shares being exported, prod and junk. one share resides on each node for resource balancing, so failover isn't going to work
21:50 mik3 so i'm trying to figure out a solution at all if possible to do this in active/active with nfs failover
21:50 _dist semiosis: when I first started using gluster for vm storage joejullian told me it was normal for all VMs to _always_ be healing. I might be wrongly paraphrasing him though cause that wouldn't only make sense in an async style replication to me
21:51 mik3 (i want to bring it back to an active/passive config, this nightmare of mine is a result of insane file availability requirement from devops)
21:52 purpleidea mik3: well the point i made above is that _NONE_ of the nfs solutions do proper failover atm.
21:52 purpleidea mik3: you said: 16:43 < mik3> purpleidea: in dual-active configurations it doesn't work but  nfs/ext4 with a floating ip in active/passive configuration works  fine, but the stateful data needs to be stored on the replicated  share
21:52 purpleidea mik3: I said, you're mistaken. it doesn't work fine.
21:52 purpleidea which brings us back to the reason we use the fuse client, or wait for pNFS support...
21:53 mik3 so every technical writeup suggesting HA NFS is wrong, including redhat?
21:53 purpleidea similar storage platforms all have the same problem, or are actually not HA.
21:53 purpleidea mik3: show me one
21:53 mik3 http://www.redhat.com/resourcelibrary/ref​erence-architectures/Deploying-Highly-Ava​ilable-NFS-on-Red-Hat-Enterprise-Linux-6
21:53 purpleidea but i think you're misunderstanding what is actually HA.
21:53 glusterbot Title: Red Hat | Deploying Highly Available NFS on Red Hat Enterprise Linux 6 (at www.redhat.com)
21:53 mik3 probably
21:53 mik3 i do that often
21:54 purpleidea (let me read)
21:54 failshell joined #gluster
21:55 mik3 i mean i've built drbd/ext3 based active/passive configs where failover worked perfectly fine
21:55 mik3 so i'm not sure what you mean
21:56 cp0k Hey guys, I just completed a Gluster upgrade from 3.3 to 3.4.2 and am now seeing the CPUs at 100% on the storage nodes...any idea what may be causing this?
21:56 purpleidea mik3: this is too big for me to browse atm, but i'm pretty sure that it's not really HA in terms of ensuring your client always has a working mount. if it's in the middle of something, then nfs will die or hang, and over udp, when the vip flips over, it will confuse nfs, but then eventually start over... understand/
21:56 fidevo joined #gluster
21:56 purpleidea mik3: you're mistaken that your failover worked fine!!
21:56 mik3 no, i'm not
21:56 mik3 you're wrong
21:56 semiosis lol
21:57 purpleidea mik3: okay, do you understand how the tcp connection tracking table works? you need this to be replicated if the failover is a real failover!
21:57 purpleidea mik3: i can be wrong. but until you prove otherwise...
21:57 mik3 uh
21:57 mik3 no
21:57 mik3 the burden of proof is on you
21:57 purpleidea mik3: okay, fair enough!
21:57 mik3 honestly i've lost interest in you
21:57 mik3 sorry
21:57 cp0k the gluster logs are scrolling alot of messages like this:
21:57 cp0k [2014-02-10 21:57:37.959285] E [afr-self-heal-data.c:1270:afr_sh_data_open_cbk] 0-th-tube-storage-replicate-0: open of <gfid:eb4a5526-348f-43fe-a42b-928cd3617dfa> failed on child th-tube-storage-client-1 (No such file or directory)
21:58 purpleidea mik3: it's okay, do this test: start a copy from your mount... a long copy. and then kill server with the vip. what happens?
21:58 Matthaeus1 joined #gluster
21:58 cp0k I am hoping this is because I do not have all my clients back online yet
21:59 pdrakeweb joined #gluster
22:00 mik3 purpleidea: i don't really care about uninterrupted read/writes, more so the filesystem access resumes after a failover
22:01 purpleidea mik3: okay, as long as you realize how that solution isn't HA. i'm glad we agree!
22:01 mik3 i'm sure redhat and the entire internet will be pounding on your door for consultation after they realize their mistake
22:02 purpleidea ,,(next)
22:02 glusterbot Another satisfied customer... NEXT!
22:03 cp0k ooo pick me pick me, still waiting for help :)
22:03 mik3 cp0k: hah good luck, all you'll find is pedantic geek nostril flaring
22:03 semiosis ooh look, coffee time
22:03 purpleidea +1
22:03 kmai007 purpleidea: i just want validation that I can mount up any brick regardless if its replicate-0 or 1
22:03 _dist purpleidea: Until there's a better hint from the --xml on heal info, is there a way I can know when its' safe to power off a node? Is there a way to check even for a specific file what the status is on all nodes?, but yeah take your time :)
22:04 semiosis cp0k: sorry if someone else cant help you i'll be back in a little bit... need coffee & to get some more actual work done, but i'll be on til late
22:04 cp0k thanks, much appreciate it
22:05 LessSeen joined #gluster
22:05 purpleidea cp0k: did you make sure to restart the glusterd on _each_ node?
22:06 cp0k yes, I did
22:06 cp0k of course
22:07 purpleidea cp0k: vijay has a few articles like: http://vbellur.wordpress.com/2013/​07/15/upgrading-to-glusterfs-3-4/
22:07 cp0k yes, these are the instructions I used :)
22:08 purpleidea cp0k: i'm not familiar with that particular error, sorry.
22:08 cp0k Im hoping the errors about 'No such file or directory' are due to some of my clients still being Disconnected pending Gluster upgrade
22:09 cp0k but my biggest curiosity at this point is what the reason for the CPU spiking to 100%
22:09 purpleidea cp0k: can you ,,(paste) to see what the volume heal says...
22:09 glusterbot cp0k: For RPM based distros you can yum install fpaste, for debian and ubuntu it's pastebinit. Then you can easily pipe command output to [f] paste [binit] and it'll give you a URL.
22:09 purpleidea cp0k: gluster might be trying to heal things and verify things... not sure
22:09 cp0k purpleida: that is what I am thinking as well
22:09 cp0k maybe Gluster is just doing a massive self check
22:10 purpleidea gluster volume heal <volume> info
22:10 purpleidea | fpaste
22:10 cp0k it just hangs atm
22:10 purpleidea give it a chance... and pipe it to a textfile (it might be big)
22:11 cp0k purpleidea: I will try that and let you know how it goes, thanks
22:11 ProT-0-TypE joined #gluster
22:11 cp0k 127.0.0.1:/storage            189T  160T   19T  90% /storage
22:11 cp0k dealing with ALOT of data here :)
22:11 purpleidea yw, good luck. for fun, check that you don't have any selinux problems
22:11 cp0k # gluster volume heal th-tube-storage info
22:11 cp0k ~ #
22:12 cp0k returned nothing
22:12 purpleidea cp0k: oh thanks! you found my movie collection!
22:12 cp0k # gluster volume heal th-tube-storage info > storage_info
22:12 cp0k Another transaction is in progress. Please try again after sometime.
22:12 cp0k heh
22:12 purpleidea cp0k: what was the exit code?
22:12 cp0k none
22:12 purpleidea 0|1 ?
22:13 cp0k it didnt report an exit code at all
22:13 purpleidea cp0k: commands return an exit code. eg: false; echo $?
22:13 purpleidea true; echo $?
22:14 cp0k no exit code was printed to the terminal, maybe it is in the log files
22:14 cp0k which right now is scrolling like crazy
22:14 purpleidea cp0k: no
22:15 purpleidea cp0k: try running your heal info command again, and _right_ after, echo $?
22:15 purpleidea if it's zero, the command worked.
22:15 purpleidea if it's non zero, it means problem.
22:15 purpleidea standard shell type thing...
22:16 purpleidea cp0k: man bash (EXIT STATUS)
22:16 cp0k when I try the command again, it tells me
22:16 cp0k Another transaction is in progress. Please try again after sometime.
22:17 cp0k will give it some time, in the meantime work on upgrading gluster to 3.4.2 on the remaining clients
22:20 failshel_ joined #gluster
22:27 purpleidea cp0k: if i had to guess: it was running (or is running) the heal command, which was taking a while because there's a lot healing at the moment... you killed it, and it's still working on it... perhaps restart glusterd on each node, and try again, but don't kill it. and pipe it to a file like command > filename
22:27 recidive joined #gluster
22:27 purpleidea i've gotta go, but i hope that helps for now!
22:32 ninkotech_ joined #gluster
22:33 ninkotech joined #gluster
22:34 mik3 purpleidea: http://www.oracle.com/ocom/groups/publi​c/@otn/documents/webcontent/2011281.pdf
22:35 mik3 During node transition events, the client may see a momentary pause in
22:35 mik3 the data stream for reads and writes, but will shortly resume operation as if there
22:35 mik3 was no interruption; no client side interaction is required
22:35 mik3 purpleidea: you should really go inform oracle/redhat/the internet that they're all wrong
22:36 purpleidea mik3: that's great that oracle can do this! do you get the same resume behaviour with your cluster?
22:36 mik3 i did before, yes
22:36 mik3 as i stated
22:36 sputnik13 joined #gluster
22:36 purpleidea well i guess i'm wrong, and i'm glad it's working well for you!
22:36 purpleidea (are you using tcp?)
22:36 mik3 the data is resident on the replicate filesystem, so when the secondary is promoted it resumes
22:37 mik3 you should really go read instead of spouting off anecdotal pedantic crap, you might actually provide someone with help
22:37 mik3 left #gluster
22:42 gdubreui joined #gluster
22:45 jmalm left #gluster
22:50 Matthaeus joined #gluster
22:50 ninkotech__ joined #gluster
22:54 badone_ joined #gluster
22:55 Matthaeus1 joined #gluster
23:01 DV joined #gluster
23:04 ProT-O-TypE joined #gluster
23:07 khushildep joined #gluster
23:23 recidive joined #gluster
23:25 JoeJulian purpleidea: Odd that someone would be upset that you were offering information that is "concerned with minor details and rules or with displaying academic learning." You'd think that would be the best kind of information when you're trying to make something work.
23:25 tdasilva joined #gluster
23:25 JoeJulian ... according to his own categorization.
23:26 purpleidea JoeJulian: i tried my best. I guess I did it wrong.
23:26 purpleidea upon reading the redhat HA nfs doc it looks like it's showing how to have the server failover, not the clients, and of course the clients would still die. also, it's using an EMC san as storage :P ... and then he left.
23:27 JoeJulian I don't take kindly to his attitude.
23:28 purpleidea JoeJulian: no worries. i didn't like it either, but he's gone now
23:31 semiosis he had another argument to get to
23:32 purpleidea right!
23:33 JoeJulian @ban rev@technothug.net
23:35 recidive joined #gluster
23:36 primechuck Client would probably die, and if they aren't using some special client, would have a stale mount upon resume of the new NFS session if anything was open when the connection was lost
23:36 DV joined #gluster
23:36 purpleidea primechuck: exactly
23:37 primechuck Unless you're using NFS 4.1 with Gluster...that might be the ticket :)
23:37 primechuck But I'm not sure how locking is handled in the event of a path or server failure
23:37 purpleidea primechuck: so we were all wondering what the story with pNFS is... and when this would all be available
23:38 purpleidea something something ganesha-nfs ... kkeithley said it's available as rpm's. haven't tested it though
23:39 primechuck I asked that in IRC a few weeks ago :)  when I got around to installing and hammering it with testing, the requirement for NFS was changed and I didn't look at it.
23:39 primechuck FUSE client FTW, if you don't mind some latency :)
23:40 purpleidea primechuck: i was just about to test this actually: if you start a long copy and then take down server1, then bring it up and take down server2, does it still keep chugging away?
23:42 primechuck In my testing with it using ganesha with libgfapi it didn't, but it looked to be something I configured wrong on the client side.  I still cannot disern of it is supposed to handle a path failure or if it just knows enough about the cluster to restablish to another head end without digging into the code.
23:43 purpleidea primechuck: will test this now :)
23:43 primechuck Cannot wait for the blog post saying, 'Don't buy NetApp for Vmware use this' :)
23:44 purpleidea so speaking of netapp+vmware ... don't use those :P
23:45 purpleidea primechuck: (testing the fuse client mount...) would love to hear about if the results are the same/different with libgfapi directly... test it!
23:45 purpleidea and ping me
23:46 primechuck Guess I omited the part where we dropped the requirement for NFS completely and are just useing fuse.
23:47 dbruhn joined #gluster
23:48 primechuck joined #gluster
23:48 primechuck and the part where the wireless drops
23:49 pixelgremlins_ba joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary