Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2015-08-14

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:26 victori joined #gluster
00:27 Telsin_ joined #gluster
00:35 vmallika joined #gluster
00:35 calisto joined #gluster
00:39 jobewan joined #gluster
00:47 nangthang joined #gluster
00:57 jcastillo joined #gluster
00:59 Pupeno joined #gluster
01:00 topshare joined #gluster
01:01 Pupeno_ joined #gluster
01:05 Pupeno joined #gluster
01:07 Pupeno joined #gluster
01:08 harish joined #gluster
01:11 gildub joined #gluster
01:19 Pupeno joined #gluster
01:25 Pupeno joined #gluster
01:27 Pupeno_ joined #gluster
01:27 theron joined #gluster
01:28 finknottle joined #gluster
01:29 finknottle Does the first brick in a replica group have special importance ? When client side quorum is set to auto, if the first brick is down, the files on that replica group become read only
01:30 Pupeno joined #gluster
01:31 finknottle I'm trying to convert all the bricks, which are thick lvm volumes, to thin ones. And I'm trying to take bricks down one at a time for converting the bricks to thin lvm
01:32 finknottle but client side quorum makes it harder. I can always disable it, but i wanted to know if there is a risk of data corruption/loss with that
01:32 Pupeno joined #gluster
01:34 davidself joined #gluster
01:37 Lee1092 joined #gluster
01:42 calisto joined #gluster
01:45 Pupeno joined #gluster
01:45 JoeJulian Without quorum, just ensure that the self-heals have finished before taking down the next server.
01:47 ilbot3 joined #gluster
01:47 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
01:50 Pupeno joined #gluster
01:50 finknottle Ok. Also, for moving from thick to thin, I'm using rsync to backup the data. So far i haven't noticed any problems with that. is that safe ?
01:51 finknottle I'm using "rsync -v -av -A -X -H"
01:51 finknottle which i think preserves everything that needs to be preserved, unless i've missed something
01:51 JoeJulian use --inplace
01:52 JoeJulian Otherwise is makes a temp filename which doesn't hash out to the correct dht subvolume when it's named back.
01:52 Pupeno joined #gluster
01:53 finknottle Oh. I've already converted a couple of bricks this way. Do I need to delete them, and let self heal sort it out ? The main reason for doing rsync was to avoid self heal
01:53 finknottle which seriously bogs the system down
01:54 _Bryan_ joined #gluster
01:55 Pupeno_ joined #gluster
01:57 finknottle is there a way to check if the previous rsyncs have done any damage ?
02:02 harish joined #gluster
02:04 finknottle @JoeJulian, any ideas ?
02:06 Pupeno joined #gluster
02:07 topshare joined #gluster
02:08 haomaiwa_ joined #gluster
02:15 finknottle So I tried modifying a file, and the modification went through fine to the brick which was restored after rsync.
02:16 haomaiwa_ joined #gluster
02:17 Pupeno joined #gluster
02:21 Pupeno_ joined #gluster
02:32 Pupeno joined #gluster
02:37 Pupeno joined #gluster
02:41 Pupeno joined #gluster
02:45 Pupeno_ joined #gluster
02:46 Pupeno joined #gluster
02:50 topshare joined #gluster
02:55 topshare joined #gluster
02:55 suliba joined #gluster
02:57 Pupeno joined #gluster
03:01 haomaiwa_ joined #gluster
03:06 Pupeno joined #gluster
03:08 bharata-rao joined #gluster
03:08 Pupeno_ joined #gluster
03:13 calisto joined #gluster
03:20 dgandhi joined #gluster
03:22 Pupeno joined #gluster
03:28 shubhendu joined #gluster
03:28 Pupeno joined #gluster
03:35 johnmark joined #gluster
03:36 atinm joined #gluster
03:41 TheSeven joined #gluster
03:54 sakshi joined #gluster
03:59 itisravi joined #gluster
04:02 7GHAAYF6W joined #gluster
04:03 overclk joined #gluster
04:04 Manikandan_ joined #gluster
04:05 Manikandan joined #gluster
04:05 corretico_ joined #gluster
04:08 kshlm joined #gluster
04:11 corretico_ joined #gluster
04:14 corretico_ joined #gluster
04:16 kanagaraj joined #gluster
04:16 nbalacha joined #gluster
04:22 neha joined #gluster
04:28 RameshN joined #gluster
04:29 jwd joined #gluster
04:32 jwaibel joined #gluster
04:32 corretico joined #gluster
04:38 corretico_ joined #gluster
04:41 yazhini joined #gluster
04:41 anil joined #gluster
04:45 deepakcs joined #gluster
04:46 ashiq joined #gluster
04:47 gem joined #gluster
04:47 hgowtham joined #gluster
04:50 anil left #gluster
04:52 kdhananjay joined #gluster
04:55 yosafbridge joined #gluster
04:56 rafi joined #gluster
04:58 mikemol joined #gluster
05:02 haomaiwang joined #gluster
05:05 neha_ joined #gluster
05:06 mikemol joined #gluster
05:29 Manikandan_ joined #gluster
05:32 vimal joined #gluster
05:35 martinliu joined #gluster
05:40 R0ok_ joined #gluster
05:43 nishanth joined #gluster
05:47 ramky joined #gluster
05:47 Manikandan joined #gluster
05:48 aravindavk joined #gluster
05:52 elico joined #gluster
05:58 vmallika joined #gluster
05:59 martinliu joined #gluster
06:01 Saravana_ joined #gluster
06:02 haomaiwa_ joined #gluster
06:02 jwd joined #gluster
06:03 kdhananjay joined #gluster
06:10 marnom joined #gluster
06:11 Manikandan_ joined #gluster
06:13 anil joined #gluster
06:20 raghu joined #gluster
06:30 davidself joined #gluster
06:40 maveric_amitc_ joined #gluster
06:42 yoavz joined #gluster
06:44 neha joined #gluster
06:52 ramky joined #gluster
07:01 kshlm joined #gluster
07:02 haomaiwa_ joined #gluster
07:04 Slashman joined #gluster
07:16 PaulCuzner joined #gluster
07:18 karnan joined #gluster
07:19 rjoseph joined #gluster
07:22 hgowtham joined #gluster
07:28 nangthang joined #gluster
07:37 ninkotech__ joined #gluster
07:46 ashiq joined #gluster
07:52 PaulCuzner left #gluster
07:57 LebedevRI joined #gluster
08:01 haomaiwa_ joined #gluster
08:16 ws2k3 joined #gluster
08:19 Pupeno joined #gluster
08:30 tanuck joined #gluster
08:52 ghenry joined #gluster
08:52 ghenry joined #gluster
08:54 deniszh joined #gluster
09:01 haomaiwa_ joined #gluster
09:04 ctria joined #gluster
09:06 Guest54599 joined #gluster
09:08 Guest54599 Hi. I did a standardsetup. Here are my thoughts: I have one directory and want to share it over 2 nodes. I did a "peer probe" - everything works fine. All files in this folder (mounted via mount.glusterfs) are synced over 2 nodes. But if I shutdown one of them, the file are no longer available. What did I wrong?
09:16 Trefex joined #gluster
09:16 Trefex dear all. i upgraded gluster to 3.7.3 thinking it would solve my issue, but it didn't, fix-layout doesn't work
09:16 shubhendu joined #gluster
09:17 Trefex i get a failed status and then that's that
09:17 nishanth joined #gluster
09:17 Trefex any ideas on how to fix it?
09:18 Trefex here is the relevant log from rebalance.log http://paste.fedoraproject.org/255057/14395439/
09:19 glusterbot Title: #255057 Fedora Project Pastebin (at paste.fedoraproject.org)
09:19 ashiq joined #gluster
09:20 gem joined #gluster
09:20 neha joined #gluster
09:22 Romeor Guest54599, are you running HA or Distributed volume?
09:24 Guest54599 HA ... I did a 'gluster vol create volname replica 2 node1:/dir/brick node2:/dir/brick'
09:26 Romeor and glsuter version is?
09:26 Guest54599 and both of the machines are with iptables and selinux deactivated. They can see each other but if node1 goes down, node2 has problems. It is weird, becaus sometimes, I can access the folder and sometimes I can't.
09:26 ashiq- joined #gluster
09:26 Romeor did you check logs?
09:26 Guest54599 glusterfs-373 from el6
09:27 Guest54599 its centos
09:27 Guest54599 yes there is nothing to see
09:29 Guest54599 Depends gluster on >1 nodes? Is there a problem, if one goes down and the other would restart the service? On an unreachable brick in the gluster environment..
09:49 kshlm joined #gluster
09:53 Trefex anyone?
10:02 haomaiwa_ joined #gluster
10:02 nishanth joined #gluster
10:06 shubhendu joined #gluster
10:07 baojg joined #gluster
10:08 owlbot joined #gluster
10:19 badone_ joined #gluster
10:19 gem joined #gluster
10:20 owlbot joined #gluster
10:27 baojg joined #gluster
10:28 Debloper joined #gluster
10:31 DV joined #gluster
10:34 kshlm joined #gluster
10:36 harish joined #gluster
10:41 yazhini left #gluster
10:48 overclk joined #gluster
10:50 vmallika joined #gluster
10:59 shyam joined #gluster
11:09 jrm16020 joined #gluster
11:19 Manikandan_ joined #gluster
11:26 overclk joined #gluster
11:42 vmallika joined #gluster
11:42 arcolife joined #gluster
11:51 hgowtham joined #gluster
11:53 B21956 joined #gluster
12:02 Trefex hello. Can anybody help me with my fix-layout issue?
12:11 _Bryan_ joined #gluster
12:16 autoditac_ joined #gluster
12:21 theron joined #gluster
12:21 calisto joined #gluster
12:21 unclemarc joined #gluster
12:24 kanarip joined #gluster
12:31 rafi Trefex: what is your issue with fix-layout ?
12:31 Trefex rafi: oh thanks and hai :) My issue is found here: http://paste.fedoraproject.org/255057/14395439/
12:32 glusterbot Title: #255057 Fedora Project Pastebin (at paste.fedoraproject.org)
12:32 Trefex rafi: the problem is that the fix-layout fails after 20.000 seconds every time
12:32 * rafi is reading the logs
12:32 Trefex rafi: i did upgrade my gluster now to 3.7.3 and still have the same issue. Basically, I added a new node to my existing 2 nodes gluster setup, and read that i should fix-layout and then rebaéance data
12:33 rafi Trefex: that's true
12:33 rafi Trefex: let me go through the logs
12:33 gildub joined #gluster
12:34 MarceloLeandro joined #gluster
12:34 rafi Trefex: did you completed the upgrade operation in all the nodes ?
12:34 Trefex rafi what's the upgrade operation? o0
12:35 rafi Trefex: you said you upgraded to 3.7.3, right ?
12:35 Trefex rafi yeah as you can see http://paste.fedoraproject.org/255110/55571914
12:35 glusterbot Title: #255110 Fedora Project Pastebin (at paste.fedoraproject.org)
12:36 rafi Trefex: OK, just clarifying , so your trusted storage pool contains three nodes running 3.7.3
12:37 Trefex rafi yes correct
12:37 Trefex and 1 controller node
12:37 Trefex doing the nfs, samba and so fort
12:37 autoditac_ joined #gluster
12:38 rafi Trefex: Earlier your glusterfs was 3.3, right ?
12:38 Trefex 3.6
12:38 Trefex @ rafi
12:38 glusterbot Trefex: I do not know about 'rafi', but I do know about these similar topics: 'rtfm'
12:39 rafi Trefex: [2015-08-13 21:36:05.535730] C [rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired] 0-live-client-4: server 192.168.123.106:49156 has not responded in the last 42 seconds, disconnecting.
12:39 rafi Trefex: line number 11
12:39 Trefex rafi yeah this is strange right
12:40 MarceloLeandro_ joined #gluster
12:40 rafi Trefex: :)
12:41 Trefex rafi but how could this happen?
12:41 Trefex it's more or less reproducible
12:41 rafi Trefex: wait
12:43 rafi Trefex: http://www.spinics.net/lists​/gluster-devel/msg16468.html
12:43 glusterbot Title: Re: glusterfs-3.7.3 released Gluster Development (at www.spinics.net)
12:44 Trefex rafi: mhhhh but all my gluster daemons are 3.7.3
12:45 rafi Trefex: ya,
12:45 Trefex so then i don't get this workaround :)
12:45 rafi Trefex: All bricks are running, right ?
12:45 Trefex ye
12:46 Trefex rafi https://paste.fedoraproject.org/255116/56385143/
12:46 glusterbot Title: #255116 Fedora Project Pastebin (at paste.fedoraproject.org)
12:47 mpietersen joined #gluster
12:50 theron joined #gluster
12:50 rafi Trefex: give me some time,
12:50 rafi Trefex: I will take a look
12:51 Trefex rafi: ok no problem, please let me know, if there's any more information I could provide
12:51 rafi Trefex: sure
12:51 Trefex rafi: also, could you just enlighten me, how come there is alreadyy 150 GB of data on the "new" node?
12:51 Trefex i thought no data could go to new node until it was completely layout-fix'ed ?
12:51 rafi Trefex: there is a rebalance going on, right ?
12:52 Trefex rafi yeah, but only a fix-layout one
12:53 Trefex rafi used gluster volume rebalance live fix-layout start to start the process, first layout, then data, or so i think ?
12:54 rafi Trefex: yes
12:54 rafi Trefex: was there any data already on bricks ?
12:54 Trefex but the mount was available and data written to the volume already, but accroding to my understanding no data should end up on the new node until the layout is finish?
12:54 rafi Trefex: before attaching ?
12:54 Trefex no, it was a clean wipe
12:54 Trefex then we open the mount point + starting fix-layout
12:56 rafi Trefex: did you check the files created in new bricks form backend ?
12:56 Trefex rafi check how ?
12:57 autoditac__ joined #gluster
12:58 autoditac_ joined #gluster
12:59 autoditac_ joined #gluster
12:59 Trefex can data be written to the new node for the "folders" that were already "layout-fix"'ed ?
13:00 rafi Trefex: I guess so, are there any i/o going on ?
13:01 jdossey joined #gluster
13:01 chirino joined #gluster
13:01 Trefex not now, i unmounted everything to have a "clean" thing, until we fix this issue
13:01 shyam joined #gluster
13:02 rafi Trefex: can you on profiling
13:02 Trefex rafi not sure how ?
13:02 rafi can you trun on profiling
13:03 rafi Trefex: gluster v profile volname start
13:03 Trefex rafi during the rebalance now ?
13:03 rafi Trefex: yes
13:04 rafi Trefex: jsut turn on it
13:04 social joined #gluster
13:04 Trefex rafi ok done
13:04 rafi Trefex: after some time, gluster v profile volname info
13:04 owlbot joined #gluster
13:07 Trefex rafi http://paste.fedoraproject.org/255122/14395576 can also wait longer
13:07 glusterbot Title: #255122 Fedora Project Pastebin (at paste.fedoraproject.org)
13:10 nsoffer joined #gluster
13:16 julim joined #gluster
13:19 jobewan joined #gluster
13:21 rafi Trefex: are you der ?
13:21 Trefex rafi and an updated one http://paste.fedoraproject.org/255136/58485143
13:21 glusterbot Title: #255136 Fedora Project Pastebin (at paste.fedoraproject.org)
13:21 Trefex rafi yes i'm here
13:21 rafi ok
13:22 rafi Trefex: can you give me a volume status from the node where you see this error ?
13:23 Norky joined #gluster
13:25 Trefex rafi this is from the new node  http://paste.fedoraproject.org/255138/39558700
13:25 glusterbot Title: #255138 Fedora Project Pastebin (at paste.fedoraproject.org)
13:31 rafi Trefex: now rebalance is running on every node ?
13:32 Trefex rafi well i think so
13:33 rafi Trefex: is it still happening, or happened one time ?
13:33 Trefex rafi happened 2-3 times with 3.6 and 1 time with 3.7 and it never finished successuflly
13:34 Trefex rafi but it takes around 20k-30k seconds to happened
13:35 aaronott joined #gluster
13:35 rafi Trefex: did you restarted after failure ?
13:36 Trefex rafi i restarted all machines (warm reboot) and trying now again
13:37 plarsen joined #gluster
13:37 rafi Trefex: we are suspecting some corner case issues, I can write a patch and sent you custom build
13:37 Trefex rafi: sure, i'm willing to test anything you think could improve
13:38 rafi Trefex: this ping_timer_expired issues are very hard to reproduce
13:39 rafi Trefex: it might happen if there is a deadlock on server process or a lot of fsyncs are happening, or if there is a lot of packets in queue to sent over the network, due to network conjunction timer might expire
13:40 rafi Trefex: but, nothing seems to be well proof for your problem
13:41 rafi Trefex: Since, I don't have correct RCA for this, what we could think of is, considering ping packets as most prioritized packets and add them into top of the queue,
13:41 rafi Trefex: I'm not sure whether it will solve your problem or not,
13:41 Trefex rafi what's RCA?
13:42 rafi Trefex: root cause analysis
13:42 Trefex rafi what i saw was that I had a soft CPU lock on ksoftirqd/0 on that "new" node
13:43 Trefex rafi eg 1 CPU was "stuck" at 100% CPU on that thread
13:43 Trefex after reboot i think i also went to new kernel, now running on 3.10.0-229.11.1.el7.x86_64
13:45 theron joined #gluster
13:46 rafi Trefex: :(
13:46 rafi Trefex: are you seeing this timer issue on new node only, or from every node ?
13:47 Trefex rafi well i think the log files are the same on each node for the rebalance task no?
13:47 rafi Trefex: each rebalance task/process are independent
13:48 Trefex rafi I don't understand, i started the rebalance from the controller node
13:48 rafi Trefex: what do you mean by controller node ?
13:49 Trefex i have 3 nodes that make my storage, and one node which exports the mount point
13:49 Trefex one that export-node i have also a gluster daemon running, but no storage pools attached
13:49 Trefex eg from the logs that node is called "lowlander/highlander"
13:50 rafi Trefex: how many node are there in peer status ?
13:50 rafi Trefex: only three , right ?
13:50 Trefex yeah three
13:50 rafi Trefex: and you are only able to trigger rebalance from any of this three nodes
13:51 Trefex no i can trigger it from the control node as well
13:51 Trefex rafi but to answer your question, i checked the logs on the 3 peers, and each had rebalance failed at different timepoints but with the same "timeout" error of the same server (.106)
13:52 rafi Trefex: rebalance process will start running in every nodes if that is hosting a brick for that volume, and they are independent
13:53 Trefex yep so in that case, all 3 failed, with timeout on .106 server at different timepoints
13:53 rafi Trefex: so looks like the problem with 106 server :)
13:54 rafi Trefex: is there any fsync operation happening on that node ?
13:54 rafi Trefex: sorry, there shouldn't be
13:54 Trefex rafi here is logs from all 3 machines http://paste.fedoraproject.org/255155/95604491 please check the entries from 2015-08-13
13:54 glusterbot Title: #255155 Fedora Project Pastebin (at paste.fedoraproject.org)
13:55 anil joined #gluster
13:56 Trefex rafi i should provide it earlier, i didn't realize the rebalance was independant
13:57 bennyturns joined #gluster
13:57 rafi Trefex: np :), each process will do fix-layout for their own bricks
13:57 Manikandan_ joined #gluster
13:59 Trefex rafi mhhh and they communicate that info to other nodes? Because there's only 150 GB on the "new" node which means it should be quite fast for that one ?
13:59 Twistedgrim joined #gluster
14:01 Peppard joined #gluster
14:02 rafi Trefex: i don't understand, what is there in this new brick for 150GB ?
14:02 Trefex rafi no clue, some new data it seems
14:02 rafi Trefex: Was there any I/O during the fix-layout, before ?
14:02 rafi Trefex: in parallel ?
14:03 Trefex rafi it's possible there was I/O during the fix and after (when it failed)
14:03 Trefex exit
14:03 Trefex ls
14:03 Trefex oups
14:03 rafi Trefex: then, new i/o will redirect to the new layout
14:03 Trefex rafi which was "failed"
14:03 rafi Trefex: for those we already did a fix-layout
14:04 Trefex ah ok
14:04 rafi Trefex: but not for all directory, right ?
14:04 Trefex correct
14:04 Trefex ok so that explains the new "data"
14:04 rafi Trefex: could be :)
14:04 Trefex but it should not affect the current fix-layout or?
14:05 rafi Trefex: nop
14:05 rafi Trefex: looks like I have enough data to explore, ping_timer_problem is something weird, because there are lot of things which leads this,
14:05 Trefex rafi ok, should we stay in contact on IRC or somewhere else where we can build trace ?
14:07 rafi Trefex: any thing is fine for me, i will be here in IRC, or else you can sent a mail to gluster-devel mailing list and may be you can put me on the cc
14:07 rafi Trefex: rkavunga@redhat.com
14:08 rafi Trefex: gluster-devel@gluster.org
14:08 Trefex rafi let's stay in contact by IRC for now then. Perhaps I'll do a mailing list if it fail now again (which I'm pretty sure it will)
14:10 rafi Trefex: is it so important for you ?
14:11 Trefex rafi well the way I understand is that I can't really use the new node unless the fix-layout is finished?
14:11 rafi Trefex: true
14:11 Trefex then yes, it's important :)
14:12 rafi Trefex: I have one fix in-mind, I will write it as soon as possible
14:12 Trefex rafi the node is 130 TB usable storage, so not so small :)
14:12 rafi Trefex: I understand :)
14:12 rafi Trefex: today is little late in INDIA,
14:13 Trefex i understand :)
14:14 rafi Trefex: I will give one fix as soon as possible
14:14 Trefex sounds good :)
14:14 vimal joined #gluster
14:15 harish joined #gluster
14:16 overclk joined #gluster
14:20 rafi Trefex: see you
14:21 Trefex rafi bye raf, thanks for the help so far!
14:21 rafi Trefex: :)
14:22 mator joined #gluster
14:22 mator joined #gluster
14:25 haomaiwa_ joined #gluster
14:29 anil joined #gluster
14:32 shaunm joined #gluster
14:43 ekuric joined #gluster
14:47 jdossey joined #gluster
14:47 timotheus1 joined #gluster
14:52 MarceloLeandro__ joined #gluster
14:55 overclk joined #gluster
15:09 dgandhi joined #gluster
15:10 atinm joined #gluster
15:12 hagarth joined #gluster
15:13 anant1991 joined #gluster
15:13 anant1991 Hi there
15:13 anant1991 I just want to ask a question
15:14 anant1991 if I have mounted my gluster share using  " gluster1:/vol0 "
15:14 anant1991 then some how if gluster1 will goes off
15:15 anant1991 then my server was unable to access the gluster share
15:15 anant1991 and I want that it will be automettically served by another server "gluster2"
15:15 baojg joined #gluster
15:15 anant1991 is it possible ?
15:18 _maserati joined #gluster
15:19 anant1991 any one active ?
15:19 anant1991 I have a question
15:21 jobewan joined #gluster
15:34 cholcombe joined #gluster
15:41 chirino joined #gluster
15:47 wushudoin| joined #gluster
15:48 neofob left #gluster
15:48 marlinc joined #gluster
15:50 marlinc I'd like to setup GlusterFS between offices to sync home directories on the file servers. Would it cause issues when one of the offices goes offline for a while? How would GlusterFS resolve the issue. Would the offline office be resynced ones it comes online?
15:50 kdhananjay joined #gluster
15:53 wushudoin| joined #gluster
15:55 overclk joined #gluster
15:55 ramky joined #gluster
16:00 chirino joined #gluster
16:04 jdossey joined #gluster
16:08 mckaymatt joined #gluster
16:13 overclk joined #gluster
16:16 rafi joined #gluster
16:24 baojg joined #gluster
16:29 baojg joined #gluster
16:29 overclk joined #gluster
16:30 jockek joined #gluster
16:39 neofob joined #gluster
16:42 Rapture joined #gluster
16:51 overclk joined #gluster
16:51 trav408 joined #gluster
16:53 chirino joined #gluster
16:54 corretico joined #gluster
16:54 neofob joined #gluster
16:56 gem joined #gluster
16:58 gem joined #gluster
17:03 vimal joined #gluster
17:07 Twistedgrim joined #gluster
17:15 mckaymatt joined #gluster
17:15 overclk joined #gluster
17:15 R0ok_ joined #gluster
17:18 JoeJulian marlinc: That's not an optimal application of clustered filesystems.
17:18 JoeJulian The latency is an issue and you have a good potential for split-brain.
17:19 marlinc How about the geo-replication part of GlusterFS?
17:20 JoeJulian unidirectional
17:20 marlinc Ah, I couldn't get that out of the documentation
17:21 marlinc What would be a good solution? I can't seem to find much
17:23 JoeJulian Yeah, there's the whole CAP theorem problem with that. I've never found a solution I'm truly happy with. The closest I got (with windows machines) was having everyone use rdp with hosted VMs.
17:30 techsenshi joined #gluster
17:31 techsenshi im currently running a rebalance of a volume but I've discovered some folders which are being reported empty to clients
17:31 marlinc I've been looking at using DRBD with some file system that supports write from multiple places at the same time
17:32 techsenshi looking through I can see the folders are not empty on our original nodes, but the folders are emtpty on the new nodes (in the various bricks)
17:32 JoeJulian marlinc: I hope you're more successful than I was with drbd. After the 3rd time it destroyed all our data, I never looked back.
17:32 marlinc Ouch, how so? And what FS did you run on top of it?
17:33 marlinc How long was that ago?
17:33 JoeJulian 6 years
17:34 JoeJulian techsenshi: folders are created before files are moved... not sure on the client issue though. Any non-standard volume settings?
17:34 techsenshi just server.allow-insecure
17:35 JoeJulian marlinc: I was running ext4 at the time. I was more interested in redundancy than multi-site for that one though.
17:35 techsenshi i'm just going to copy the files from the bricks back to the volume for now
17:35 techsenshi i'm currious if this is a larger issue though...
17:35 JoeJulian Make sure they're not 0 size mode 1000.
17:36 techsenshi how can I check that?
17:36 JoeJulian ls
17:36 JoeJulian Well.. ls -l
17:39 techsenshi on the bricks the data looks good copying from the bricks to a new folder on the volume right now.
17:39 jbautista- joined #gluster
17:40 marlinc I think I'm going to try out DRBD with OCFS2
17:40 techsenshi searching the rebalance log I dont get any hits on the path
17:40 JoeJulian odd
17:41 JoeJulian I'm not a huge cheerleader for the quality of the rebalance process. I've rarely had success rebalancing - but never had data disappear from the clients. That's a new feature.
17:42 JoeJulian Check the client log, see if there's a clue there.
17:42 JoeJulian Heck, maybe even the brick logs
17:42 JoeJulian (another good reason for log aggregation, eg. logstash)
17:43 techsenshi is there any way to figure out if the directory had data on any other bricks? other than manual search
17:43 JoeJulian no
17:45 jbautista- joined #gluster
17:45 techsenshi okay.. and what about cleaning up the bricks?  can I just delete the data from the bricks after I copied it back
17:46 JoeJulian In theory there will be entries in the .glusterfs tree for the gfid of every file. The gfid is in the ,,(extended attributes).
17:46 glusterbot (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://pl.atyp.us/hekafs.org/index.php/​2011/04/glusterfs-extended-attributes/
17:48 marlinc After some more researching I found out that DRBD only supports two nodes, without doing some crazy stuff
17:49 techsenshi can i run a self-heal for a particular directory not the whole volume?
17:49 mckaymatt joined #gluster
17:50 JoeJulian It's mapped as .glusterfs/XX/YY/XXYYNNNNNNNNNNNNNNNNNNNNNNNNNNNN for instance a gfid of 0a6cdff5ec3e4b7aa0f0aab1e43ed0a6 will have an entry in .glusterfs/0a/6c/0a6cdff5ec3e4b7aa0f0aab1e43ed0a6
17:50 JoeJulian Sure, but the problem is when the filenames aren't returned to the client.
17:50 jobewan joined #gluster
17:51 JoeJulian If they are, it's a simple find | xargs stat
17:52 JoeJulian Without the filenames, you might be able to make a list of filenames from the bricks and try stat'ing them from the client to see if they show up.
17:52 techsenshi ah so I should stat them from the client, not the brick
17:53 JoeJulian Sorry, right.
17:54 techsenshi thanks for your help never done this so not sure I understand
17:54 techsenshi so from the client i go to one of the empty dirs and run find| xargs stat info about block device etc
17:54 JoeJulian When a file is accessed from the client, a health check is done. If the file needs healed, the client will start healing the file in the background*.
17:56 techsenshi hmm so in my case the files are nested in deeper folders which the client does not see
17:56 JoeJulian stat is a simple utility that calls fstat64 on a file which triggers two file ops, one of which is a lookup() call. That lookup is what triggers the self-heal check. Since stat doesn't trigger a whole lot of other fops, it's fairly efficient.
17:57 JoeJulian That's what I was suggesting. Try making the client do a lookup() on one of the missing files or folders. See if it recognizes it's issue and fixes it.
17:57 techsenshi oooh interesting so from the client if I manually cd deeper into the folder structure and do an ls then things start to appear
17:58 techsenshi yup now I get it
17:58 JoeJulian mounting over again might also cure it.
17:58 techsenshi we primarily just share share it out via samba gluster vfs
17:58 JoeJulian I wonder if that's a vfs bug.
17:59 techsenshi i think the vfs "bug" would be that its not triggering the self-heal for whatever my issue is
18:17 Philambdo joined #gluster
18:19 techsenshi really annoying having to go into various folders to trigger the self heal, over vfs its not happening have to manually enter a path on a client and from there i'm okay
18:21 JoeJulian techsenshi: What if you did a fuse mount, like on the server, and checked it there?
18:23 techsenshi thats what I'm currently doing, no way of knowing what other directories I'll need to do this process on though
18:26 edong23 joined #gluster
18:29 shaunm joined #gluster
18:30 JoeJulian What I would do is prepare a find on each brick (find $brick_root >/tmp/foo) then copy them all to the machine with the client mount (foo1,foo2,etc), combine them and use that (sort -u < /tmp/foo* | xargs stat 2>&1 > /dev/null ) and use the
18:33 techsenshi from the find results how would you query the client to dir into each folder?
18:34 unclemarc joined #gluster
18:35 cliluw Is there any API to create and manage Gluster volumes instead of through the command line interface?
18:41 mckaymatt joined #gluster
18:48 theron joined #gluster
18:54 tg2 curious - what happens if you add a brick to a volume and that brick has files that are not from gluster?
18:54 tg2 i know it is preferential to add the files through the mount so they get tagged and stored in the rigth place for their respective hash
18:59 theron joined #gluster
19:06 mckaymatt joined #gluster
19:23 techsenshi kk had to do a little reformatting from the find statements due to our files/folders with spaces
19:23 techsenshi but that really does help
19:33 JoeJulian techsenshi: awesome. :)
19:33 JoeJulian cliluw: not yet
19:34 kovshenin joined #gluster
19:34 JoeJulian tg2: If they're the left-hand brick (as added from the cli in replica sets) they will likely be integrated into the volume and self-healed to the other replica(s).
19:34 JoeJulian It's considered "undefined behavior" but it's been done many times.
19:35 tg2 i have a brick I can't remove gracefully for some reason
19:35 tg2 its a legacy brick form like 3.3.something days
19:36 tg2 and after upgrading it has some issues
19:36 tg2 i can't remove-brick on it
19:36 tg2 i want to try a replace-brick but I don't have another disposable brick
19:36 tg2 I'm guessing it will fail tho
19:40 ctria joined #gluster
19:46 wushudoin| joined #gluster
19:54 cliluw JoeJulian: I don't need anything fancy like a RESTful API. Can I create volumes with libgfapi?
19:56 JoeJulian cliluw: nope. I've brought up the idea, but I doubt it's even feasible to start working on it until glusterd is rewritten.
19:57 cliluw JoeJulian: Darn, that's unfortunate.
19:58 JoeJulian Just wrap the CLI.
19:58 JoeJulian That's what other projects have done (see ovirt).
19:58 cliluw JoeJulian: That's my plan B. I was hoping there was an API.
20:01 ghenry joined #gluster
20:03 wushudoin| joined #gluster
20:23 ctria joined #gluster
20:30 volga629 joined #gluster
20:47 calavera joined #gluster
21:06 mpietersen joined #gluster
21:30 tg2 or just rewrite glusterd and submit a patch
21:30 tg2 ;)(
21:40 trav408 joined #gluster
21:41 chirino joined #gluster
21:42 volga629 joined #gluster
21:42 chirino_m joined #gluster
21:55 TheCthulhu1 If I have /home/ mounted as a gluster volume, and I have a user called "james" for example that I want to set a quota on, is the command "gluster volume quota test-volume limit-usage /home/james/ 10GB" or "gluster volume quota test-volume limit-usage /james 10GB", the wording on the page confuses me
21:57 dlambrig joined #gluster
21:59 JoeJulian TheCthulhu1: paths are from the volume root.
22:00 theron_ joined #gluster
22:02 mckaymatt joined #gluster
22:06 jvandewege_ joined #gluster
22:06 techsenshi to avoid any further issues with my volume rebalance, would I have better results if I upgraded to 3.6.x?
22:06 techsenshi currently running 3.5.5
22:07 JoeJulian likely, yes. I know there's been a lot of work done there.
22:07 cyberbootje1 joined #gluster
22:08 techsenshi heh but then I risk running into some other problems err bugs, is 3.5 considered more stable than 3.6?
22:08 JoeJulian No
22:08 tg2 techsenshi,  i had a similar problem
22:08 tg2 upgrading further didn't help
22:08 tg2 went to 3.6.2 and the reblaance still failed
22:08 tg2 I think it's a bad brick tbh
22:09 JoeJulian Yeah, 3.6.3+ is where I started recommending 3.6. 3.6.4 is current.
22:09 techsenshi kk i'm just not certain as to what would cause my rebalnce to have files not available to the samba vfs really weird
22:10 techsenshi scripted a client to check some smaller directories and now samba clients see all the files, so i guess the rebalance is working
22:10 JoeJulian I agree, it's weird. Especially if they show back up when accessed.
22:12 techsenshi the rebalance on 3.6, can it do bricks in parrallel, seems like in 3.5 its one brick at a time
22:13 tg2 3.6.4 is not in the ubuntu repo
22:13 Rapture joined #gluster
22:13 cyberbootje2 joined #gluster
22:13 tg2 or maybe its 3.7 i'm thinking of
22:14 badone_ joined #gluster
22:17 jrm16020 joined #gluster
22:20 neofob left #gluster
22:32 JoeJulian @ppa
22:32 glusterbot JoeJulian: The official glusterfs packages for Ubuntu are available here: 3.5: http://goo.gl/6HBwKh 3.6: http://goo.gl/XyYImN 3.7: https://goo.gl/aAJEN5 -- See more PPAs for QEMU with GlusterFS support, and GlusterFS QA releases at https://launchpad.net/~gluster -- contact semiosis with feedback
22:36 daMaestro joined #gluster
22:48 ToMiles joined #gluster
22:57 calavera joined #gluster
23:15 dlambrig joined #gluster
23:22 plarsen joined #gluster
23:39 tg2 how stable is 3.7 now
23:44 techsenshi anybody notice performance increase from 3.5 to 3.6?
23:45 srepetsk just upgraded to 3.7.2 in my environment yesterday
23:47 patnarciso_ good evening all. (or morning depending on how worldly you are)
23:59 tg2 techsenshi - there should be one
23:59 tg2 srepetsk, from?
23:59 tg2 i went from 3.3 to 3.4 to 3.5 to 3.6 in series
23:59 tg2 went surprisingly well

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary