Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2016-09-07

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:02 deniszh joined #gluster
00:26 ankitraj joined #gluster
00:49 kukulogy joined #gluster
00:58 shdeng joined #gluster
01:02 deniszh joined #gluster
01:05 hagarth joined #gluster
01:40 Lee1092 joined #gluster
01:48 Javezim joined #gluster
01:49 Javezim Is anyone running Hyper-V VMs on Glusterfs? What kind of performance do you see? Does anyone have any recommendations for doing this? I figure utilizing iscsitarget on Ubuntu to map the Drive to the Windows Server host would be the best way
02:03 deniszh joined #gluster
02:08 skoduri joined #gluster
03:03 Gambit15 joined #gluster
03:03 deniszh joined #gluster
03:09 kramdoss_ joined #gluster
03:09 nishanth joined #gluster
03:11 el_isma joined #gluster
03:12 ZachLanich joined #gluster
03:13 magrawal joined #gluster
03:20 rideh joined #gluster
03:28 RameshN joined #gluster
03:43 plarsen joined #gluster
03:49 rafi joined #gluster
04:01 atinm joined #gluster
04:02 nbalacha joined #gluster
04:04 deniszh joined #gluster
04:05 riyas joined #gluster
04:05 msvbhat joined #gluster
04:06 itisravi joined #gluster
04:07 sanoj joined #gluster
04:07 shubhendu joined #gluster
04:14 shubhendu joined #gluster
04:21 itisravi joined #gluster
04:53 atinm joined #gluster
04:55 prth joined #gluster
04:55 aspandey joined #gluster
04:59 msvbhat joined #gluster
05:00 karthik_ joined #gluster
05:04 ashiq joined #gluster
05:06 aravindavk joined #gluster
05:08 ankitraj joined #gluster
05:11 arcolife joined #gluster
05:12 rafi joined #gluster
05:25 kdhananjay joined #gluster
05:27 jkroon joined #gluster
05:28 ppai joined #gluster
05:28 raghug joined #gluster
05:32 kotreshhr joined #gluster
05:38 atinm joined #gluster
05:39 Bhaskarakiran joined #gluster
05:41 hgowtham joined #gluster
05:41 coreping joined #gluster
05:45 mhulsman joined #gluster
05:49 mhulsman1 joined #gluster
05:52 hchiramm joined #gluster
05:56 Muthu joined #gluster
05:56 jiffin joined #gluster
05:57 Manikandan joined #gluster
05:58 harish joined #gluster
06:01 nishanth joined #gluster
06:01 arcolife joined #gluster
06:05 kukulogy joined #gluster
06:06 satya4ever joined #gluster
06:06 kukulogy joined #gluster
06:07 deniszh joined #gluster
06:10 kukulogy joined #gluster
06:13 derjohn_mob joined #gluster
06:17 Muthu_ joined #gluster
06:18 itisravi joined #gluster
06:29 jtux joined #gluster
06:43 k4n0 joined #gluster
06:48 rastar joined #gluster
06:48 karnan joined #gluster
06:49 devyani7 joined #gluster
06:52 kdhananjay joined #gluster
06:54 karthik_ joined #gluster
06:58 shubhendu joined #gluster
07:00 malevolent joined #gluster
07:00 xavih joined #gluster
07:01 nishanth joined #gluster
07:05 kshlm joined #gluster
07:10 jri joined #gluster
07:10 prth joined #gluster
07:16 jkroon joined #gluster
07:17 Saravanakmr joined #gluster
07:18 skoduri joined #gluster
07:20 fsimonce joined #gluster
07:25 deniszh joined #gluster
07:34 nbalacha joined #gluster
07:40 robb_nl joined #gluster
07:45 karnan joined #gluster
07:52 hackman joined #gluster
07:53 Slashman joined #gluster
07:56 derjohn_mob joined #gluster
08:07 jtux joined #gluster
08:09 deniszh joined #gluster
08:23 devyani7 joined #gluster
08:24 burn joined #gluster
08:31 raghug joined #gluster
08:32 itisravi joined #gluster
08:35 mhulsman joined #gluster
08:35 hchiramm joined #gluster
08:39 mhulsman1 joined #gluster
08:43 poornima joined #gluster
08:52 swebb joined #gluster
08:54 riyas joined #gluster
08:54 Bhaskarakiran joined #gluster
08:54 pkalever joined #gluster
08:54 poornima joined #gluster
08:54 rjoseph|afk joined #gluster
08:54 kdhananjay joined #gluster
08:54 itisravi joined #gluster
08:54 shruti joined #gluster
08:54 jiffin joined #gluster
08:54 ashiq joined #gluster
08:54 sac joined #gluster
08:54 magrawal joined #gluster
08:54 satya4ever joined #gluster
08:55 lalatenduM joined #gluster
08:55 skoduri|training joined #gluster
08:55 Saravanakmr joined #gluster
08:55 aspandey joined #gluster
08:55 sanoj joined #gluster
08:55 atinm joined #gluster
08:55 karnan joined #gluster
08:55 hchiramm joined #gluster
08:55 rastar joined #gluster
08:55 nbalacha joined #gluster
08:56 hgowtham joined #gluster
08:56 raghug joined #gluster
08:56 kshlm joined #gluster
08:56 Manikandan joined #gluster
08:56 RameshN joined #gluster
08:56 shubhendu joined #gluster
08:57 kotreshhr joined #gluster
08:59 nishanth joined #gluster
09:02 ppai joined #gluster
09:03 Muthu_ joined #gluster
09:05 jri joined #gluster
09:05 devyani7 joined #gluster
09:08 kdhananjay joined #gluster
09:09 shubhendu joined #gluster
09:12 arcolife joined #gluster
09:16 poornima joined #gluster
09:16 k4n0 joined #gluster
09:25 Gnomethrower joined #gluster
09:32 devyani7 joined #gluster
09:36 jkroon joined #gluster
09:40 msvbhat joined #gluster
09:43 rafi joined #gluster
09:49 itisravi joined #gluster
09:53 prth joined #gluster
09:56 mhulsman joined #gluster
10:00 shubhendu joined #gluster
10:08 sanoj joined #gluster
10:08 arcolife joined #gluster
10:08 kshlm joined #gluster
10:09 Muthu_ joined #gluster
10:09 ashiq joined #gluster
10:19 karnan joined #gluster
10:24 DV joined #gluster
10:26 nishanth joined #gluster
10:40 nbalacha joined #gluster
10:55 poornima joined #gluster
10:59 ppai joined #gluster
11:00 robb_nl joined #gluster
11:03 ira joined #gluster
11:14 coreping joined #gluster
11:18 Goonstah joined #gluster
11:21 raghug joined #gluster
11:25 nishanth joined #gluster
11:25 B21956 joined #gluster
11:29 karnan joined #gluster
11:33 Guest_84757 joined #gluster
11:33 Guest_84757 allah is doing
11:33 Guest_84757 sun is not doing allah is doing
11:33 Guest_84757 moon is not doing allah is doing
11:34 Guest_84757 stars are not doing allah is doing
11:34 post-factum amye: ^^
11:34 kdhananjay joined #gluster
11:34 Guest_84757 planets are not doing allah is doing
11:34 Guest_84757 galaxies are not doing allah is doing
11:34 Guest_84757 oceans are not doing allah is doing
11:34 post-factum hagarth: ^^
11:34 Guest_84757 mountains are not doing allah is doing
11:34 Guest_84757 trees are not doing allah is doing
11:35 post-factum JoeJulian: ^^
11:35 Guest_84757 mom is not doing allah is doing
11:35 Guest_84757 dad is not doing allah is doing
11:35 post-factum purpleidea: ^^
11:35 post-factum semiosis: ^^
11:35 Guest_84757 boss is not doing allah is doing
11:35 post-factum Guest_84757: Windows is not doing, Linux is doing
11:36 post-factum Guest_84757: there is no god except St. Patrick and Ganesha!
11:36 Guest_84757 job is not doing allah is doing
11:36 Guest_84757 dollar is not doing allah is doing
11:38 Guest_84757 degree is not doing allah is doing
11:38 Guest_84757 medicine is not doing allah is doing
11:38 beemobile joined #gluster
11:38 Guest_84757 customers are not doing allah is doing
11:39 Guest_84757 you can not get a job without the permission of allah
11:40 Guest_84757 you can not get married without the permission of allah
11:47 shubhendu joined #gluster
11:48 d0nn1e joined #gluster
11:50 ramky joined #gluster
11:54 kkeithley JoeJulian, hagarth: can you give nigelb channelop privs again. Somehow it didn't stick. Maybe in #gluster-dev too?
11:54 kshlm Weekly community meeting starts in 5 minutes in #gluster-meeting
11:55 TZaman joined #gluster
12:01 nobody481 Could anyone help me with compiling gluster on FreeBSD?  I've run ./autogen.sh successfully.  But then ./configure fails on every line starting with "PKG_CHECK_MODULES".  I can't figure out why.
12:02 kshlm nobody481, Do you have pkg-config installed?
12:02 kshlm Install it and run autogen again before running configure
12:10 nobody481 kshlm: I'll give that a try, thank you
12:16 nobody481 Ok, that got me past that problem.  Now ./configure is stating error: cannot find input file: `libglusterfs/Makefile.in'
12:17 ppai joined #gluster
12:25 ndevos nobody481: the Makefile.in should get generated by ./autogen.sh... can you confirm it is missing?
12:26 nishanth joined #gluster
12:27 kkeithley did you rerun autogen.sh? Or just rerun ./configure?
12:27 unclemarc joined #gluster
12:28 nobody481 ndevos: Yes, it is missing.  The libglusterfs directory only has the file Makefile.am, and a directory named "src"
12:28 nobody481 kkeithley: I did rerun autogen.sh
12:28 jkroon @ kshlm - i'm prepping an upgrade to 3.7.15 so long.
12:29 ndevos nobody481: hmm, weird... is that in a git repository, or from a tarball?
12:29 nobody481 It was from the git repository.
12:29 kkeithley then I suspect that one of the auto* caches might be borked.  The easiest thing might to be untar a fresh source tree and run `./autogen.sh && ./configure` in that
12:30 nobody481 Ok I'll give that a try and see what happens...
12:30 kkeithley then clone a new tree and  run `./autogen.sh && ./configure` in that
12:39 ashiq joined #gluster
12:42 k4n0 joined #gluster
12:45 johnmilton joined #gluster
12:59 kpease joined #gluster
13:01 shyam joined #gluster
13:04 nbalacha joined #gluster
13:06 bluenemo joined #gluster
13:17 jkroon kshlm, you still around?
13:18 kshlm I am, but I'm currently fixing the mess I did in the meeting.
13:19 kshlm Give me a little more time.
13:19 the-me joined #gluster
13:20 kshlm jkroon, Just ask what you need to ask. There are other people around.
13:21 jkroon kshlm, sure.
13:21 jkroon ok, so i've now upgraded back to kernel 4.6.4 (we had limited performance issues with 3.7.14 on 4.6.4), after a downgrade to 4.0.9 (last mdadm known good version) and subsequently to 3.19.5.  After that we now upgraded to to 3.7.15.
13:22 jkroon I hope all that makes a lot of since.
13:22 kukulogy joined #gluster
13:23 jkroon since the initial kernel downgrade from 4.6.4 to to 4.0.9 we suddenly started sustaining a load average of around 120-140 (norm used to be in the <10 range).
13:23 jkroon now no matter what we do (same config) we cannot seem to get the low sub 100.
13:23 jkroon other than stopping courier-pop3d and imapd which takes away the demand in the first place and is effectively a DoS.
13:25 Philambdo joined #gluster
13:27 kshlm jkroon, I'm back.
13:27 kshlm I'm not sure how much help I can be, but I'll try.
13:28 kshlm For my clarity, your troubled setup right now is as follows
13:28 kshlm You have your mail-servers, serving mail out of a glusterfs volume.
13:29 kshlm The servers hosting the volume are running kernel 4.6.4.
13:29 kshlm And you're using glusterfs-3.7.15.
13:29 kshlm With a previous glusterfs version, downgrading the kernel reduced the load.
13:30 kshlm Now downgrading doesn't help.
13:30 jkroon /var/spool/mail is mounted on glusterfs yes
13:30 jkroon we had 3.7.14 on 4.6.4
13:31 jkroon a combination which gave us trouble (but nowhere near the current state of badness) and in another cluster that was resolved  by downgrading the kernel to avoid (suspected) mdadm problems.  4.1 kernel is known bad for mdadm and we've managed to deadlock every major version (latest) up to 4.6.4 since then, so we currently aim for latest 4.0
13:31 jkroon previously the servers was on 3.19.5 and worked well.
13:32 jkroon so after a downgrade to 4.0.9 gave problems we reverted to 3.19.5 again.
13:32 jkroon this didn't solve the problem so we upgraded back to 4.6.4 in the hopes that since it's only the kernel that changed this should fix the problem.
13:32 jkroon it didn't.
13:32 jkroon I then actioned an upgrade to 3.7.15 after reading the release notes and seeing things like memory leak and other cpu bound fixes.
13:32 jkroon this still didn't resolve it.
13:33 kshlm Which was the first glusterfs version you saw the load with?
13:33 jkroon 3.7.14
13:33 kshlm Also, what type of volume are you using.
13:33 kshlm 3.7.13 + kernel 4.6 was good?
13:33 jkroon which was also perfectly acceptable on the 4.6.4 kernel (we actioned the dongrade two nights back from 4.6.4)
13:34 jkroon we never ran 3.7.13 - we jumped from 3.7.4
13:34 kshlm Oh man. That's a big range to figure out what changed.
13:35 jkroon 3.7.4 was ok on 3.19.5
13:35 jkroon well, i can tell you what changed when.
13:35 jkroon monday evening:  kernel from 4.6.4 to 4.0.9
13:35 jkroon yesterday afternoon we noticed the problem and I was only investigating since then.
13:36 jkroon this morning I downgraded to 3.19.5 again since that ran on the servers for years without problems.
13:36 jkroon when that didn't help either I upgraded back to 4.6.4 on kernel
13:36 jkroon and when that didn't help, glusterfs 3.7.14 => 15.
13:36 jkroon so yes - it's a wide range of crap I agree.
13:37 jkroon https://paste.fedoraproject.org/423369/25542914/
13:37 glusterbot Title: #423369 • Fedora Project Pastebin (at paste.fedoraproject.org)
13:37 jkroon gluster volume info and a few others.
13:38 jkroon what i'm suddenly worried about is that we might still have been running the 3.7.4 glusterfsd processes until monday evening ... since until recently we didn't realize those needed to be restarted separately.
13:38 skylar joined #gluster
13:38 kshlm Again, to make things clearer for me.
13:39 kukulogy joined #gluster
13:39 kshlm You had load issues with 3.7.4 + latest kernel, which got resolved by downgrading to an older kernel.
13:39 kshlm You upgraded to .14, and everything was working well.
13:39 jkroon glusterfs on 3.7.4 + 3.19.5 served us well for very long.
13:40 jkroon the kernel upgrade to 4.6.4 came first, when we first saw load spikes, but on the mail server it was acceptable so we didn't care too much.
13:40 jkroon in fact, we didn't even notice until we started seeing problems on the LAMP stack (which is currently running fine on kernels 3.9.15 and glusterfs 3.7.14)
13:40 jkroon so we decided to just upgrade to 3.7.14 everywhere.
13:41 jkroon which may have been a bad choice from a trouble-shooting perspective.
13:43 kshlm Lot to take in.
13:44 kshlm To as I can make it out, something changed (probably in AFR) between 3.7.4 to 3.7.14 which is generating a lot of load on the servers.
13:44 jkroon what I find interesting, as soon as I move the imap+pop3 load to *one* of the two servers the load seems to gravitate towards that server, which to me indicates a fuse client issue ??
13:45 kshlm You're mounting on the servers?
13:45 jkroon jip, just moved the clustering on incoming connections to send all traffic to one of the two hosts and suddenly the load on the other server is dropping like a brick.
13:45 jkroon yes, so our bricks and clients are on the same physicals.
13:45 jkroon glusterfs was our choice to create a shared filesystem between the two hosts.
13:46 kshlm Oh.
13:46 kshlm This new observation is interesting.
13:46 jkroon sorry if i'm throwing too much information at you too quickly.
13:47 jkroon yea, load on "inactive" host dropped from around 130 to 43, seems to have stabilized now.
13:47 kshlm So the client process is creating the load.
13:47 jkroon load on "active" side has increased from around 120 to 150.
13:47 shyam joined #gluster
13:47 jkroon it would seem that way yes
13:49 kshlm Would be nice to actually see what the client is doing.
13:50 jkroon any information you need i'm authorized to provide you
13:51 kshlm I would like to know what fops are being generated by the client.
13:52 jkroon how do i gather that for you?
13:52 kshlm I'm guessing that there have been changes to AFR to increase reliability, which has incresed the number of operations.
13:52 kshlm jkroon, No easy way from the client.
13:52 semiosis could it be a lot of healing going on in the background is causing the load?
13:52 jkroon from the server?
13:53 jkroon at the moment yes that's actually viable.
13:53 jkroon how safe is it to disable (kill) the shd given the above config?
13:54 jwd joined #gluster
13:54 kshlm https://github.com/gluster/glusterfsiostat was a GSOC project to get profile info on clients.
13:54 glusterbot Title: GitHub - gluster/glusterfsiostat: A tool to provide performance statistics similar to those given by nfsiostat about glusterfs mounts on a system through a standard CLI and visualization of data with a graphics processing utility. (at github.com)
13:54 kshlm Unfortunately doesn't have any docs.
13:54 jiffin1 joined #gluster
13:54 jkroon kshlm, if you can guide me.
13:54 kshlm jkroon, I'm unfamiliar with it as well.
13:55 jkroon and it's two years old ...
13:55 jkroon let's work on the shd theory for the moment.
13:55 kshlm jkroon, Can you find out which exact process it creating the load? Is it the fuse mount process or shd.
13:56 jkroon top shows that the brick processes and the mount point processes are using cpu.
13:56 kshlm And SHD?
13:57 semiosis a system load avg that high indicates to me that there are processes in D state (uninterruptible sleep) that are blocked, waiting for IO to complete.  do you see a lot of processes in D state?
13:58 jkroon 0% CPU and in state S so i'm guessing no.
13:58 jkroon semiosis, yes I do.
13:58 jkroon mostly imapd + pop3d processes.
13:58 Philambdo joined #gluster
13:58 jkroon so we're back at fuse mount.
13:59 msvbhat joined #gluster
13:59 kshlm jkroon, I think I have a meeting now,
13:59 jkroon kk
13:59 kshlm I'll get back once I'm done.
14:00 jkroon thanks.  you wouldn't happen to have an approximate ETA?
14:00 kshlm ~45 minutes.
14:00 semiosis i've been out of the loop for a few gluster versions, but there should be a command to see how many files are pending healing.  my guess is that you have a lot of files pending healing, and the imapd/pop3d processes trying to access files that need healing are blocked until the heal is completed.  if you can get gluster to show you how many files are pending healing that might confirm the theory
14:02 semiosis jkroon: gluster volume heal VOLNAME info
14:03 semiosis https://access.redhat.com/documentation/en-US/R​ed_Hat_Storage/2.0/html/Administration_Guide/se​ct-User_Guide-Managing_Volumes-Self_heal.html
14:03 glusterbot Title: 10.8. Triggering Self-Heal on Replicate (at access.redhat.com)
14:03 jkroon semiosis, yes, there are entries there.
14:03 semiosis a lot?
14:04 jkroon more than what there was earlier today
14:04 jkroon about 65 on each brick, was at 0/0 pre glusterfs upgrade
14:04 jkroon and peaked on the one server at 145
14:05 jkroon 66/2 currently
14:06 semiosis jkroon: ok, new theory, if your fuse mount was not connected to a brick, then any writes to files on that brick would need to be healed, causing the number of pending heals to go up.  check your client for missing brick connections.  the way i would do that is looking at the fuse client log file, and using netstat to check if all the tcp connections are alive.
14:07 semiosis @ports
14:07 glusterbot semiosis: glusterd's management port is 24007/tcp (also 24008/tcp if you use rdma). Bricks (glusterfsd) use 49152 & up. All ports must be reachable by both servers and clients. Additionally it will listen on 38465-38468/tcp for NFS. NFS also depends on rpcbind/portmap ports 111 and 2049.
14:09 semiosis i have a meeting coming up.  gotta go.  will check back later.  good luck!
14:09 jiffin1 joined #gluster
14:15 plarsen joined #gluster
14:15 plarsen joined #gluster
14:18 JoeJulian semiosis!!! :D
14:19 ankitraj joined #gluster
14:20 kpease joined #gluster
14:28 jkroon semiosis, sorry, i understand that, fuse client needs to connect to both bricks.
14:28 jkroon only writing to one brick results in heal being required.
14:36 Philambdo joined #gluster
14:37 arcolife joined #gluster
14:38 jkroon i cannot find that i'm only connected to one brick.
14:38 jkroon on either server.
14:39 satya4ever joined #gluster
14:40 robb_nl joined #gluster
14:41 semiosis jkroon: did you confirm with netstat that the fuse client process has a tcp connection established to both brick ,,(ports)
14:41 glusterbot jkroon: glusterd's management port is 24007/tcp (also 24008/tcp if you use rdma). Bricks (glusterfsd) use 49152 & up. All ports must be reachable by both servers and clients. Additionally it will listen on 38465-38468/tcp for NFS. NFS also depends on rpcbind/portmap ports 111 and 2049.
14:42 johnmilton joined #gluster
14:42 nobody481 joined #gluster
14:42 ic0n joined #gluster
14:42 delhage joined #gluster
14:42 crashmag joined #gluster
14:42 ebbex_ joined #gluster
14:42 frakt joined #gluster
14:42 The_Ball joined #gluster
14:42 gvandeweyer joined #gluster
14:42 jesk joined #gluster
14:42 javi404 joined #gluster
14:42 ItsMe`` joined #gluster
14:42 wistof joined #gluster
14:42 jkroon semiosis, yes, i did.
14:42 jkroon also, netstat doesn't show any build-up of rx or tx queue.
14:42 jkroon also checked the nic statistics, no rx or tx errors anywhere.
14:43 jkroon the volume heal info does vary quite dramatically in terms of number of files listed.
14:43 jkroon it goes up and down quite a bit.
14:43 lkoranda joined #gluster
14:48 jkroon [2016-09-07 14:48:25.238302] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-mail-replicate-0: Unreadable subvolume -1 found with event generation 2 for gfid e31aafca-4873-4582-a08c-247203b540e1. (Possible split-brain)
14:49 jkroon so that might explain load.
14:57 kukulogy joined #gluster
14:59 prth joined #gluster
15:00 harish joined #gluster
15:00 Bhaskarakiran joined #gluster
15:02 semiosis jkroon: are your system clocks in sync between the two servers?
15:03 semiosis gotta run.  bbl.
15:04 rwheeler joined #gluster
15:05 jkroon within few ms at least yes.
15:05 jkroon had that problem before ...
15:05 jkroon both sync to the same ntp upstream servers.
15:06 arcolife joined #gluster
15:08 eMBee joined #gluster
15:10 riyas joined #gluster
15:12 kshlm jkroon, You're getting those message without any obvious network issues?
15:14 kshlm jkroon, I'll suggest that you put all this down in a mail to gluster-users/gluster-devel. And cc pkarampu@redhat.com the AFR maintainer.
15:14 kshlm He will be able to help you with what's happening.
15:14 kshlm I still believe it's changes to AFR that's increased the load.
15:16 wushudoin joined #gluster
15:17 kshlm jkroon, I need to leave now. I hope I've been of some help.
15:19 aravindavk joined #gluster
15:20 nathwill joined #gluster
15:24 ZachLanich joined #gluster
15:24 shubhendu joined #gluster
15:34 skoduri joined #gluster
15:35 Bhaskarakiran joined #gluster
15:46 devyani7 joined #gluster
15:47 Lee1092 joined #gluster
15:49 karnan joined #gluster
15:49 ppai joined #gluster
15:49 anmol joined #gluster
15:54 robb_nl joined #gluster
15:55 nobody481 Is there any way to force gluster version 3.7.15 to peer with gluster version 3.7.6?  Right now I can't do it, "peer probe: failed: Peer does not support required op-version"
16:00 hagarth joined #gluster
16:01 ashiq_ joined #gluster
16:03 jiffin joined #gluster
16:04 madmatuk Hi all, i am seeing serious io on one of our disks in our 2 node replicate 2 cluster. I disconnected all clients, but this is still happening and on both servers. ~99 %Util and ~88 r/s using iostat -x 5, any idea what might be causing this?
16:08 ndevos nobody481: if you start with a gluster pool running 3.7.6 you should be able to peer probe a server running 3.7.15
16:09 rafi joined #gluster
16:09 jri_ joined #gluster
16:09 ndevos nobody481: possibly you can change the op-version too, but I'm not sure it allows downgrading... something like 'gluster volume set all cluster.op-version 30706' might work
16:21 nobody481 ndevos: I hadn't thought of that, good idea.  I could try initiating the peer probe from the 3.7.6 machine......
16:21 ndevos nobody481: obviously the best way would be to have the same versions everywhere...
16:22 ndevos madmatuk: maybe you can identify the process doing the i/o? glustershd (self-heal-daemon) might be running and doing it's thing
16:23 nobody481 ndevos: I've been trying all morning to get 3.7.15 to compile on the FreeBSD box, no luck at all
16:23 madmatuk ndevos i thought something like that might be happening, i did a "gluster volume status" and there was nothing that suggested a heal was in process?
16:23 ndevos nobody481: oh, right that was the freebsd issue...
16:24 ndevos madmatuk: it's better to check the logs for healing activity, /var/log/gluster/glustershd.log iirc
16:25 ndevos nobody481: we actually have simple compile/run tests on FreeBSD, so its a little unexpected that it doesnt even build for you
16:25 madmatuk ndevos /var/log/glusterfs/glustershd.log is empty :(
16:26 nobody481 ndevos: It's probably something I'm doing wrong.  But I just got all 4 installations to peer when I initiated it from the 3.7.6 machine, so I'm going to move forward for now
16:26 ndevos madmatuk: uh, ok, not sure what could be the cause of that, normally it would at least meantion the starting of the daemon
16:26 ndevos nobody481: ok, good luck!
16:27 Bhaskarakiran joined #gluster
16:27 msvbhat joined #gluster
16:27 madmatuk nvedos - typical everything goes wrong for me lol - the volume status says N/A for the self heal ... Self-heal Daemon on localhost               N/A       N/A        Y       7100
16:27 madmatuk same for the other
16:28 Gambit15 joined #gluster
16:29 ndevos madmatuk: I think the N/A is for the port that it uses, there is none for self-heal :)
16:29 madmatuk ahh ok :) good to know
16:30 madmatuk any idea how i can identify what is doing all these reads on this specific brick pair?
16:30 ndevos madmatuk: maybe you can use dstat or iotop or some tool like that to identify the process doing the i/o
16:30 madmatuk i will investigate! :) thanks ndevos
16:31 ndevos oh, if it is a brick pair, you may see something with 'gluster volume status client' or the 'gluster volume profile ...' or such
16:34 madmatuk ndevos ok i will have a read about that volume profile , gluster voume status client doesnt work "Volume client does not exist"
16:34 TZaman joined #gluster
16:36 ndevos madmatuk: check the 'gluster volume help' output, I'm not sure what the exact command is
16:36 * ndevos leaves for the day, will be back tomorrow
16:37 madmatuk ndevos sure, i will do that :) thanks for the help, have a good one! :)
16:38 baojg joined #gluster
16:38 robb_nl joined #gluster
16:40 k4n0 joined #gluster
16:52 nbalacha joined #gluster
16:59 armyriad joined #gluster
17:03 jiffin joined #gluster
17:08 hchiramm joined #gluster
17:11 hagarth joined #gluster
17:18 StarBeast joined #gluster
17:21 shyam joined #gluster
17:22 baojg joined #gluster
17:24 robb_nl joined #gluster
17:27 TZaman joined #gluster
17:37 jkroon -?????????  ? ?    ?       ?            ? courierpop3dsizelist
17:38 jkroon i'm seeing entries like that but no heal info ...
17:43 arcolife joined #gluster
18:01 shaunm joined #gluster
18:13 nobody481 I just configured 3 peers, created the storage volume, and started the volume.  I saw that the new directory was created on all 3 peers.  An hour ago I created 20 dummy files on one of the peers but they haven't shown up on the other 2.  What gives?
18:14 jkroon how did you copy the files in?
18:14 jkroon perhaps you're distributing instead of replicating?
18:14 semiosis nobody481: you are writing the files through a client mount point?
18:16 nobody481 I went into the directory on one of the peers and created them with "touch".  I created the volumes with the "replica" option
18:16 nobody481 semiosis: No I didn't write the files through a client mount point.  I created them on the peer itself, I just moved to the directory and used the "touch" utility
18:17 semiosis nobody481: you shouldn't access the bricks directly once they're used by glusterfs.
18:17 semiosis nobody481: make a client mount point then read/write in it
18:17 nobody481 Ahh ok, newbie mistake
18:22 ChrisHolcombe joined #gluster
18:39 sandersr joined #gluster
18:47 rastar joined #gluster
18:48 nobody481 Ok, hopefully my last question.  I have gluster installed on 3 Ubuntu machines and 1 FreeBSD machine.  I initiated the peers from the FreeBSD machine to the 3 Ubuntu machines, they all show up fine.  DNS entries are in /etc/hosts on all 4 machines.  When I try to create the volume on the FreeBSD machine, it gives me an error that the FreeBSD machine "is not in 'peer in Cluster' state", even though it
18:48 nobody481 shows up as "State: Peer in Cluster (Connected)" on the 3 Ubuntu machines.  Any ideas there?
19:03 JoeJulian nobody481: 1st, make sure they're all running the same version. Someone was having a problem with that between freebsd and something yesterday. Second, check the peer states from the freebsd box.
19:16 raghu joined #gluster
19:16 jiffin joined #gluster
19:27 k4n0 joined #gluster
19:34 jkroon JoeJulian, i'm developing that sinking gluster feeling again.
19:35 jkroon seeing the weirdest split brains, normally a split-brain results in some form of io error on the client, so now i get directory split brains, turns out there is files on one brick, but not ther other, so I run stat on those, and the files get created (as empty files) on the brick where they're missing, and the directory remains in split brain...
19:35 jkroon bizarre
19:36 jkroon my love/hate with gluster continues.  when it works well i love it, but on the days when it has moods i really seriously dislike it.
19:41 JoeJulian jkroon: Yeah, I understand. Obviously the solution is to find the root cause of these. Clearly something is losing connection along the way.
19:43 deniszh joined #gluster
19:43 jkroon JoeJulian.  Yea.  I agree.  But I'm at a loss.  The machines are gbit connected and all traffic between them is allowed.
19:43 jkroon no firewall nothing.
19:44 jkroon the only thing that I know caused problems was the disk IO mdadm stuff.
19:44 jkroon but a kernel downgrade really should have sorted all that out.
19:46 JoeJulian Anything in the client logs?
19:47 JoeJulian .. also, what version have you landed on?
19:51 jkroon 3.7.15
19:51 jkroon what are we looking for in the client logs?  mine are quite busy.
19:53 jkroon heck, I've got gfid's in split brains where even if I manually look at all of the 5 files in those folders, I still can't find a difference between them.
19:53 jkroon folder is split brain according to gluster volume heal ... info split-brain, but the folder is working perfectly.
19:58 JoeJulian jkroon: in the client logs, I'd be looking for "connection" being lost or regained. Maybe there's a clue in the timing.
19:58 deniszh1 joined #gluster
19:59 JoeJulian Different gfids on a directory sounds like a race condition.
20:01 jkroon no, same gfid both sides ... files missing on one side, but even after a stat of those files the split brain remains ...
20:02 jkroon 317 splits to go ...
20:08 jkroon ok, it seems just running a stat on those folders /* resolves it at least a portion of the time.
20:10 jkroon at least the load averages are starting to come down as I fix the split brains, which is at least encouraging.
20:10 JoeJulian From that description, I would guess that it works once its directory entries match.
20:15 jkroon yea, auto merge
20:16 shyam joined #gluster
20:19 jkroon what would happen in the case of a split brain where you simply rm the gfid file and not the linked file?
20:20 jkroon down to 240 split brains.  by simply iterating over gluster volume heal .. info split-brain, doing a readlink on the gfid files if their symlinks (folders) and running a stat on them and their content.
20:22 jkroon [client-handshake.c:1224:client_setvolume_cbk] 0-mail-client-0: Server and Client lk-version numbers are not same, reopening the fds
20:22 glusterbot jkroon: This is normal behavior and can safely be ignored.
20:23 jkroon go figure ...
20:23 sandersr joined #gluster
20:24 jkroon [afr-common.c:4299:afr_notify] 0-mail-replicate-0: Subvolume 'mail-client-1' came back up; going online.
20:24 jkroon so why did it go down to begin with?
20:27 jkroon JoeJulian, it would seem the self-heal daemon is in some kind of loop.
20:27 jkroon it keeps trying to clear a time split brain on the same three gfid values over and over and over again
20:28 jkroon https://paste.fedoraproject.org/423616/14732801/ that just loops over and over and over again
20:28 glusterbot Title: #423616 • Fedora Project Pastebin (at paste.fedoraproject.org)
20:38 JPaul joined #gluster
20:43 ten10 joined #gluster
20:43 ten10 Anyone here use glusterfs for redundancy on iscsi stores?
20:44 jkroon cluster.locking-scheme <-- has anyone set this to granular compared to full?
20:44 glusterbot jkroon: <'s karma is now -25
20:45 jkroon I'm reasonably sure I don't need afr-v1 compatibility.
20:50 ten10 dang was hoping atleast 1 person was using it that I could ask some questions :(
20:54 semiosis ten10: keep asking questions, someone might answer
20:57 ten10 well, right now (for better or worse) I'm using large files for iscsi data stores rather than a partition or volume.. was hoping to be able to have a redundant iscsi server that I could fail over to on the fly if I need to upgrade the main iscsi server...
20:57 JoeJulian Won't be me. I avoid iscsi as much as possible.
20:58 JoeJulian Should be possible though. I know you can use tgt with gluster volumes and the bd translator. Not sure how that works with multipathing though.
21:00 ten10 i was using NFS for my vmware datastores but had some issues with slowness.. I guess it was due to the sync write for nfs or something.. I'm not a huge fan of vmware but have a home lab for learning, etc..
21:01 ten10 ultimately i guess I could just manually move them before hand
21:02 JoeJulian If you're just learning, then try it out. If you get it working, blog about it. Next thing you know you're not just learning but you're the expert.
21:05 post-factum if you dont get it working, file a bugreport
21:05 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
21:05 post-factum thanks glusterbot u r so cute
21:14 ten10 yeah that's the plan.. was looking for words of encouragement from others that might have tried ;)
21:17 shortdudey123 joined #gluster
21:29 om joined #gluster
21:34 johnmilton joined #gluster
21:55 johnmilton joined #gluster
22:06 hagarth joined #gluster
22:27 bluenemo joined #gluster
22:59 arcolife joined #gluster
23:01 bkolden joined #gluster
23:04 clyons joined #gluster
23:10 plarsen joined #gluster
23:19 om joined #gluster
23:57 kukulogy joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary