Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2015-10-28

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 bennyturns joined #gluster
00:00 jvandewege joined #gluster
00:01 nangthang joined #gluster
00:05 night joined #gluster
00:08 bennyturns joined #gluster
00:14 hgichon joined #gluster
00:22 Rapture joined #gluster
00:31 calavera joined #gluster
00:37 k-ma joined #gluster
00:49 deniszh joined #gluster
00:51 gildub joined #gluster
00:55 k-ma joined #gluster
00:55 deniszh joined #gluster
00:56 mlhamburg joined #gluster
01:05 vimal joined #gluster
01:12 deniszh1 joined #gluster
01:12 deniszh1 left #gluster
01:27 zhangjn joined #gluster
01:28 GB21 joined #gluster
01:30 Lee1092 joined #gluster
01:36 gem joined #gluster
01:50 Pupeno joined #gluster
02:10 harish joined #gluster
02:10 nangthang joined #gluster
02:15 zhangjn joined #gluster
02:17 mjrosenb I've asked this before, but I don't remember enough of the answer to find it again
02:17 mjrosenb I done messed up the .glusterfs directory, and I've heard there is a way to get gluster to rebuild it
02:17 mjrosenb but I don't know how to do that.
02:21 julim joined #gluster
02:21 zhangjn joined #gluster
02:23 rideh joined #gluster
02:26 ghenry joined #gluster
02:26 ghenry joined #gluster
02:26 mjrosenb it looks like stating the file is enough to trigger it?
02:33 julim joined #gluster
02:35 julim joined #gluster
02:36 mjrosenb it is definitely not working for me :-(
02:52 gem joined #gluster
02:54 bharata-rao joined #gluster
03:08 vmallika joined #gluster
03:11 julim joined #gluster
03:25 k-ma joined #gluster
03:29 overclk joined #gluster
03:29 shortdudey123 joined #gluster
03:36 kdhananjay joined #gluster
03:38 stickyboy joined #gluster
03:41 pgreg joined #gluster
03:43 calavera joined #gluster
03:49 nbalacha joined #gluster
03:50 Pupeno joined #gluster
03:57 calavera joined #gluster
04:01 calavera joined #gluster
04:04 atinm joined #gluster
04:07 gem joined #gluster
04:07 itisravi joined #gluster
04:09 kotreshhr joined #gluster
04:25 kanagaraj joined #gluster
04:36 calavera joined #gluster
04:38 ramteid joined #gluster
04:42 TheSeven joined #gluster
04:48 ildefonso joined #gluster
05:00 pppp joined #gluster
05:05 GB21 joined #gluster
05:08 GB21_ joined #gluster
05:10 ndarshan joined #gluster
05:14 ppai joined #gluster
05:19 neha_ joined #gluster
05:20 atalur joined #gluster
05:23 GB21 joined #gluster
05:23 sakshi joined #gluster
05:29 GB21_ joined #gluster
05:29 RameshN joined #gluster
05:31 kshlm joined #gluster
05:32 ashiq joined #gluster
05:33 Manikandan joined #gluster
05:33 vmallika joined #gluster
05:37 beeradb joined #gluster
05:43 poornimag joined #gluster
05:44 Bhaskarakiran joined #gluster
05:44 skoduri joined #gluster
05:45 rafi joined #gluster
05:48 hgowtham joined #gluster
05:54 jiffin joined #gluster
06:06 jtux joined #gluster
06:07 deepakcs joined #gluster
06:10 ramky joined #gluster
06:14 kdhananjay joined #gluster
06:19 Saravana_ joined #gluster
06:24 julim joined #gluster
06:27 shubhendu joined #gluster
06:34 zhangjn joined #gluster
06:39 GB21 joined #gluster
06:41 DV joined #gluster
06:43 spalai joined #gluster
06:45 Saravana_ joined #gluster
06:53 mhulsman joined #gluster
06:57 zhangjn joined #gluster
07:03 Saravana_ joined #gluster
07:03 karnan joined #gluster
07:05 LebedevRI joined #gluster
07:05 anil_ joined #gluster
07:15 zhangjn joined #gluster
07:16 zhangjn joined #gluster
07:16 zhangjn joined #gluster
07:18 zhangjn joined #gluster
07:19 zhangjn joined #gluster
07:21 aravindavk joined #gluster
07:27 bhuddah joined #gluster
07:51 _NiC joined #gluster
07:57 deniszh joined #gluster
08:00 ramky joined #gluster
08:18 aravindavk joined #gluster
08:20 Philambdo joined #gluster
08:21 [Enrico] joined #gluster
08:24 zhangjn joined #gluster
08:25 nishanth joined #gluster
08:26 fsimonce joined #gluster
08:30 zhangjn joined #gluster
08:32 kovshenin joined #gluster
08:33 Pupeno joined #gluster
08:33 Pupeno joined #gluster
08:33 mlhamburg1 joined #gluster
08:42 Philambdo joined #gluster
08:54 jwd joined #gluster
08:55 [Enrico] joined #gluster
08:59 molch joined #gluster
08:59 molch Hi Guys
08:59 kshlm joined #gluster
09:00 molch whats up with the download mirror's?
09:00 molch http://download.gluster.org​/pub/gluster/glusterfs/3.6/
09:00 [Enrico] joined #gluster
09:01 arcolife joined #gluster
09:01 bhuddah molch: what's with that?
09:01 molch its been offline. Has only just come back now
09:02 molch from multiple sources on AWS EC2 in sydney and from a couple of differnet ISP connections
09:02 molch I was just now receiving "This webpage is not available  ERR_CONNECTION_REFUSED"
09:02 molch but it has come good :S
09:02 julim joined #gluster
09:03 itisravi left #gluster
09:07 bhuddah well. just in time it works again.
09:08 molch any ideas what the problem was.
09:08 bhuddah general brokenness?
09:18 ekuric joined #gluster
09:25 ivan_rossi joined #gluster
09:29 av3ng3r joined #gluster
09:30 hchiramm joined #gluster
09:31 janegil joined #gluster
09:33 nisroc joined #gluster
09:35 jockek joined #gluster
09:36 mhulsman joined #gluster
09:36 aravindavk joined #gluster
09:37 kaushal_ joined #gluster
09:38 hchiramm joined #gluster
09:40 jiku joined #gluster
09:40 stickyboy joined #gluster
09:41 jiku joined #gluster
09:50 anil_ joined #gluster
09:50 pgreg_ joined #gluster
09:52 [Enrico] joined #gluster
09:52 harish_ joined #gluster
10:03 julim joined #gluster
10:05 Slashman joined #gluster
10:06 sakshi joined #gluster
10:10 deepakcs joined #gluster
10:19 Leildin joined #gluster
10:25 jbrooks joined #gluster
10:34 firemanxbr joined #gluster
10:49 rafi joined #gluster
11:04 julim joined #gluster
11:04 chirino_m joined #gluster
11:11 rafi joined #gluster
11:14 hagarth joined #gluster
11:15 spalai joined #gluster
11:19 kotreshhr joined #gluster
11:24 shruti` joined #gluster
11:29 atinm joined #gluster
11:31 Slashman joined #gluster
11:41 kovshenin joined #gluster
11:44 vk|lavi joined #gluster
11:44 vk|lavi Hi all!
11:44 bhuddah hello
11:44 glusterbot bhuddah: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
11:44 vk|lavi I have a kind of trouble with GlusterFS
11:44 bhuddah oh, please....
11:45 vk|lavi yep
11:45 ira joined #gluster
11:45 bhuddah whose bot is it?
11:45 vk|lavi I can see that it uses random TCP ports in 900:1000
11:45 bhuddah random ports should always be 1025:
11:46 bhuddah or rather even more "up"
11:46 vk|lavi bhuddah, http://pastebin.ca/3223184
11:46 glusterbot Title: pastebin - Stuff - post number 3223184 (at pastebin.ca)
11:47 bhuddah sorry, it's blocked for me. i'm behind a shitty company proxy.
11:47 vk|lavi to com, so you have to believe me
11:48 vk|lavi so is there any way on how to prevent Gluster from using some specific port?
11:49 vikki joined #gluster
11:51 bhuddah sorry, my crystal ball is at home. i'm of no use atm for you :(
11:51 vk|lavi anyone beside bots here?
11:59 overclk joined #gluster
12:00 harish_ joined #gluster
12:03 unclemarc joined #gluster
12:13 vikki left #gluster
12:14 nishanth joined #gluster
12:14 shubhendu joined #gluster
12:15 kotreshhr left #gluster
12:17 neha_ joined #gluster
12:17 ndarshan joined #gluster
12:21 hchiramm joined #gluster
12:21 [Enrico] joined #gluster
12:24 gem joined #gluster
12:31 monotek1 joined #gluster
12:34 harish joined #gluster
12:34 pgreg joined #gluster
12:35 julim joined #gluster
12:38 shaunm joined #gluster
12:42 ashiq joined #gluster
12:44 gem joined #gluster
12:46 shyam joined #gluster
12:46 anil_ joined #gluster
12:47 jiffin joined #gluster
12:47 plarsen joined #gluster
12:53 rafi joined #gluster
12:56 Manikandan joined #gluster
13:02 Slashman joined #gluster
13:03 shubhendu joined #gluster
13:03 hchiramm joined #gluster
13:05 kotreshhr joined #gluster
13:05 kotreshhr left #gluster
13:06 kovshenin joined #gluster
13:06 shruti joined #gluster
13:15 rafi joined #gluster
13:17 mpietersen joined #gluster
13:23 skylar joined #gluster
13:30 dgandhi joined #gluster
13:34 julim joined #gluster
13:40 B21956 joined #gluster
13:46 zhangjn joined #gluster
13:49 zhangjn joined #gluster
13:55 Saravana_ joined #gluster
13:57 poornimag joined #gluster
14:00 kovshenin joined #gluster
14:04 kovshenin joined #gluster
14:04 nbalacha joined #gluster
14:07 hchiramm joined #gluster
14:10 jwaibel joined #gluster
14:12 mhulsman joined #gluster
14:13 atalur joined #gluster
14:19 spalai left #gluster
14:19 hgowtham joined #gluster
14:20 skylar1 joined #gluster
14:20 Slashman joined #gluster
14:27 jobewan joined #gluster
14:30 maserati joined #gluster
14:45 ayma joined #gluster
14:46 RameshN joined #gluster
14:48 jwd joined #gluster
14:48 ctria joined #gluster
15:22 Slashman joined #gluster
15:25 hagarth joined #gluster
15:27 rafi joined #gluster
15:28 atalur joined #gluster
15:35 theron joined #gluster
15:36 RameshN joined #gluster
15:37 stickyboy joined #gluster
15:42 calavera joined #gluster
15:48 kanagaraj joined #gluster
15:59 Philambdo joined #gluster
16:12 jonfatino Guys I am going to replace my second brick (server) with another one. What's the best way to do this? https://paste.ee/r/iGM3b
16:12 jonfatino replace brick method? change replica to 1 and remove that brick then add replica 2 and add brick?
16:13 joseki joined #gluster
16:14 joseki hi all. have you ever seen the gfid-resolver.sh helper script hang and not return?
16:14 joseki i'm trying to track down some gfids that are listed in "heal info".
16:15 joseki these gfids aren't available on any of my other machines (4 x 2 distributed-replicated)
16:15 kotreshhr joined #gluster
16:15 joseki i can ls -la the gfid in the .glusterfs and I see it there
16:18 kotreshhr left #gluster
16:19 joseki i'm guessing it's because i have a very, very large number of files, and the find -inum is slow as heck
16:19 k-ma joseki: i had problems with it on bricks with millions of files. the find it runs took hours
16:20 janegil joined #gluster
16:20 joseki i see.
16:21 joseki i have about 50K files that show up in "heal info", so perhaps it would be better to find a way to dump all the inums from that brick and then look
16:21 k-ma that's exactly how i ended up doing it :)
16:22 Rapture joined #gluster
16:23 joseki k-ma: perhaps just run this: find . -exec ls -i {} \; on the brick? or is there a faster way?
16:24 joseki ignoring the .glusterfs directory, obviously
16:25 k-ma iirc i just ran find and used -printf to print what i needed, no ls requred
16:26 Slashman joined #gluster
16:28 joseki ok, thanks! yeah, still slow
16:30 DV joined #gluster
16:34 k-ma yep, i left it to run overnight to gather the fname - inode mapping from all affected bricks
16:34 k-ma then mapped from the dumps to the heal info gfids
16:40 Rapture joined #gluster
16:40 janegil joined #gluster
16:40 jmarley joined #gluster
16:41 tommaso joined #gluster
16:49 firemanxbr joined #gluster
16:55 calavera joined #gluster
17:08 skoduri joined #gluster
17:18 theron joined #gluster
17:29 rafi joined #gluster
17:38 jonfatino Guys I am going to replace my second brick (server) with another one. What's the best way to do this? https://paste.ee/r/iGM3b  replace brick method? change replica to 1 and remove that brick then add replica 2 and add brick?
17:43 jiffin joined #gluster
17:43 mhulsman joined #gluster
17:51 JoeJulian jonfatino: replace-brick.
17:53 vk|lavi joined #gluster
17:57 akik if using selinux on a glusterfs client, am i restricted to having fuse_fs_t as the selinux label for the home dir?
17:57 akik the problem i've ran into is that the x2go server tries to modify the label in the home dir of the logging in user and fails
17:57 akik for $HOME/.Xauthority
17:58 ivan_rossi left #gluster
17:58 JoeJulian I think you can set it however you want. It's just an extended attribute.
17:58 akik but changing the selinux label for the files is denied
17:59 akik (after mounting)
17:59 JoeJulian Hmm..
18:03 tommaso joined #gluster
18:03 vimal joined #gluster
18:05 akik there was some mount option i think it was context= which defines the label for all the files
18:05 JoeJulian Right, just found that myself.
18:05 akik but i'm not sure what's the effect of that for other programs
18:05 akik and there can be only one label
18:05 JoeJulian Well that kind-of sucks.
18:07 akik i have these services in the picture, glusterfs, 389 directory server and x2go server
18:08 akik i would really want to keep selinux enabled
18:09 akik maybe i'll try that context=xauth_home_t and see what breaks
18:12 JoeJulian looks like that's a shortcoming of fuse. I see they've tried fixing it in the past, but they encountered a deadlock and haven't found the problem.
18:13 akik ok thanks
18:13 akik is there a bug tracker for that?
18:13 JoeJulian https://github.com/SELinuxPro​ject/selinux/wiki/Kernel-Todo
18:13 glusterbot Title: Kernel Todo · SELinuxProject/selinux Wiki · GitHub (at github.com)
18:14 JoeJulian It's the kernel. I don't think they do trackers or any of the really useful stuff we use IRL.
18:14 rafi joined #gluster
18:17 akik found this https://bugzilla.redhat.co​m/show_bug.cgi?id=1252627
18:17 glusterbot Bug 1252627: high, unspecified, ---, bugs, NEW , Cannot set selinux context on files in on a glusterfs mount
18:20 akik another one with no progress https://bugzilla.redhat.co​m/show_bug.cgi?id=1230671
18:20 glusterbot Bug 1230671: medium, unspecified, ---, bugs, NEW , SELinux not supported with FUSE client
18:25 Manikandan joined #gluster
18:26 Manikandan joined #gluster
18:28 Manikandan joined #gluster
18:29 atalur joined #gluster
18:30 Manikandan joined #gluster
18:30 tommaso joined #gluster
18:33 vk|lavi joined #gluster
18:34 DV joined #gluster
18:36 tommaso joined #gluster
18:39 theron joined #gluster
18:40 rafi joined #gluster
18:43 akik red hat. the open source company.. "You are not authorized to access bug #1256635."
18:43 glusterbot Bug https://bugzilla.redhat.com:​443/show_bug.cgi?id=1256635 is not accessible.
18:44 JoeJulian So trying to set the security.selinux attribute with log-level TRACE, the attribute call never makes it to gluster. Talk to the kernel devs.
18:46 akik if i want to use high available nfs, do i need to setup the nfs ganesha?
18:46 JoeJulian I would
18:47 akik i meant to ask if the nfs server installed by default could be considered ha ?
18:49 JoeJulian Well, it listens on all the servers. You could set up a vip manages with some other tool to make it HA.
18:49 JoeJulian I've had sufficient success with ucarp.
18:50 JoeJulian The robust tool would be corosync/pacemaker.
18:50 mhulsman joined #gluster
18:51 akik oh
18:52 akik so if i run gluster vol status on hostA, it will show NFS Server on hostB
18:52 tommaso joined #gluster
18:53 akik also shows NFS Server on localhost
18:55 akik i've used keepalived and haproxy before
18:56 plarsen joined #gluster
18:56 jiffin akik: glusterd is aware of other nfs servers running on the other nodes , just like brick processes
18:59 jiffin akik: HA solution for glusterNFS  can be achieved using with CTDB but not tested properly
18:59 jiffin akik: https://download.gluster.org/pub/glus​ter/glusterfs/doc/HA%20and%20Load%20B​alancing%20for%20NFS%20and%20SMB.pdf
19:00 akik ok thanks i need to start reading
19:01 kovshenin joined #gluster
19:01 jiffin akik: but it has its own limitation when compared with HA solution provided along NFS-Ganesha + corosync/pacemaker
19:02 theron joined #gluster
19:02 akik nice comment: NOTE: If you try this with GlusterFS 3.3.x, you must disable SELinux manually, editing /etc/sysconfig/selinux.
19:03 akik from the pdf doc
19:03 akik but using nfs might be a way forward, still keeping selinux enabled
19:04 jiffin akik: As far as I know , selinux support introduced in 3.7 onwards
19:06 JoeJulian The selinux flag's been around for a really long time, and it used to work.
19:08 akik this ctdb sounds interesting
19:09 akik i only need nfs, not cifs
19:10 akik http://www.gluster.org/communit​y/documentation/index.php/CTDB
19:16 akik looks like i can create a dns round-robin with dnsmasq quite easily
19:17 akik thanks for the tips
19:29 rafi joined #gluster
19:36 theron joined #gluster
19:39 rafi joined #gluster
19:43 bennyturns joined #gluster
19:48 gildub joined #gluster
19:49 infrabill joined #gluster
19:52 bturner_ joined #gluster
19:59 jonfatino JoeJulian: do you know of anyway to force stop / delete the replace brick operation? It says its completed however it lies. https://paste.ee/r/GKubX
20:02 tommaso joined #gluster
20:07 Pupeno joined #gluster
20:17 kovshenin joined #gluster
20:18 marcoceppi left #gluster
20:18 mhulsman joined #gluster
20:24 hchiramm joined #gluster
20:32 deniszh joined #gluster
20:35 harish joined #gluster
20:35 JoeJulian jonfatino: Just do the replace-brick ... commit force
20:35 JoeJulian self-heal will handle the rest.
20:37 deniszh joined #gluster
20:38 theron joined #gluster
20:41 vk|lavi joined #gluster
20:43 deniszh joined #gluster
20:51 hchiramm joined #gluster
20:56 theron joined #gluster
21:05 harish joined #gluster
21:09 Pupeno joined #gluster
21:11 DV joined #gluster
21:14 Pupeno joined #gluster
21:20 B21956 joined #gluster
21:32 Pupeno joined #gluster
21:32 Pupeno joined #gluster
21:39 stickyboy joined #gluster
21:41 hchiramm joined #gluster
21:48 infrabill Anyone up for a noob glusterfs question around the native client?
21:53 infrabill Trying to connect with native client and getting Recieved request from non-privileged port. Failing request. Any ideas?
22:02 frozengeek joined #gluster
22:04 hchiramm joined #gluster
22:06 frozengeek hey, does gluster somehow support manually changin a vol file (e.g. adding an extra brick to each replicaset or something like that) and then correctly reballance to the new layout?
22:09 edwardm61 joined #gluster
22:15 Guss joined #gluster
22:18 Guss left #gluster
22:20 ctria joined #gluster
22:20 pff joined #gluster
22:23 pff how do I go about debugging a gluster issue, maybe part of the volume is bad or something?
22:23 hchiramm joined #gluster
22:23 pff it produces loads of logs but nothing pinpoint
22:26 julim joined #gluster
22:27 JoeJulian @unprivileged
22:28 dblack joined #gluster
22:28 infrabill ?
22:28 harish joined #gluster
22:29 bennyturns joined #gluster
22:29 JoeJulian @learn unprivileged as Set your volumes and glusterd to allow connections from unprivileged ports per https://gluster.readthedocs.org/en/​latest/release-notes/3.7.0/#gluster​-volume-set-serverallow-insecure-on
22:29 glusterbot JoeJulian: The operation succeeded.
22:30 JoeJulian ~unprivileged | infrabill
22:30 glusterbot infrabill: Set your volumes and glusterd to allow connections from unprivileged ports per https://gluster.readthedocs.org/en/​latest/release-notes/3.7.0/#gluster​-volume-set-serverallow-insecure-on
22:30 JoeJulian It's frequently asked. Figured I'd better set up a factoid
22:31 JoeJulian frozengeek: more or less, yes. See "gluster volume add-brick" and "gluster volume rebalance"
22:31 infrabill @joejulian Ill try it right now. Thanks very much
22:31 JoeJulian pff: What are the symptoms?
22:33 pff @JoeJulian on the box itself, 4 cores and periodically for a sustained period, 100% CPU across all cores
22:34 pff @JoeJulian from web server perspective just really slow and load high, when gluster picks up, the web server gets quicker
22:34 JoeJulian pff: Could your clients be losing network connectivity to one of the servers periodically?
22:34 pff @JoeJulian, had issue with one large log file that makes me suspicious, got I/O error when trying to read it, the only thing I could do was remove from both bricks
22:35 JoeJulian Sounds like split-brain.
22:35 pff explain?
22:35 JoeJulian Which might confirm my connectivity question.
22:35 JoeJulian @lucky split-brain
22:35 glusterbot JoeJulian: https://en.wikipedia.org/wiki/Split-brain
22:35 pff it's all hosted on AWS
22:35 pff fwiw
22:36 jonfatino JoeJulian: sorry to bug you however so the replace brick failed and I did a force commit like you said. It was not doing anything and gluster was locking up so I rebooted the box. Now when I attempt to start gluster this is what I get https://paste.ee/r/bVJlM
22:36 JoeJulian pff: vpc?
22:37 pff JoeJulian: yes
22:37 JoeJulian jonfatino: Can all your servers resolve all your other servers by name?
22:38 JoeJulian pff: I've seen others complain about this since they started pushing people to vpc. Let's ping semiosis and see if he wants to poke at this.
22:38 jonfatino yes and the peer was connected before I rebooted the box
22:38 pff JoeJulian: setup has been fine up until last 36 hours, then it's just gone into CPU hell, web servers in web server hell, seeing loads of 250+
22:39 JoeJulian jonfatino: "resolve brick failed in restore" suggests otherwise. Look for typos?
22:41 jonfatino I have the same /etc/hosts file with content2 content11 content12 in there with ip and the ACL is set correctly. I have tested ping from each box to each other and everyone pings the correct ip
22:42 JoeJulian Mmkay...
22:42 JoeJulian try glusterd --debug and see if there's any more useful details.
22:44 jonfatino https://paste.ee/r/sos1X    I see some errors for unknown key (bricks)
22:44 jonfatino Should I just apt-get remove gluster and reinstall it and change uuid in /var/lib/glusterd/glusterd.info ?
22:44 jonfatino then re-add it?
22:45 JoeJulian unknown key is normal
22:46 JoeJulian Seems to be having a problem with the host, "runner21"
22:46 jonfatino when I force commit the replace brick (which was not done) I checked volume status and got "Brick content2:/gluster                                 N/A     N       N/A"
22:46 jonfatino how do I force remove that peer from this node?
22:47 mjrosenb I think I'm trying to do a partial heal-- I have a file with a gfid, but it doesn't have an entry in .glusterfs; I thought just stating the file should be enought, but it isn't creating that link.
22:47 glusterbot mjrosenb: heal's karma is now -1
22:47 mjrosenb heal++; heal is good.
22:47 glusterbot mjrosenb: heal's karma is now 0
22:47 amye overeager bot today
22:47 JoeJulian jonfatino: add it to /etc/hosts so you can get glusterd to start maybe.
22:48 jonfatino JoeJulian: yea this is a messed setup I had this server (content2) peered with runner21 / runner22 to test some stuff but I removed those before I deleted runner21 / runner22
22:48 JoeJulian mjrosenb: odd. You're right. The lookup() should do that.
22:48 mjrosenb I think ('s karma is like -400;  since the string "(--" appears frequently in logs.
22:49 JoeJulian jonfatino: Once you've got all your glusterd running, you can peer-detach them.
22:49 jonfatino Yea the issue is getting it running lol... Perhaps I should just reinstall this node as it has to rebuild the bricks anyways?
22:49 jonfatino apt-get remove gluster and reinstall?
22:49 JoeJulian nope
22:49 JoeJulian Add them to /etc/hosts
22:50 F2Knight joined #gluster
22:50 JoeJulian Just use dummy addresses.
22:50 JoeJulian It doesn't have to successfully connect to them, it just has to be able to resolve them.
22:51 mjrosenb it seems to be doing a lookup on the actual filename, rather than the (sym)hardlink.
22:51 jonfatino runner21 and runner22 are in there however those nodes are deleted. Should I fake it and add ip content11 (which is live and peered with content2) or use localhost?
22:51 JoeJulian mjrosenb: right, but the lookup on the file should create the hardlink.
22:51 infrabill @joejulian different error now. all subvolums are down. going offline until one of them comes back, and no subvolumes up
22:52 pff JoeJulian: gluster volume heal vol0 info split-brain just hangs
22:52 JoeJulian Make another hardlink (you can delete it after). If that doesn't do it I don't know what will.
22:52 JoeJulian infrabill: excellent
22:53 JoeJulian pff it's probably just really busy healing other stuff.
22:53 pff JoeJulian: could you take a look for beer tokens?
22:54 infrabill @joejulian that is followed by shutting down and unmounting
22:54 JoeJulian I cannot. It's against my employment contract. I can give you all the free advice I want though to try to help you solve it yourself.
22:55 JoeJulian infrabill: Start by getting peer-status under control. Make sure they're all good and remove those invalid servers.
22:55 pff JoeJulian: understood; what are my first steps; do I take the site down that might be loading it?
22:56 pff or do I look first before jumping?
22:56 infrabill @joejulian only 2 nodes. running gluster peer status from either shows connected and no errors
22:58 * mjrosenb peruses posix_lookup
22:58 hchiramm joined #gluster
23:02 infrabill @joejulian replication is working when i test on node1 or node2, cannot get native client to connect
23:04 JoeJulian pff: I don't suppose your bricks are zfs?
23:05 JoeJulian infrabill: that last statement sounds like you're connecting directly to the bricks. That's to be avoided.
23:06 pff joejulian: I don't think so
23:06 JoeJulian Just curious. That bit me recently.
23:06 pff joejulian: as usual this has been lobbed over the fence and I have practically no knowledge of gluster
23:06 pff all I know is that it looks sick and behaves like it's sick
23:07 pff joejulian: what information can I provide to get closer to understanding the issue?
23:08 JoeJulian pff: What I suspect is happening is that aws is starving your vm, it's going high load - so high that it's not responding to network io for up to 42 seconds. When that happens your clients time out. When you get cpu back, it starts responding then it starts healing, bumping up your load again.
23:08 JoeJulian (42 seconds assumes you haven't changed ping-timeout)
23:09 infrabill @joejulian i installed client on the nodes to test replication if that is what you mean.
23:09 pff joejulian: starving vm of what?
23:09 JoeJulian Honestly, since I have seen so many people come in here with vpc problems at aws, I would contact their support.
23:09 jonfatino JoeJulian: got gluster to start using 127.0.0.1 in osts file and removed runner21 + runner22. restarted gluster and it came back online and working. Now how do I get the pid process for the brick so start?  [2015-10-28 23:05:10.684825] E [client-handshake.c:1760:client_query_portmap_cbk] 0-volume1-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
23:09 JoeJulian starving of cpu I think, or maybe iops.
23:10 Pupeno joined #gluster
23:10 jonfatino pff: amazon is famous for giving you 4x vcpu and giving you 10% cpu priority per core
23:10 jonfatino so even though you get 4 virtual cores you prob only get 1ghz of processing power
23:11 JoeJulian jonfatino: gluster volume start $vol force
23:11 JoeJulian I'm guessing something prevented glusterfsd from starting with the errant servers.
23:12 JoeJulian jonfatino: If that doesn't do it, since you've removed those other servers, remove them from hosts again and make sure glusterd will start.
23:12 pff joejulian: I did contact them, they suggested patching kernel to enable enhanced networking, changing volume size from 20Gb to 250Gb, said nothing was wrong with the underlying h/w
23:12 jonfatino JoeJulian: thanks now were moving 15MB it seems to be healing :-) Quick question in the heal on another node of mine the .glusterfs folder is like 500gb and the raw "brick" folder is like 50mb
23:13 jonfatino Any ideas off the top of your head why it would do that?
23:13 JoeJulian The .glusterfs folder should just be hardlinks to all your real files.
23:13 jonfatino Interesting will have to investigate that later. JoeJulian got a website with a donate link or something?
23:14 pff JoeJulian: patching kernel hard (for me) and would prob need to be done across about 5 servers and it didn't feel like it would make a huge difference
23:14 JoeJulian To make sure you don't have any flotsam left in .glusterfs, "find .glusterfs -type f -links 1" If anything show up, it can probably be deleted.
23:14 JoeJulian https://joejulian.name
23:14 glusterbot Title: JoeJulian.name (at joejulian.name)
23:15 JoeJulian pff: I'd just keep poking them. Tell them you're vm's arent' getting enough cpu time.
23:15 theron joined #gluster
23:15 JoeJulian come back early in the morning and ping semiosis. He has a lot of experience with amazon.
23:15 JoeJulian He's in GMT-4.
23:16 pff JoeJulian: I will do.  How do I know I've done everything I can within gluster?
23:17 pff JoeJulian: I mean, servers have been up and down quite a bit as a result, sporadically they work fine and then they don't
23:17 JoeJulian gluster volume info $volname . If you've changed anything from defaults, consider resetting it. Under default settings, you should be able to handle most normal loads.
23:17 pff JoeJulian: am worried about data corruption / split brain as you describe
23:18 JoeJulian The best part of split-brain is that gluster won't corrupt anything. If you have two different versions of a file, it'll block access to it until it's healed.
23:19 JoeJulian @split-brain
23:19 glusterbot JoeJulian: To heal split-brains, see https://gluster.readthedocs.org/en/release-3.7.0​/Features/heal-info-and-split-brain-resolution/ For additional information, see this older article https://joejulian.name/blog/fixin​g-split-brain-with-glusterfs-33/ Also see splitmount https://joejulian.name/blog/gluster​fs-split-brain-recovery-made-easy/
23:19 pff @JoeJulian Volume Name: vol0 | Type: Replicate | Volume ID: 81bb4fb6-d904-41c8-afff-a13a6de349fd | Status: Started
23:19 pff @JoeJulian Number of Bricks: 1 x 2 = 2 | Transport-type: tcp | Bricks: | Brick1: fileserver01:/data/gluster/brick
23:19 pff @JoeJulian Brick2: fileserver02:/data/gluster/brick | Options Reconfigured: performance.readdir-ahead: on
23:19 pff JoeJulian; it's that block I'm worried about, what if it's a vital file?
23:20 pff JoeJulian: that's on one of the boxes fileserver01
23:20 pff JoeJulian: that output is from one of the boxes fileserver01
23:21 JoeJulian iirc, there's a setting, "gluster volume set help" which can choose a heal strategy automatically.
23:23 pff JoeJulian: this? Option: cluster.data-self-heal-algorithm
23:23 pff Default Value: (null)
23:23 pff Description: Select between "full", "diff". The "full" algorithm copies the entire file from source to sink. The "diff" algorithm copies to sink only those blocks whose checksums don't match with those of source. If no option is configured the option is chosen dynamically as follows: If the file does not exist on one of the sinks or empty file exists or if the source file size is about the same as page size the entire file will
23:23 pff be read and written i.e "full" algo, otherwise "diff" algo is chosen.
23:23 JoeJulian No, that's not it.
23:23 JoeJulian btw, I've read all of those so you could just ask the one line question.
23:23 pff sorry
23:25 frozengeek JoeJulian: yes, i know that i can use the commandline tools for most task, but modifying the vol files would probably be much easier in some cases (esp if you know the target configuration, and not really the steps towards it) so i was wondering if i could do it just with the vol files.
23:26 jonfatino What's the fastest way to get healing to speed up?
23:28 suliba joined #gluster
23:28 pff JoeJulian: I don't know what setting I'm looking for, that cluster.data-self-heal-algorithm one is the only one that looks promizing
23:28 JoeJulian frozengeek: It's still possible to use vol files, but it's not supported.
23:30 JoeJulian jonfatino: Faster CPU/memory. Bigger network with lower latency (rdma over infiniband). Faster disks.
23:32 JoeJulian pff: I was wrong. I know they talked about adding an option to make that happen automatically, it must not have made it in.
23:33 JoeJulian The best thing is to just avoid split-brain with quorum, but then it sounds like you're more concerned with availability over consistency so that may not be what you want.
23:34 jonfatino JoeJulian: ssds PHX -> ATL full 1Gbps seems to only be doing 1.5 mbps   Also once again the .glusterfs folder is filling up and now the brick data
23:34 jonfatino https://paste.ee/r/d9g9p
23:38 pff JoeJulian: at the moment I'm just trying to get a site back online for my customers; it worries me that gluster doesn't give me any information about what the problem is
23:39 pff JoeJulian: it gives me basic info but that's it, there might be 0.6M files in play here
23:40 Humble joined #gluster
23:40 pff I shall read about quorums though ...
23:42 pff JoeJulian: as gluster volume heal vol0 info split-brain doesn't return I'm flying blind
23:43 JoeJulian Ever? Have you let it sit for 30 minutes?
23:44 JoeJulian the other option is to look at the self-heal daemon logs, glustershd.log
23:44 jonfatino pff: Are you using php and the gluster client? php not so great with gluster and the 100's of stat calls
23:45 pff @jonfatino: yes, mainly php
23:45 JoeJulian @php
23:45 glusterbot JoeJulian: (#1) php calls the stat() system call for every include. This triggers a self-heal check which makes most php software slow as they include hundreds of small files. See http://joejulian.name/blog/optimizi​ng-web-performance-with-glusterfs/ for details., or (#2) It could also be worth mounting fuse with glusterfs --attribute-timeout=HIGH --entry-timeout=HIGH --negative-timeout=HIGH --fopen-keep-cache
23:45 jonfatino Yep I ran into that issue day one when messing with php :-)
23:46 jonfatino pff: as a test you could try and load your site directly from the bricks. If you have to use php I would recommend read from bricks and write via gluster client.
23:47 JoeJulian +1
23:47 JoeJulian Or follow the directions on my blog. :D
23:48 pff it is writing to glustershd.log but not spooling into it as I tail it
23:48 JoeJulian Any large files to heal?
23:48 pff JoeJulian, jonfatino: thanks both
23:48 jonfatino pfft yep what JoeJulian said. A friend of mine runs a cluster of webservers and has nginx sit in front of stuff like wp-admin and what not so everything related to clients uploading directly on website gets redireted to the first server like (web1) and he has lsyncd running on web1 that uses rsync to rsync from web1 to web2 web3 etc
23:49 pff @jonfatino it will take me a while to process that!
23:49 JoeJulian If you're performance relies on hitting the disk to load application, you're doing it wrong - imho.
23:50 pff @JoeJulian: I've inherited this setup but I take your point
23:55 pff @JoeJulian, @jonfatino: we use cloudflare to deal with static content, use memcached pretty heavily which works fine if user isn't logged in but if they are then that's the worse case - lots of logged in users
23:56 pff normal load is < 1 across the board, it's just gone nuts right now and it's not traffic related
23:57 pff @JoeJulian, @jonfatino: I suspect you're 100% right about split brain and self healing etc but I can't prove it
23:57 pff it's either too busy to give me that information or it can't

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary