Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2016-02-22

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 owlbot joined #gluster
00:05 EinstCra_ joined #gluster
00:05 owlbot joined #gluster
00:07 haomaiwa_ joined #gluster
00:09 owlbot joined #gluster
00:13 owlbot joined #gluster
00:17 owlbot joined #gluster
00:22 owlbot joined #gluster
00:26 owlbot joined #gluster
00:30 owlbot joined #gluster
00:31 longwuyu1n joined #gluster
00:31 longwuyu1n hi
00:31 glusterbot longwuyu1n: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
00:34 owlbot joined #gluster
00:38 owlbot joined #gluster
00:42 owlbot joined #gluster
00:46 owlbot joined #gluster
00:48 Alghost_ joined #gluster
00:50 owlbot joined #gluster
00:54 owlbot joined #gluster
00:58 owlbot joined #gluster
01:02 owlbot joined #gluster
01:07 plarsen joined #gluster
01:18 harish joined #gluster
01:23 haomaiwa_ joined #gluster
01:35 plarsen joined #gluster
01:40 nbalacha joined #gluster
01:47 nishanth joined #gluster
01:48 EinstCrazy joined #gluster
01:49 baojg joined #gluster
01:57 chromatin joined #gluster
02:04 nangthang joined #gluster
02:05 caitnop joined #gluster
02:14 pppp joined #gluster
02:14 haomaiwa_ joined #gluster
02:21 rafi joined #gluster
02:25 mtanner joined #gluster
02:29 ovaistar_ joined #gluster
02:31 muneerse2 joined #gluster
02:36 Wizek_ joined #gluster
02:48 ilbot3 joined #gluster
02:48 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
02:49 BuffaloCN left #gluster
02:52 ashiq joined #gluster
02:54 haomaiwa_ joined #gluster
02:55 Lee1092 joined #gluster
03:01 haomaiwa_ joined #gluster
03:02 arcolife joined #gluster
03:06 nangthang joined #gluster
03:24 sakshi joined #gluster
03:29 ovaistariq joined #gluster
03:31 baojg joined #gluster
03:36 overclk joined #gluster
03:39 nbalacha joined #gluster
03:53 ramteid joined #gluster
03:53 atinm joined #gluster
03:58 shubhendu joined #gluster
04:01 haomaiwa_ joined #gluster
04:05 kanagaraj joined #gluster
04:07 itisravi joined #gluster
04:08 itisravi joined #gluster
04:08 ppai joined #gluster
04:23 nehar joined #gluster
04:29 karthikfff joined #gluster
04:33 Alghost_ joined #gluster
04:34 Manikandan joined #gluster
04:52 hgowtham joined #gluster
04:53 RameshN joined #gluster
04:54 ndarshan joined #gluster
04:56 jiffin joined #gluster
05:00 PotatoGim joined #gluster
05:00 ovaistariq joined #gluster
05:01 haomaiwa_ joined #gluster
05:09 nehar joined #gluster
05:10 Bhaskarakiran joined #gluster
05:16 gem joined #gluster
05:20 pppp joined #gluster
05:23 EinstCra_ joined #gluster
05:25 ramky joined #gluster
05:30 poornimag joined #gluster
05:32 Saravanakmr joined #gluster
05:38 python_lover joined #gluster
05:55 harish_ joined #gluster
05:57 nishanth joined #gluster
05:59 atalur joined #gluster
06:01 karnan joined #gluster
06:01 haomaiwa_ joined #gluster
06:05 nangthang joined #gluster
06:06 skoduri joined #gluster
06:06 gowtham joined #gluster
06:06 kdhananjay joined #gluster
06:07 merp_ joined #gluster
06:13 Alghost_ joined #gluster
06:15 ppai joined #gluster
06:17 harish_ joined #gluster
06:17 anil joined #gluster
06:18 atinm joined #gluster
06:21 Bhaskarakiran joined #gluster
06:21 Bhaskarakiran joined #gluster
06:29 kdhananjay joined #gluster
06:29 python_lover joined #gluster
06:31 DV joined #gluster
06:36 aravindavk joined #gluster
06:38 EinstCrazy joined #gluster
06:48 RameshN joined #gluster
06:58 atinm joined #gluster
07:01 ovaistariq joined #gluster
07:01 haomaiwa_ joined #gluster
07:04 baojg joined #gluster
07:07 robb_nl joined #gluster
07:09 mhulsman joined #gluster
07:09 mhulsman1 joined #gluster
07:16 [Enrico] joined #gluster
07:18 nangthang joined #gluster
07:20 mobaer joined #gluster
07:22 jtux joined #gluster
07:24 DV joined #gluster
07:28 mbukatov joined #gluster
07:34 harish_ joined #gluster
07:38 overclk joined #gluster
07:42 Humble joined #gluster
07:48 kdhananjay joined #gluster
07:56 aravindavk joined #gluster
07:58 DV joined #gluster
08:00 wolsen joined #gluster
08:01 [diablo] joined #gluster
08:01 haomaiwa_ joined #gluster
08:01 deniszh joined #gluster
08:05 RameshN joined #gluster
08:11 jri joined #gluster
08:16 owlbot joined #gluster
08:20 owlbot joined #gluster
08:24 owlbot joined #gluster
08:27 Slashman joined #gluster
08:28 merp_ joined #gluster
08:32 itisravi joined #gluster
08:39 hackman joined #gluster
08:41 ira joined #gluster
08:43 ctria joined #gluster
08:51 DV joined #gluster
08:55 fsimonce joined #gluster
08:57 nehar joined #gluster
08:59 Bhaskarakiran joined #gluster
08:59 ivan_rossi joined #gluster
09:01 haomaiwang joined #gluster
09:02 ovaistariq joined #gluster
09:05 nbalacha joined #gluster
09:06 python_lover joined #gluster
09:19 Ulrar left #gluster
09:23 harish_ joined #gluster
09:36 owlbot joined #gluster
09:40 owlbot joined #gluster
09:44 owlbot joined #gluster
09:56 DV joined #gluster
10:01 owlbot joined #gluster
10:01 haomaiwa_ joined #gluster
10:03 ira joined #gluster
10:06 poornimag joined #gluster
10:06 nbalacha joined #gluster
10:27 arcolife joined #gluster
10:28 Bhaskarakiran joined #gluster
10:47 mhulsman joined #gluster
10:54 itisravi joined #gluster
10:55 DV joined #gluster
10:56 mhulsman1 joined #gluster
11:00 aravindavk joined #gluster
11:02 Wizek_ joined #gluster
11:02 ovaistariq joined #gluster
11:07 abyss__ joined #gluster
11:11 ppai joined #gluster
11:13 poornimag joined #gluster
11:25 Wizek joined #gluster
11:33 haomaiwa_ joined #gluster
11:41 jockek joined #gluster
11:43 sghatty_ joined #gluster
11:43 fyxim joined #gluster
11:45 sankarshan_away joined #gluster
11:48 haomaiwa_ joined #gluster
11:49 virusuy joined #gluster
11:52 Nakiri__ joined #gluster
11:53 dron23 joined #gluster
12:00 owlbot joined #gluster
12:01 haomaiwang joined #gluster
12:02 owlbot joined #gluster
12:03 hagarth joined #gluster
12:10 yoavz joined #gluster
12:16 kdhananjay joined #gluster
12:18 ppai joined #gluster
12:26 nehar joined #gluster
12:42 nbalacha joined #gluster
12:56 mobaer joined #gluster
13:01 haomaiwa_ joined #gluster
13:12 poornimag joined #gluster
13:17 ira joined #gluster
13:23 Nakiri__ joined #gluster
13:25 theron joined #gluster
13:29 chirino_m joined #gluster
13:34 ParsectiX joined #gluster
13:35 aravindavk joined #gluster
13:36 sathees joined #gluster
13:40 chromatin joined #gluster
13:45 voobscout joined #gluster
13:49 johnmilton joined #gluster
13:53 unclemarc joined #gluster
14:01 voobscout joined #gluster
14:01 haomaiwang joined #gluster
14:08 DV joined #gluster
14:10 natarej_ joined #gluster
14:13 hamiller joined #gluster
14:17 kdhananjay joined #gluster
14:24 skoduri joined #gluster
14:25 julim joined #gluster
14:26 theron joined #gluster
14:26 theron joined #gluster
14:27 natarej joined #gluster
14:29 Ulrar joined #gluster
14:31 Ulrar Hi, I have trouble understanding something, if someone can help. I have 3 nodes configured for one volume, with a replica count set to 3 (Number of Bricks: 1 x 3 = 3). There is about 600 GB worth of space on each, and with a replica of 3 I'd expect to see a volume of 600 Gb, but I see twice that in df -h
14:32 Ulrar I was thinking of lowering the replica to 2, but it seems weird
14:35 wnlx joined #gluster
14:39 Wizek joined #gluster
14:40 Ulrar Oh wait, it's my fault. Forgot I was using RAID 5, glusterfs is displaying the correct number here, my bad
14:44 voobscout joined #gluster
14:44 skylar joined #gluster
14:53 chirino joined #gluster
14:56 plarsen joined #gluster
14:58 ekuric1 joined #gluster
14:59 DV joined #gluster
15:01 haomaiwa_ joined #gluster
15:04 ovaistariq joined #gluster
15:05 tswartz joined #gluster
15:11 amye joined #gluster
15:19 rwheeler joined #gluster
15:22 hchiramm joined #gluster
15:22 Slashman joined #gluster
15:22 DV__ joined #gluster
15:28 DV joined #gluster
15:31 atalur joined #gluster
15:34 NuxRo joined #gluster
15:35 wushudoin joined #gluster
15:37 rGil joined #gluster
15:49 theron joined #gluster
15:53 farhorizon joined #gluster
15:53 robb_nl joined #gluster
15:59 Slashman joined #gluster
15:59 Slashman joined #gluster
16:01 7GHAABPAU joined #gluster
16:02 theron joined #gluster
16:03 dpaz joined #gluster
16:05 dpaz hi guys , I have a 3 node gluster setup and I was wondering if I need to configure any fencing agent and if there's any guide for that
16:09 dpaz actually I'm sure I need it , but is there anything integrated with gluster
16:19 ovaistariq joined #gluster
16:21 kkeithley dpaz: There are pacemaker resource agents for gluster in the glusterfs-resource-agents RPMs for Fedora/RHEL/CentOS.  They're there too in the SuSE RPMs and Debian/Ubuntu .debs.
16:27 dpaz kkeithley: thanks!
16:29 theron joined #gluster
16:31 merp_ joined #gluster
16:33 jiffin joined #gluster
16:34 voobscout joined #gluster
16:35 kanagaraj joined #gluster
16:35 dnoland1 joined #gluster
16:35 rafi joined #gluster
16:38 dnoland1 our gluster nfs share is allowing us to run commands like $(touch some_new_file), but if we run $(echo words > some_new_file) we get this: some_new_file: Read-only file system
16:39 dnoland1 We also *can* do this: $(touch new_file && echo words > new_file)
16:40 dnoland1 It is just writing data on a new file that is causing problems.  Any thoughts?
16:40 JoeJulian My first assumption is that you have quorum enabled and have lost quorum.
16:41 dnoland1 I do have log errors to that effect.  That said, I have the following quorum settings:
16:41 dnoland1 cluster.quorum-type (none)
16:41 shubhendu joined #gluster
16:41 dnoland1 cluster.quorum-count (null)
16:42 wolsen joined #gluster
16:42 dnoland1 that is just for our nfs share
16:42 dnoland1 The remainder of our gluster system is working fine
16:42 neofob joined #gluster
16:43 bfm joined #gluster
16:43 dnoland1 I am seeing this in the nfs log
16:43 dnoland1 [2016-02-22 16:37:55.253163] W [MSGID: 114031] [client-rpc-fops.c:2402:client3_3_create_cbk] 0-home-client-13: remote operation failed. Path: /d/dano2364/some_random_file [Transport endpoint is not connected]
16:43 dnoland1 [2016-02-22 16:37:55.299597] W [MSGID: 108001] [afr-transaction.c:686:afr_handle_quorum] 0-home-replicate-4: /d/dano2364/some_random_file: Failing CREATE as quorum is not met
16:43 voobscout joined #gluster
16:44 dnoland1 But all three of our gluster servers can see each other (based on this command): sudo gluster peer status
16:44 dnoland1 Number of Peers: 2
16:44 dnoland1 Hostname: ss-62
16:44 dnoland1 Uuid: 752c9501-dea7-467c-91ec-2e942df2d86c
16:44 dnoland1 State: Peer in Cluster (Connected)
16:44 dnoland1 Hostname: ss-61
16:44 dnoland1 Uuid: bbab75b5-77a0-4752-b410-054844184137
16:44 dnoland1 State: Peer in Cluster (Connected)
16:44 JoeJulian @paste
16:44 glusterbot JoeJulian: For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999
16:44 dnoland1 sorry, will use pbin
16:45 dnoland1 http://sprunge.us/BKhg
16:46 JoeJulian I guess we should look at 'gluster volume status home'
16:46 bfm Hi Guys! I have geo-replication working between two clusters, but recently I noticed hidden files appearing on the geo-rep slaves. Say, I have dir/file1.txt at the source and at some stage on the geo-rep slave I see dir/.file1.txt.VA5U1q of zero size. It looks to me that some sort of geo-rep hiccup is happening here, but I can't track it down through the logs :-(
16:47 JoeJulian That looks like an rsync tempfile.
16:47 dnoland1 http://sprunge.us/NFBj
16:49 bfm @JoeJulian, trouble is that those files do not get cleaned up
16:55 JoeJulian bfm: Are you sure the geo-rep of those files is complete? My assumption would be that they would stick around so they could be continued if it's not.
16:56 JoeJulian If they are, I would file a bug report about that.
16:56 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
16:57 JoeJulian dnoland1: You do have some services that are not running. That shouldn't cause a loss of quorum when none is defined though.
16:59 theron joined #gluster
17:00 JoeJulian dnoland1: Try setting quorum.count to 0 instead of null.
17:00 dnoland1 JoeJulian: Thank you.  I am not sure why Self-heal Daemon is not running on ss-61
17:00 dnoland1 And I will try that
17:01 haomaiwang joined #gluster
17:02 bfm JoeJulian: it looks like this for example on geo-rep slave:
17:02 bfm -rw-r--r-- 1 2001     2001  98174969 Feb 22 16:30 file.20160218.0.379.noarch.rpm
17:02 bfm -rw------- 0 root repluser         0 Feb 22 14:48 .file.20160218.0.379.noarch.rpm.3VVTvo
17:02 bfm -rw------- 0 root repluser         0 Feb 22 15:18 .file.20160218.0.379.noarch.rpm.6g6KYl
17:02 bfm -rw------- 0 root repluser         0 Feb 22 15:36 .file.20160218.0.379.noarch.rpm.BQlZEt
17:02 bfm -rw------- 0 root repluser         0 Feb 22 14:07 .file.20160218.0.379.noarch.rpm.DtVws4
17:02 glusterbot bfm: -rw-r--r's karma is now -19
17:02 bfm -rw------- 0 root repluser         0 Feb 22 14:36 .file.20160218.0.379.noarch.rpm.HKGIVU
17:02 bfm -rw------- 0 root repluser         0 Feb 22 16:48 .file.20160218.0.379.noarch.rpm.KWHKOB
17:02 bfm -rw------- 0 root repluser         0 Feb 22 16:18 .file.20160218.0.379.noarch.rpm.LCvi6L
17:02 glusterbot bfm: -rw-----'s karma is now -6
17:02 bfm -rw------- 0 root repluser         0 Feb 22 16:06 .file.20160218.0.379.noarch.rpm.TFZt8Z
17:02 bfm -rw------- 0 root repluser         0 Feb 22 14:18 .file.20160218.0.379.noarch.rpm.TZ3JJZ
17:02 glusterbot bfm: -rw-----'s karma is now -7
17:02 bfm -rw------- 0 root repluser         0 Feb 22 15:06 .file.20160218.0.379.noarch.rpm.UOQTZt
17:02 glusterbot bfm: -rw-----'s karma is now -8
17:02 glusterbot bfm: -rw-----'s karma is now -9
17:02 glusterbot bfm: -rw-----'s karma is now -10
17:02 glusterbot bfm: -rw-----'s karma is now -11
17:02 glusterbot bfm: -rw-----'s karma is now -12
17:02 glusterbot bfm: -rw-----'s karma is now -13
17:02 glusterbot bfm: -rw-----'s karma is now -14
17:02 glusterbot bfm: -rw-----'s karma is now -15
17:02 JoeJulian @paste
17:02 glusterbot JoeJulian: For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999
17:03 bfm sorry!
17:04 ivan_rossi left #gluster
17:05 dnoland1 JoeJulian: I tried to set quorum.count to zero, http://sprunge.us/fCOc
17:06 JoeJulian So I assume you pasted that, bfm, because you would like me to confirm my previous advice. I still recommend you file a bug report.
17:06 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
17:09 JoeJulian dnoland1: Aha... "/* If user doesn't configure anything enable auto-quorum if the replica has odd number of subvolumes */"
17:10 JoeJulian dnoland1: so you would have to set quorum-type to "none" to disable that.
17:10 rcampbel3 joined #gluster
17:11 JoeJulian Which still doesn't explain why you're hitting quorum issues with all your bricks online.
17:11 dnoland1 JoeJulian: ok, so I should run $(sudo gluster volume set home cluster.quorum-type none) on my gluster daemons
17:11 dnoland1 ?
17:11 JoeJulian ... unless there are firewall issues.
17:11 dnoland1 Ok, will try that.  I appreciate your help :)
17:11 JoeJulian You would only have to set it on one.
17:11 dnoland1 k
17:12 JoeJulian The changes made through the cli are cluster-wide.
17:13 bluenemo joined #gluster
17:19 dnoland1 Ok, change made.
17:19 dnoland1 Is is possible that I am hitting quorum issues just because the self heal daemon on ss-61 is offline (for reasons I don't understand)
17:20 calavera joined #gluster
17:20 JoeJulian No
17:21 JoeJulian It's because the nfs service isn't connected to all the bricks in the replica subvolume.
17:21 JoeJulian 0-home-replicate-4
17:26 bennyturns joined #gluster
17:29 jri joined #gluster
17:30 bfm JoeJulian: http://termbin.com/8s6n
17:30 bfm that's on the geo-rep slave
17:31 plarsen joined #gluster
17:31 dnoland1 JoeJulian: I see.  The nfs server on ss-61 is off for some reason.  Is there a more proper way to restart it than using systemd to restart glusterd?
17:32 edong23 joined #gluster
17:35 ahino joined #gluster
17:38 theron joined #gluster
17:41 JoeJulian dnoland1: Another way is "gluster volume start $volname force"
17:45 merp_ joined #gluster
17:48 dnoland1 That resolved the issue
17:48 dnoland1 Thank you JoeJulian.  I really appreciate your help
17:49 dnoland1 (just inherited this position from my boss, and knew nothing about gluster until just a little while ago, so your help really saved me)
17:49 ahino joined #gluster
17:49 JoeJulian You're welcome. Keep an eye on that. I suspect there's a reason those were not running.
17:49 JoeJulian With that many bricks, it might have been oom purging.
17:52 dnoland1 Will do.  Setting up icinga to monitor nfs daemon now
18:01 haomaiwa_ joined #gluster
18:02 hchiramm joined #gluster
18:09 theron joined #gluster
18:16 Manikandan joined #gluster
18:17 voobscout joined #gluster
18:21 jbrooks joined #gluster
18:29 NuxRo joined #gluster
18:42 nishanth joined #gluster
18:43 voobscout joined #gluster
18:44 amye joined #gluster
18:45 calavera joined #gluster
18:54 Melamo joined #gluster
18:55 Melamo left #gluster
19:00 ahino joined #gluster
19:01 haomaiwa_ joined #gluster
19:03 calavera joined #gluster
19:07 s-hell Hello everyone!
19:08 s-hell Can anyone help me with this error: https://paste.pcspinnt.de/view/raw/f19c97d2
19:11 jobewan joined #gluster
19:12 mtanner joined #gluster
19:36 JoeJulian s-hell: Did you actually define the geosync master as localhost?
19:44 s-hell hm, no. don't think so.
19:44 s-hell wait a second...
19:44 s-hell no, master node is set to hostname
19:45 s-hell i've used the georepsetup tool
19:46 s-hell JoeJulian: georepsetup pimages geouser@www1.ambiendo.ovh pimages
19:46 theron joined #gluster
19:47 theron joined #gluster
19:48 ovaistariq joined #gluster
19:49 calavera joined #gluster
19:49 s-hell there is no other error message only this error.
19:52 cliluw joined #gluster
20:01 haomaiwang joined #gluster
20:05 Trefex hi all. I have a 3-node distributed setup, and find it impossible to load my data into the cluster since many months
20:05 Trefex i have tried mounting directly, rsync to an rsyncd, nfs, smbd
20:06 Trefex after a while i get timeouts and the rsync process fails
20:06 Trefex i have many small files, and i am using rsync so that i can resume, but I can't get my dataset in
20:06 Trefex any ideas?
20:07 deniszh joined #gluster
20:12 s-hell got it. it was a wrong mountbroker configuration
20:15 JoeJulian s-hell: Nice, glad you got it figured out.
20:16 JoeJulian Trefex: Check your client logs for clues. Try disabling performance translators one-at-a-time and see if that helps. See if there's something loading up on your servers that's causing them not to respond.
20:17 Trefex JoeJulian: performance translators?
20:17 JoeJulian gluster volume set help | grep performance
20:17 Trefex JoeJulian: i guess it's due to a long rsync process
20:18 voobscout joined #gluster
20:18 JoeJulian gluster shouldn't care.
20:18 JoeJulian It's just a bunch of writes to an open file descriptor.
20:19 Trefex JoeJulian: this is my current config http://ur1.ca/ok8qd
20:19 glusterbot Title: #327447 Fedora Project Pastebin (at ur1.ca)
20:19 Trefex JoeJulian: do you see anything fishy ?
20:20 JoeJulian "volume info" is more useful since that only shows the things that have been changed from default.
20:20 Kins joined #gluster
20:20 telmich joined #gluster
20:20 klaas joined #gluster
20:20 partner joined #gluster
20:20 the-me joined #gluster
20:20 javi404 joined #gluster
20:20 _nixpanic joined #gluster
20:20 frakt joined #gluster
20:20 renout_away joined #gluster
20:20 zerick joined #gluster
20:20 cuqa_ joined #gluster
20:20 xavih joined #gluster
20:20 ron-slc joined #gluster
20:20 csaba joined #gluster
20:20 JoeJulian But even then, there's no setting that should *make* it timeout.
20:20 ws2k3_ joined #gluster
20:20 tru_tru joined #gluster
20:20 _nixpanic joined #gluster
20:20 dastar joined #gluster
20:21 Iouns joined #gluster
20:21 s-hell joined #gluster
20:21 inodb joined #gluster
20:21 malevolent joined #gluster
20:21 Trefex JoeJulian: didn't know that http://ur1.ca/ok8qq
20:21 kenansulayman joined #gluster
20:21 bhuddah joined #gluster
20:21 glusterbot Title: #327451 Fedora Project Pastebin (at ur1.ca)
20:21 lkoranda joined #gluster
20:21 kenansulayman joined #gluster
20:21 Trefex also just for funsies, on another setup, i have 1 rsync to NFS over ZFS and 1 rsync of same data to GlusterFS
20:21 Trefex one takes 40 mins, the other 7 hours
20:22 dblack joined #gluster
20:22 Trefex which I read is normal for Gluster, so that's not cool :(
20:22 JoeJulian Well changing the log-levels to prevent seeing what might be a problem may make it more difficult for you to diagnose.
20:22 Trefex ow
20:22 Trefex client or brick ?
20:22 scuttle` joined #gluster
20:23 JoeJulian Maybe both. Depends. You're trying to diagnose a problem. Information helps with that.
20:23 wistof joined #gluster
20:23 Trefex gotcha, i think it's simply speed, gluster is crapp with small files, but i didn't find a better alternative :)
20:24 JoeJulian cluster.data-self-heal-algorithm: full shouldn't come in to play here but do you have a reason you want it set that way?
20:24 Trefex JoeJulian: what could be a reason? I took over this setup and got no handover
20:24 Trefex so not sure why the options were set that way
20:24 _fortis joined #gluster
20:25 samikshan joined #gluster
20:25 JoeJulian Heh, in fact, there's no reason at all. I just noticed this isn't replicated so that's never used.
20:26 virusuy Hi all! . Im setting quotas to a folder with some data on it and the "used available" column in 'quota list' doesn't seems to be reflecting the reallity ( i'm on 3.6.x )
20:26 Trefex JoeJulian: which is something i might change, because right now, when a disk breaks, i have to take down whole cluster
20:26 virusuy i mean, in that folder are like 2TB of data, and quota say it's only using 500G
20:26 JoeJulian virusuy: search the mailing list. I saw mention of that there recently.
20:26 virusuy JoeJulian: gotcha!
20:27 marlinc joined #gluster
20:28 JoeJulian Trefex: With your rsync, are you using --inplace? If not, you're going to be adding some extra dereferences to a lookup.
20:28 john51 joined #gluster
20:28 Trefex JoeJulian: ya
20:28 JoeJulian dedup with zfs?
20:28 Trefex rsync -RraWHvzP --timeout=3000 --inplace --delete hcs/ /mnt/tmpMounts/hcs is what i use right now
20:30 Trefex JoeJulian: now client debug log files is filled with "on live-client-1 returned error [Stale file handle]
20:30 Trefex JoeJulian: http://ur1.ca/ok8rq is the ZFS setup of one of the nodes
20:30 glusterbot Title: #327459 Fedora Project Pastebin (at ur1.ca)
20:32 Trefex JoeJulian: basically it's off
20:41 virusuy JoeJulian: seems like this quota issue is on 3.6.x and the only workaround is update to 3.7
20:41 virusuy JoeJulian: thanks for the heads up
20:42 JoeJulian You're welcome.
20:42 Trefex JoeJulian: why does it say this 0-rpc-clnt: submitted request (XID: 0x29e3f8b Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (live-client-2)
20:42 Trefex even though i'm using Gluster 3.7.8 ?
20:42 JoeJulian The rpc version is 330.
20:42 Trefex oh i see
20:44 steveeJ joined #gluster
20:48 amye joined #gluster
20:51 ovaistariq joined #gluster
20:55 calavera joined #gluster
21:01 haomaiwang joined #gluster
21:04 theron joined #gluster
21:13 farhorizon joined #gluster
21:20 anoopcs joined #gluster
21:24 cuqa_ joined #gluster
21:30 jri joined #gluster
21:34 deniszh joined #gluster
21:41 cpetersen_ JoeJulian: If I were to simulate another failure and did get a momentary split-brain again, in that moment, what logs would you like me to pull?  And, if I do that would you be able to take a peek?
21:44 JoeJulian cpetersen_: The client log from ganesha (both the pre-ip-move, and post-ip-move clients), the brick logs, self-heal logs and, most importantly, 'getfattr -m . -d -e hex' for the file on the servers reporting split-brain.
21:45 tessier joined #gluster
21:45 tessier Hello all! I just broke my cluster. :( I accidentally mounted something else over the gluster mountpoint on the brick nodes.
21:45 tessier Feb 22 14:09:50 disk10 gluster-j-brick[4388]: [2016-02-22 22:09:50.444248] M [posix-helpers.c:1718:posix_health_check_thread_proc] 0-9j-posix: health-check failed, going down
21:46 tessier How do I fix this? I've restarted gluster....
21:47 tessier Ah....phew.
21:47 tessier I restarted everything again and it came back up.
21:47 tessier That was stupid. Let's not do that again.
21:47 JoeJulian Assuming you fixed the mount, "gluster volume start $volname force" would have done it.
21:53 deniszh1 joined #gluster
21:54 cpetersen_ Hmmm... client log, trying to remember where that is.
21:55 cpetersen_ I think it's ganesha-gfapi.log but there are really only two so that must be it.
21:59 Philambdo1 joined #gluster
21:59 JoeJulian seems like a good guess.
22:01 64MAADRUZ joined #gluster
22:02 Philambdo1 joined #gluster
22:02 cpetersen_ JoeJulian:  http://paste.fedoraproject.org/327499/14561784/
22:02 glusterbot Title: #327499 Fedora Project Pastebin (at paste.fedoraproject.org)
22:03 cpetersen_ Don't have to run "getfattr -m . -d -e hex" as the files are know nright?
22:03 NuxRo joined #gluster
22:03 JoeJulian Have to run those since we want to know *why* gluster thinks they're split-brain.
22:03 cpetersen_ Ah ok, will do.
22:04 cpetersen_ Well.
22:04 cpetersen_ It doesn't think that anymore
22:04 JoeJulian Oh, wait... It's only complaining about the volume root?
22:04 cpetersen_ It's not in split-brain anymore and the VM failed over appropriately.
22:04 cpetersen_ :P
22:05 ParsectiX joined #gluster
22:05 cpetersen_ Yes - apparently JoeJulian.
22:05 JoeJulian Is it only lines 10 and 43 that you're concerned with?
22:05 cpetersen_ Correct.
22:05 cpetersen_ In this instance, at least.
22:06 JoeJulian I've never seen a split-brain volume root, but even if I had, I don't think it would be a problem.
22:07 JoeJulian Maybe I'll force one and see if it breaks anything.
22:07 ovaistariq joined #gluster
22:08 amye joined #gluster
22:12 cpetersen_ Also, should be upgrading gluster from 3.7.6?
22:14 deniszh joined #gluster
22:14 JoeJulian I would (did).
22:16 cpetersen_ Any issues?
22:16 cpetersen_ Benefits?
22:19 JoeJulian The only issues (which didn't effect my use case) is some performance issue with performance.write-behind ( bug 1309462 ).
22:19 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1309462 low, unspecified, ---, bugs, NEW , Upgrade from 3.7.6 to 3.7.8 causes massive drop in write performance.  Fresh install of 3.7.8 also has low write performance
22:20 JoeJulian Benefits: a very large number of memory leaks fixed thanks to post-factum.
22:20 cpetersen_ What causes the problem?  Having that feature enabled?  I don't think I have it enabled.
22:20 JoeJulian It's enabled by default.
22:20 cpetersen_ OIC.
22:21 cyberbootje joined #gluster
22:26 ovaistariq joined #gluster
22:26 cpetersen_ Holy crap, ganesha-gfapi.log is 200 MB...
22:30 cpetersen_ It's very interesting.  The files aren't locked, but my VM just will not start up
22:30 cpetersen_ It's failed over to another host, so HA works fine, but the VM will just not start up
22:30 cpetersen_ If I bring the original host back up, it does start up fine
22:32 JoeJulian define "will not start up"
22:33 cpetersen_ The VMware machine posts, but following that it's just black.
22:33 cpetersen_ As if the VM nvram is not accessible.
22:34 JoeJulian Anything in that 200MB log when that happens?
22:38 cpetersen_ eh oh
22:38 kovshenin joined #gluster
22:39 cpetersen_ http://ur1.ca/ok98l
22:39 glusterbot Title: #327507 Fedora Project Pastebin (at ur1.ca)
22:39 post-factum JoeJulian: that's why I've cherry-picked memleak-related fixes on top of 3.7.6: https://github.com/pfactum/glusterfs/commits/fixes-3.7.6
22:39 glusterbot Title: Commits · pfactum/glusterfs · GitHub (at github.com)
22:40 post-factum but that is sad, I want my fresh shiny bug-free 3.7.9!
22:40 cpetersen_ DIR01 is the affected VM, FYI.
22:41 JoeJulian Show me a piece of software with no known bugs, and I'll show you a piece of software that isn't used.
22:41 cpetersen_ I don't like those errors.  lol
22:42 post-factum JoeJulian: saw some related joke on that, like if the app's size approaches 0, the amount of debug efforts also approaches 0, so zero-sized app tends to be bug-free
22:42 JoeJulian Someone will still complain about it though.
22:43 cpetersen_ :P
22:43 post-factum yep, size == 0 is edge case, so definitely it will trigger bugs in other apps ;)
22:43 dnoland1 left #gluster
22:44 kenhui joined #gluster
22:44 amye joined #gluster
22:44 JoeJulian hmm, "event generation 6" is new... must be an afr2 thing.
22:46 JoeJulian Ah, looks like it might be an invalid interpretation. A comment from afr_read_subvol_select_by_policy when it returns -1 reads, "no readable subvolumes, either split brain or all subvols down". So it may not be split-brain, but rather all subvolumes may be down.
22:48 JoeJulian So why is it losing connection to all subvolumes? Network?
22:49 JoeJulian And I really wish they would enforce either spaces or tabs in the source. I hate trying to read code that's indented every which way.
22:50 cpetersen_ In specific, where would the connection be dropping?
22:50 cpetersen_ I did kill one of the bricks, just not the one serving the ganesha nfs share.
22:51 * JoeJulian slaps cpetersen_ with a wet trout for throwing in extra variables during diagnostics.
22:53 cpetersen_ ?!?!?!
22:53 cpetersen_ I didn't throw in extra variables.  I killed an appliance.  I went and hard shutdown the appliance to test failure.
22:54 cpetersen_ The VMs are both on the that host, brick2 and the Windows VM that needs to failover to host 1.
22:54 cpetersen_ =)
22:54 cpetersen_ I'm hyper-converged, remember?
22:56 cpetersen_ The other VM is affected as well.
22:57 cpetersen_ Strange part is that the other VM, ACS01, is located on host 1 which has brick 1 on it.  Neither that host nor the nfs share was affected..
22:57 cpetersen_ But that VM will not boot either.
22:57 JoeJulian Ok, I misunderstood "I did kill one of the bricks" to mean one *other* brick.
22:57 cpetersen_ Correct.
22:58 cpetersen_ Bricks 1, 2 and 3.  1 has the NFS share primary VIP that I am consuming.  I killed host 2 which took down brick 2 and initiated a VMware HA failover of DIR01 to host 3.
22:59 cpetersen_ The NFS share was not affected, nor the files in an adverse way because if I boot host 2 up again, the VM will start up just fine.
22:59 cpetersen_ I am perplexed.
22:59 mobaer joined #gluster
22:59 JoeJulian Right, so you should have retained quorum and ganesha should have had active connections to 1 and 3 still.
22:59 cpetersen_ Correct, and it does.
22:59 JoeJulian Not according to that log.
23:00 cpetersen_ I have server quorum set to server and volume quorum set to auto.
23:00 cpetersen_ Should I change volume quorum to 2?
23:00 JoeJulian It doesn't go back to the beginning so I can't see if it ever had a connection.
23:00 cpetersen_ Actually, to 1.
23:00 JoeJulian No, auto is fine.
23:00 cpetersen_ Here are the complete logs.
23:00 cpetersen_ http://filebin.ca/2XtFzEB952Dt/file03logs.7z
23:01 cpetersen_ Thank God for compression...
23:01 haomaiwang joined #gluster
23:06 theron joined #gluster
23:06 JoeJulian cpetersen_: let me see volume info
23:08 cpetersen_ "gluster v info": http://ur1.ca/ok9cc
23:08 glusterbot Title: #327511 Fedora Project Pastebin (at ur1.ca)
23:11 cpetersen_ :D
23:11 cpetersen_ I felt like a real moron when I had my bricks mounted previously under the /run/gluster/shared_storage folder ... gah
23:13 cpetersen_ But let's not talk about that shall we >.<
23:17 JoeJulian :D
23:18 HugHern_ joined #gluster
23:18 cpetersen_ root, which is the owner, has RW on the files in the ESXi datastore
23:18 cpetersen_ so there are no locks present that I can see
23:20 cpetersen_ Nothing that VMware doesn't do natively as per the norm that is.  ie, *.vmx.lck file.
23:21 JoeJulian No, this totally looks like either a network condition, or a race.
23:22 JoeJulian I'm just not completely sure what's supposed to be happening in some of these bits.
23:22 JoeJulian Or why there's any logs from dht.
23:24 JoeJulian The one thing I'm 99% sure of is that there's no split-brain happening.
23:24 JoeJulian It's simply failing to pick a read-subvolume.
23:25 cpetersen_ So gluster is struggling to pick a brick to pull from?
23:25 cpetersen_ So then presumably, ganesha is working fine
23:26 cpetersen_ NFS 3 is doing the job, but the gluster client is struggling
23:27 JoeJulian That's the way I'm interpreting this.
23:27 JoeJulian Do you have an open bug report?
23:27 cpetersen_ Why would the VM fail to start though?  ESXi can see and list all of the files on the share...
23:27 cpetersen_ Not for this no, I don't feel I've identified a culprit yet
23:28 JoeJulian At the moment it tries, it cannot read the nvram file
23:28 JoeJulian [2016-02-19 20:58:35.632138] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-SHARED_vmvol01-client-2: remote operation failed. Path: /DIR01/DIR01.nvram (2a03a3c2-c444-46a5-b754-41b4f70d27ed) [No such file or directory]
23:29 cpetersen_ Right...
23:29 cpetersen_ Makes sense.
23:29 tessier JoeJulian: Thanks for the tip on "gluster volume start $volname force". Duly noted.
23:30 chromatin joined #gluster
23:30 JoeJulian cpetersen_: file a bug. Include those logs and volume info. Describe the steps to create the failure.
23:30 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
23:30 JoeJulian If I see anything missing, I'll add my 2c.
23:35 cpetersen_ JoeJulian: The error you posted there didn't occur today when I simulated the failure.
23:35 cpetersen_ There were no remote operation failed messages today actually.
23:35 cpetersen_ Well, in relation to storage volumes, hold up I may be lysing
23:36 cpetersen_ "0-SHARED_vmvol01-replicate-0: Unreadable subvolume -1 found with event generation 6 for gfid d72a3396-f392-404d-91b7-1f1608cd61be. (Possible split-brain)"
23:36 cpetersen_ "0-SHARED_vmvol01-dht: <gfid:d72a3396-f392-404d-91b7-1f1608cd61be>: failed to lookup the file on SHARED_vmvol01-dht [Stale file handle]"
23:36 cpetersen_ These are the ones we are concerned about now, no?
23:39 ovaistariq joined #gluster
23:40 kenhui joined #gluster
23:41 cpetersen_ What is <brick>-dht?
23:41 arcolife joined #gluster
23:42 cpetersen_ Well nevermind, found your article.
23:42 cpetersen_ =)
23:45 kovshenin joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary