Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster-dev, 2016-01-29

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:42 luizcpg joined #gluster-dev
00:57 luizcpg joined #gluster-dev
01:53 baojg joined #gluster-dev
01:56 JoeJulian There's no way to trigger a specific file to heal any more, is there?
01:56 JoeJulian Just have to wait for shd to get to it?
01:59 dlambrig1 left #gluster-dev
02:12 gem joined #gluster-dev
02:12 hagarth JoeJulian: find filename | xargs stat doesn't heal?
02:13 JoeJulian I thought it didn't anymore, that there was no more client-side heals. Or did I misunderstand someone?
02:16 hagarth JoeJulian: no, client side heals are not disabled. there was a plan but it has not yet been implemented.
02:18 nishanth joined #gluster-dev
02:19 JoeJulian huh.. well then I have some more digging to do so I can see why that doesn't happen.
02:30 JoeJulian Interesting... I wonder if it's read-hash-mode. It's attached to both replica, "pending_matrix: [ 35 0 ]", so it's clearly reading the pending flags from both replica, but "Only 1 child up - do not attempt to detect self heal".
02:40 JoeJulian Huh, yep. I'll have to check and see if I can repro with a current version.
03:15 overclk joined #gluster-dev
03:21 kanagaraj joined #gluster-dev
03:27 Manikandan joined #gluster-dev
03:43 spalai joined #gluster-dev
03:44 itisravi joined #gluster-dev
03:47 sakshi joined #gluster-dev
03:52 nbalacha joined #gluster-dev
03:53 atinm joined #gluster-dev
04:01 shubhendu joined #gluster-dev
04:17 ashiq joined #gluster-dev
04:19 atinm joined #gluster-dev
04:21 aspandey joined #gluster-dev
04:28 rastar ppai: excellent, could please reply with this dashboard link on the tips and tricks mail thread? That mail thread is not restricted to dev environment, it is for whole gluster ecosystem.
04:31 gem joined #gluster-dev
04:39 rafi joined #gluster-dev
04:41 poornimag joined #gluster-dev
05:02 poornimag joined #gluster-dev
05:06 skoduri joined #gluster-dev
05:07 spalai left #gluster-dev
05:08 spalai joined #gluster-dev
05:09 spalai left #gluster-dev
05:11 pranithk joined #gluster-dev
05:13 aravindavk joined #gluster-dev
05:14 Bhaskarakiran joined #gluster-dev
05:16 mchangir_ joined #gluster-dev
05:18 hgowtham joined #gluster-dev
05:18 pppp joined #gluster-dev
05:20 ggarg joined #gluster-dev
05:29 jiffin joined #gluster-dev
05:41 Saravanakmr joined #gluster-dev
05:41 kdhananjay joined #gluster-dev
05:44 ppai joined #gluster-dev
05:48 ppai pranithk, can you take this in for 3.7.7 ? http://review.gluster.org/#/c/13179/ It's not critical at all but has passed all regression
05:49 pranithk ppai: no :-(
05:50 pranithk ppai: More people will ask for patches to be merged which can lead to spurious failures... The release is delayed by 30 days already. So....
05:51 ppai pranithk, np. Wanted it to end up in a RPM as gogfapi bindings depend on it and newer version of swift rely on that. We've got workaround though
05:51 pranithk ppai: cool, thanks
05:56 ppai pranithk|afk, fwiw, here's two more that has passed all regressions: https://goo.gl/t9wQg4
05:57 nishanth joined #gluster-dev
05:57 kanagaraj joined #gluster-dev
06:00 vimal joined #gluster-dev
06:08 pranithk ppai: soumya was right, master patch for memleak is not merged yet.
06:10 ppai pranithk, oh! then it makes sense
06:12 vmallika joined #gluster-dev
06:18 spalai joined #gluster-dev
06:22 kanagaraj joined #gluster-dev
06:23 JoeJulian I love all the memory management work that's been happening. :)
06:29 Saravanakmr /msg NickServ SETPASS Saravana_ vwuxxvfrukyv  ethufreenode456#
06:36 spalai joined #gluster-dev
06:39 pranithk JoeJulian: hey! how are you?
06:40 pranithk JoeJulian: Did you get a chance to see/attend facebook scale conf presentation?
06:40 karthikfff joined #gluster-dev
06:50 JoeJulian Not yet. I plan on watching it this weekend.
06:51 pranithk JoeJulian: cool
06:54 asengupt joined #gluster-dev
07:23 ppai atinm, this should render it:
07:23 ppai ![Glusterd2 Architecture](/design/GlusterD2/images/gd2-arch.png)
07:24 JoeJulian pranithk: Do you know of a way that I can figure out which shd is healing a specific file? I have some ideas I can try to see if I can figure out how far through the heal it is, but every idea I come up with for figuring out which shd is doing the job is failing me.
07:25 JoeJulian (I'm healing a half dozen 20TB images. It's taking forever and I'm afraid of doing anything that might make it restart from the beginning of the file)
07:25 pranithk JoeJulian: If you take a statedump of the brick on which this file is present, there will be locks in <volname>-replicate-0:self-heal domain.
07:26 JoeJulian excellent, thanks.
07:26 pranithk JoeJulian: Oh that is easy. Take a statedump and see the lock ranges. It will give the offset and length it is healing
07:26 JoeJulian even better
07:27 JoeJulian My next feature request will be to get that information someplace usable. :)
07:30 pranithk JoeJulian: Hmm... will think about this.
07:32 atinm ppai, cool
07:32 atinm I will send it
07:34 atinm ppai, is the path correct?
07:34 atinm ppai, shouldn't it be only images/gd2-arch.png?
07:44 poornimag joined #gluster-dev
08:05 post-factum pranithk|afk: will all memleak-related patches merged into 3.7.7? i mean fuse as well as api
08:06 post-factum s/merged/be merged
08:09 post-factum joined #gluster-dev
08:25 kanagaraj joined #gluster-dev
08:29 atinm ppai, http://review.gluster.org/#/c/13314/
08:45 hchiramm_ joined #gluster-dev
09:01 atalur joined #gluster-dev
09:05 raghu joined #gluster-dev
09:06 pranithk baojg: I found that the frames are stuck in write-behind.
09:06 pranithk raghu: http://pastebin.centos.org/38996/ is one statedump...
09:06 pranithk baojg: raghu will take over debugging this issue
09:06 raghu pranithk: yes. Will take a look. Seems like a race in write-behind
09:06 raghu is there a bug filed on this?
09:07 baojg got it.
09:08 baojg not ,just the mail as i knew.
09:09 raghu baojg: what version is this?
09:09 baojg 3.5.7 ,but i saw some one use 3.6.x
09:10 raghu ok
09:16 xavih pranithk: are you there ?
09:17 rog anyone have experience with 3.6.x on FreeBSD 10.2?
09:17 pranithk xavih: hey! tell me
09:18 rog just wanted to ascertain the general feeling of stability
09:18 xavih pranithk: I've been trying to determine the source of the memory leak in 3.7.6
09:18 rog looking to deploy it to escape the "NFS lock-in" with our clustered boxes
09:18 xavih pranithk: it seems related to inodes not being freed, however inode ref counting is correctly done
09:18 xavih pranithk: it seems to me that the problem is the 'nlookup' field
09:19 xavih pranithk: are you familiarized with the exact meaning of this field ?
09:21 pranithk xavih: yes
09:21 pranithk xavih: let me see if there is documentation available. give me a minute
09:22 xavih pranithk: well, the problem seems to be that when a readdirp is done, nlookup is incremented. When ref counting reaches 0, since nlookup is not 0, the inode is "passivated" instead of "retired", causing it to not be destroyed
09:22 xavih pranithk: nlookup of inodes corresponding to each entry of the read dir, I mean
09:23 aspandey joined #gluster-dev
09:23 pranithk xavih: https://gluster.readthedocs.org/en/release-3.7.0-1/Developer-guide/datastructure-inode/
09:23 xavih pranithk: nlookup should be the number of *currently* running lookups, I think, but this rule is broken in the readdirp case
09:24 pranithk xavih: oh
09:24 pranithk xavih: but kernel sends forgets
09:25 xavih pranithk: yes, but it only sends forgets for lookup it knows (i.e. lookups sent by the kernel)
09:25 pranithk xavih: fuse mounts?
09:25 pranithk xavih: fuse_forget(). Even readdirp is sent by kernel
09:25 xavih pranithk: so it will always send a forget with an nlookup count one less than the current nlookup count of the inode
09:26 pranithk xavih: not true as per my understanding...
09:26 pranithk xavih: let me check once
09:26 xavih pranithk: the kernel takes into account the inode of the readdirp ?
09:27 pranithk xavih: check xlators/mount/fuse/src/fuse-bridge.c fuse_readdirp_cbk, it will do inode_lookup only for the inodes it sends to the kernel in readdirp
09:27 pranithk xavih: as far as I understand yes
09:27 xavih pranithk: yes
09:27 xavih pranithk: but I'm not sure that this is counted by the kernel as an additional nlookup
09:27 xavih pranithk: anyway I haven't thought about that. Let me check
09:28 xavih pranithk: maybe I'm wrong
09:28 xavih pranithk: I'll tell you something when I have more information
09:28 xavih pranithk: thanks :)
09:28 pranithk xavih: np. Do you have simple test case to re-create it?
09:29 xavih pranithk: a simple "find /<mount point> -type f"
09:29 pranithk xavih: oh in that case the inode won't be forgotten.
09:29 xavih pranithk: why ?
09:29 pranithk xavih: the rules for forgetting: 1) Kernel sees memory pressure, so it needs to forget. 2) On unlink because the memory is useless as the inode doesn't exist
09:30 pranithk xavih: if we just do find, kernel chooses to store the inodes as long as it sees fit
09:30 xavih pranithk: ok, let me check this :)
09:31 pranithk xavih: the way you can find if it is leaking is by doing the following: 1) Create a directory hierarchy (untar may be?) 2) do a find . -type f 3) rm -rf
09:31 pranithk xavih: if you see inodes even after this. Then we have something to debug :-)
09:31 xavih pranithk: ok. Thanks :)
09:37 xavih pranithk: a drop_caches should have the same effect than an rm for inode forgets, right ?
09:39 pranithk xavih: It didn't work for me when I did
09:39 pranithk xavih: only rm -rf used to work
09:39 xavih pranithk: I'll try that then
09:39 xavih pranithk++: thanks again :)
09:39 glusterbot xavih: pranithk's karma is now 42
09:40 pranithk xavih: np xavi
09:40 anmol joined #gluster-dev
09:42 pranithk xavih: readdirp in fuse kernel is introduced by Avati. So the testing happened with gluster first. I never checked fuse code though. So all this lookup accounting would be right in readdirp is my guess ;-)
09:43 aravindavk joined #gluster-dev
09:43 xavih pranithk: yes, it's probably right :) I'll continue testing
09:44 pranithk raghu: I will be leaving for home in 45 minutes. Will you be able to help xavih with inode-leaks? He may have more questions about inode ref/nlookup count accounting.
09:50 pranithk xavih: do you know about TCP_USER_TIMEOUT? it also helps in preventing hangs it seems.
09:50 pranithk xavih: The hangs where afr waits for write to complete on a brick when brick is suddenly rebooted. At the moment we wait for ping-timeout to happen.
09:52 xavih pranithk: but we currently use SO_KEEPALIVE in socket, don't we ?
09:53 xavih pranithk: this should also detect the node failure
09:54 pranithk xavih: but keepalive doesn't work when the node shutsoff
09:54 pranithk xavih: I will give you rfc wait
09:54 xavih pranithk: oh
09:54 xavih pranithk: I thought it could detect that with a reasonable timeout
09:54 pranithk xavih: https://tools.ietf.org/html/rfc5482
09:59 xavih pranithk: it's interesting :)
10:08 pranithk xavih: Are you seeing the leaks still?
10:09 xavih pranithk: I haven't tried with rm yet
10:10 pranithk xavih: okay. Will be leaving soon. Send me a mail if you need something. I will reply in about 5-6 hours from now...
10:10 Manikandan joined #gluster-dev
10:10 xavih pranithk: thank you very much for your help :)
10:15 ggarg joined #gluster-dev
10:23 pranithk xavih: you helped me so much for ec. Time to help you back :-). Okay logging off now. Cya.
10:25 ppai joined #gluster-dev
10:32 luizcpg joined #gluster-dev
10:59 itisravi joined #gluster-dev
11:06 nbalacha joined #gluster-dev
11:20 baojg joined #gluster-dev
11:39 ira joined #gluster-dev
11:47 jiffin1 joined #gluster-dev
11:49 luizcpg joined #gluster-dev
11:51 luizcpg joined #gluster-dev
11:58 luizcpg joined #gluster-dev
12:28 nbalacha joined #gluster-dev
12:35 asengupt joined #gluster-dev
12:36 dlambrig joined #gluster-dev
12:37 jiffin1 joined #gluster-dev
12:57 luizcpg joined #gluster-dev
13:03 raghu joined #gluster-dev
13:08 kdhananjay joined #gluster-dev
13:21 overclk kkeithley: regarding change #13274 - not compiled, untested, use with caution ;)
13:22 sakshi joined #gluster-dev
13:37 asengupt joined #gluster-dev
13:51 overclk joined #gluster-dev
13:53 gem joined #gluster-dev
14:01 EinstCrazy joined #gluster-dev
14:11 asengupt joined #gluster-dev
14:20 baojg joined #gluster-dev
14:34 ppai joined #gluster-dev
14:36 mchangir_ joined #gluster-dev
14:38 luizcpg joined #gluster-dev
14:39 gem joined #gluster-dev
14:54 mchangir_ joined #gluster-dev
15:08 spalai left #gluster-dev
15:16 baojg joined #gluster-dev
15:17 dlambrig1 joined #gluster-dev
15:55 baojg joined #gluster-dev
16:00 wushudoin joined #gluster-dev
16:00 wushudoin joined #gluster-dev
16:01 spalai joined #gluster-dev
16:06 ggarg joined #gluster-dev
16:32 ggarg joined #gluster-dev
16:32 shaunm joined #gluster-dev
17:05 gem joined #gluster-dev
17:09 jiffin joined #gluster-dev
17:16 jiffin1 joined #gluster-dev
17:19 baojg joined #gluster-dev
17:32 jiffin joined #gluster-dev
17:34 shubhendu joined #gluster-dev
17:38 baojg joined #gluster-dev
17:42 jiffin joined #gluster-dev
17:43 baojg joined #gluster-dev
17:49 post-factum talking about memory leaks
17:49 post-factum root     16647 85.5  6.7 3613844 3316952 pts/1 Sl+  Jan26 4068:24 valgrind --leak-check=full --show-leak-kinds=all --log-file=valgrind_fuse.log /usr/sbin/glusterfs -N --volfile-server=glusterfs.la.net.ua --volfile-id=asterisk_records /mnt/net/glusterfs/asterisk_records
17:49 post-factum i guess, 3.3G is enough for valgrind, so I stop rsync
17:50 post-factum do drop_caches and umount
17:51 post-factum while dropping caches, glusterfs process CPU usage is high
17:51 vimal joined #gluster-dev
17:51 post-factum but RAM consumption doesn't change
17:52 post-factum nice DHT leaks, i see
17:53 post-factum https://gist.github.com/f8e0151a6878cacc9b1a
17:53 post-factum will post it to mailing list
17:56 post-factum and i guess i need to do some summary, because current thread is big enough to get lost
18:09 post-factum ok, posted summary to mailing lists
18:11 rafi joined #gluster-dev
18:26 primusinterpares joined #gluster-dev
18:43 baojg joined #gluster-dev
19:11 jiffin joined #gluster-dev
19:31 jiffin joined #gluster-dev
19:44 baojg joined #gluster-dev
20:32 jiffin1 joined #gluster-dev
20:44 jiffin joined #gluster-dev
20:46 baojg joined #gluster-dev
21:09 jiffin joined #gluster-dev
21:13 hagarth post-factum: just responded to your email. hopefully that patch should fix the dht leaks seen.
21:16 post-factum hagarth: yup, got that. will test asap
21:28 post-factum built ok, mounted ok. running tests
21:29 hagarth post-factum: nice!
21:32 jiffin joined #gluster-dev
21:47 baojg joined #gluster-dev
21:49 wushudoin joined #gluster-dev
22:18 jiffin joined #gluster-dev
22:25 jiffin joined #gluster-dev
22:47 baojg joined #gluster-dev
23:11 shyam left #gluster-dev
23:30 JoeJulian hagarth++
23:30 glusterbot JoeJulian: hagarth's karma is now 79
23:49 baojg joined #gluster-dev

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary