Camelia, the Perl 6 bug

IRC log for #gluster-dev, 2013-03-22

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:37 yinyin joined #gluster-dev
01:07 jules_ joined #gluster-dev
01:24 yinyin joined #gluster-dev
01:51 sahina joined #gluster-dev
01:58 bala joined #gluster-dev
02:11 sahina joined #gluster-dev
02:37 jdarcy joined #gluster-dev
03:08 nixpanic joined #gluster-dev
03:08 nixpanic joined #gluster-dev
03:20 rastar joined #gluster-dev
03:28 yinyin joined #gluster-dev
03:30 bharata joined #gluster-dev
03:43 bharata joined #gluster-dev
04:00 bulde joined #gluster-dev
04:05 anmol joined #gluster-dev
04:05 sac joined #gluster-dev
04:20 sgowda joined #gluster-dev
04:32 pai joined #gluster-dev
04:41 deepakcs joined #gluster-dev
04:51 sripathi joined #gluster-dev
04:54 yinyin joined #gluster-dev
05:01 raghu joined #gluster-dev
05:02 aravindavk joined #gluster-dev
05:10 harshpb joined #gluster-dev
05:18 yinyin joined #gluster-dev
05:20 mohankumar joined #gluster-dev
05:27 hagarth joined #gluster-dev
05:34 rastar joined #gluster-dev
05:49 lala_ joined #gluster-dev
05:54 joaquim__ joined #gluster-dev
05:57 vshankar joined #gluster-dev
05:59 pai joined #gluster-dev
06:19 aravindavk joined #gluster-dev
06:27 mohankumar joined #gluster-dev
06:56 deepakcs joined #gluster-dev
07:08 vshankar joined #gluster-dev
07:58 harshpb joined #gluster-dev
08:13 harshpb joined #gluster-dev
08:38 harshpb joined #gluster-dev
09:34 deepakcs joined #gluster-dev
10:10 harshpb joined #gluster-dev
10:18 harshpb joined #gluster-dev
10:28 harshpb joined #gluster-dev
10:37 deepakcs joined #gluster-dev
10:54 lalatenduM joined #gluster-dev
11:07 jdarcy joined #gluster-dev
11:20 xavih what is the purpose of the features/index xlator ?
11:21 xavih is it for geo-rep ?
12:15 hagarth xavih: features/index is for creating indices. Self-heal-daemon is a consumer of one such index created by features/index.
12:19 H__ A 3.3.1 glusterfsd died (was serving replace-brick data). I see no clues in the logs. What are recommended methods to monitor for and restart glusterfsd's ?
12:34 edward1 joined #gluster-dev
12:34 sgowda joined #gluster-dev
12:36 jdarcy joined #gluster-dev
12:37 xavih hagarth: I've a problem with the index created in .glusterfs/indices/xattrop
12:38 xavih hagarth: there are thousands of files and they do not disapear
12:38 hagarth xavih: that being?
12:38 hagarth xavih: do you have self-heal-daemon running?
12:38 xavih hagarth: gluster volume heal <volname> info heal-failed reports gfid that are inside that directory
12:39 xavih hagarth: yes
12:39 xavih it seems that something is stuck
12:39 xavih hagarth: with gluster volume stopped is it safe to delete the directory ?
12:40 hagarth xavih: since there are failures, it might be pointing to split-brains.
12:40 hagarth xavih: anything in the log files around split-brain activity?
12:40 xavih hagarth: gluster volume heal <volname> info split-brain reports no entries
12:41 hagarth xavih: you can probably check the log files for the reasons behind self-heal failures.
12:41 pranithk joined #gluster-dev
12:42 jdarcy joined #gluster-dev
12:43 xavih hagarth: there aren't errors saying anything about split brains (apparently)
12:43 xavih hagarth: I've seen something very strange in .glusterfs
12:44 xavih hagarth: there is a gfid representig a directory. It should be a symlink AFAIK but it is a real directory
12:45 xavih hagarth: and there is this in the log file:  0-vol01-posix: open on /pool/a/.glusterfs/be/55/be55d1a5-7​33d-4023-9b1b-025b3b6f1849/VOL_02: No such file or directory
12:46 xavih hagarth: be55d1a5-733d-4023-9b1b-025b3b6f1849 IS a directory, not a symlink
12:46 xavih hagarth: is this normal ?
12:47 pranithk xavih: /pool/a/.glusterfs/be/55/be55d1​a5-733d-4023-9b1b-025b3b6f1849 is a symlink to a directory I think...
12:49 xavih pranithk: it should, but it isn't: drwx------ 2 root root      149 Mar 21 16:06 be55d1a5-733d-4023-9b1b-025b3b6f1849
12:50 xavih pranithk: there are other directories that are symlink in fact: lrwxrwxrwx 1 root root       51 Jan  4 09:54 be55df92-acb9-438a-b4c7-b37664d89621 -> ../../88/31/8831ae9c-2482-​454b-878e-2e0534d36755/C7
12:50 pranithk xavih: https://bugzilla.redhat.com/show_bug.cgi?id=859581 seems like this issue...
12:50 glusterbot Bug 859581: high, unspecified, ---, vsomyaju, ASSIGNED , self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs
12:51 pranithk xavih: we don't know how this happens.. Can you give me more info about what you did to get into this state...
12:51 xavih pranithk: interesting... can this be the cause for features/index not beeing processing the queue ?
12:52 xavih pranithk: well, this volume has suffered a lot of problems. Not sure what may have caused this
12:52 pranithk According to the bug, until the directory is removed and symlink is created, the self-heal did not progress..
12:53 xavih pranithk: ok, I'll read the bug and try do as you say...
12:53 pranithk xavih: Could you please give me info about what problems happened on the volume. I have been extremely curious about the bug.. was never able to figure out the reason why this would happen..
12:54 pranithk xavih: It would be extremely useful for figuring out what may have caused it...
12:55 pranithk xavih: before you change the state of the system... could you give me the stat output on that directory....
12:55 xavih pranithk: well, it started the last year... the volume was expanded, then a couple of bricks were replaced (all this with 3.2)
12:56 xavih pranithk: for performance reasons the rebalance needed to be stopped several times during work hours
12:56 xavih pranithk: then a massive self-heal started (full replace of a brick, this time with 3.3.1)
12:57 xavih pranithk: then they required my attention, and after looking logs and extended attributes I saw that many of these operations were made concurrently
12:58 xavih pranithk: they thought the previous operation was completed when in fact it wasn't
12:58 yinyin joined #gluster-dev
12:58 xavih pranithk: so I don't know what can be the cause of this...
12:58 xavih pranithk: however I have some testing environment where I'll be able to do some tests when this problem gets stabilized
12:59 xavih pranithk: what do you need exactly from the directory ?
12:59 pranithk stat output
13:00 xavih can I paste the output here ?
13:00 pranithk sure
13:00 xavih [root@srv01 55]# stat be55d1a5-733d-4023-9b1b-025b3b6f1849 File: `be55d1a5-733d-4023-9b1b-025b3b6f1849' Size: 149             Blocks: 0          IO Block: 4096   directory
13:00 xavih Device: 800h/2048d      Inode: 8078161     Links: 2
13:00 xavih Access: (0700/drwx------)  Uid: (    0/    root)   Gid: (    0/    root)
13:00 xavih Access: 2013-03-21 19:24:50.528236424 +0100
13:00 xavih Modify: 2013-03-21 16:06:48.024652576 +0100
13:01 xavih Change: 2013-03-21 16:06:48.024652576 +0100
13:01 pranithk xavih: wow all of them today....
13:01 pranithk what is the brick process version? 3.3.x?
13:02 xavih well, maybe it's related with self-heal problems. Every 10 minutes appear errors in the gluster volume heal XXXX info heal-failed
13:02 xavih and users are working
13:02 pranithk could you grep for ".glusterfs" in the brick logs and see if there are any matches?
13:02 xavih 3.3.1
13:02 xavih sure
13:03 pranithk brb in 1 minute.
13:05 pranithk xavih: back
13:06 xavih pranithk: there are basically two errors:  0-vol01-posix: open on /pool/b/.glusterfs/be/55/be55d1a5-733d-4023-9
13:06 xavih b1b-025b3b6f1849/VOL_04: No such file or directory
13:06 xavih pranithk: and this: ] 0-vol01-posix: symlink ../../b7/77/b777dd24-6037-450f-8881-abfa2ec1f01a/6 -> /pool/b/.glusterfs/b7/77/b777dd2​4-6037-450f-8881-abfa2ec1f01a/6 failed (No such file or directory)
13:07 pranithk are you sure you are searching in the correct log? because the directory you mentioned is in /pool/a
13:08 xavih pranithk: yes, I searched all logs. The message appears both in /pool/a and /pool/b
13:09 xavih pranithk: the last error seems to be caused by an inexistent 'real' directory
13:09 pranithk xavih: Are these logs confidential or I can get them to debug?...
13:10 xavih pranithk: well, I think this won't be possible. They are from a customer with high security concerns... (too high sometimes)
13:11 pranithk xavih: :-(
13:11 pranithk xavih: was that error from posix_handle_soft function, I mean the log...
13:12 xavih pranithk: however I promise I'll look into it to determine the cause. Anything I find I'll post to the bug page
13:12 pranithk cool
13:12 xavih pranithk: the second error, yes
13:12 xavih pranithk: the first one is from posix_open
13:12 pranithk xavih: Are there any errors with strings "error mkdir hash-1" or "error mkdir hash-2" ?
13:13 xavih pranithk: no, none
13:13 pranithk xavih: hmm....
13:15 xavih pranithk: Don't worry, I'll try to solve the problem with your indications and will tell you how have it went
13:15 xavih pranithk: Can I replace the directory with a symlink with the gluster running ?
13:15 pranithk xavih: the directory be/55/be55d1a5-733d-4023-9b1b-025b3b6f1849 is created just today... Are there any logs with be55d1a5-733d-4023-9b1b-025b3b6f1849 string?
13:17 xavih pranithk: no, it has been modified today, but I don't know when it was created, doesn't it ?
13:17 pranithk xavih: That is not a good idea generally... Let me read what Joe did when it happened for him.. You can ping him on #gluster channel if you need more info ..
13:17 xavih pranithk: thank you
13:18 xavih pranithk: the only message that appears with that uuid in brick logs is the error I posted previously
13:18 xavih pranithk: there is nothing in volume log
13:18 pranithk xavih: ok...
13:19 pranithk xavih: Hey could you paste me the second error log the one with pool/a
13:19 xavih pranithk, hagarth_: thank you very much for your heal, really appreciated
13:20 pranithk xavih: I want to see the errno for failure in symlink...
13:20 pranithk xavih: probably it is eexist but still..
13:20 xavih pranithk: Will tell you something as soon as I can solve the customer problem and I can concentrate exclusively on this
13:21 pranithk oh cool
13:21 hagarth joined #gluster-dev
13:21 pranithk xavih: ping me if I am around...
13:21 xavih pranithk: sure :)
13:23 xavih pranithk: sorry, one more thing...
13:23 xavih pranithk: is it safe to delete all files from .glusterfs/indices/xattrop with the volume stopped ?
13:24 pranithk xavih: You may miss some self-heals if you do that..
13:25 xavih pranithk: all indices files are on one brick of a replica. The other one has the directory empty
13:25 hagarth xavih: you might have to trigger full self-heal on that volume.
13:25 xavih pranithk: is it safe to assume that the other brick is up to date ?
13:26 xavih pranithk: the indices files are pending updates to local brick, right ?
13:26 pranithk xavih: no
13:26 pranithk xavih: It tells which files need self-heal...
13:27 pranithk xavih: it does not tell on which brick...
13:27 pranithk xavih: afr does lookup on both the bricks to figure out the direction...
13:27 xavih pranithk: this means that it could be that a file referenced in the index of /pool/a is in fact pending to be healed in /pool/b ?
13:27 xavih pranithk: ok
13:28 xavih pranithk: I'll have to take a deeper look before removing anything
13:29 xavih pranithk: I'll begin by solving the problem with that directory and hope that self-heal could continue normally before that
13:29 xavih pranithk, hagarth: thank you again :)
13:31 lpabon joined #gluster-dev
13:38 lpabon joined #gluster-dev
13:52 johnmark hagarth: ping
13:52 johnmark xavih: btw, want to see an early preview of the gluster forge?
13:53 johnmark redhat-staging.gitorious.com
13:53 johnmark hagarth: I'm trying to see if any of the docs writers in BLR have tim eto write up the GlusterFS-Cinder integration
13:53 johnmark for end users
13:54 johnmark I'm going to guess that Anjana, et al, are busy, but is she the person to contact?
13:54 pranithk xavih: One request... I went through the code. Could you please give getfattr output on that problematic directory when you get to it. If I am not around please add output of "getfattr -d -m . -e hex <dir>" to the bug I mentioned above..
13:56 awheeler How do I change the status of a bug that is ON_QA?  I've added a comment to it (on the 20th), but saw no option to change it, and I've had no responses to my comment. Bug 909053
13:56 glusterbot Bug http://goo.gl/nOJQ1 unspecified, medium, ---, junaid, ON_QA , Gluster-swift does not allow operations on multiple volumes concurrently.
14:00 ndevos awheeler: the status of the bug can be changed at the bottom of the page
14:00 hagarth johnmark: I doubt about availability of free cycles.
14:00 johnmark hagarth: yeah, I assume that
14:01 johnmark but I need to ask before hiring a freelance docs writer
14:01 awheeler ndevos: Apparently not that bug.  I do see that option on a different bug though.
14:02 hagarth johnmark: probably shoot an email to them and check?
14:02 johnmark hagarth: because when I ask for money to do that, the first question they'll ask is "did you make a request to our docs team?"
14:02 johnmark yeah
14:02 johnmark Is Anjana is the right person?
14:02 ndevos awheeler: oh, right, thats a bug against Red Hat Storage, and not the community version
14:03 johnmark ndevos: then we need to make a new bug entry, it seems
14:03 awheeler Ah, well, that explains it then.
14:03 ndevos johnmark: maybe, its for rhs-2.1, I'm not sure if that will get based on glusterfs-3.4
14:04 hagarth johnmark: yes
14:04 johnmark ndevos: I had assumed it would
14:04 awheeler And now that I look I can see the difference.  Is there a process for doing that?  Will they acknowledge the feedback?
14:04 johnmark hagarth: ok
14:04 johnmark thanks
14:04 johnmark awheeler: yes, if you create a bug against GlusterFS, devs will acknowledge it
14:05 ndevos awheeler: so, the correct approach is to clone the bug to the community/glusterfs product, which should be the blocker for the RHS bug
14:05 johnmark ndevos: do you know the best way to do that? Is there a way to clone a bug without cutting and pasting?
14:05 awheeler Ok, how do I clone it? Cut-and-paste?
14:05 ndevos johnmark: yeah, upper-right-corner is a 'clone' link
14:06 johnmark ndevos: aha! that's good to know. not sure how I missed that
14:06 ndevos lol, copy/paste is horrible!
14:06 johnmark ndevos: yes, it is :)
14:07 awheeler Excellent, shall I add external bug to reference  the other bug, or is the clone sufficient?
14:10 ndevos awheeler: when you clone, the new bug s marked as 'dependent on' the old bug, thats wrong, the new bug should block the old one
14:11 ndevos awheeler: maybe these fields are only available if you select 'advanced' or something like that...
14:11 awheeler Excellent, and done:  Bug 924792
14:11 glusterbot Bug http://goo.gl/Smv7Z medium, unspecified, ---, junaid, NEW , Gluster-swift does not allow operations on multiple volumes concurrently.
14:12 awheeler So the patch referenced is in the glusterfs git repo, but the original patch was submitted against a RHS repo?  And thus might work correctly for them?
14:12 ndevos awheeler: nice, I've added the block/depends relation to now
14:13 ndevos awheeler: no, the bug was against RHS, but the patch was filed against the community repo
14:13 awheeler ndevos: Cool, thank you.  I see the option now.
14:13 ndevos awheeler: and, on top of that, you have identified that the patch is incomplete
14:16 awheeler ndevos: Do the swift tests get run by the hudson/jenkins builds?  They didn't appear to run when I ran run-tests.sh
14:17 ndevos awheeler: I'm not sure, but there is a UFO test run on each patch submission, but I dont know what it actually does
14:20 awheeler ndevos: Just thinking this patch should be unit testable, but it's a bit involved.
14:21 ndevos awheeler: I have little experience with the UFO bits, so I cant help you there :-/
14:22 awheeler ndevos: No worries, you have been very helpful.  :-)
14:23 ndevos awheeler: ah, good to hear that :)
14:34 wushudoin joined #gluster-dev
15:05 lpabon joined #gluster-dev
16:17 jclift joined #gluster-dev
16:21 jclift kkeithley1: ping
16:23 hagarth joined #gluster-dev
16:24 jdarcy joined #gluster-dev
16:28 jdarcy joined #gluster-dev
16:30 lalatenduM joined #gluster-dev
16:42 rastar joined #gluster-dev
16:54 kkeithley1 jclift: what's up?
16:54 jclift kkeithley1: Having trouble building with the .spec file.
16:54 jclift kkeithley1: Chucked the info in the gerrit review. :)
16:55 jclift kkeithley1: Suspecting I'm just doing something wrong though.
16:55 kkeithley1 On what platform? Fedora?
16:55 jclift RHEL 6.4 and CentOS
16:56 jclift kkeithley1: http://review.gluster.org/4674
16:56 jclift kkeithley1: Heh, don't tell me... that patch only affects the Fedora part of the code in the .spec?
16:56 jclift kkeithley1: I have an F17 VM around somewhere that I could try it on instead. (?)
16:56 kkeithley1 build.g.o is a CentOS 6.3
16:57 kkeithley1 That patch doesn't really have anything to do with the spec, or for Fedora, RHEL, or CentOS
16:57 jclift "Change I4f121305: glusterfs.spec.in: sync with fedora glusterfs.spec" ?
16:58 kkeithley1 yup
16:58 jclift Misleading patch title?
16:59 kkeithley1 It's just piggy-backed on the original BZ where I added .../extras/LinuxRPM.
16:59 jclift Ahhhh, k.
16:59 jclift kkeithley1: Ok, what should I be doing to test this thing works then?
16:59 kkeithley1 cd extras/LinuxRPM && make glusterrpms
17:00 jclift Heh, cool.
17:00 kkeithley1 er, make dist && cd extras/LinuxRPM && make glusterrpms
17:01 jclift Heh, I was going to ask what the difference between make dist and this is then.
17:01 jclift k, trying it out now.
17:01 jclift Looks like it needs --enable-fusermount on the ./configure line, else the tarball is missing some stuff.
17:01 kkeithley1 yes
17:01 jclift kkeithley: It doesn't seem this stuff is written up anywhere?
17:02 * jclift will write up a BZ about "missing instructions for building tarballs"
17:02 kkeithley Yeah, it's one of the dusty corners of the community release
17:03 jclift It's ok.  I try to force myself to fix (and hopefully write BZ's) for stuff as I hit it, if I think other people would hit it too.
17:04 jclift So, 3.5 should "feel" a bit more polished for newbies wanting to compile. :)
17:04 kkeithley yup, that's great stuff. I'm glad someone's chasing these down and cleaning them up
17:07 jclift kkeithley: Excellect, that "just worked" and now I have a bunch of rpms ready to go.
17:07 jclift Cool, will update gerrit.
17:07 kkeithley excellent
17:10 jclift Might as well test on F17 too, just to be super sure.  VM is already spun up now. :)
17:19 kkeithley (FWIW, I'm Kaleb, not Keith ;-))
17:20 jclift Heh.
17:20 * jclift needs coffee
17:20 jclift Sorry.  I actually know that, but don't seem super focused today. :/
17:20 kkeithley me too, jet lagged
17:20 jclift :(
17:53 mohankumar joined #gluster-dev
17:53 lpabon kkeithley: ping
17:54 kkeithley lpabon: what's up?
17:54 sgowda joined #gluster-dev
17:54 lpabon Hi "Keith", I mean Kaleb ;-).. I have a question on change 4674
17:55 kkeithley yes
17:55 kkeithley be nice
17:55 lpabon My question is what is the purpose of "/d/cache..."?  Should that be a Makefile variable?  It should probably be commented
17:56 kkeithley No, it's a directory on build.gluster.org
17:56 lpabon Ah
17:58 kkeithley Since doing the original fedora spec sync I have noticed that sometimes the basic/rpm.t fails because of transient networking problems that cause the git clone or curl fetch to fail. Not sure if it's network connectivity or dns failure, or what.
17:59 kkeithley I have a cron job build.gluster.org that fetches them and stashes them in /d/cache.
17:59 lpabon cool, but do you think that should be a variable like... BUILD_GLUSTER_ORG_CACHE_DIR ?
18:00 jclift lpabon: If there's a way to set up env variables in the build system, that could work.
18:01 jclift lpabon: Otherwise, maybe just do a hostname check and if == "build.gluster.org" or whatever then attempt using the cache?
18:01 lpabon or just have it at the top of the file, just in case we ever want to change it either by changing the makefile, or by sending a new value from the command line
18:02 jclift lpabon: Well, if it's not possible to have env variables in the build system (no idea), then the default value in the file has to work for both build system and end user building hosts.
18:02 kkeithley I don't see a big diff between 'check the hostname' versus 'see if /d/cache exists'
18:03 jdarcy joined #gluster-dev
18:03 jclift lpabon: Env variable might be more useful, in case users want to use a cache dir too.
18:03 lpabon i think the main difference is that as of now only a few of us would understand the purpose of the '/d/cache'
18:03 jclift Just having ideas, not really caring about either way. :)
18:03 jclift lpabon: Good point.
18:04 * jclift did notice a "downloading rpms" type of comment just before a git clone (which doesn't actually fetch an rpm).  But, didn't want to be too pedantic. :)
18:06 kkeithley No, it's fetching the rpm files used to build the fedora/epel rpms from the FedoraSCM git repo
18:07 kkeithley I'll argue diminishing returns. Most people building from source do 'configure; make; make install'.
18:07 lpabon ah, i thought that also
18:08 lpabon But from talking to others, i have found that their development routine is to deploy rpms
18:08 kkeithley Anyone who figures out how to build rpms in extras/LinuxRPM, doing one-sie, two-sie builds can sit through the git/curl or if they're smart enough to read the make file can add their own /d/cache.
18:09 kkeithley But on build.gluster.org where every patch gets verified in batch environment, that's what /extras/LinuxRPM is really for, hence caching things rather than always fetching them from upstream.
18:09 lpabon that's fine, but i do not agree with your previous comment
18:10 kkeithley You can change it if you like. I'm ready to be done with it.
18:11 kkeithley I've got bigger fish to fry
18:11 lpabon Hmm, I did not mean any insult, and I apologize if that is how it has come across.  I'm only trying to make the product maintainable by others who are new to the code
18:16 jclift The patch as-is is prob ok.  But lpabon could follow it up with tweaks to make it more dev friendly?
18:17 lpabon i'll give it a shot
18:22 jclift Heh, strange discovery.  ./autogen.sh (libtoolize really) on F17 requires tar. Known bug. BZ # 794675
18:22 * jclift had better update autogen.sh to check for that.
18:26 kkeithley I didn't think you were being insulting. I'm trying to eliminate the occasional regression test failures that result from network or dns outages. Hint, tests/basic/rpm.t builds a src.rpm using extras/LinuxRPM, which it in turn uses to do a couple builds in mock.
18:30 kkeithley As it is, it builds rpms just fine — I don't know how much more developer friendly it needs to be. But, like I said, you're welcome to make further changes. I just don't personally feel the need to do any more to it.
19:00 lpabon Cool, I agree. thanks Kaleb
19:43 copec joined #gluster-dev
20:24 jdarcy joined #gluster-dev
23:04 hagarth joined #gluster-dev
23:16 jdarcy joined #gluster-dev

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary