Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster-dev, 2016-07-22

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:55 ira joined #gluster-dev
01:19 hagarth joined #gluster-dev
02:33 poornimag joined #gluster-dev
03:03 magrawal joined #gluster-dev
03:49 nbalacha joined #gluster-dev
03:53 atinm joined #gluster-dev
04:06 atinm joined #gluster-dev
04:20 mchangir joined #gluster-dev
04:25 ramky joined #gluster-dev
04:36 shubhendu joined #gluster-dev
04:36 jiffin joined #gluster-dev
04:39 sanoj joined #gluster-dev
04:49 poornimag joined #gluster-dev
04:58 kotreshhr joined #gluster-dev
04:58 gem joined #gluster-dev
05:01 ppai joined #gluster-dev
05:10 ndarshan joined #gluster-dev
05:11 ankitraj joined #gluster-dev
05:22 Bhaskarakiran joined #gluster-dev
05:24 sakshi joined #gluster-dev
05:40 hchiramm joined #gluster-dev
05:41 aspandey joined #gluster-dev
05:42 karthik_ joined #gluster-dev
05:46 prasanth joined #gluster-dev
05:47 nishanth joined #gluster-dev
05:52 mchangir joined #gluster-dev
05:52 devyani7_ joined #gluster-dev
05:54 prasanth joined #gluster-dev
05:56 nishanth joined #gluster-dev
06:01 asengupt joined #gluster-dev
06:02 hgowtham joined #gluster-dev
06:03 ppai joined #gluster-dev
06:09 poornimag nigelb, ping, regarding the automation of regression failure report generator
06:12 nigelb hi!
06:12 nigelb I saw that the script got merged.
06:15 spalai joined #gluster-dev
06:16 msvbhat joined #gluster-dev
06:18 poornimag ok, so i was trying to create a job for it, so it should be in jenkins right? not centos-ci?
06:21 nigelb Get me a script to run and I can probably create the job.
06:23 poornimag nigelb, ok, will do that
06:25 nigelb poornimag: put the script in jenkins/scripts in https://github.com/gluster/glu​sterfs-patch-acceptance-tests
06:25 nigelb and send a pull request
06:30 prasanth joined #gluster-dev
06:31 rastar joined #gluster-dev
06:32 devyani7 joined #gluster-dev
06:33 kdhananjay joined #gluster-dev
06:36 rafi joined #gluster-dev
06:37 rastar joined #gluster-dev
06:38 ashiq joined #gluster-dev
06:41 poornimag nigelb, ok sure
06:49 Saravanakmr joined #gluster-dev
07:00 Saravanakmr ndevos, kkeithley  google (of glusterfs) still points gluster.org documentation - and it is giving 404..  Do we have redirection setup to http://gluster.readthedocs.io ?
07:08 kdhananjay1 joined #gluster-dev
07:14 rraja joined #gluster-dev
07:15 prasanth joined #gluster-dev
07:18 penguinRaider joined #gluster-dev
07:19 kdhananjay joined #gluster-dev
07:23 rraja joined #gluster-dev
07:28 pur joined #gluster-dev
07:48 ndevos misc: redirecting anything from community/documentation/* to the new docs would be good, people do not like 404's
08:11 glusterbot` joined #gluster-dev
08:12 obnox joined #gluster-dev
08:13 aspandey joined #gluster-dev
08:13 rastar joined #gluster-dev
08:14 rastar joined #gluster-dev
08:15 rastar joined #gluster-dev
08:22 misc ndevos: yeah, I will take a look for a quick solution
08:22 ndevos misc++ much appreciated
08:22 glusterbot ndevos: misc's karma is now 30
08:23 misc ndevos: can you open a ticket for tracking ?
08:24 misc I still need to do a ML and a alias and write doc on it so I risk to forget
08:24 misc (and the current heat wave make me unable to sleep well at night so I am really in a bad shape in the morning :/)
08:26 hagarth joined #gluster-dev
08:26 ndevos same here, I'm just taking the rest of the day off and go do something else :)
08:28 ndevos here you go: https://bugzilla.redhat.co​m/show_bug.cgi?id=1359062
08:28 glusterbot Bug 1359062: unspecified, unspecified, ---, mscherer, ASSIGNED , Setup redirection from gluster.org/community/documentation to the new RTD site
08:30 Saravanakmr ndevos++
08:30 glusterbot Saravanakmr: ndevos's karma is now 289
08:31 nbalacha nigelb, ping. Question on netbsd
08:31 nigelb nbalacha: sure, go ahead
08:32 misc "Nigel Enhanced Technology BSD"
08:32 nbalacha nigelb, was the /build/install path changed to /data/build/install recently?
08:32 nigelb nbalacha: I just changed it, yes.
08:32 nbalacha nigelb: ok. That causes quota.t failures on the release-3.7 branch
08:32 nigelb /data/build/ is symlinked to /build
08:32 nigelb OHH.
08:33 nigelb That doesn't make sense. Why does it fail?
08:33 nbalacha nigelb: quota.t was marked bad test on master but doesnt seem to be on release-3.7
08:33 aravindavk joined #gluster-dev
08:33 nbalacha the get_aux function uses a hardcoded /build/install in its search string
08:33 nigelb ugh ugh ugh
08:34 nigelb That is not good.
08:34 nbalacha yep
08:34 poornimag joined #gluster-dev
08:34 nbalacha df -h 2>&1 | sed 's#/build/install##' | grep -e "[[:space:]]/run/gluster/${V0}$" -e "[[:space:]]/var/run/gluster/${V0}$"
08:34 nbalacha so it doesnt find the string it wants and test 18 fails consistently
08:35 nigelb /build/install still exists
08:35 nigelb There is a symlink.
08:35 nbalacha but the df output shows /data/build/install
08:35 nigelb hang on.
08:35 nigelb df output never had /build/install either
08:35 nbalacha so on stripping out the build/install, we still do not get the correct path
08:35 nigelb There's only a /d/
08:35 nbalacha not for the aux mount
08:35 nbalacha one sec
08:36 nigelb ahhh
08:36 nbalacha this is the df output that is being parsed
08:36 nbalacha localhost:client_per_brick/patchy.client.nbsla​ve77.cloud.gluster.org.d-backends-patchy1.vol       3.9G       6.1M       3.7G   0% /data/build/install/var/run/gluster/tmp/mntFr0tdb
08:36 nbalacha localhost:client_per_brick/patchy.client.nbsla​ve77.cloud.gluster.org.d-backends-patchy2.vol       3.9G       6.1M       3.7G   0% /data/build/install/var/run/gluster/tmp/mntbw9G4V
08:36 nbalacha localhost:client_per_brick/patchy.client.nbsla​ve77.cloud.gluster.org.d-backends-patchy3.vol       3.9G       6.1M       3.7G   0% /data/build/install/var/run/gluster/tmp/mnt1hedHq
08:36 nbalacha localhost:client_per_brick/patchy.client.nbsla​ve77.cloud.gluster.org.d-backends-patchy4.vol       3.9G       6.1M       3.7G   0% /data/build/install/var/run/gluster/tmp/mntgJygo6
08:36 nbalacha localhost:patchy                                                                                  7.9G        12M       7.5G   0% /data/build/install/var/run/gluster/patchy
08:36 nbalacha so the string parsing fails
08:37 nigelb okay, our tests making assumptions on where it's run is problematic, but I don't think you have the time or patience to fix that.
08:37 nigelb how about I mount the new volume at /build rather than /data
08:37 nbalacha dont think that will work
08:37 nbalacha this will auto mount it
08:37 purpleidea joined #gluster-dev
08:37 purpleidea joined #gluster-dev
08:37 nbalacha is there any variable I could use instead of the hardcoded string
08:37 nbalacha or I could just mark this bad test for now
08:38 nbalacha or none of the release-3.7 netbsd runs will pass
08:38 nigelb Hang on.
08:38 nigelb What will fail if I make the new volume mount at /build
08:38 nigelb and /build is on a different volume from /
08:38 nbalacha this is a gluster auto mount
08:38 nbalacha will the path get passed to it correctly?
08:39 nbalacha I dont know the scripts so cannot comment
08:39 nigelb right now gluster gets the path because I've pointed /build to /data/build
08:39 nigelb You have 7g, right?
08:39 nbalacha 70 I think
08:39 nigelb 77
08:39 nbalacha nope 77
08:40 nigelb I'll setup 77 in a way I *think* might work.
08:40 nbalacha yep :)
08:40 nigelb And see if it passes?
08:40 nbalacha sure
08:40 nigelb Give me 2 mins to do that.
08:40 nbalacha ok - I have my code in /home/jenkins/glusterfs
08:40 nbalacha not going to delete that right?
08:40 nigelb No
08:40 nigelb Do you need /data/build saved?
08:44 nbalacha no
08:44 nigelb How can I stop the gluster process?
08:44 nbalacha I just kill them :)
08:44 nbalacha pkill gluster
08:46 nbalacha nigelb, heading out to a meeting and I dont seem to be able to connect to wifi
08:47 nbalacha so will check back with you in an hour
08:47 nigelb nbalacha: I'll email you once it's ready.
08:47 nbalacha nigelb: thanks
08:50 nigelb nbalacha: you're all set
09:09 spalai joined #gluster-dev
09:18 penguinRaider joined #gluster-dev
09:29 skoduri joined #gluster-dev
09:33 anoopcs ndevos, Did you mean something like https://github.com/gluster/glusterfs-​patch-acceptance-tests/blob/master/ce​ntos-ci/libgfapi-python/run-test.sh for glusterfs-coreutils?
09:35 penguinRaider joined #gluster-dev
09:35 nbalacha joined #gluster-dev
09:49 nbalacha nigelb, it cant find gluster
09:52 msvbhat joined #gluster-dev
09:53 nigelb nbalacha: you may have to rebuild.
09:53 nbalacha nigelb, I did
09:54 nigelb that's strange.
09:54 nbalacha I see it in /build/install/sbin
09:54 nbalacha but which glusterd returns nothing
09:54 nigelb nbalacha: blow away the install folder
09:54 nigelb And try starting again.
09:55 nbalacha will do
10:00 spalai joined #gluster-dev
10:02 nbalacha nigelb: nope
10:02 nbalacha nigelb, I ran /opt/qa/build.sh to build it
10:02 nbalacha nigelb, is that correct
10:03 nigelb That looks right
10:03 nigelb Here's what we normally run -> https://github.com/gluster/glusterfs-patch-​acceptance-tests/blob/master/jenkins/script​s/rackspace-netbsd7-regression-triggered.sh
10:04 nbalacha build goes throuhg
10:05 nbalacha but tests cannot find gluster
10:06 nigelb how are you running tests?
10:06 nbalacha su -l root -c 'cd /home/jenkins/glusterfs && /opt/qa/regression.sh'
10:07 nbalacha I've been using this command over the last few days
10:07 nbalacha worked fine
10:08 nigelb I see a lot of compliation and test processes.
10:08 nbalacha ?
10:08 nbalacha I am rebuilding
10:08 nigelb ah
10:08 nigelb what is the output for when it can't find gluter?
10:08 nigelb *gluster?
10:08 nbalacha plain old glusterd will fail
10:08 nigelb the path on bsd machines is a bit strange occasionally.
10:09 nbalacha test just hangs
10:09 nbalacha $ glusterd
10:09 nbalacha glusterd: not found
10:09 nbalacha $
10:09 nigelb ah
10:09 nbalacha it has not added the build/install etc to PATH
10:09 nigelb that's most likely a path issue
10:10 nbalacha is that something that was scripted?
10:10 nbalacha because it worked earlier
10:11 nbalacha I just added it manually and it finds it now
10:11 nbalacha trying the test now
10:11 nigelb Ha, I was goign to get the path from a test run.
10:11 nigelb I don't know why it used to work and doesn't work now.
10:11 nbalacha nope- tests cant find it
10:11 nigelb My best guess is I broke somethign when I redid the mount points
10:12 nbalacha I can from the mount though
10:12 nbalacha hmm
10:12 msvbhat joined #gluster-dev
10:12 nbalacha it should have exported it
10:12 nbalacha export PATH="${BASE}/sbin:${PATH}"
10:13 nigelb $BASE is empty btw.
10:14 nigelb Unless you just defined it.
10:14 nbalacha it is defined earlier
10:14 nbalacha chflags: /netbsd: No such file or directory
10:14 nbalacha any idea about this?
10:14 nigelb /sbin/chflags
10:14 nigelb that'll work
10:15 nbalacha no I mean, /netbsd
10:15 nbalacha is that supposed to be there
10:15 nbalacha I dont remember if it was different in the previous runs
10:16 nigelb PATH=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_6​4/bin:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_​64/bin:/usr/bin:/bin:/usr/pkg/bin:/usr/local/bin
10:16 nigelb that's the path when the jekins run starts
10:16 nbalacha $ echo $PATH
10:16 nbalacha /usr/bin:/bin:/usr/pkg/bin:/usr​/local/bin:/build/install/sbin
10:16 ashiq_ joined #gluster-dev
10:16 nbalacha thats from mine
10:17 nigelb < nbalacha> chflags: /netbsd: No such file or directory
10:17 nigelb ^ what's giving you this output
10:17 nbalacha nigelb : not an issue I think - see the same in the old regression runs too
10:17 nbalacha that is when I run /opt/qa/regression.sh
10:17 nbalacha but it should have exported the path
10:19 nbalacha any older regressions running?
10:20 nigelb I can see two tests running in ps ax
10:20 nigelb is that something you just kicked off?
10:20 nbalacha nope
10:20 nbalacha I tried to kill them
10:20 nbalacha but I cannot
10:20 nigelb I'd say reboot
10:20 nbalacha yeah
10:21 nbalacha reboot:not found
10:21 nigelb `/sbin/shutdown -r now`
10:21 nbalacha k
10:21 nbalacha this was a fun day
10:22 nigelb I've been messing with netbsd for 2 weeks
10:22 nigelb my days have been for quite a while now :P
10:22 nbalacha you have my sympathies
10:22 nbalacha I have been using it for only 2 days
10:22 nbalacha and I am already bugged
10:22 nigelb It takes a while to get used to the difference
10:23 nbalacha yep. Is there an autocomplete by any chance
10:23 nbalacha i have to type out whole commands
10:24 nigelb you're in sh
10:24 nigelb type bash
10:24 nigelb and you'll have bash rather than sh
10:24 nigelb which (thankfully) does have autocomplete
10:25 nbalacha rebooted
10:25 nbalacha and no difference
10:25 nigelb damn.
10:25 nigelb how can I reproduce the issue?
10:25 penguinRaider joined #gluster-dev
10:28 nigelb nbalacha: Oh, I know what the problem is I think.
10:28 nbalacha good :)
10:28 nbalacha what is it
10:28 nigelb oh crap.
10:29 nigelb now it's running tests in my shell
10:29 nigelb I'll just let it run.
10:29 nbalacha :D
10:29 nigelb export PATH=/sbin:$PATH
10:29 nigelb It needs /sbin in the path
10:29 nbalacha where did it go?
10:29 nigelb it couldn't find chflags or mount
10:29 nigelb /sbin isn't in the path by default.
10:29 nigelb I think we inject it at some point.
10:30 nigelb I've run into this enough times in the last two weeks :)
10:30 nbalacha ah ok
10:30 nbalacha so do you see the output of the 0quota.t test
10:30 nbalacha that should be one the first
10:30 nbalacha if it passes, we are good
10:30 nigelb It's running 0quota-rename.t
10:31 nbalacha it might have finished 0quota.t
10:31 nbalacha i renames them so they would run first
10:31 nigelb still running.
10:31 nigelb should it take this long?
10:32 nbalacha how many tests has it completed
10:32 nigelb None, I think. I don't see any output other than thsi
10:32 nbalacha so you see something like ok 1, LINENUM
10:33 nigelb No
10:33 nbalacha just test name and ..?
10:33 nigelb http://dpaste.com/0SK2EKE
10:33 nbalacha it is hung
10:33 nbalacha the same prob I saw
10:33 nbalacha usually seen when it cannot find gluster
10:33 nbalacha but it could be something else
10:34 nbalacha kill it - Ctrl C
10:34 nigelb dang.
10:36 nigelb nbalacha: hang on.
10:36 nigelb try this.
10:37 nigelb export PATH=$PATH:/build/install/sbin:/sbin
10:37 nigelb and then try running the tests.
10:39 nbalacha no good
10:39 nigelb still stuck?
10:39 nigelb got glusterd running after that.
10:40 nigelb okay, I recommend putting a set +x in the test file
10:40 nigelb and seeing how far it goes before it gets stuck.
10:41 hgowtham joined #gluster-dev
10:42 nbalacha I cannot edit regression.sh
10:42 nbalacha I added an echo in the beginning of the file
10:42 nbalacha it does not hit it
10:42 bfoster joined #gluster-dev
10:43 nigelb No, I mean in tests/basic/0quota-rename.t
10:44 nbalacha tried that
10:44 nbalacha and I now officially give up
10:44 nigelb Ouch, sorry ^_^
10:46 nbalacha ok - if I comment out the first 2 lines where it is reading the .rc files, it goes ahead
10:46 nbalacha and fails obviously but it goes ahead
10:46 nigelb oh fun.
10:46 nbalacha the python path?
10:46 nigelb no, I mean set +x right before the include lines
10:46 nigelb It makes bash more verbose
10:47 nigelb (that's my hacky sysadmin way of debugging bash scripts, I'm not sure it works for our tests)
10:48 nbalacha no good
10:48 nbalacha hung again
10:49 nbalacha and I am leaving for the day
10:49 nbalacha :)
10:49 nbalacha so what do we do abt the relase-3.7 regression runs?
10:50 nbalacha do you want the machine back? I don't think I am going to do anything more on this over the weekend
10:51 nigelb nbalacha: Start a thread on gluster-devel
10:51 nigelb Let's see where that goes.
10:51 pkalever joined #gluster-dev
10:51 nbalacha ok
10:51 nbalacha nigelb, thanks for your help on ths
10:53 anoopcs nigelb, FYI: netbsd regression run for https://review.gluster.org/#/c/14966/ passed today (and I don't know how..quota.t was ran successfully on release-3.7 on netbsd)
10:53 anoopcs https://build.gluster.org/job/rackspace-net​bsd7-regression-triggered/18265/consoleFull
10:54 nigelb Not all machines have the new partition
10:54 anoopcs Ah..
10:54 nigelb YOu probably got lucky and ran on a machine without the new partition.
10:54 anoopcs nigelb, I thought the changes were made everywhere...
10:54 nigelb Nah. I missed a few machines because it took a while per machine.
10:54 nigelb And then it was time to get away from the computer before I started having netbsd nightmares.
10:55 * anoopcs remembers the failures for the same patch with other slaves...
10:55 nigelb https://bugzilla.redhat.com​/show_bug.cgi?id=1351626#c6
10:55 glusterbot Bug 1351626: unspecified, unspecified, ---, bugs, ASSIGNED , Clear up space on inactive netbsd machines
10:55 nigelb machines where I made the changes.
10:56 nigelb The fun part is if I revert these changes, the machines won't have enouhg space.
10:57 anoopcs And those slave names exactly matches the failures run for that patch.
10:57 anoopcs which explains it.
10:59 nigelb so now, if I revert the changes, those machines won't have enough space.
11:00 anoopcs If so better to mark it as bad for release-3.7 too until this is fixed.
11:00 nigelb I think we should actually fix it rather than mark it as bad, tbh.
11:02 nigelb anoopcs: how do I figure out if a test has been marked bad?
11:04 ramky joined #gluster-dev
11:04 purpleidea joined #gluster-dev
11:04 purpleidea joined #gluster-dev
11:06 penguinRaider joined #gluster-dev
11:26 penguinRaider joined #gluster-dev
11:36 misc mhhh
11:37 kkeithley misc: you should delete the kp alias, he doesn't work for Red Hat any more.
11:38 misc kkeithley: yeah, but that's the problem
11:38 kkeithley ?
11:39 misc usually, Fedora admins do refuse to change email if the person cannot prove that both email belong to the same person
11:39 misc so I am not sure if they follow a specific rule from infosec or anything
11:40 kkeithley why does anybody need an mail alias on gluster.org?
11:40 misc no idea, but the fact is people got them in the past
11:40 misc but that's also why I said we would need some process/governance
11:40 kkeithley AFAIK we only need some pseudo accounts with email for updating bugzilla, etc.
11:41 misc some people like to have that as a proof they are a member, this kind of things
11:41 kkeithley things were more relaxed in the past.  I'd say get rid of them, and don't allow them.
11:42 misc some people may lose access to some others accounts if we remove them
11:42 kkeithley yup, some people had gluster.com accounts too, because they were Gluster, Inc. employees. Those days are gone
11:42 kkeithley who?
11:42 kkeithley just kp?
11:42 misc so if we go this road, i rather try to at least warn a bit before, but I record your vote in favor of the glustexit :)
11:43 kkeithley If you want to be nice, give them a week, then delete
11:43 misc so we have justin, dave, vijay, amye, davemc, misc, johnmark kp, avati, ndevos and semiosis in the list
11:43 kkeithley personally, I'd just delete right now
11:43 misc kkeithley: you would make a terrifying sysadmin :)
11:43 misc (we also have "root")
11:43 kkeithley justin, davemc, johhmark, kp, avati are all gone
11:44 kkeithley ;-)
11:44 kkeithley who is dave?
11:44 misc someone with a gnsa.us email
11:45 misc I do not have much info, this was before I moved stuff to salt, git log say nothing
11:45 * kkeithley knows that absolute power corrupts absolutely
11:47 * kkeithley could play dirty...
11:47 nigelb I'd say warn and remove them.
11:47 kkeithley I suspect I know what Ric would say if I mentioned it to him
11:48 kkeithley who is asking for a new alias?
11:49 nigelb https://bugzilla.redhat.co​m/show_bug.cgi?id=1358456
11:49 glusterbot Bug 1358456: unspecified, unspecified, ---, bugs, NEW , please create a new email alias
11:49 misc vijay ask to create a lias for kp, but kp had already a alias going to his redhat email
11:49 misc I am checking with infosec if that's ok, but I am sure they will say this is ok
11:50 kkeithley meh. Okay
11:50 kkeithley whatever
11:50 anoopcs nigelb, You need to look inside those test file. For example: https://github.com/gluster/glusterfs/​blob/master/tests/basic/quota.t#L240
11:50 nigelb anoopcs: I finally figured that out :)
11:50 anoopcs nigelb, Oh..I was away for sometime.
11:51 misc I will also check with vijay if he did chec that the real person krishan contacted him, cause I do not want some random folk to get a gluster;org email
11:51 * misc will also verify the gluster org on github
11:52 nigelb misc: Unless have a good reason to keep them, I'm in favor of removing the gluster.org alias entirely.
11:52 nigelb *aliases
11:53 misc nigelb: I think we can start to comment on the ticket for that
11:53 misc I wasn't hoping people to care that much :)
11:53 misc I am not against not giving new alias, but I would like to verify that they do not receive regular mail
11:53 misc as a minima
11:59 julim joined #gluster-dev
12:00 Bhaskarakiran joined #gluster-dev
12:02 misc ok so
12:02 misc davemc used his alias for meetup.com, we might ask him to change that
12:02 misc avati seems to be on gluster-users with that email
12:02 kkeithley well, vijay opened the BZ requesting it. I know who KP is. If vijay asked, I'm 99.44% confident that it's legit.   I do think you should get rid of those old ones though, justin, davemc, johnmark, avati, etc.  Especially if they point at redhat.com email addresses.
12:03 nigelb since they're bounce already.
12:03 nigelb and have been bouncing for some time.
12:03 kkeithley but where does the alias forward to? If redhat.com, then they're useless. Just delete them.
12:03 kkeithley yes, exactly
12:04 misc well, no, because sometime, people do request the email to be sent to the manager
12:04 misc it depend country by country
12:05 kkeithley s/if they point at redhat.com email/if they point at defunct redhat.com email/
12:05 misc yeah, I can investigate that
12:06 misc just need to find where this is stored in IT config, or ask them one by one
12:06 mchangir joined #gluster-dev
12:06 misc but so, let's see for kp first
12:06 kkeithley in which case I'd say if there is still a redhat.com email address for that person that forwards to a manager, then you don't need to do anything.  If the redhat.com account is closed, delete the alias. IMO
12:07 misc kkeithley: well, in this case, that's a request to change from a redhat.com email to a external one
12:07 nigelb a defunct one, I'm guessing?
12:07 misc no idea yet
12:08 misc IIRC, that's in a specific ldap that I may not have access
12:08 kkeithley well, vijay is asking for it, so I'm not going to try to second guess the motivation. Or be the one to say he can't have it.
12:09 kkeithley (although Ric might say it. Or not.)
12:09 misc I did had a friend who connected on irc as 'misc_' and got a root password reset on a server by a friend of me, cause that friend tought this was me
12:10 misc so I rather be by default cautious and be wrong than trusting and be wrong :)
12:11 misc so the david alias is David Nalley
12:11 misc I will ping him on irc
12:11 misc mhh, ok if I fin dhim
12:11 kkeithley okay, but vijay had to authenticate to bugzilla to open the BZ, so I trust that more than I'd trust anything on IRC.
12:13 misc oh, i trust vijay to have open the bug
12:14 misc it is more to make sure that the real krishan did contact him :)
12:14 misc (and again, that's more sticking to the process, because I have no idea who was kp )
12:17 kkeithley which comes back to why I think we just not do it. You have better things to do.  I don't know why KP can't just use he gmail account for whatever he wants to do. People who know gluster history know who KP is.
12:19 spalai left #gluster-dev
12:19 kkeithley anyway, I've flogged this horse enough.
12:20 pranithk1 joined #gluster-dev
12:20 pranithk1 xavih: Need your opinion about one of the spurious failures, there is one way to fix it, but want to seek your inputs as well...
12:20 pranithk1 xavih: let me know when you are free
12:21 xavih pranithk1: I've about half an hour now...
12:21 pranithk1 xavih: should take 5 minutes...
12:21 pranithk1 xavih: The failure is in tests/basic/ec/ec-new-entry.t
12:22 pranithk1 xavih: What is happening is by the time first gluster volume heal info is executed, heal on root directory is not completing
12:22 pranithk1 xavih: Not even starting I think
12:23 pranithk1 xavih: Because of which it gives pending heal count to be 0
12:23 pranithk1 xavih: Then the tests fail
12:23 pranithk1 xavih: I changed it to get pending-heal-count after heal on root heal completes...
12:24 pranithk1 xavih: Now there is new spurious failures where after replace-brick the new brick first disconnects and then connects
12:24 pranithk1 xavih: We trigger heal on root directory only on the first CHILD_UP event, and at the time the new brick was down
12:24 pranithk1 xavih: So the heal on root directory is not completing
12:25 pranithk1 xavih: I am thinking of changing the condition to trigger heal whenever the new brick comes up instead of only the very first time...
12:25 xavih pranithk1: let me see the code...
12:25 pranithk1 xavih: Do you foresee any problems with it? If the brick goes down and goes back up too frequently then there will be lots of heals...
12:25 pranithk1 xavih: okay cool. In 'notify'
12:26 pranithk1 xavih: the condition to ec_launch_replace_heal()
12:31 xavih pranithk1: we would need a way to stop a running heal when a new one is started. The new one will take the work the old one left not done
12:32 xavih pranithk1: no, that would be bad if the root entry is already healed...
12:32 pranithk1 xavih: If the versions are same, the heal will only do inodelk and then lookups, thats all
12:32 pranithk1 xavih: It won't do readdirp
12:33 misc nigelb: have you got enough spam, or would you be ok to be added in the root alias ?
12:34 xavih pranithk1: maybe we could check if the root directory is healthy or not
12:34 xavih pranithk1: then we have some combinations
12:34 xavih pranithk1: if it's healthy and another self-heal is running, nothing else is done
12:34 pranithk1 xavih: If it is healthy, heal is anyway not run no?
12:35 xavih pranithk1: if it's healthy but there isn't a self-heal running, it will be the main self-heal daemon
12:35 pranithk1 xavih: wait you are correct
12:35 pranithk1 xavih: Even when the versions don't match it is doing crawl on the directory :-/
12:36 xavih pranithk1: if it's not healthy, and there's another self-heal, we stop it
12:36 xavih pranithk1: what do you mean ?
12:36 pranithk1 xavih: It is not easy to stop an ongoing heal.
12:37 xavih pranithk1: I'm not saying to stop the current self-heal, but to mark it so that when the current file being healed is finished, it should stop
12:37 pranithk1 xavih: Even when there are no version mismatches we are attempting full readdir and lookups of the dirs
12:37 pranithk1 xavih: okay. got it
12:38 xavih pranithk1: but this is only on root, right ?
12:38 pranithk1 xavih: I think if I remember correctly the first version of heal you wrote, you handled it right?
12:38 pranithk1 xavih: I think I screwed it up :-/
12:39 xavih pranithk1: let's start from the beginning...
12:39 pranithk1 xavih: yes it is only root
12:40 xavih pranithk1: when CHILD_UP is received, ec_launch_replace_heal() should be called always, to prevent the problem you are talking about, right ?
12:40 pranithk1 xavih: yes
12:40 pranithk1 xavih: But CHILD_UP will become CHILD_MODIFIED if the brick that was down comes back up
12:40 xavih pranithk1: basically it executes a getxattr() to force a self-heal of the root directory
12:40 pranithk1 xavih: that is correct
12:41 mchangir joined #gluster-dev
12:41 xavih pranithk1: yes, yes, I know. But this is easy to change. We are trying to see if there is a problem by starting multiple self-heals
12:41 pranithk1 xavih: yes, that is correct
12:42 xavih pranithk1: ok. So if getxattr() doesn't detect any inconsistency, it will return without any other action
12:42 kotreshhr left #gluster-dev
12:42 xavih pranithk1: the problem is when getxattr() does detect an inconsistency...
12:43 pranithk1 xavih: there is a lock, so multiple heals won't happen
12:43 xavih pranithk1: everything is ok till here ?
12:43 pranithk1 xavih: I guess we are alright. :-)
12:43 pranithk1 xavih: no, I was wrong about: "So if getxattr() doesn't detect any inconsistency, it will return without any other action"
12:43 pranithk1 xavih: It seems to perform readdirs and lookups :-(
12:44 xavih pranithk1: oops
12:44 pranithk1 xavih: I am not sure why I wrote it that way. I need to remember. May be to make sure things are fine :-/
12:44 pranithk1 xavih: Ah, now I remember. We don't call heal unless someone explicitly detects an inconsistency. In all possible codepaths
12:45 pranithk1 xavih: But this code path after replace-brick doesn't do that check
12:45 pranithk1 xavih: ec_check_status, calls it only when heals are required.
12:46 xavih pranithk1: even doing readdir and lookups on the root directory, this is harmless, isn't it ?
12:46 xavih pranithk1: it will take some more time, but that's all
12:46 pranithk1 xavih: while it is harmless, it is unnecessary
12:46 pranithk1 xavih: exactly
12:47 xavih pranithk1: yes, but this is not causing the problem we are discussing
12:47 xavih pranithk1: this can be optimized later
12:47 pranithk1 xavih: After working with you, I am also thinking about performance :-).
12:47 pranithk1 xavih: Yes I can send that as a separate patch
12:47 xavih pranithk1: the problem here is to have too many self-heals launched and waiting one another
12:47 pranithk1 xavih: So here if the number of child_ups are more than before then we can call replace-heal
12:47 pranithk1 xavih: Hmm... why too many heals?
12:48 pranithk1 xavih: Oh you mean too many disconnects and reconnects?
12:48 pranithk1 xavih: the original problem I pinged you for... got it
12:48 xavih pranithk1: if a brick reconnects too many times and we allow replace_heal to be called for each CHILD_UP, we'll have a lot of queued pending heal, wasting memory
12:49 pranithk1 xavih: yes. That is a rare thing. If we do the optimization it won't be too costly.. right?
12:49 pranithk1 xavih: Hey, you said you have 30 minutes, I thought It would only take 5 minutes, so if you have to leave, please feel free to. We can talk even on Monday.
12:51 xavih pranithk1: not sure if it would be useful to check for an existing self-heal and replace it if root directory is not healthy
12:51 xavih pranithk1: or we could even simply mark the directory entry as bad instead of starting a self-heal if there's another self-heal running in the daemon
12:52 xavih pranithk1: I agree that the optimization would also help :)
12:52 pranithk1 xavih: thinking
12:53 xavih pranithk1: I like the second option
12:53 pranithk1 xavih: which one?
12:53 lpabon joined #gluster-dev
12:53 pranithk1 xavih: marking bad?
12:53 xavih pranithk1: to only mark it as bad
12:53 xavih pranithk1: the self-heal daemon will take care of it
12:54 pranithk1 xavih: You are saying that if the versions don't match just mark ec.dirty to be non-zero?
12:54 pranithk1 xavih: and index heal will pick it up?
12:54 xavih pranithk1: basically, yes
12:54 pranithk1 xavih: hmm...
12:56 pranithk1 xavih: But to check the versions are inconsistent we anyway have to take lock and inspect right?
12:56 pranithk1 xavih: so essentially if we do the optimization it seems to be same?
12:57 xavih pranithk1: to avoid unnecessary delays, we could modify the index healer so that if it detects that the root gfid is present, take it before the others
12:57 xavih pranithk1: yes, but we need to mark it and do nothing, even with the optimization
12:57 xavih pranithk1: in case it's bad
12:59 pranithk1 xavih: So, we need to take locks and inspect root directory. If the versions don't match then mark dirty and unlock.
12:59 xavih pranithk1: I think this would be a good solution
12:59 pranithk1 xavih: In index heal, before starting the heal check if root-gfid is present in the index. If yes, trigger that heal?
13:00 xavih pranithk1: this last thing is only to make sure that root directory gets healed as soon as possible
13:00 xavih pranithk1: but not strictly needed
13:00 xavih pranithk1: (I think)
13:00 pranithk1 xavih: yeah, got it. Yes it is not strictly needed.
13:01 pranithk1 xavih: Thanks xavih for your time :-)
13:01 pranithk1 xavih: I will get this done.
13:01 xavih pranithk1: yw :)
13:01 pranithk1 xavih: How is your time next weekend?
13:01 pranithk1 xavih: I want 3.9. to have your encoding optimization feature. I want to resume where I left off.
13:02 pranithk1 xavih: I am extremely sorry for the state of things with respect to it's review
13:02 xavih pranithk1: we can retake that, but I won't have much time this weekend
13:02 pranithk1 xavih: not weekend
13:02 pranithk1 xavih: I meant next week
13:02 xavih pranithk1: ah, ok
13:02 pranithk1 xavih: :-)
13:02 xavih pranithk1: we can find some time for it :)
13:03 pranithk1 xavih: great. I will ping you a bit more until that feature is merged. I will only rebase it to latest as a punishment for not reviewing :-)
13:03 pranithk1 xavih: Gah! sorry, it is almost 30 minutes since we started the discussion.
13:03 xavih pranithk1: don't worry. I can do that on monday
13:03 pranithk1 xavih: no no, I insist
13:03 xavih pranithk1: there's also some changes I need to do...
13:03 xavih pranithk1: don't worry ;)
13:04 xavih pranithk1: but I need to leave now :P
13:04 pranithk1 xavih: oh, in that case fine
13:04 xavih pranithk1: have a nice weekend
13:04 pranithk1 xavih: que tengas un buen fin de semana
13:04 pranithk1 xavih: did I get it right?
13:04 xavih pranithk1: gracias :)
13:04 pranithk1 xavih: google translate :-)
13:04 xavih pranithk1: perfect :)
13:04 pranithk1 xavih: cool!
13:04 pranithk1 xavih: adios
13:04 xavih pranithk1: bye
13:21 penguinRaider joined #gluster-dev
13:24 pranithk1 nigelb: hey, I just saw your mail about quota.t, what is this grep for df -h, didn't get you
13:27 julim joined #gluster-dev
13:32 hagarth pranithk1: have you seen my emails about ec.t and yet another afr test problem?
13:32 pranithk1 hagarth: I sent out the fix for ec.t
13:32 hagarth pranithk1: awesome, thanks!
13:32 pranithk1 hagarth: Only two more spurious failures are left from the list sent in the last week.
13:32 pranithk1 hagarth: ec-new-entry-mark.t and entry-self-heal.t in afr
13:33 pranithk1 hagarth: Fixed 3 yesterday and today
13:34 hagarth pranithk1: great .. I am keeping a close tab on regression-test burn in to weed out these spurious failures
13:37 pranithk1 hagarth: okay..
13:42 shaunm joined #gluster-dev
14:03 nbalacha joined #gluster-dev
14:12 pkalever left #gluster-dev
14:14 pranithk1 joined #gluster-dev
14:15 pranithk1 nbalacha: hey! is statedumpdir what we want?
14:15 nbalacha nope
14:15 nbalacha I think we can get that info from env variables in the scripts
14:15 pranithk1 nbalacha: What is it that we want?
14:15 nbalacha we need a way to figure out if the quota aux mount is up and running
14:15 nbalacha for a vol
14:16 nbalacha the way it is being done now is using df -h and looking for /var/run/glusterfs/<vol<
14:16 nbalacha but as the build is in a diff dir, it tries to strip out those components etc etc
14:16 nbalacha and tht is fragile
14:17 nbalacha as the components are hardcoded
14:17 pranithk1 nbalacha: oh got it
14:17 nbalacha which means it will break the next time we change something in our regression setup
14:17 pranithk1 nbalacha: How does glusterd find if there is aux mount?
14:17 nbalacha i dont know if it does
14:18 pranithk1 nbalacha: GLUSTERFS_GET_AUX_MOUNT_PIDFILE
14:18 nbalacha how abt using the mount command instead of df -h
14:18 pranithk1 nbalacha: even I don't the code. Just browsing :-)
14:18 nbalacha let raghug or manikandan get back then?
14:18 nbalacha there might a be a nice command
14:19 nbalacha but in the meantime release-3.7 fails on netbsd
14:22 pranithk1 nbalacha: no appa, no command in cli-cmd-volume.c
14:22 pranithk1 nbalacha: we need to check if DEFAULT_VAR_RUN has the volname.pid and if it is a mount or not
14:24 nbalacha and for the particular volume we are interested in
14:24 pranithk1 nbalacha: yes yes, which is $V0.pid
14:24 nbalacha ok
14:24 pranithk1 nbalacha: Shall I code it up and check for fun?
14:25 nbalacha go right ahead :)
14:25 pranithk1 nbalacha: I mean get_aux() function
14:25 nbalacha delighted if you will :)
14:25 nbalacha you will rescue the netbsd runs :D
14:26 rraja joined #gluster-dev
14:31 hagarth joined #gluster-dev
14:57 penguinRaider joined #gluster-dev
15:10 wushudoin joined #gluster-dev
15:14 penguinRaider joined #gluster-dev
15:19 semiosis misc: kkeithley: you can definitely axe my gluster.org email forward/aliases.  i was offered an email alias and thought it was cool so I got it, but never used it, and won't miss it.  thanks!
15:31 ira joined #gluster-dev
15:31 spalai joined #gluster-dev
15:31 lalatend1M joined #gluster-dev
15:33 mchangir joined #gluster-dev
15:33 hchiramm joined #gluster-dev
15:40 lpabon joined #gluster-dev
15:41 mchangir joined #gluster-dev
15:44 misc grmblbl http://akat1.pl/?id=2
15:47 decay joined #gluster-dev
15:47 obnox joined #gluster-dev
15:47 foster joined #gluster-dev
15:47 JoeJulian joined #gluster-dev
15:47 spalai joined #gluster-dev
15:49 devyani7 joined #gluster-dev
15:50 [o__o] joined #gluster-dev
15:50 mchangir joined #gluster-dev
16:13 mchangir joined #gluster-dev
16:24 shubhendu joined #gluster-dev
16:27 shubhendu joined #gluster-dev
16:36 hchiramm joined #gluster-dev
16:39 spalai left #gluster-dev
16:40 spalai joined #gluster-dev
16:40 shaunm joined #gluster-dev
17:22 pkalever joined #gluster-dev
17:31 skoduri joined #gluster-dev
17:36 hchiramm joined #gluster-dev
17:37 julim joined #gluster-dev
17:59 glustin joined #gluster-dev
18:02 spalai left #gluster-dev
18:15 pkalever left #gluster-dev
18:20 penguinRaider joined #gluster-dev
18:22 hagarth joined #gluster-dev
18:27 nigelb pranithk1: Sorry, I'd stepped away for the evening. I'm glad you sorted that out.
18:49 shyam joined #gluster-dev
19:34 hagarth joined #gluster-dev
20:33 ira joined #gluster-dev
20:42 shyam left #gluster-dev
21:14 uebera|| joined #gluster-dev
21:14 uebera|| joined #gluster-dev
21:22 lpabon joined #gluster-dev
21:44 hagarth joined #gluster-dev
21:53 amye joined #gluster-dev
22:13 amye joined #gluster-dev

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary