Camelia, the Perl 6 bug

IRC log for #gluster, 2013-09-13

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:10 JoeJulian geewiz: You have to walk the directory tree to trigger self-heal with 3.2: find $mountpoint | xargs stat >/dev/null
00:11 geewiz JoeJulian: I tried that but there was no self-heal to the replacement server.
00:11 JoeJulian check your client log
00:12 geewiz JoeJulian: I get a lot of "transport endpoint not connected" errors.
00:13 JoeJulian fpaste your client log (not glusterd.vol.log)
00:15 geewiz It's repetitions of this: http://ur1.ca/ficq3
00:15 glusterbot Title: #39255 Fedora Project Pastebin (at ur1.ca)
00:18 JoeJulian I imagine the repetitions are caused by the repeated shutting down and restarting of the brick on 81.86, or else something else is happening that is not evident in those few lines. That also suggests that glusterd is not listening on 111.68 or is firewalled.
00:19 geewiz Judging from iftop, the clients seem to talk to the server, but only in the Kbps range. It's probably failing communication instead of self-heal.
00:20 hagarth joined #gluster
00:23 geewiz I just deleted an obsolete volume and it seems to have worked on both servers.
00:25 geewiz So it's not failing completely.
00:26 geewiz But each volume seems to trigger a "transport endpoint not connected" error on both servers.
00:27 geewiz Could it be that the volume information was not synced properly and they're talking to wrong ports?
00:28 geewiz In glusterd/vols, the respective other node always has "listen_port=0". I don't know if that's intended.
00:33 geewiz I need to go to sleep. Maybe I'll find the cause with a fresh head. Thanks!
00:50 jporterfield joined #gluster
00:51 vpshastry joined #gluster
00:57 jmeeuwen joined #gluster
01:01 samppah joined #gluster
01:02 \_pol joined #gluster
01:03 brosner joined #gluster
01:26 rjoseph joined #gluster
01:32 kevein joined #gluster
01:38 mjrosenb hey, all.  I have a dht setup, and after one of the bricks went offline, all of the clients are wonky.
01:40 harish joined #gluster
01:55 Transformer joined #gluster
01:56 Transformer left #gluster
01:56 elyograg joined #gluster
02:00 elyograg we have run into what seems to be a known problem.  still on 3.3.1, wondering if 3.4 might fix it.  Mounting gluster via NFS on Solaris requires first doing a mount from an NFS mount on a linux machine.  If the Solaris machine is rebooted, it appears that doing a mount on a linux machine required *again* before the Solaris mount will work.  We don't know how long it takes for the magic created by the linux mount to dissipate.
02:05 jag3773 joined #gluster
02:10 asias joined #gluster
02:15 premera_w joined #gluster
02:19 bharata-rao joined #gluster
02:21 DV joined #gluster
02:25 harish joined #gluster
02:28 saurabh joined #gluster
02:36 jporterfield joined #gluster
02:42 jporterfield joined #gluster
03:07 jporterfield joined #gluster
03:08 bennyturns joined #gluster
03:19 kshlm joined #gluster
03:27 kanagaraj joined #gluster
03:32 shubhendu joined #gluster
03:38 mrEriksson joined #gluster
03:39 davinder joined #gluster
03:45 jporterfield joined #gluster
03:49 itisravi joined #gluster
03:51 jporterfield joined #gluster
03:58 vpshastry joined #gluster
03:59 sgowda joined #gluster
04:02 ajha joined #gluster
04:18 rjoseph joined #gluster
04:21 bivak joined #gluster
04:21 flrichar joined #gluster
04:27 jbrooks joined #gluster
04:28 \_pol joined #gluster
04:29 badone joined #gluster
04:31 an joined #gluster
04:31 lalatenduM joined #gluster
04:40 glusterbot New news from newglusterbugs: [Bug 1004519] SMB:smbd crashes while doing volume operations <http://goo.gl/DMsNHh>
04:42 theron_ joined #gluster
04:43 ppai joined #gluster
04:45 psharma joined #gluster
04:51 jporterfield joined #gluster
04:55 nshaikh joined #gluster
04:57 shylesh joined #gluster
04:57 shruti joined #gluster
05:00 dusmant joined #gluster
05:01 shireesh joined #gluster
05:08 mjrosenb is it possible that the client isn't talking to one of the bricks?
05:08 CheRi joined #gluster
05:09 mjrosenb https://gist.github.com/6546953
05:09 glusterbot Title: xcut (at gist.github.com)
05:09 mjrosenb I wonder why that is happening.
05:10 mjrosenb afiact, the python error is unrelated to being unable to connect to the daemon.
05:11 hagarth mjrosenb: glusterd doesn't seem to be running .. is this a server or a client?
05:11 jporterfield joined #gluster
05:12 shylesh joined #gluster
05:13 rjoseph joined #gluster
05:18 timothy joined #gluster
05:28 vpshastry joined #gluster
05:30 ndarshan joined #gluster
05:30 ababu joined #gluster
05:34 bala joined #gluster
05:39 badone joined #gluster
05:45 mohankumar joined #gluster
05:46 bala joined #gluster
05:53 satheesh joined #gluster
05:57 aravindavk joined #gluster
06:01 mjrosenb hagarth: client.
06:01 hagarth mjrosenb: peer status cannot be executed on the client
06:02 mjrosenb hagarth: ahh, that would do it.
06:02 mjrosenb hagarth: so can I find out which bricks this client is talking to?
06:03 mjrosenb because I have two machines, and they are getting different results.
06:04 hagarth mjrosenb: netstat helps, you can also look at volume status <volname> client on the server
06:05 mjrosenb gluster> volume status magluster client
06:05 mjrosenb Unable to obtain volume status information.
06:17 mjrosenb hagarth: any ideas?
06:20 johnmwilliams joined #gluster
06:20 kPb_in_ joined #gluster
06:20 vshankar joined #gluster
06:20 jporterfield joined #gluster
06:21 rjoseph joined #gluster
06:22 jtux joined #gluster
06:25 rgustafs joined #gluster
06:28 jcsp joined #gluster
06:30 anands joined #gluster
06:31 wgao joined #gluster
06:40 vshankar joined #gluster
06:48 hagarth mjrosenb: need to check your glusterd log files for that
06:51 mjrosenb hagarth: I started the brick in debug mode :-)
06:52 StarBeast joined #gluster
06:53 raghu joined #gluster
06:59 mjrosenb hagarth: I don't see any activity from the client thatt isn't working properly
06:59 mjrosenb hagarth: can I ask the client to re-scan or something?
07:02 hagarth mjrosenb: attempt a remount of the client?
07:05 ngoswami joined #gluster
07:07 ctria joined #gluster
07:08 eseyman joined #gluster
07:09 mooperd_ joined #gluster
07:11 mjrosenb hagarth: you mean, unmounting it and re-mounting it?
07:14 dusmant joined #gluster
07:17 hybrid512 joined #gluster
07:19 haritsu joined #gluster
07:21 puebele joined #gluster
07:33 jporterfield joined #gluster
07:38 lalatenduM joined #gluster
07:42 puebele1 joined #gluster
07:44 morse joined #gluster
07:48 vshankar joined #gluster
07:54 ProT-0-TypE joined #gluster
07:59 jcsp joined #gluster
08:05 andreask joined #gluster
08:08 rjoseph joined #gluster
08:11 mbukatov joined #gluster
08:18 ProT-0-TypE joined #gluster
08:29 jporterfield joined #gluster
08:38 ProT-0-TypE joined #gluster
08:38 asias joined #gluster
08:46 vimal joined #gluster
08:47 mjrosenb hagarth_: so there's no way of doing this without unmounting the gluster volume on the client?
08:49 ababu joined #gluster
08:53 vpshastry left #gluster
08:56 vpshastry joined #gluster
08:57 an joined #gluster
09:00 samsamm joined #gluster
09:03 ndarshan joined #gluster
09:05 ababu joined #gluster
09:06 jre1234 joined #gluster
09:10 andreask joined #gluster
09:12 sac joined #gluster
09:30 hagarth joined #gluster
09:39 vshankar joined #gluster
09:48 manik joined #gluster
09:49 jporterfield joined #gluster
09:58 dusmant joined #gluster
09:58 bulde joined #gluster
10:13 jporterfield joined #gluster
10:19 nshaikh joined #gluster
10:20 jporterfield joined #gluster
10:26 andreask joined #gluster
10:32 edward2 joined #gluster
10:32 jporterfield joined #gluster
10:36 harish joined #gluster
10:47 kbsingh joined #gluster
10:52 Elendrys joined #gluster
10:55 jtux joined #gluster
11:04 andreask joined #gluster
11:19 hagarth joined #gluster
11:25 CheRi joined #gluster
11:28 mooperd__ joined #gluster
11:32 bulde joined #gluster
11:36 aib_007 joined #gluster
11:38 vshankar joined #gluster
11:38 aravindavk joined #gluster
11:49 dusmant joined #gluster
11:50 kanagaraj joined #gluster
11:55 harish joined #gluster
11:57 RedShift joined #gluster
11:57 manik joined #gluster
12:13 an joined #gluster
12:35 rfortier joined #gluster
12:43 rfortier joined #gluster
12:51 ndarshan joined #gluster
12:55 hagarth joined #gluster
13:01 vshankar joined #gluster
13:03 bulde joined #gluster
13:04 jdarcy joined #gluster
13:05 hagarth1 joined #gluster
13:10 ababu joined #gluster
13:14 nshaikh joined #gluster
13:20 rcheleguini joined #gluster
13:22 dusmant joined #gluster
13:25 jporterfield joined #gluster
13:28 B21956 joined #gluster
13:41 lpabon joined #gluster
13:41 ababu joined #gluster
13:42 failshell joined #gluster
13:44 elyograg we have run into what seems to be a known problem.  still on 3.3.1, wondering if 3.4 might fix it.  Mounting gluster via NFS on Solaris requires first doing an NFS mount on a linux machine.  If the Solaris machine is later rebooted, it appears that doing another NFS mount on a linux machine is required *again* before the Solaris mount will work.  We don't know how long it takes for the magic created by the linux mount to dissipate.
13:44 chirino joined #gluster
13:44 elyograg catching a train to work now, will be able to read responses when I get there.
13:52 kaptk2 joined #gluster
13:53 bugs_ joined #gluster
14:00 vpshastry left #gluster
14:01 bennyturns joined #gluster
14:02 Technicool joined #gluster
14:02 mohankumar joined #gluster
14:02 bennyturns joined #gluster
14:10 chirino joined #gluster
14:12 l0uis joined #gluster
14:13 dusmant joined #gluster
14:17 theron joined #gluster
14:20 dmojoryder has anyone encountered frequent 'page allocation failure' msgs with the stack always containing tcp calls, and if so is there a tcp/kernel param to address?
14:21 Staples84 joined #gluster
14:30 jporterfield joined #gluster
14:48 theron jclift, around?
14:52 \_pol joined #gluster
14:53 nueces joined #gluster
14:54 chirino joined #gluster
14:59 kaptk2 Can anybody explain how I would go about setting up HA with KVM VM's using Gluster 3.4?
14:59 jag3773 joined #gluster
15:01 kaptk2 It seems to me you would still have a single point of failure
15:01 JoeJulian ie. The box hosting the VM.
15:01 haritsu joined #gluster
15:02 \_pol joined #gluster
15:02 JoeJulian GlusterFS provides HA storage for the VM images, but there's no hypervisor capable of replication of the vm core. That would make for an extremely slow vm.
15:04 kaptk2 JoeJulian: right, so when I define a disk. I say something like -drive file=gluster://1.2.3.4/data/a.qcow
15:04 kaptk2 if the box 1.2.3.4 goes down... what happens?
15:04 vpshastry joined #gluster
15:04 kaptk2 how would the box running the VM look to a different host
15:05 kbsingh theron: jclift says his laptop died
15:05 kbsingh theron: it was bloody, much violent, a few bystanders pickedup injuries too- but we managed to save jclift
15:07 daMaestro joined #gluster
15:11 theron kbsingh, lol ask him if his preso from earlier was recored :) would love to watch it.
15:11 theron and who is presenting right now at the dojo? great preso.
15:11 theron (citrix guy I think)
15:12 kaptk2 would the best plan be to just run it on the localhost? That way if that machine failed you could bring the box up on your other host and the volume location would remain?
15:12 kbsingh theron: its being recorded, we should have the talks online on a youtube channel in a few days. the person to track down for that is Evolution in #centos-devel
15:12 kbsingh theron: https://twitter.com/franciozzy
15:12 glusterbot Title: Felipe Franciosi (franciozzy) on Twitter (at twitter.com)
15:12 kbsingh its a great talk
15:13 theron yea my feed hiccuped for  second and I actually let out an audible "arrrrgggghh".
15:13 * theron follows
15:19 kbsingh theron: are you using the http://www.centos.org/media.html page ?
15:19 kbsingh a couple of people complained about audio on the wiki page ( no idea why, its using the same iframe but a diff size )
15:24 diegows_ joined #gluster
15:26 theron kbsingh, I'm using http://www.centos.org/media.html yes.  Audio is fine here, no audio issue.  odd latency to picture in picture, but it's not bad.
15:27 theron kbsingh, watching it full screen as well.
15:32 B21956 joined #gluster
15:33 zerick joined #gluster
15:41 lpabon joined #gluster
15:47 B21956 joined #gluster
16:00 kbsingh theron: passed onto the guys doing the video
16:04 kPb_in_ joined #gluster
16:06 an joined #gluster
16:18 B21956 joined #gluster
16:19 B21956 left #gluster
16:20 B21956 joined #gluster
16:20 B21956 left #gluster
16:33 chirino joined #gluster
16:35 Mo_ joined #gluster
16:37 bulde joined #gluster
16:40 JoeJulian ~mount server | kaptk2
16:40 glusterbot kaptk2: The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds
16:41 dusmant joined #gluster
16:46 ndevos @rrnds
16:46 glusterbot ndevos: I do not know about 'rrnds', but I do know about these similar topics: 'rrdns'
16:46 \_pol joined #gluster
16:47 ndevos ah, ,,(mount server) has the typo
16:47 glusterbot The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds
16:47 kaptk2 @rrdns
16:47 glusterbot kaptk2: You can use rrdns to allow failover for mounting your volume. See Joe's tutorial: http://goo.gl/ktI6p
16:47 \_pol_ joined #gluster
16:47 hagarth joined #gluster
16:48 * ndevos never remembers how the commands for glusterbot to correct that - is there a wikipage?
16:48 kaptk2 JoeJulian: thanks, that explains it quite well
16:48 mjrosenb I had a problem last night.  It looks like whenever a brick goes down, and comes back up, every client gets wonky until they get remounted.
16:48 mjrosenb is there a way to get them back into a sane state without unmounting?
16:51 elyograg anyone see my question about three hours ago?
16:53 mjrosenb elyograg: why do you say it is a known problem?
16:55 bulde joined #gluster
16:57 elyograg mjrosenb: because we figured out we had to do it with a google search.  there was a blog entry someone had, and at least one mailing list thread.
16:57 \_pol joined #gluster
16:57 elyograg what we didn't know is that the linux mount would be required if we need to reboot the solaris box.
16:58 elyograg s/required/required again/
16:58 glusterbot What elyograg meant to say was: what we didn't know is that the linux mount would be required again if we need to reboot the solaris box.
16:58 \_pol__ joined #gluster
17:04 \_pol joined #gluster
17:04 zaitcev joined #gluster
17:05 hagarth joined #gluster
17:06 JoeJulian mjrosenb: short of diagnosing the problem and fixing it, a workaround may be to trigger some client graph change, ie. setting performance.client-io-threads then resetting it again...
17:07 JoeJulian elyograg: fascinating. If I were trying to diagnose that, I would probably do wireshark comparisons to see what's different and see if that offers some sort of clue on what to try on the solaris box.
17:08 mjrosenb JoeJulian: on the brick?
17:09 JoeJulian The server you're mounting from would be logical. You're looking for the nfs handshake to see what linux does that solaris doesn't.
17:09 JoeJulian oops, mixing up convos.
17:10 JoeJulian Still haven't had coffee and am fixing 3 other issues at the same time. I guess 5 is my limit without caffiene
17:11 \_pol_ joined #gluster
17:12 JoeJulian mjrosenb: You're trying to trigger the client to reload the graph (graph is the stack of translators). By changing a volume setting (gluster volume set ...) that affects the client, such as client-io-threads, that would trigger that reload. It shouldn't interfere with the mount nor any applications using it.
17:12 glusterbot New news from newglusterbugs: [Bug 1002556] running add-brick then remove-brick, then restarting gluster leads to broken volume brick counts <http://goo.gl/YqOYSj>
17:14 mjrosenb JoeJulian: i'll gladly buy you a coffee.  you have helped me so much. :-)
17:14 JoeJulian :D
17:20 shylesh joined #gluster
17:22 bulde joined #gluster
17:48 andreask joined #gluster
17:48 andreask joined #gluster
18:07 kPb_in_ joined #gluster
18:09 an__ joined #gluster
18:29 ProT-0-TypE joined #gluster
18:33 vimal joined #gluster
18:37 ujjain joined #gluster
18:56 vpshastry left #gluster
19:12 dbruhn joined #gluster
19:25 edward1 joined #gluster
19:25 mjrosenb JoeJulian: I have another question: I think I asked this already, but I'll ask again since I don't remember the answer
19:26 mjrosenb what is the .glusterfs directory for on a DHT node, and if creating all of the hardlinks fails, what will the side effects be
19:26 mjrosenb also, relatedly, if I rsync'ed them, and creating the hardlinks failed due to different filesystems, but now I have tons of large files in .glusterfs, can I just nuke them?
19:27 neofob left #gluster
19:29 vagif_verdi joined #gluster
19:30 vagif_verdi hi, i'm looking for dfs and i have questions to see if glusterfs fits my use case
19:43 vagif_verdi left #gluster
19:56 XpineX joined #gluster
19:59 jporterfield joined #gluster
20:13 jporterfield joined #gluster
20:13 \_pol joined #gluster
20:29 lkoranda joined #gluster
20:30 failshel_ joined #gluster
20:36 jporterfield joined #gluster
20:42 jporterfield joined #gluster
20:46 \_pol_ joined #gluster
20:47 \_pol_ joined #gluster
21:10 samppah_ joined #gluster
21:11 theron_ joined #gluster
21:11 sac`away` joined #gluster
21:12 johnmark_ joined #gluster
21:19 sticky_afk joined #gluster
21:19 stickyboy joined #gluster
21:33 jporterfield joined #gluster
21:41 mjrosenb so I'm getting this, and un-mounting and re-mounting isn't helping: https://gist.github.com/6556436
21:41 glusterbot Title: xcut (at gist.github.com)
21:44 bennyturns joined #gluster
21:57 mjrosenb [2013-09-13 14:39:34.255303] D [dht-common.c:434:dht_lookup_dir_cbk] 0-magluster-dht: lookup of /tmp on magluster-client-0 returned error (Transport endpoint is not connected)
21:57 mjrosenb [2013-09-13 14:39:34.255855] I [dht-layout.c:593:dht_layout_normalize] 0-magluster-dht: found anomalies in /tmp. holes=1 overlaps=0
21:58 mjrosenb [2013-09-13 14:39:34.255908] D [dht-layout.c:609:dht_layout_normalize] (-->/usr/lib64/libgfrpc.so.0​(rpc_clnt_handle_reply+0xa5) [0x7f53d5362fa5] (-->/usr/lib64/glusterfs/3.3.0/xlator/prot​ocol/client.so(client3_1_lookup_cbk+0x455) [0x7f53d0783835] (-->/usr/lib64/glusterfs/3.3.0/xlator/clust​er/distribute.so(dht_lookup_dir_cbk+0x5fc) [0x7f53d053703c]))) 0-magluster-dht: path=/tmp err=Transport endpoint is not connected on subvol=magluster-client-0
21:58 mjrosenb [2013-09-13 14:39:34.255931] D [dht-common.c:482:dht_lookup_dir_cbk] 0-magluster-dht: fixing assignment on /tmp
21:58 mjrosenb [2013-09-13 14:39:34.255944] W [dht-selfheal.c:875:dht_selfheal_directory] 0-magluster-dht: 1 subvolumes down -- not fixing
21:58 mjrosenb [2013-09-13 14:39:34.256062] W [dht-layout.c:186:dht_layout_search] 0-magluster-dht: no subvolume for hash (value) = 1124730006
21:58 mjrosenb this client is basically constantly seeing this while running.
21:59 mjrosenb so it looks like it can't contact one of the bricks
21:59 mjrosenb but both bricks are running.
21:59 mjrosenb and I can ping them both.
22:13 jporterfield joined #gluster
22:24 nueces joined #gluster
22:34 micu3 joined #gluster
23:33 sprachgenerator joined #gluster
23:42 theron joined #gluster
23:49 jporterfield joined #gluster
23:55 jporterfield joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary