Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-04-15

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:20 bala joined #gluster
00:21 yinyin joined #gluster
00:34 kkeithley joined #gluster
01:11 sulky joined #gluster
01:15 gtobon joined #gluster
01:16 sulky joined #gluster
01:32 gtobon joined #gluster
01:40 yinyin joined #gluster
01:59 lpabon joined #gluster
01:59 baojg joined #gluster
02:04 harish_ joined #gluster
02:07 itisravi_ joined #gluster
02:08 Ark joined #gluster
02:08 rastar joined #gluster
02:17 haomaiwa_ joined #gluster
02:31 haomaiwa_ joined #gluster
02:51 itisravi_ joined #gluster
03:00 haomaiwang joined #gluster
03:01 nightwalk joined #gluster
03:23 Peanut__ joined #gluster
03:27 gmcwhistler joined #gluster
03:28 jag3773 joined #gluster
03:39 gdubreui joined #gluster
03:43 kanagaraj joined #gluster
03:43 shubhendu joined #gluster
03:44 sputnik13 joined #gluster
03:45 dusmant joined #gluster
03:46 itisravi_ joined #gluster
03:52 kumar joined #gluster
04:00 kb joined #gluster
04:08 bharata-rao joined #gluster
04:15 Ark joined #gluster
04:17 yinyin_ joined #gluster
04:17 ppai joined #gluster
04:20 sahina joined #gluster
04:22 ndarshan joined #gluster
04:31 aravindavk joined #gluster
04:33 atinm joined #gluster
04:36 vsa joined #gluster
04:55 benjamin_____ joined #gluster
04:59 ravindran1 joined #gluster
05:05 purpleidea jclift: criticalhammer i know that people are working on erasure coding for gluster fs. if you're interested in having this feature sooner, i would post on the mailing list about wanting to make it work and even help with patches :)
05:11 Ark joined #gluster
05:14 spandit joined #gluster
05:16 bala joined #gluster
05:17 vpshastry1 joined #gluster
05:18 hagarth joined #gluster
05:20 pvh_sa joined #gluster
05:24 mjrosenb joined #gluster
05:24 mjrosenb morning all.  I'm trying to re-start a brick
05:25 mjrosenb Gluster process                                         Port    Online  Pid
05:25 mjrosenb ------------------------------------------------------------------------------
05:25 mjrosenb Brick memoryalpha:/local                                24009   N       N/A
05:25 mjrosenb and I get that.
05:25 haomaiwa_ joined #gluster
05:25 mjrosenb it looks like glusterd is starting, but it isn't spawning any child processes.
05:25 meghanam joined #gluster
05:25 meghanam_ joined #gluster
05:30 * mjrosenb is going to start pasting random snippets of the logs, on the off chance tha glusterbot recognizes it
05:30 mjrosenb [2014-04-15 01:15:39.379352] I [glusterd-handler.c:411:glusterd_friend_find] 0-glusterd: Unable to find peer by uuid
05:30 mjrosenb [2014-04-15 01:15:39.379300] I [glusterd-rpc-ops.c:880:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 00000000-0000-0000-0000-000000000000
05:30 mjrosenb [2014-04-15 01:15:39.379372] C [glusterd-rpc-ops.c:886:glusterd3_1_stage_op_cbk] 0-: Stage response received from unknown peer: 00000000-0000-0000-0000-000000000000
05:30 mjrosenb [2014-04-15 01:15:39.379973] W [glusterd-op-sm.c:2262:glusterd_op_modify_op_ctx] 0-management: op_ctx modification failed
05:31 mjrosenb [2014-04-14 23:59:59.298771] E [socket.c:1715:socket_connect_finish] 0-management: connection to  failed (Connection refused)
05:32 mjrosenb ahh, bingo!
05:35 baojg joined #gluster
05:41 elico joined #gluster
05:45 nthomas joined #gluster
05:45 nishanth joined #gluster
05:47 raghu joined #gluster
05:56 nshaikh joined #gluster
05:57 vimal joined #gluster
06:02 kumar joined #gluster
06:14 Philambdo joined #gluster
06:15 baojg joined #gluster
06:16 lalatenduM joined #gluster
06:35 ajha joined #gluster
06:39 portante joined #gluster
06:47 rahulcs joined #gluster
06:52 ravindran1 joined #gluster
06:56 ravindran1 joined #gluster
06:58 edward1 joined #gluster
06:58 rahulcs joined #gluster
06:59 ravindran1 joined #gluster
07:01 eseyman joined #gluster
07:01 Ark joined #gluster
07:02 portante joined #gluster
07:02 ctria joined #gluster
07:02 psharma joined #gluster
07:05 pk joined #gluster
07:22 pvh_sa joined #gluster
07:31 rahulcs joined #gluster
07:34 ravindran1 joined #gluster
07:37 mjrosenb 26139: getpeername(9,{ AF_UNIX "/var/run/rpcbi<80>SU^A^H" },0x7fffffffc8cc) = 0 (0x0)
07:37 mjrosenb that looks strange.
07:41 fsimonce joined #gluster
07:45 rahulcs joined #gluster
07:46 pk left #gluster
07:46 X3NQ joined #gluster
07:58 rgustafs joined #gluster
08:05 ceiphas joined #gluster
08:07 ceiphas hi folks, i have a new problem... i have a gluster volume mounted with fuse and then re-exported using kernel nfs. when a nfs v2 client mounts this export and tries to do a "ls <mountpoint>" it gets "file too large" errors
08:17 haomaiw__ joined #gluster
08:19 social ceiphas: /o\ why, that must be by definiton an RPC shit storm
08:20 ceiphas social, ???
08:22 ndevos ceiphas: my guess would be that you need to enable 32-bit inodes on the fuse mount
08:23 ndevos ceiphas: and you should be aware of the README.nfs from the fuse documentation
08:25 ndevos ceiphas: something like this should do it: mount -t glusterfs -o enable-ino32 $server:$volume /exports/$volume
08:26 liquidat joined #gluster
08:27 ceiphas ndevos, the enable-ino32 changed nothing
08:28 ndevos ceiphas: have you re-exported the filesystem, and re-mounted it on the client?
08:28 ceiphas jep
08:29 ndevos ceiphas: you can enable 'rpcdebug -m nfs' (or something like that) and see if there are any obvious issues
08:29 ceiphas client or server?
08:29 ndevos client
08:31 ceiphas i think i need to enable nfs debugging first...
08:32 ndevos try: rpcdebug -m nfs -s client
08:33 ndevos and: rpcdebug -m nfs -s proc ; rpcdebug -m nfs -s xdr
08:33 johnmwilliams__ joined #gluster
08:33 ndevos and maybe: rpcdebug -m nfs -s nfs
08:35 ndevos debuggin will be written to /var/log/messages (or dmesg)
08:35 fyxim_ joined #gluster
08:38 saurabh joined #gluster
08:42 saravanakumar joined #gluster
09:12 haomaiwang joined #gluster
09:17 haomai___ joined #gluster
09:21 rahulcs joined #gluster
09:22 calum_ joined #gluster
09:32 RameshN joined #gluster
09:38 glusterbot New news from newglusterbugs: [Bug 1026977] [abrt] glusterfs-3git-1.fc19: CThunkObject_dealloc: Process /usr/sbin/glusterfsd was killed by signal 11 (SIGSEGV) <https://bugzilla.redhat.com/show_bug.cgi?id=1026977>
09:41 rahulcs joined #gluster
09:47 RameshN joined #gluster
09:50 ctria joined #gluster
10:14 rahulcs joined #gluster
10:19 Guest34500 joined #gluster
10:21 knfbny joined #gluster
10:22 auganov joined #gluster
10:32 ira_ joined #gluster
10:36 dusmantkp_ joined #gluster
10:42 edward1 joined #gluster
10:44 kanagaraj joined #gluster
10:46 ctria joined #gluster
10:57 Philambdo joined #gluster
11:00 iamben_tw joined #gluster
11:05 gdubreui joined #gluster
11:08 lyang0 joined #gluster
11:08 qdk joined #gluster
11:08 Philambdo1 joined #gluster
11:29 ceiphas joined #gluster
11:35 ppai joined #gluster
11:36 aravindavk joined #gluster
11:45 Slashman joined #gluster
11:45 Slash joined #gluster
12:06 benjamin_____ joined #gluster
12:08 rahulcs joined #gluster
12:10 rahulcs_ joined #gluster
12:12 itisravi joined #gluster
12:13 jmarley joined #gluster
12:13 jmarley joined #gluster
12:14 ppai joined #gluster
12:34 qdk joined #gluster
12:36 chirino joined #gluster
12:39 Ark joined #gluster
12:45 aravindavk joined #gluster
12:49 jag3773 joined #gluster
13:03 qdk joined #gluster
13:04 rahulcs joined #gluster
13:11 B21956 joined #gluster
13:16 rahulcs joined #gluster
13:21 aravindavk joined #gluster
13:22 ccha hi semiosis
13:23 ccha glusterfs-client 3.4.3 doen't have dependacy on fuse, is it normal ?
13:23 ccha for ubuntu
13:26 dbruhn joined #gluster
13:28 lalatenduM ccha, I haven't tried the ubuntu client , but I think the package would be having required fuse in it , so should be ok
13:30 Philambdo joined #gluster
13:44 ccha lalatenduM: apt-cache show glusterfs-client
13:44 ccha Depends: upstart-job, python, glusterfs-common (>= 3.4.3-ubuntu1~precise2)
13:45 ccha and for 3.3
13:45 ccha Depends: upstart-job, python, fuse-utils (>= 2.7.4), glusterfs-common (>= 3.3.2-ubuntu1~precise2
13:47 lalatenduM ccha, just try it, I think it would work fine
13:48 rgustafs joined #gluster
13:49 ccha lalatenduM: no I mean I upgraded it
13:49 lalatenduM ccha, are facing any issue with it?
13:49 ccha and I read package fuse is on the apt autoremove list
13:49 lalatenduM s/are/are you/
13:49 glusterbot What lalatenduM meant to say was: ccha, are you facing any issue with it?
13:50 ccha so it I agree with aptt autoremove, then fuse will remove
13:50 JoeJulian mjrosenb: wierd. Are you sure all the versions match? That's the first thing I would look at with rpc errors.
13:51 ccha fuse is a orphan package since I upgrade to 3.4.3
13:52 ccha JoeJulian: so you use glusterfs-client on ubuntu ?
13:52 JoeJulian ccha: nope
13:52 ccha ok
13:52 JoeJulian what does fuse-utils provide?
13:53 JoeJulian I can't think of any features in that which should be required.
13:54 JoeJulian And nobody's come in here unable to mount their volumes in ubuntu and I'm sure they would have if a requirement was missing.
13:54 ccha fuse-utils depends on pakcge fuse
13:54 JoeJulian I mean it's been what, 6 months?
13:57 JoeJulian I think if I were in your shoes and considering uninstalling something from a production environment, I would either leave fuse installed, or test my concerns in a vm.
13:57 ccha I mean glusterfs-client should have dependancy for fuse, right ?
13:58 JoeJulian ccha: I don't know ubuntu packaging. fuse is part of the kernel.
14:01 JoeJulian I don't immediately seen any dependencies in the client that should require any fuse libraries or utilities. As long as /dev/fuse is there the client should be able to mount.
14:02 ccha hum I think it on our systems there is no fuse package installed by default
14:02 JoeJulian I can bug semiosis to log in. I'll be seeing him in a little over an hour.
14:03 ccha with previous glusterfs-client, fuse installed by dependencies
14:04 JoeJulian Since that changed, my expectation would be that he changed it because it wasn't necessary, but that's just a guess based on my understanding of his methods.
14:04 ccha so now I should apt-get install fuse to keep it, not as dependencies but as installed package to keep it
14:04 JoeJulian If you need it for something other than glusterfs, sure.
14:05 wgao joined #gluster
14:05 ccha nope I need fuse  just for glusterfs
14:05 JoeJulian Then why complicate things.
14:06 JoeJulian Install glusterfs-client, mount your volume, have a beer.
14:06 ccha and install fuse too
14:06 JoeJulian WHY?
14:07 ccha because fuse on our systems are not installed by default
14:07 JoeJulian Can you mount a volume without that package?
14:08 ccha nope, I can't with fuse package
14:08 ccha s/with/without/
14:08 glusterbot What ccha meant to say was: nope, I can't without fuse package
14:08 JoeJulian Lol, ok. Now we're getting somewhere. :D
14:08 JoeJulian I'll tell semiosis that you've reported a bug with his packaging at the board meeting this morning.
14:13 Andyy2 joined #gluster
14:14 sroy joined #gluster
14:15 JoeJulian alrighty then... I'm out for a while. Red Hat Summit's going to be keeping me away most of the day.
14:18 aravindavk joined #gluster
14:20 japuzzo joined #gluster
14:21 lalatenduM JoeJulian, you might have seen the discussion in gluster-devel mailing list about documentation , Thinking of adding you as a reviewer for some of them , is it ok
14:21 lalatenduM ?
14:25 gmcwhistler joined #gluster
14:25 lalatenduM JoeJulian, because it would good to your perspective on these, if it really addressing things which users will come across while trying new things
14:25 rbw joined #gluster
14:29 LoudNoises joined #gluster
14:31 rbw Hello. what's the best way to accomplish load balancing in gluster?
14:32 rbw I've looked at rrdns and specifying multiple cluster nodes with the mount command (comma separated nodes)
14:33 dbruhn rbw, can you use the gluster native client?
14:33 rbw dbruhn: hmm.. you mean, using the native client for load balancing?
14:33 dbruhn the gluster native client already does what you want it to
14:34 rbw ok, so you basically specify cluster nodes in the gluster config?
14:34 rbw and it'd be aware of i.e. a node dying?
14:34 dbruhn When the client connects it gets a manifest of all of the servers, and it communicates to all of the cluster nodes independently
14:35 rbw hmm
14:35 mjrosenb JoeJulian: you mean the versions on the bricks, or the versions of the individual programs on the brick that isn't coming up?
14:36 rbw dbruhn: I'm using the mount command to mount a glusterfs volume. is there a better way?
14:37 dbruhn rbw, no
14:37 rbw dbruhn: because using mount you need to specify a host which the volume resides on
14:37 rbw dbruhn: if only specifying one node, and you stop glusterd on that one, the mount breaks.
14:37 mjrosenb glusterfs 3.3.0 built on Sep  4 2012 16:50:45
14:38 mjrosenb for both glusterd and glusterfsd
14:38 dbruhn rbw, what is your volume configuration?
14:38 rbw dbruhn: however, specifying multiple cluster nodes (i.e. glu01:/vol1,glu02:/vol2) works as intended
14:38 mjrosenb aaand on the brick that works; glusterfs 3.3.0 built on Sep 12 2013 05:31:04
14:38 rbw dbruhn: none. just started testing out gluster. :P
14:39 rahulcs joined #gluster
14:40 rbw dbruhn: so.. I need to specify cluster nodes in a configuration file, then mount a single target, and the load balancing works automagically?
14:40 dbruhn ok, if you have client1 and mount server1 and you have 4 gluster servers and server 1 dies for some reason, client1 when it first connected received a manifest of all of the servers, and will continue to communicate with all of the remaining servers that are still available
14:41 rbw dbruhn: all right
14:41 dbruhn mjrosenb, JoeJulian is being kept busy with Redhat summit stuff and went to go take care of some stuff with that.
14:41 rbw dbruhn: I tested that, but it didn't work properly
14:41 dbruhn what was your volume configuration?
14:41 mjrosenb dbruhn: fun.
14:42 rbw dbruhn: volume configuration... client side?
14:42 mjrosenb so the biggest quesion i have is about files that are accessed.
14:42 dbruhn rbw, no you had to create a gluster volume, what did you configure it as.
14:44 mjrosenb http://dpaste.com/1780436/
14:44 glusterbot Title: dpaste: #1780436: xcut, by mjrosenb (at dpaste.com)
14:44 rbw dbruhn: gluster volume create gv0 replicate 2 transport tcp gs1:/mnt/gv0 gs2:/mnt/gv0
14:44 rbw dbruhn: (if I remember correctly)
14:45 dbruhn were you using a single network for the servers and clients to talk to each other?
14:45 rbw dbruhn: then I mounted the volume from the client using mount .. gs1:/gv0
14:45 rbw dbruhn: and stopped the network interface on gs1, and it stopped working
14:46 mjrosenb it /looks/ like the string "/var/run/rpcbind.sock" got replaced with "/var/run/rpcbi%s", and random stuff is getting spewed into a sring that is then attempted to be opened.
14:46 rbw dbruhn: then I tried specifying both nodes in the mount command, mount .. gs1:/gv0,gs2:/gv0
14:46 rbw dbruhn: and stopped gs1, and it still worked
14:46 rbw dbruhn: yes, single network
14:46 dbruhn rbw, the mount command doesn't need to connect to another node. And what was your indication that the first one wasn't working?
14:47 rbw dbruhn: it timed out when I was trying to read/write.
14:47 rbw dbruhn: and told me the endpoint was down
14:50 rbw dbruhn: any ideas?
14:52 rahulcs joined #gluster
14:53 glusterbot New news from resolvedglusterbugs: [Bug 1026977] [abrt] glusterfs-3git-1.fc19: CThunkObject_dealloc: Process /usr/sbin/glusterfsd was killed by signal 11 (SIGSEGV) <https://bugzilla.redhat.com/show_bug.cgi?id=1026977>
14:55 dbruhn rbw, I would run your test again. The client has a 42 second timeout and it could have been hung waiting on that.
14:55 dbruhn Also, make sure all of your servers are up and able to talk to the client
14:56 dbruhn I've seen a lot of people have a client connecting to one server and have the volume working, not realizing their client isn't connected to the second server
14:57 jmarley joined #gluster
14:57 jmarley joined #gluster
14:57 tdasilva joined #gluster
14:59 harish_ joined #gluster
15:01 rbw dbruhn: OK, I'll re-run my tests. Thanks. :)
15:02 rbw dbruhn: btw - another stupid question; are read/write requests automatically balanced, or just failover?
15:02 dbruhn automatically balanced... kind of
15:02 dbruhn writes happen in parallel to replication group members
15:03 dbruhn reads by default are first brick server to respond
15:03 rbw dbruhn: ok, cool. thanks.
15:03 dbruhn one thing a lot of people have a hard time wrapping their head around at first is that the client actually talks to all of the servers directly
15:08 rbw yeah .. :)
15:09 rbw what if a cluster node dies, then goes back up again?
15:09 dbruhn Then it would come back online and start working properly
15:09 dbruhn And the client will catch the one up that was offline
15:09 dbruhn if the client disconnects then the self heal will fix it
15:10 rbw nice. this sounds almost too simple ;)
15:10 dbruhn hahahah
15:10 lpabon joined #gluster
15:18 benjamin_____ joined #gluster
15:19 jag3773 joined #gluster
15:21 mjrosenb dbruhn: so, with the 'clients talk to all servers directly', how does the nfs / samba integration work?
15:21 mjrosenb since those don't involve talking with the servers directly.
15:22 dbruhn the NFS integration actually is a NFS server sitting in front of the glusternative client
15:23 dbruhn so all of the gluster specific stuff happens behind the NFS export
15:23 ndevos mjrosenb: you can see it like a proxy, or gateway
15:24 mjrosenb ahh, and that has a chance in hell of working because it just speaks gluster on one end, and nfs on the other
15:24 mjrosenb no fuse involved.
15:24 ndevos correct :)
15:27 wgao joined #gluster
15:32 [o__o] joined #gluster
15:33 mjrosenb ok, so another question about nfs... why is it even trying to spin up an nfs server to begin with?
15:33 mjrosenb I'm somewhat sure I disabled that.
15:33 daMaestro joined #gluster
15:35 ndevos you have to disable that per volume, 'gluster volume set $VOL nfs.disable true'
15:35 dusmantkp_ joined #gluster
15:37 vipulnayyar joined #gluster
15:41 mjrosenb ndevos: and I've done that
15:42 mjrosenb unless gluster does stuff with rpcbind unrelated to nfs
15:42 * mjrosenb is used to rpcbind only being used for nfs and nis.
15:42 ndevos mjrosenb: rpcbind is also used for mount+lock, both needed for nfs
15:43 ndevos mjrosenb: when you have disabled nfs for all volumes, an existing nfs-server may not get stopped, have you checked how long the process is running
15:43 ndevos ?
15:44 ndevos you if its running for a while already, you can just kill it
15:44 mjrosenb ndevos: I killed it and re-started it, since glusterd is running, but glusterfsd isn't.
15:44 mjrosenb which is a bit of a problem.
15:44 ndevos mjrosenb: right, but glusterfsd is not the nfs-server
15:44 * ndevos is maybe late to the party, and missed a little
15:45 ndevos mjrosenb: you would need to check if the glusterfsd does not get started, or if it exits by itself again
15:46 mjrosenb ndevos: in the past, i've had issues because something tries to start up an nfs-server, then doesn't so the whole brick falls over.
15:47 ndevos mjrosenb: I cant say much about that...
15:47 mjrosenb http://dpaste.com/1780495/
15:47 glusterbot Title: dpaste: #1780495: xcut, by mjrosenb (at dpaste.com)
15:48 mjrosenb i seems to indicate hat glusterfsd started, but then exited.
15:48 mjrosenb *that
15:48 [o__o] left #gluster
15:48 mjrosenb also, sorry about all of the typos, my keyboard is falling apart.  a new one should be coming any day now.
15:52 ndevos mjrosenb: is there nothing in the brick logs?
15:54 mjrosenb ndevos: there's lots of stuff in the brick logs1
15:54 mjrosenb 1
15:54 mjrosenb !
15:54 mjrosenb actually, the only things there are 'bricks' and 'nfs.log'
15:55 mjrosenb there is normally a usr-local-etc-glusterfs-glusterfsd.vol.log, but that wasn't created this time?!
15:55 ndevos mjrosenb: you dont happen to have a full /var or partition where you save the logs?
15:56 mjrosenb oh, there's a bricks/local
15:56 mjrosenb ndevos: I made a directory 'old', and moved all of the old logs there o find out what was being created.
15:57 mjrosenb oh, bricks/local has logs going back to 2012
15:57 mjrosenb c.c
15:57 mjrosenb [2014-04-15 03:43:43.692100] E [xlator.c:385:xlator_init] 0-magluster-posix: Initialization of volume 'magluster-posix' failed, review your volfile again
15:57 mjrosenb that could be the issue.
15:58 JoeJulian mjrosenb: try glusterd --debug
15:59 mjrosenb JoeJulian: that's how I got the truss logs.
15:59 systemonkey joined #gluster
16:00 mjrosenb JoeJulian: ok, started via glusterd --debug.
16:00 mjrosenb memoryalpha# ps aux -www | grep gluster
16:00 mjrosenb root     26538   0.0  0.0  12320  2456  0  I+    4:17AM     0:00.00 less /var/lib/glusterd/peers/14feb846-04a7-4ac4-aab9-14c96aef8fb9
16:01 mjrosenb root     27310   0.0  0.1  57352 14136  2  I+   11:59AM     0:00.37 glusterd --debug (glusterfsd)
16:01 mjrosenb root     27331   0.0  0.0  16312  2184  3  S+   12:00PM     0:00.00 grep gluster
16:01 mjrosenb so no glusterfsd.
16:01 jbd1 joined #gluster
16:01 mjrosenb shall I pastebin the logs that were spewed to the screen?
16:02 JoeJulian fpaste, yeah.
16:03 JoeJulian splitting my attention, sorry
16:03 JoeJulian In a meeting.
16:03 mjrosenb http://bpaste.net/show/205918/
16:03 glusterbot Title: Paste #205918 at spacepaste (at bpaste.net)
16:06 * mjrosenb still guesses the issue is these two lines:
16:07 mjrosenb [2014-04-15 12:02:19.026898] C [posix.c:3965:init] 0-magluster-posix: Extended attribute not supported, exiting.
16:07 mjrosenb [2014-04-15 12:02:19.026925] E [xlator.c:385:xlator_init] 0-magluster-posix: Initialization of volume 'magluster-posix' failed, review your volfile again
16:08 ndevos mjrosenb: right, extended attributes are a requirement, maybe you need some mount-options for that?
16:08 ndevos (for mounting the filesystem for the bricks)
16:08 mjrosenb sounds like it!
16:08 * mjrosenb wonders what changed, since this /was/ working
16:10 sputnik13 joined #gluster
16:11 mjrosenb oh, how do we check for xattr support?
16:11 mjrosenb do we try to set an attribute?
16:12 ndevos I guess something like 'getfattr -m. -ehex -d /path/to/brick' should give you a hint
16:12 mjrosenb the filesystem was readonly
16:12 mjrosenb ok, it should not have taken me that long o figure out that the filesystcem was readonly :-(
16:13 ndevos no, it normally is quite verbose in the logs...
16:13 ndevos but at least you found the issue
16:13 mjrosenb which logs?
16:16 ndevos /var/log/messages, or dmesg
16:18 Mo__ joined #gluster
16:19 mjrosenb why would a filesystem being readonly show up there?
16:21 JoeJulian mjrosenb: Unable to find hostname: 192.168.0.4
16:21 jbd1 typically when a fileystem is unexpectedly mounted readonly, it's because the kernel detected filesystem corruption and automatically remounts-ro in order to prevent further corruption (ext4 has errors=remount-ro for this).  When this happens, there's noise in the logs.
16:21 ndevos because the ext4 or xfs or whatever filesystem driver is part of the kernel, it mostly complains loudly when a block-device has an issue, and makes the filesystem read-only
16:21 JoeJulian mjrosenb: Looks like that server has a different ip address?
16:21 * ndevos is a fan of mounting with "errors=panic"
16:22 mjrosenb jbd1: ndevos: no, i just froze the filesystem while I migrated it, and I forgot to unfreeze it when I was done.
16:22 JoeJulian oh, good.
16:22 * jbd1 uses errors=remount-ro but tests for readonly with monitoring
16:22 mjrosenb JoeJulian: that's one of the two servers, and is certainly up.
16:22 jbd1 mjrosenb: that's the best reason to have read-only!
16:22 ndevos mjrosenb: oh, right, thats very similar, but would not be logged :)
16:23 ndevos jbd1: what fs do you use? the new brick failure detection might help you (but it only seems to be working on xfs...)
16:24 mjrosenb only on xfs? that's a new one
16:24 mjrosenb way back when, aufs2 worked everywhere *but* xfs.
16:24 jbd1 ndevos: I use xfs for my glusterfs bricks and some other filesystems, but ext4 is still in use for root fs on my machines
16:25 ndevos jbd1: okay, so the health-checker in the posix-xlator writes in an interval to a file on the brick, and catches errors
16:25 * mjrosenb just uses zfs for everything
16:25 mjrosenb because zfs.
16:26 ndevos ah, no, thats the change I want to make, it currently calls stat(), xfs reports an error when r/o, ext4 does not do that :-/
16:26 jbd1 ndevos: that's good to know.  My kingdom for glusterfs sending snmp traps on errors
16:27 ndevos jbd1: the brick process will exit, and some errors get logged, it's not snmp, but any log-parser should be able to capture it
16:27 jbd1 I'd even settle for glusterfs allowing me to configure a hook script that is fired on error detection. Log parsing is not my favorite way to find problems on servers.
16:28 ndevos check if bricks are in the output of 'gluster volume status'?
16:28 jbd1 yup, have to write a custom script to verify that everything is Y Y
16:28 ndevos 'gluster --xml volume status' ?
16:29 jbd1 I don't mean to whine.  It's not a big deal.
16:29 ndevos adding a hook can be done, you can file a bug and request for that feature
16:29 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
16:29 jbd1 why not. I'll do that now
16:30 ndevos cool, thanks!
16:30 * ndevos needs to go, ttyl!
16:34 kkeithley joined #gluster
16:35 kkeithley joined #gluster
16:37 jbd1 Anyone have a minute to help me track down a funky distribute-replicate issue?  I have a directory which, when accessed, causes whichever program touches it to hang.  I have verified that the gfids of all files and directories match and haven't yet found anything useful in the logs.  Even strace hangs though
16:40 glusterbot New news from newglusterbugs: [Bug 1087947] Feature request: configurable error reporting hook script <https://bugzilla.redhat.com/show_bug.cgi?id=1087947>
16:42 jbd1 The directory in question did, once upon a time, have a gfid mismatch issue, but that has been corrected for a couple of weeks already
16:43 zerick joined #gluster
16:43 jbd1 aha! found it
16:44 jbd1 [2014-04-14 19:56:19.922623] I [server3_1-fops.c:1085:server_unlink_cbk] 0-UDS8-server: 388: UNLINK /7a/18/0f/rtouchtone66815/mail/entries/inbox/welcome.dump (2ebbb6e7-256e-46a5-b6ca-76cccfe37eec) ==> -1 (Permission denied)
16:44 jbd1 ---------T 2 apache apache 0 Apr  1 22:11 /export/brick1/vol1/7a/18/0f/rtouchtone66815/mail/entries/inbox/welcome.dump
16:46 jbd1 it's almost funny that glusterfs sets the permissions on files to stuff like that, then complains that it can't do what it needs to do due to the permissions it set
16:49 pvh_sa joined #gluster
16:53 vpshastry1 joined #gluster
17:07 jclift jbd1: Do you have time/inclination to create a bug report about it?  That could turn out to be really helpful for others. :)
17:08 lpabon joined #gluster
17:09 masterzen joined #gluster
17:10 vpshastry1 left #gluster
17:10 Matthaeus joined #gluster
17:15 JoeJulian root shouldn't be denied permission to unlink a file. selinux?
17:15 calum_ joined #gluster
17:21 jbd1 no, no selinux here.  glusterfs runs as root too.  Just strange.
17:22 jbd1 problem still exists.  strace ls -l on the file from the client hangs on lstat64("/home/om/UDS8/7a/18/0f/rtouchtone66815/mail/entries/inbox/welcome.dump",
17:24 jbd1 getdents64 on /home/om/UDS8/7a/18/0f/rtouchtone66815/mail/entries also hangs
17:24 jobewan joined #gluster
17:24 systemonkey joined #gluster
17:27 steveeJ joined #gluster
17:28 systemonkey joined #gluster
17:29 chirino joined #gluster
17:29 Slash left #gluster
17:30 jbd1 nothing in the logs about this on client or server.  I'll create a bug
17:30 Slashman joined #gluster
17:33 Matthaeus1 joined #gluster
17:38 systemonkey2 joined #gluster
17:45 jbd1 I hope something comes of it.  Issues like the one I'm having are rare, but if the file in question were accessed frequently enough, I would be looking at a site outage
17:52 jag3773 joined #gluster
17:52 jbd1 1087960 created
17:59 edward1 joined #gluster
18:05 Abrecus joined #gluster
18:08 Abrecus joined #gluster
18:10 glusterbot New news from newglusterbugs: [Bug 1087960] Client hangs when accessing a file, nothing logged <https://bugzilla.redhat.com/show_bug.cgi?id=1087960>
18:10 sputnik13 with distributed volumes, does gluster know to prefer distribution to bricks on other hosts over distribution to bricks on the same host?
18:19 jbd1 sputnik13: distributed (no replicate) doesn't prefer anything over anything-- what would be the point?
18:19 jbd1 sputnik13: if you replicate, you define the replica pairs when you create or grow the volume
18:20 sputnik13 jbd1: sorry, I meant replicate
18:20 sputnik13 I said distribute but I mean replicate :)
18:21 pk joined #gluster
18:24 rahulcs joined #gluster
18:28 lmickh joined #gluster
18:32 B21956 joined #gluster
18:47 Slashman joined #gluster
18:51 jbd1 sputnik13: then it's all a matter of how you define your replica bricks.  the command to add the bricks to the volume is order-sensitive-- add server1:brick1 server2:brick1 server3:brick1 server4:brick1 would make server1/brick1 replica of server2/brick1 and server3/brick1 replica of server4/brick1
18:52 sputnik13 ok, that's what I was thinking it would be, if it cared, so I did create the volumes in the manner you described
18:52 jbd1 sputnik13: it would be a mistake to add server1:brick1 server1:brick2 because then server1 would just be mirroring two bricks locally and you would lose device redundancy
18:52 sputnik13 good to have confirmation
18:52 sputnik13 jbd1: thanks for the info
18:53 jbd1 sputnik13: I try to give back some; this channel has certainly helped me enough
18:59 pk left #gluster
19:09 jag3773 joined #gluster
19:10 glusterbot New news from newglusterbugs: [Bug 1084432] Service fails to restart after 3.4.3 update <https://bugzilla.redhat.com/show_bug.cgi?id=1084432>
19:42 rahulcs joined #gluster
19:45 kkeithley1 joined #gluster
19:53 ndk joined #gluster
20:01 kkeithley joined #gluster
20:05 zerick joined #gluster
20:07 rahulcs joined #gluster
20:09 pk joined #gluster
20:09 swat30 joined #gluster
20:12 marcoceppi joined #gluster
20:12 marcoceppi joined #gluster
20:15 kkeithley joined #gluster
20:17 marcoceppi joined #gluster
20:17 marcoceppi joined #gluster
20:21 hagarth joined #gluster
20:22 MacWinne_ joined #gluster
20:25 jag3773 joined #gluster
20:33 marcoceppi joined #gluster
20:33 marcoceppi joined #gluster
20:33 gdavis33 how can i verify the file counts for a replace brick?
20:34 Ark joined #gluster
20:35 jbd1 gdavis33: on old brick, run find /path/to/brick -name .glusterfs -prune -or -print > /path/to/filelist.txt .  Then after the replace-brick is done healing, you can run a new find on the new brick to verify the file count matches.
20:36 gdavis33 i get Number of files migrated = 65924        Migration complete
20:36 jbd1 gdavis33: if you change -print to -ls you can get more information, like file size and permissions, but note that if this is a live volume then the files may change, or new ones added, old ones removed, etc
20:36 gdavis33 from the replace process
20:37 jbd1 gdavis33: and you're looking to confirm that all the files from the old brick were migrated to the new one, right?
20:37 gdavis33 yes
20:38 jbd1 gdavis33: then you need to count how many files were on the old brick
20:38 gdavis33 find /d1 -name .glusterfs -prune -or -print | wc -l
20:38 gdavis33 66591
20:38 gdavis33 thats the old
20:39 gdavis33 and the new is
20:39 gdavis33 find /brick1 -name .glusterfs -prune -or -print | wc -l
20:39 gdavis33 31645
20:39 pk left #gluster
20:39 gdavis33 so none of the numbers match
20:40 jbd1 gdavis33: yeah, that's odd-- I would recommend running a find from a client on the volume, which will trigger gluster's self-heal process on any files that need to be migrated to the new brick
20:40 guest1440 joined #gluster
20:40 gdavis33 do i need to commit first?
20:41 rahulcs joined #gluster
20:41 jbd1 gdavis33: not sure.  I haven't personally run a replace-brick yet
20:41 gdavis33 comforting :)
20:41 jbd1 gdavis33: sorry :) Probably wise to generate that filelist from the old brick anyway
20:42 jbd1 gdavis33: and if it's a replicated setup, it's probably fine to commit before running the find from the client-- anything missing will be self-healed
20:44 gdavis33 jbd1: famous last words
20:44 jbd1 gdavis33: you have backups, right?  right?
20:44 jbd1 ;)
20:44 gdavis33 Um, no
20:45 gdavis33 it receives and removes 10s of thousands of files daily
20:45 gdavis33 backup would be invalid in 30 mins
20:46 gdavis33 thats what replicated HA is for right?
20:47 jbd1 not exactly, backups != redundancy
20:47 jbd1 but you can check all  your bricks to see whether you have N copies (replica-N)
20:59 MeatMuppet joined #gluster
21:13 badone joined #gluster
21:21 kkeithley joined #gluster
21:28 tdasilva joined #gluster
21:39 daMaestro joined #gluster
21:40 Humble joined #gluster
21:43 qdk joined #gluster
21:51 theron_ joined #gluster
22:02 fidevo joined #gluster
22:11 theron joined #gluster
22:25 jag3773 joined #gluster
22:28 pk joined #gluster
22:34 pk left #gluster
22:45 MeatMuppet left #gluster
22:52 tdasilva joined #gluster
23:07 diegows joined #gluster
23:08 B21956 joined #gluster
23:11 vpshastry1 joined #gluster
23:56 pvh_sa joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary