Camelia, the Perl 6 bug

IRC log for #gluster, 2012-11-01

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:07 daddmac2 this log entry:[2012-10-31 15:43:56.044319] E [afr-self-heal-common.c:2156:​afr_self_heal_completion_cbk] 0-test-volume-replicate-4: background  meta-data data entry missing-entry gfid self-heal failed on /filepath
00:07 daddmac2 refers to "test-volume-replicate-4", but we don't have a volume named "test".
00:10 daddmac2 where would i find this reference?  i've done a "grep -r test *" in var/lib/glusterd, but there were no hits.  so, where should i look for this?
00:13 kevein joined #gluster
00:13 JoeJulian daddmac2: Which log is that in?
00:14 daddmac2 checking...
00:19 daddmac2 lots of data to grep...
00:24 seanh-ansca joined #gluster
00:24 daddmac2 LOTS...
00:41 daddmac2 i'm still looking, but i believe it was the mnt-[mountpoint].log .
00:47 daddmac2 joejulian: oops, i'm still looking, but i believe it was the mnt-[mountpoint].log .
00:53 JoeJulian Is this too much into how it works? http://joejulian.name:8080/v1​/AUTH_joe/public/gluster.pdf
00:55 daddmac2 i'll check it out!
00:57 JoeJulian daddmac2: You're still on 3.3.0... I found several rpc and self-heal bugs most of which were fixed in 3.3.1. I would strongly recommend upgrading.
00:57 stefanha joined #gluster
01:06 daddmac2 joejulian: i'd really like to!  we're running centos, any guess when 3.3.1 will hit the repos, or is there a patch rpm?  i can make from source, but we don't have resources to maintain our environment if i do that.
01:06 daddmac2 if we did, i would be home now...  :P
01:07 chacken joined #gluster
01:17 JoeJulian @yum repo
01:17 glusterbot JoeJulian: kkeithley's fedorapeople.org yum repository has 32- and 64-bit glusterfs 3.3 packages for RHEL/Fedora/Centos distributions: http://goo.gl/EyoCw
01:18 JoeJulian ~yum repo | daddmac2
01:18 glusterbot daddmac2: kkeithley's fedorapeople.org yum repository has 32- and 64-bit glusterfs 3.3 packages for RHEL/Fedora/Centos distributions: http://goo.gl/EyoCw
01:19 daddmac2 cool!
01:22 daddmac2 joejulian: we'll look at this in the morning.  just one question, can this be applied as a rolling upgrade, or do i need a window?
01:24 JoeJulian It can be done on a live system. Do the servers first. I do one server at a time and then make sure that the self-heal has completed before doing the next. A remount will be required to upgrade the client though.
01:30 rwheeler joined #gluster
01:35 daddmac2 we can do that thanks to ctdb.
01:37 daddmac2 reading gluster.pdf.  on "replicate" graph, Replicate and FUSE can live on server or client because the client (glusterfsd) does it?
01:41 daddmac2 correction, we distribute-replicate, so replicate and distribute live on where the client (fuse/glusterfsd) is running?
01:57 dmachi1 joined #gluster
02:03 benner_ joined #gluster
02:09 joscas joined #gluster
02:10 bala1 joined #gluster
02:12 sunus joined #gluster
02:16 daddmac2 left #gluster
02:36 plarsen joined #gluster
02:41 nodots joined #gluster
02:51 niv joined #gluster
02:59 kevein joined #gluster
03:07 sunus joined #gluster
03:10 stefanha joined #gluster
03:10 sunus joined #gluster
03:20 kkeithley joined #gluster
03:25 nightwalk joined #gluster
03:51 ika2810 joined #gluster
04:02 sunus joined #gluster
04:21 vimal joined #gluster
04:22 Humble_afk joined #gluster
04:55 ngoswami joined #gluster
06:56 stickyboy joined #gluster
06:58 vimal joined #gluster
07:09 stickyboy Anyone using link aggregation on Gigabit Ethernet?
07:11 ramkrsna joined #gluster
07:11 ramkrsna joined #gluster
07:14 sunus hi, how can i change glusterd's uuid?
07:14 sunus i create mulit vms from the same img, so i want to change their uuid
07:15 lkoranda joined #gluster
07:21 vimal joined #gluster
07:28 rgustafs joined #gluster
07:40 ctria joined #gluster
07:41 ekuric joined #gluster
08:04 Nr18 joined #gluster
08:05 stickyboy joined #gluster
08:07 tjikkun_work joined #gluster
08:10 faizan joined #gluster
08:13 pkoro joined #gluster
08:23 hagarth joined #gluster
08:25 spn joined #gluster
08:32 ctria joined #gluster
08:36 vimal joined #gluster
08:52 TheHaven joined #gluster
09:06 stefanha joined #gluster
09:11 dobber joined #gluster
09:13 mdarade1 joined #gluster
09:18 stefanha joined #gluster
09:27 faizan joined #gluster
09:39 Azrael808 joined #gluster
09:44 DaveS_ joined #gluster
09:53 duerF joined #gluster
10:01 sunus gluster uuid became 0000000000 why?
10:02 ndevos sunus: before creating your vm-template, you will want to stop glusterd and remove /var/lib/glusterd/glusterd.info, that file contains the UUID and will be created if missing
10:02 sunus and with new uuid?
10:03 ndevos yeah, a new random one
10:03 tryggvil joined #gluster
10:07 mdarade1 joined #gluster
10:08 Nr18 joined #gluster
10:09 sunus thank you! i will do this right away
10:12 gbrand_ joined #gluster
10:14 sunus ndevos: i changed uuid of vms, but before that i already add some peer to cluster, then after i changed uuid, i can noit peer probe the vms.. said vm is already a part of another cluster, but i detached that vm
10:19 ndevos sunus: you'll need to stop glusterd, remove the wrong UUID file from under /var/lib/glusterd/peers and start glusterd again
10:23 sunus ndevos: that's what i did, but i may find the problem, thx! i will also need to empty the peer dir
10:23 ndevos sunus: great! just remember to start with a clean template next time
10:25 sunus ndevos: yeah, it worked, thank you
10:26 faizan joined #gluster
10:26 sunus ndevos: i do this to check a bug i found weeks ago.. http://community.gluster.org/q/can-not-cr​eate-new-volume-after-created-one-volume/       this one, i run from source failed, i now try rpms.
10:26 glusterbot Title: Question: can not create new volume after created one volume. (at community.gluster.org)
10:29 ndevos sunus: make sure the directories for the bricks exist in advance, and do not have any glusterfs relates xattrs when you create a volume with them
10:30 ndevos although, I would expect an error like: or a prefix of it is already part of a volume
10:30 glusterbot ndevos: To clear that error, follow the instructions at http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/
10:30 ndevos glusterbot: I know! Tell sunus :)
10:32 sunus ndevos: i know :>
10:35 y4m4 joined #gluster
10:36 lh joined #gluster
10:36 lh joined #gluster
10:42 manik joined #gluster
10:47 sunus ndevos: i turned out that the version i installed from yum went smoothly, and the one i install from source get the bug..
10:48 sunus glusterfs, you gotta be kidding me!
10:54 Nr18 joined #gluster
10:58 tryggvil_ joined #gluster
11:01 tryggvil__ joined #gluster
11:05 mdarade1 joined #gluster
11:27 kkeithley1 joined #gluster
11:35 mdarade1 left #gluster
11:37 hagarth1 joined #gluster
11:40 bfoster joined #gluster
11:48 tryggvil joined #gluster
11:49 tryggvil joined #gluster
11:53 faizan joined #gluster
12:06 mdarade joined #gluster
12:19 quillo joined #gluster
12:20 psymax joined #gluster
12:21 psymax hello! it is possible to mount a gluster volume on two machines or more ?
12:22 sensei That's kind of the point of it :)
12:23 psymax that's cool. thanks :)
13:04 FU5T joined #gluster
13:06 lh joined #gluster
13:06 lh joined #gluster
13:07 faizan joined #gluster
13:33 mohankumar joined #gluster
13:39 robo joined #gluster
13:57 dmachi joined #gluster
14:00 tryggvil joined #gluster
14:01 stopbit joined #gluster
14:02 mdarade left #gluster
14:10 chouchins joined #gluster
14:18 atrius joined #gluster
14:21 wushudoin joined #gluster
14:35 JoeJulian Is this too much into how it works? What's missing for an "Intro to Gluster" presentation? http://joejulian.name:8080/v1​/AUTH_joe/public/gluster.pdf
14:38 ndevos JoeJulian: not sure, depends on the audience I guess
14:38 semiosis two things i'd add... 1. early slide, What glusterfs is not... raid, on-disk format, ....
14:38 ndevos I've done an introduction once too, at a fedora devconf: http://people.redhat.com/ndevos/talks/​Gluster-data-distribution_20120218.pdf
14:39 semiosis 2. mention that the cloud icon on slides 8 & 9 is the network connecting clients to servers, tcp/rdma
14:39 semiosis graph is both client xlator stack + server xlator stack
14:39 semiosis connected by network
14:46 JoeJulian ndevos: Your dht slide, it loops? Really? Does it read the trusted.glusterfs.dht from each brick+directory sequentially?
14:48 ndevos JoeJulian: yeah, thats just a for-loop to find out what brick contains the file - it's all in-memory, not requesting details over the network
14:49 ppradhan joined #gluster
14:49 JoeJulian Oh good, that's what I thought.
14:51 ndevos and it's a little simplified too, actually the hash-ranges are set in the xattrs of the directories on the bricks...
14:53 JoeJulian Right, then the brick is predicted, then checked, then the file is created if the trusted.glusterfs.dht matches the prediction, otherwise I suppose it does do a sequential network lookup.
14:54 TheHaven joined #gluster
14:54 ndevos I guess it does, although I dont think it needs to be done sequential
15:13 purpleidea joined #gluster
15:13 purpleidea joined #gluster
15:21 zr joined #gluster
15:21 zr hi
15:21 glusterbot zr: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
15:22 zr probe peer fails to probe
15:22 zr 0-glusterd: Received CLI probe req 192.168.23.129 24007
15:22 zr 0-glusterd: Unable to find hostname: 192.168.23.129
15:22 zr [root@node1 ~]# gluster peer probe node2
15:22 zr Probe unsuccessful
15:22 zr Probe returned with unknown errno 107
15:41 blendedbychris joined #gluster
15:41 blendedbychris joined #gluster
15:43 zr Probe failed with op_ret -1 and op_errno 107
15:43 daddmac1 joined #gluster
15:52 aliguori joined #gluster
15:59 seanh-ansca joined #gluster
16:03 bambi2 joined #gluster
16:13 UnixDev is it better to access glusterfs on a client serving only node rather than serving nfs directly from a node that also serves bricks?
16:14 JoeJulian zr: Check the other glusterd log as well. I'm not sure what the "unable to find hostname" error should mean with an ip address. :/
16:15 semiosis UnixDev: mounting nfs from localhost is dangerous, if that's what you mean
16:15 zr nothing in other node's log but i can see packets coming from the node with tcpdump
16:16 semiosis UnixDev: otherwise please clarify
16:16 UnixDev semiosis: what about mounting from a brick?
16:16 zr 2nd node is succesfully peered but not the 1st one
16:16 semiosis ~glossary | UnixDev
16:16 glusterbot UnixDev: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
16:16 semiosis "node" is ambiguous, or at least confusing to me
16:17 UnixDev mounting volume via nfs from a server that also hosts bricks
16:17 UnixDev this is being mounted by a server that has no gluster, only access through nfs
16:18 UnixDev what I'm asking is if its better for this to be mounted from a server that hosts bricks or better from a server that runs gluster but hosts no bricks
16:19 semiosis well i think less hops is usually better, all else being equal
16:19 semiosis s/all else being/when everthing else is/
16:19 glusterbot What semiosis meant to say was: well i think less hops is usually better, when everthing else is equal
16:20 UnixDev semiosis: would it be better in dealing with failure of a brick?
16:20 Eco_ joined #gluster
16:20 UnixDev or rather, failure of server that hosts bricks
16:21 purpleidea joined #gluster
16:21 purpleidea joined #gluster
16:22 Eco_ a short infra meeting this morning is happening in #gluster-meeting for anyone who would like to attend
16:26 plarsen joined #gluster
16:28 Mo___ joined #gluster
16:38 purpleidea joined #gluster
16:39 ndevos JoeJulian: btw, you were interested in glusterfs support for wireshark, are there any use-cases that you would like to see in a presentation?
16:46 jiffe98 is it common or even recommended to have 3 replicas instead of two?
16:48 semiosis jiffe98: peoplep do but idk how common it is
16:50 ekuric left #gluster
16:51 purpleidea joined #gluster
16:51 purpleidea joined #gluster
16:55 spn joined #gluster
17:02 Bullardo joined #gluster
17:10 faizan joined #gluster
17:13 jiffe98 I see, does replication slow down much between 2 and 3 replicas?
17:14 nick5 joined #gluster
17:15 jiffe98 I see 3.3 has the concept of a quorum so I'm thinking having a 3rd copy isn't a bad idea, plus if one node is down I am still redundant
17:15 jdarcy Quorum does work better with N>2
17:16 nick5 had a question about healing on a simple 2 replica setup.
17:17 jiffe98 yeah, that's why I'm leaning towards 3 replicas
17:19 manik joined #gluster
17:21 nick5 client connects to gv01 (replicate 2) via fuse and writes a file.  one server gets disconnected from network in middle of write.  client eventually finishes writing.
17:22 nick5 the disconnected server connects back to network.  heal info shows the file is not 'right', and overnight, tries to heal the file many times, but nothing changes.
17:22 nick5 file on server1 (temporary disconnected) is never fixed.  file on server2 is correct.
17:25 jdarcy nick5: Anything in the logs to indicate why self-heal is failing?
17:28 UnixDev nick5: how did you check the file was correct or not?
17:28 nick5 looking through the logs now, but not seeing anything.
17:28 nick5 on server1, doing an md5sum of the file and the file size vs what is on server2.
17:29 jdarcy What does "getfattr -d -e hex -m . $file" show on each brick?
17:29 UnixDev nick5: what happens when you try to heal the vol via vol heal gv01 heal full ?
17:29 jiffe98 here's another question, I'm looking at small random reads/writes to files anywhere from 0-50MB in size, would I be better off with more machines or less machines with more disks?
17:30 Triade joined #gluster
17:30 nick5 on server2 (which is correct:)
17:31 nick5 security.selinux=0x756e636f6e66696e65645f7​53a6f626a6563745f723a66696c655f743a733000 trusted.afr.gv01-client-0=​0x0000a4ad0000000000000000 trusted.afr.gv01-client-1=​0x000000000000000000000000 trusted.gfid=0xe5c7ef08098e4b4192f0b1a2a5f5697c
17:32 nick5 on server1 (which is wrong)
17:32 nick5 security.selinux=0x756e636f6e66696e65645f7​53a6f626a6563745f723a66696c655f743a733000
17:32 nick5 trusted.afr.gv01-client-0=​0x000000090000000000000000
17:32 nick5 trusted.afr.gv01-client-1=​0x000000090000000000000000
17:32 nick5 trusted.gfid=0xe5c7ef08098e4b4192f0b1a2a5f5697c
17:32 Triade1 joined #gluster
17:32 jdarcy Ahhh, split brain.
17:32 jdarcy Actually it's not, but the code thinks it is.
17:32 nick5 but split-brain doesn't show anything.
17:34 UnixDev jdarcy: this worries me… can this happen to my files too? how can I know if the split files are not in the split-brain log?
17:34 jdarcy server2 (a.k.a. client-1) thinks that there are 42,157 writes which server1 (a.k.a. client-0) didn't get.  At the same time, server1 thinks that there are 9 writes that server2 didn't get.  They weren't processed locally either, which is why it's not "classic" split brain, but it still causes self-heal to fail.
17:34 nick5 when i issue a heal for the full volume, it doesn't seem to do anything either -- at least it doesn't fix the problem.
17:35 jdarcy What version is this?
17:35 nick5 3.3.1
17:35 * jdarcy should always ask that first.
17:37 nick5 in this case, the client did a mount to server1 via fuse.  i was testing to see what would happen when the server it connected to disappeared from the network.
17:38 nick5 the good part was that the write eventually finished up on server2.  the bad part is i'm in this situation where self-heal is not working.
17:40 jdarcy Just a sec, looking at commits.
17:40 TSM2 joined #gluster
17:41 jdarcy I don't see anything, let me try this case on master.
17:42 nick5 cool, thanks.
17:46 jdarcy Hm.  Can't reproduce on master.  Weird.
17:47 nick5 the glusterd process was never shutdown on server1.  i simply logged into it via a private network and 'ifconfig'd down the public network which it is using for all communication.
17:49 nick5 servers are centos 6.2 (64bit), client is fedora 17 (64bit)
17:49 nick5 all gluster rpms were pulled from the gluster repos.
17:51 TSM2 is there a way to guage the balance level that gluster will acheave when writing files to a cluster
17:52 TSM2 what i mean by that is if i write 1000 files what proportion of thoes will go to each node?
17:53 jdarcy The only thing I see different between 3.3.1 and master that might be relevant is f153c835807ac31006ba690b1deb47b20b51bc83, but even that doesn't quite fit.
17:53 y4m4 joined #gluster
17:54 jdarcy TSM2: If they have 1000 different names (last path component) then you should get very close to even distribution.  If they have the same name in different directories, they'll all converge on the same brick.
17:55 jdarcy nick5: The good news is that I'm 99% sure I know how you can fix it.  The bad news is that I'm not sure how you can prevent it (because I'm not sure how it could have happened in that version).
17:55 TSM2 jdarcy: ahh, so its possable under certian workloads to end up with a heavily loaded brick and a fairly empty one
17:56 oneiroi joined #gluster
17:56 jdarcy TSM2: Yes, in 3.3 at least.  I have a patch that makes that slightly better (lots of scientific apps tend to have same-named files in many directories).  Let me check its status.
17:57 nick5 jdarcy: cool.  what's the fix?
17:59 TSM2 well we store images in directories, unlikley to be a big problem for us as all files are unique just in subdirectories, ie 12345678.jpg will end up in a folder path /123/234/12345678.jpg, this should take us well into the billions before we hit the directory limit in the first folder
18:00 jdarcy nick5: Removing the file on server 1 would almost certainly work (it will be rebuilt from server2).  It might be informative to see if "setfattr -x trusted.afr.gv01-client-1 $file" works first.
18:00 jdarcy TSM2: Right, shouldn't be a problem for you.
18:01 nick5 run the setfattr on server1 (client-0) or server2 (client-1)?
18:01 jdarcy nick5: On server1.
18:03 TSM2 great, im also wandering, there seems to be not much info re building bricks up. If i was looking for thoughput i would be better to put more boxes in, so multiple 2U 12bay, each its own brick, but if i just wanted storage i could just put in two large 4U boxes and put additional drives in blocks and add them as subbricks
18:04 TSM2 i know there is an issue around diffrent size bricks so if i start with my first being 30TB should i always keep the others 30TB
18:05 nick5 ran the setfattr on server1 to remove the xattr, and then ran the full heal from server2, and nothing yet.
18:05 jdarcy TSM2: Correct on same brick size.  :(  As for the other, there's a never-ending debate about small servers vs. large servers.  I'm generally on the small-server side, but there's no substitute for testing the real workload.
18:05 Fabiom joined #gluster
18:06 TSM2 yup doing things with little budget for lots of testing servers
18:08 nick5 the output of 'gluster volume gv01 info healed' shows that the file was healed on both bricks.  server1 @ 11:06:29, and server2 @ 11:06:14.
18:08 jdarcy TSM2: For media files, I suspect either would work and fewer/larger serves would  be easier to manage.
18:09 TSM2 large bigblock reads and writes, not very taxing
18:09 jdarcy nick5: So does that mean the file's OK now?
18:10 nick5 nope.  still in the same situation.
18:10 nick5 also, i just removed it locally from server1, and kicked off the heal from server2, and it's not doing anything.
18:11 nick5 'heal gv01 info' shows that file in each brick.
18:11 jdarcy nick5: If you *remove* it, you need to remove the link in .glusterfs too.
18:11 nick5 i thought there was something with the .glusterfs directories too.
18:11 tryggvil joined #gluster
18:12 jdarcy Something like .../.glusterfs/e5/c7/e5c7ef​08098e4b4192f0b1a2a5f5697c
18:12 nick5 yup got it.
18:13 jdarcy https://bugzilla.redhat.com/show_bug.cgi?id=836101 and/or https://bugzilla.redhat.com/show_bug.cgi?id=825559 might be relevant here.  Looking at them myself now.
18:13 glusterbot Bug 836101: urgent, unspecified, ---, pkarampu, ASSIGNED , Reoccuring unhealable split-brain
18:13 glusterbot Bug 825559: urgent, urgent, ---, divya, ASSIGNED , [glusterfs-3.3.0q43]: Cannot heal split-brain
18:14 nick5 removed the file, removed the link, and kicked off a heal, and still nothing.  the file now only exists on server2.
18:15 jdarcy Is the self-heal daemon not even managing to get started?  Check the dates/contents of /var/log/glusterfs/glustershd.log to see.  Also, does anything change if you try to stat the file through a client?
18:15 jdarcy OK, 836101 is for device files and 825559 is the .glusterfs issue, so false alarm there.
18:16 faizan joined #gluster
18:16 nick5 the daemon seems to run, but is throwing out errors.
18:16 nick5 E [afr-self-heald.c:685:_link_inode_update_loc] 0-gv01-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)
18:16 jdarcy What kind of errors?
18:17 jdarcy Null GFID?  Hm.
18:18 nick5 the files are in the root dir on the bricks.
18:19 nick5 tried a stat from the client and still nothing on server1.
18:21 nick5 what i had to do last time was umount the client.  remount it to server1 and then for some reason, server1 fixed itself.
18:21 nick5 it's in the process of doing that right now.
18:21 jdarcy Is this a native-protocol client, or NFS?
18:21 nick5 native.
18:24 TSM2 what happens in gluster if you hard link a file, does it then have to make sure all other hardlinked files reside on the same brick
18:26 nick5 it's done.  also what's weird is that the mtime/ctime on the file changes to when the heal finishes.
18:27 jdarcy nick5: It's done?  What changed?
18:28 jdarcy Also, it seems like a bug that the mtime/ctime would change.
18:28 nick5 from the client, i did a umount, then mount, and then a stat or a simple ls on the dir, and that kicked off the heal.
18:29 nick5 i'm digging myself into an even deeper ditch now.
18:29 nick5 i went back to server1 and removed the file and the link, and now my client doesn't see anything in that dir, even though it exists on server2.
18:30 jdarcy You deleted it since the remount?
18:30 nick5 yes.
18:31 nick5 and once again, umount and remount and ls and it's starting to rebuild it on server1 again, correcty.
18:31 nick5 and of course the mtime is going to change again.
18:31 jdarcy OK, we're dealing with at least three issues (not counting mtime/ctime) here.
18:32 jdarcy (1) File got into this bad unhealable state.
18:32 jdarcy (2) Self-heal doesn't occur until an unmount/remount.  (I've actually seen this, but thought it was gone.)
18:33 jdarcy (3) File does not appear at all if it's deleted on on side after remount.
18:33 jdarcy Does that capture things so far?
18:35 jdarcy (3) might be the same as http://review.gluster.org/#change,4142
18:35 glusterbot Title: Gerrit Code Review (at review.gluster.org)
18:35 nick5 correct to all 3.
18:37 nick5 regarding (1), i can reproduce this issue consistently.
18:37 nick5 i was trying to figure it out yesterday, gave up, and started anew today, and had the same problem.
18:38 Bullardo joined #gluster
18:38 jdarcy So you're using "ifconfig down" to simulate failures, not killing glusterfsd?
18:38 nick5 correct.
18:38 tc00per left #gluster
18:41 jdarcy That shouldn't make a difference for (1) but might be more relevant for (2).
18:45 nick5 i could try connecting to server2 and doing the same tests (i.e. ifconfig server1 down) and see if that resolves the 2nd issue.
18:46 nick5 my thought was that it should work in this situation as server1 is back up and the client is technically connected to both.
18:49 jiffe98 I'm guessing with small random IO it doesn't really matter where the disks are, just the number of disks total
18:50 jiffe98 in that case having more machines might be beneficial with regards to other resources like ram/cache/network
18:57 jdarcy nick5: After the mount, it really doesn't matter who you connected to initially.  That server is only used to fetch the volfile that tells you about the others.
18:58 semiosis ,,(mount-server)
18:58 glusterbot I do not know about 'mount-server', but I do know about these similar topics: 'mount server'
18:58 semiosis ,,(mount server)
18:58 glusterbot (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds, or (#2) Learn more about the role played by the server specified on the mount command here: http://goo.gl/0EB1u
18:58 jdarcy nick5: In fact, you can mount off a server that's not even serving any of the volume you're mounting.
19:02 Gilbs joined #gluster
19:02 nick5 ah yes.  just need to pull the volfile.
19:03 nick5 so issue 2 is weird as well.
19:07 jdarcy Not as weird as you might think, if it has to do with some kind of stale state (which obviously isn't there for a new mount).
19:07 Gilbs Howdy all, Security question here:  Can I get away with using BC wipe via samba on windows, would gluster and the replicates all re-write the file (7-passes)?
19:10 jdarcy Assuming some sort of fsync (or O_SYNC) equivalent, yes.
19:12 Jippi joined #gluster
19:15 Gilbs Great, thank you.
19:23 phobos27 joined #gluster
19:26 JoeJulian ndevos: regarding wireshark. A presentation that shows several connectivity problems and how they can be diagnosed would probably be good. When I use it I'm usually trying to isolate some bug I've found.
19:27 Gilbs left #gluster
19:28 DaveS_ joined #gluster
19:33 seanh-ansca joined #gluster
19:42 purpleidea joined #gluster
19:42 purpleidea joined #gluster
19:47 pkoro joined #gluster
20:01 TSM joined #gluster
20:11 UnixDev I'm having some problem where the nfs log shows split-brain errors but the log command does not, any ideas? http://fpaste.org/Ujcg/
20:11 glusterbot Title: Viewing Paste #248586 (at fpaste.org)
20:14 UnixDev also getting some strange errors that may be related to failover http://fpaste.org/3rKo/
20:14 glusterbot Title: Viewing Paste #248587 (at fpaste.org)
20:18 badone joined #gluster
20:26 JoeJulian UnixDev: Check those files on the bricks with "getfattr -n trusted.gfid" and see if they do, indeed, mismatch.
20:31 UnixDev JoeJulian: matches… so whats wrong with the nfs log?
20:38 JoeJulian Not sure...
20:39 JoeJulian Dammit! I thought I had turned off automatic updates of the gluster packages. Fuck!
20:40 JoeJulian kkeithley: We should think about adding some sort of self-heal checks before killing all the glusterfsd processes and causing split-brain during the rpm upgrades.
20:40 semiosis running a distributed cluster filesystem is hard
20:41 davdunc joined #gluster
20:41 davdunc joined #gluster
20:41 JoeJulian :)
21:16 circut joined #gluster
21:20 Humble_afk joined #gluster
21:34 Bullardo joined #gluster
21:55 blendedbychris joined #gluster
21:55 blendedbychris joined #gluster
22:13 ppradhan left #gluster
22:15 erik49 joined #gluster
22:16 nick5 jdarcy: tried a few other things and interestingly, it must be still in split brain.
22:16 nick5 any new files or directories created go only to 1 brick and not the other, but are eventually healed.
22:16 nick5 just the one old file that was interrupted is never healed properly.
22:17 JoeJulian nick5: Have you remounted?
22:18 nick5 nope.
22:18 JoeJulian @query never reconnects
22:18 glusterbot JoeJulian: No results for "never reconnects."
22:19 nick5 really?
22:19 erik49 Is it possible to specify that a file get written to a particular node/brick?
22:19 JoeJulian bug 846619
22:19 glusterbot Bug https://bugzilla.redhat.com​:443/show_bug.cgi?id=846619 urgent, high, ---, vbellur, ASSIGNED , Client doesn't reconnect after server comes back online
22:20 JoeJulian erik49: No
22:20 erik49 will that ever be possible?
22:22 nick5 thanks. good to know.
22:23 JoeJulian erik49: Maybe, but it's certainly not on the current ,,(roadmap). It kind-of falls outside of the design that a distributed hash table provides and would be less efficient.
22:23 glusterbot erik49: See http://gluster.org/community/doc​umentation/index.php/Planning34
22:23 JoeJulian IF you don't care what filename you use, you could reverse-calculate a filename that would reside on the brick you wish it to.
22:24 JoeJulian Adding bricks and rebalancing would screw that up though.
22:25 erik49 how would I reverse-calculate?
22:26 erik49 hmm
22:27 erik49 Just to put my use case into context, I'd like the output of a write to be the input of another write and avoid all the unnecessary network i/o of writing to another node
22:28 erik49 Sounds like thats not possible to do with gluster though
22:28 JoeJulian Ah, that's not really going to happen... right.
22:28 erik49 Are there any other distributed filesystems that you know of that could do this?
22:29 JoeJulian ntfs on netapp
22:31 JoeJulian The problem comes from the fact that posix commands don't have a "copy". Even cp opens two fds reads/writes.
22:32 hattenator joined #gluster
22:33 erik49 Damn.  Well thats a bummer, would have saved a lot on network saturation.
22:33 erik49 I'll check out ntfs on netapp, thanks.
22:42 nick5 should a gluster client (fuse) have any affect on a heal process, like preventing it from working correctly?
22:43 nick5 the issue that jdarcy was helping me seems to be resolved if i disconnect the client that did the writes and then let the system self-heal.
22:48 JoeJulian nick5: It /shouldn't/ but I did encounter that problem with versions prior to 3.3.1
22:50 erik49 JoeJulian, how about finding which node/brick a file is on, so I can at least save on the read bandwidth?
22:51 erik49 is that do-able?
22:55 Bullardo joined #gluster
22:56 Nr18 joined #gluster
22:57 TSM joined #gluster
22:57 JoeJulian yes... getfattr -n trusted.gfs.pathinfo
22:58 Triade joined #gluster
23:03 semiosis JoeJulian: erik49: i believe "custom layouts" is the name of the feature you're looking for
23:05 semiosis @custom layout
23:05 semiosis hehe, didnt think so
23:08 erik49 JoeJulian, cool, thanks!
23:08 erik49 semiosis, confused? is that supposed to make the bot say something?
23:08 erik49 sorry, i meant to say, I'm confused, lol
23:08 semiosis @mount server
23:08 glusterbot semiosis: (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds, or (#2) Learn more about the role played by the server specified on the mount command here: http://goo.gl/0EB1u
23:09 semiosis yeah i thought i might have given the bot a link about "custom layouts" at one point, but apparently not
23:09 semiosis trying to find mention of it in the channel logs now, stand by
23:09 erik49 cool
23:10 semiosis topic
23:10 semiosis oops
23:10 JoeJulian Not sure if it was in chat. jdarcy might have mentioned it at Red Hat Summit.
23:10 erik49 should the above say "Once connected, the client connects to all the VOLUMES in the SERVER?"
23:10 JoeJulian nope
23:10 semiosis JoeJulian: he did, but i'm pretty sure i bugged him about it here too
23:11 erik49 oh crap
23:11 erik49 sorry I haven't used gluster in a while
23:11 erik49 lol
23:11 JoeJulian erik49: Once the client retrieves the volume definition, it connects to every server that's part of that definition.
23:11 erik49 I forgot that volumes is a bucket of servers ;(
23:12 semiosis should say bricks instead of servers
23:17 erik49 Ah, thanks
23:18 erik49 semiosis, I have to leave my lab but can I pm you my e-mail address in case you find the custom layout information?
23:19 semiosis sorry didn't find it, but generally i prefer chatting here so it gets logged & others can see
23:20 erik49 Okay, thanks a lot for the help!
23:20 semiosis yw
23:25 Triade1 joined #gluster
23:36 NuxRo hi guys, johnmark was talking about an "API gw" for glusterfs. Where can i find it and docs about it?
23:38 semiosis @glupy
23:38 semiosis NuxRo: can you give any more context?  johnmark talks about a lot of things :)
23:38 semiosis hehe
23:39 semiosis from "API gw" i could imagine you mean glupy or UFO
23:39 semiosis but maybe i'm not even close with those guesses
23:39 NuxRo semiosis: yeah, i meant to ask him, totally forgot
23:39 NuxRo i don't think it's ufo though
23:39 NuxRo I'll check out glupy
23:40 semiosis https://github.com/jdarcy/glupy
23:40 glusterbot Title: jdarcy/glupy · GitHub (at github.com)
23:40 semiosis idk what the state of that is currently
23:41 NuxRo looks beta-ish
23:41 semiosis what are you looking for with "API gw" ?
23:44 NuxRo well, in my head it was this cool accessible API that you could wrap a web gui around
23:44 NuxRo deploying ovirt just to have a gui is overkill, though i understand this is being worked on
23:45 semiosis @gmc
23:45 glusterbot semiosis: The Gluster Management Console (GMC) has been discontinued. If you need a pretty gui to manage storage, support for GlusterFS is in oVirt.
23:45 semiosis idk, not a fan of GUIs for glusterfs personally
23:45 NuxRo yeah, knew that already :)
23:46 NuxRo semiosis: managers love it
23:46 semiosis mehnegers ;)
23:46 semiosis s/mehnegers/mehnagers/
23:46 glusterbot What semiosis meant to say was: mehnagers ;)
23:46 * semiosis is one though
23:47 JoeJulian hehe
23:47 NuxRo lol
23:47 semiosis what i'd love to see, though don't have enough time or staff to do it, would be awesome monitoring of glusterfs with metrics data going into graphite/ganglia
23:47 semiosis or similar
23:48 NuxRo yeah, there's a lot glusterfs could use, but
23:48 NuxRo easy does it
23:48 semiosis control is the hard part, but collecting metrics could be pretty easy, just needs time, and could deliver some real pretty looking graphs
23:48 JoeJulian NuxRo: There is an api. It's done through shell commands to the gluster cli. ;)
23:48 semiosis JoeJulian: got a factoid for that?
23:49 JoeJulian Nah, try "gluster help" :P
23:49 semiosis JoeJulian: oh i thought you meant your json thing
23:49 JoeJulian No, that's just a pretty interface to swift.
23:49 semiosis ok, i think i've done enough damage for the day
23:49 Triade joined #gluster
23:50 JoeJulian I need to do a lot more, but I'm out of time.
23:50 semiosis hopefully i havent confused NuxRo (too much)
23:50 TSM is it better to subdivide up a server into smaller subbricks, ie if i have two servers each with 30TB RAW, if i was to create 3 sub bricks of 10TB each and then join them in pairs for redundancy, i was thinking this would allow easier expansion and also you could expand the cluser by 1 server and just rebalance
23:50 NuxRo I'd like to see a webmin module for glusterfs
23:50 NuxRo maybe I should bug Jamie Cameron about it..
23:50 * JoeJulian beats NuxRo about the head and shoulders.
23:50 JoeJulian webmin...
23:51 semiosis i interviewed someone for jr. sysadmin role who was talking all like "I've been using linux for sooo long, yeah sure apache, been there, done that..." asked him a couple questions... "Well actually I used webmin to manage the linux servers."
23:52 JoeJulian You wouldn't really have redundancy unless you do something like http://joejulian.name/blog/how-to-expand-gl​usterfs-replicated-clusters-by-one-server/
23:52 glusterbot Title: How to expand GlusterFS replicated clusters by one server (at joejulian.name)
23:52 NuxRo JoeJulian: webmin can be really great, depends how you use it. ;-)
23:53 NuxRo btw, johnmark demo-ed this today https://github.com/joejulian/ufopilot
23:53 glusterbot Title: joejulian/ufopilot · GitHub (at github.com)
23:53 NuxRo very nice
23:53 semiosis NuxRo: tbh though webmin control of glusterfs is one of the more sane/reasonable suggestions i've heard on this topic
23:53 JoeJulian thanks.
23:53 gbrand_ joined #gluster
23:53 NuxRo JoeJulian: there's one thing that bothers me (to quote Columbo), why is that password in clear text? :)
23:53 JoeJulian WHAT?
23:54 JoeJulian It's a password field. If it's in clear text then someone has a broken browser.
23:54 TSM JoJulian: i was thinking what you posted, ive seen that
23:54 NuxRo JoeJulian: user_gv0_tommy=demo .admin
23:55 NuxRo i would be more comfortable with some sort of hash at least
23:55 TSM but thinking that splitting the array from 30TB to 10TB will allow for faster fsck if required etc, obv it makes config a little more complicated
23:55 JoeJulian Oh, you mean in the config file.
23:55 NuxRo JoeJulian: yeah, sorry, should've been more specific
23:55 JoeJulian That's a swift thing. If you use tempauth, it's cleartext in the conf file. If you use keystone, it's not.
23:55 semiosis later all
23:56 JoeJulian later
23:56 * JoeJulian needs to go catch his train too.
23:56 NuxRo ttyl
23:57 NuxRo semiosis: are you in a position to write a webmin module (aka do you know perl)?
23:58 NuxRo it might actually help spread glusterfs
23:59 TSM it may be usefull but these systems should not be used by people that dont touch the cli

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary