Camelia, the Perl 6 bug

IRC log for #gluster, 2012-11-30

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:03 cyberbootje joined #gluster
00:06 hattenator joined #gluster
00:15 plarsen joined #gluster
01:07 yinyin joined #gluster
01:30 lng joined #gluster
01:41 yinyin joined #gluster
01:44 kevein joined #gluster
02:06 lng joined #gluster
02:08 lng JoeJulian: Hello! How do you manage entries like <gfid:d259e11a-a91d-43bd-ac32-aa5e9fe2ce60> returned by `info heal-failed`?
02:16 bharata joined #gluster
02:20 sunus joined #gluster
03:13 jayeffkay joined #gluster
03:42 sunus joined #gluster
03:53 Humble joined #gluster
04:02 jiffe1 joined #gluster
04:02 Psi-Jack joined #gluster
04:06 sgowda joined #gluster
04:19 yinyin joined #gluster
04:36 jayeffkay_ joined #gluster
04:36 jayeffkay_ left #gluster
04:43 mtanner joined #gluster
04:58 yinyin joined #gluster
05:07 ankit9 joined #gluster
05:11 genewitch so what could be stopping this from working?  0-mgmt: failed to fetch volume file (key:/eph-vol1)
05:12 genewitch i'm trying to mount using mount -t glusterfs hostname:/eph-vol1 /glu-eph/
05:13 genewitch i can mount the gluster stuff on the two gluster servers
05:15 genewitch here's the gluster volume info all: http://pastie.org/5456228
05:15 glusterbot Title: #5456228 - Pastie (at pastie.org)
05:15 genewitch are there ACLs or something?
05:24 genewitch Oh, i had to use glusterfs --volume-id= --volfile-server=
05:25 genewitch is it because i had two shares?
05:26 mohankumar joined #gluster
05:30 yinyin joined #gluster
05:34 mohankumar joined #gluster
05:39 genewitch actually that didn't work :-(
05:39 genewitch please,a nyone?
05:40 mohankumar joined #gluster
05:46 mohankumar joined #gluster
05:49 GLHMarmot That sounds a bit like a problem I had where the address of my bricks was on a private subnet.
05:49 jayeffkay_ joined #gluster
05:50 GLHMarmot I had to switch the bricks to use a public, or at least an accessable, subnet.
05:50 GLHMarmot Just a guess.
05:50 raghu joined #gluster
05:50 jayeffkay_ How well does glusterfs cope with directories with millions of files?
05:55 mohankumar joined #gluster
05:55 shireesh joined #gluster
05:57 sripathi joined #gluster
06:00 genewitch GLHMarmot: oh.
06:00 genewitch that's no good though because i have to pay for public transfer
06:00 genewitch can't i tell glusterd to allow all inbound connections?
06:12 hagarth joined #gluster
06:15 mohankumar joined #gluster
06:18 bala joined #gluster
06:19 genewitch can i fix this with geo-replication?
06:25 mohankumar joined #gluster
06:27 Bullardo joined #gluster
06:28 genewitch [2012-11-29 22:28:04.438202] W [socket.c:1512:__socket_proto_state_machine] 0-glusterfs: reading from socket failed. Error (Transport endpoint is not connected), peer (50.112.5.16:24007)
06:28 genewitch that's the error, so i think it is because it's behind a weird NAT
06:29 genewitch why does gluster care when apache doesn't
06:30 genewitch [2012-11-30 06:30:15.469019] E [rpcsvc.c:491:rpcsvc_handle_rpc_call] 0-glusterd: Request received from non-privileged port. Failing request
06:36 genewitch i tried option server.allow-insecure on
06:36 genewitch but that didn't seem to fix it either
06:36 genewitch (from this channel on nov 3rd
06:41 JoeJulian genewitch: you can use split-brain dns to allow connections to either address by hostname.
06:43 JoeJulian Oh, wait... you're NATted? I've never seen anybody do that successfully.
06:45 JoeJulian You'd have to make sure the right ports are always forwarded to the right place and, yeah, you'll have to set rpc-auth.ports.insecure on.
06:48 JoeJulian lng: I deleted them and then did a gluster volume heal full. They've never come back since.
06:50 JoeJulian genewitch: "why does gluster care when apache doesn't?" Because apache doesn't allow you to create chroot binaries and execute them. Gluster has that very lame attempt at making sure the software that's connecting is owned by root. That'll get better, but that's what there is for now.
06:52 JoeJulian jayeffkay_: It handles that as well as you could expect when there's about 8 network round trips per file in your typical method of listing. If you JUST read the directory, it won't be too bad.
06:53 JoeJulian jayeffkay_: If, however, you're reading the stats on every dirent so you can determine the file type, or size, or times, it'll take the multiple of the RTT for all those transactions.
06:54 jayeffkay_ JoeJulian: I won't often be listing, my main concern is inode limits, i don't know if they apply, and increased file access latency
06:55 JoeJulian You're fine for inodes as they're uuids. That latency is the real killer. If you do a plan ls, with none of the normal default features like color or decorators, it can be pretty quick.
06:55 JoeJulian s/plan/plain/
06:55 glusterbot What JoeJulian meant to say was: You're fine for inodes as they're uuids. That latency is the real killer. If you do a plain ls, with none of the normal default features like color or decorators, it can be pretty quick.
06:56 JoeJulian An ls -l, though, unless you're using infiniband would take a very long time indeed. With infiniband it would just be a long time.
06:56 JoeJulian Can't you tree those files out somehow so they're not all in one directory?
06:57 jayeffkay_ I could, I have done on ext3 before, i was wondering if it was necessary on glusterfs, assuming 99.9% of operations will be a simple file access?
06:58 JoeJulian As long as you know what you're doing, it'll be fine.
06:59 jayeffkay_ Okay, thanks JoeJulian
07:00 JoeJulian (interesting that we have two J.F.K. mnemonics here)
07:01 jayeffkay_ heh. hadn't noticed
07:01 Bullardo joined #gluster
07:03 lng JoeJulian: oh just deleted?
07:04 JoeJulian That's what I did. Not sure if that's my actual recommendation, but I'm a rebel.
07:04 lng JoeJulian: I have listed few of them and noticed inconsistent replicated files
07:05 JoeJulian As long as you back them up before deleting them, the only things I can really see going wrong is the loss of hardlinks.
07:05 JoeJulian Of course, do a full heal afterward.
07:05 lng JoeJulian: since they are hard links, deleting them will not fix split-brain
07:06 JoeJulian They're in heal-failed, you said, not split-brain.
07:07 lng JoeJulian: when you delete something in .glusterfs directory, should it be recreated after healing?
07:07 JoeJulian If it's valid.
07:07 lng JoeJulian: I thought if files are different on two replicas, it means split-brain
07:07 lng or I might be wrong?
07:08 JoeJulian I believe it's possible, in a situation where the volume needs healed, that there could be gfid files for files that were deleted.
07:08 JoeJulian heal-failed is a different category.
07:09 JoeJulian If they're split-brain, they'll get cataloged that way. If they're not, but the heal still fails, they fall through to heal-failed.
07:09 lng ok, I will try your approach
07:09 lng thanks!
07:09 JoeJulian The only way to figure out why the heal failed is to read the logs and either read the source to figure out what might be happening, or consult a psychic.
07:10 lng psychic?
07:10 JoeJulian humor
07:10 JoeJulian Tarot cards can always determine the cause of storage failure.
07:10 lng ok, I will contact him tonight
07:11 JoeJulian hehe
07:11 lng :-)
07:11 JoeJulian Ouija boards are great for recovering lost data too.
07:11 JoeJulian (base36 encoded)
07:13 mohankumar joined #gluster
07:13 lng JoeJulian: what is gfid is pointing to dir
07:13 JoeJulian ... I know what I need to bring to the next storage related conference I go to... a base64 Ouija board. ;)
07:13 lng and there're a lot of files?
07:13 JoeJulian Then the gfid would be a symlink.
07:14 JoeJulian gfid's are either real files, or symlinks. If a gfid is a directory, then that's why the heal's failing.
07:14 lng yes, it s
07:14 lng okay
07:14 lng if it is dir
07:14 lng what should I do in this case?
07:15 lng delete it too?
07:15 lng '2012-11-30 06:32:13 <gfid:63d707c7-40b1-4778-a567-40651498e67c>'
07:15 lng for example
07:15 JoeJulian If it is, see if you can find the first instance of it in your logs and add it to my bug... (just a sec while I get the bug id)
07:16 lng okay
07:16 lng but, don't I need to delete such dir?
07:16 JoeJulian I would back it up in case there's an issue with the files in it.
07:17 JoeJulian bug 859581
07:17 glusterbot Bug http://goo.gl/60bn6 high, unspecified, ---, vsomyaju, ASSIGNED , self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs
07:17 lng what would happen if I delete it?
07:17 JoeJulian Just what always happens when you delete a directory.
07:18 lng oh
07:19 lng because it is symlink, actual files in target directory will be deleted, right?
07:19 JoeJulian No, a symlink can be safely deleted.
07:20 glusterbot New news from newglusterbugs: [Bug 882112] RFE: Write a new translator to add time delay in fops <http://goo.gl/bijpO>
07:20 dobber joined #gluster
07:21 lng JoeJulian: what would you suggest me to do with these directories returned by heal-failed?
07:22 JoeJulian I would look at the contents and see if I could determine the value of the files. If they're obviously junk, i would delete the directory....
07:22 lng JoeJulian: if I delete it, should I delete actual files?
07:23 JoeJulian If they weren't junk, I would then try to find the directory that matches the gfid and see if the files are where they belong.
07:23 JoeJulian Well, you can't delete a directory if it has files in it.
07:23 lng JoeJulian: this is extremely slow
07:23 lng JoeJulian: rm -r
07:24 lng rm -frv "$brick/.glusterfs/${gfid:0:2}/${gfid:2:2}/$gfid"
07:24 JoeJulian It's up to you, of course. That seems likely to result in data loss.
07:25 lng searching by inode is slow
07:25 JoeJulian yes
07:25 JoeJulian You could always just tar them up somewhere, then delete If it turns out they were important, you'll know where to find them.
07:26 puebele1 joined #gluster
07:28 mjrosenb joined #gluster
07:28 mjrosenb does gluster use python these days?
07:32 JoeJulian "does" or "can"?
07:33 JoeJulian The only python that's in the git tree (I claim without actually looking) is the swift stuff.
07:33 JoeJulian jdarcy posted an example of how to write a translator in python though.
07:34 JoeJulian https://github.com/jdarcy/glupy
07:34 glusterbot Title: jdarcy/glupy · GitHub (at github.com)
07:34 JoeJulian mjrosenb: ^^^
07:35 lng JoeJulian: should I delete them on one of the replicas?
07:36 JoeJulian lng: No. No gfid file should be a directory.
07:37 lng I'm confused
07:37 Bullardo joined #gluster
07:37 lng 2012-11-30 07:22:04 <gfid:be92fc13-1c8b-4121-b9db-d556e82894a9>
07:38 guigui1 joined #gluster
07:38 lng gluster volume heal storage info heal-failed
07:38 mjrosenb JoeJulian: I updated some stuff, and I started getting this when starting glusterd: http://paste.ubuntu.com/1398602/
07:38 lng JoeJulian: ^
07:38 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
07:38 JoeJulian "find .glusterfs/*/* -type d" should return no results.
07:39 lng JoeJulian: they are links
07:39 lng to dir
07:39 JoeJulian mjrosenb: You got me... I think I'm getting too tired to be useful tonight... I forgot all about geo-replicate.
07:39 JoeJulian lng: Ok then you have nothing to worry about.
07:40 JoeJulian You don't even have to worry about hardlinks, because you can't hardlink a directory.
07:40 mjrosenb I don't even know what that is, I assume I don't need it (if it is disableable)
07:41 JoeJulian That gfid file would have the same trusted.gfid as the directory it symlinks to.
07:41 lng how do I get rid of them - I don't want them to appear in heal-failed
07:41 JoeJulian mjrosenb: No, if you're not using geo-replication then you don't care.
07:42 mjrosenb although I should probably find out why python files are not running
07:42 JoeJulian lng: Delete them from all replicas. If they should be recreated, a heal...full will fix them. If they shouldn't, they'll be gone.
07:44 JoeJulian mjrosenb: What version of python is that? That shouldn't be a syntax error in any version I've ever used.
07:44 meshugga joined #gluster
07:44 JFK joined #gluster
07:44 torbjorn__ joined #gluster
07:44 hagarth left #gluster
07:44 maxiepax joined #gluster
07:45 NuxRo joined #gluster
07:45 z00dax joined #gluster
07:45 lng JoeJulian: http://paste.ubuntu.com/1398612/
07:45 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
07:45 mjrosenb JoeJulian: I think I tried with both 2.6 and 2.7
07:45 lng JoeJulian: so I can delete both, right?
07:46 mjrosenb stack overflow says it is an error on 3.0 and above though
07:46 lng on all replicas
07:46 ankit9 joined #gluster
07:46 mjrosenb my environment may not be updating as quickly as I want
07:46 JoeJulian Ah, right... I haven't done any version 3 python yet.
07:47 lng JoeJulian: yes?
07:47 mjrosenb but the shebang is #!/usr/bin/env python2
07:47 mjrosenb so there shouldn't be any way it is running python3
07:48 JoeJulian lng: /bin/rm "$brick/.glusterfs/${gfid:0:2}/${gfid:2:2}/$gfid"
07:49 lng JoeJulian: ok, thank you!
07:50 ngoswami joined #gluster
07:51 ekuric joined #gluster
07:56 JoeJulian mjrosenb: The python binary is started from marker's gsyncd.c. It's define by configure as PYTHON in config.h during the build process. There's no way to override it so it'll probably run /usr/bin/python I suspect.
07:57 lng JoeJulian: when I delete them, actual files will stay intact, right?
07:57 JoeJulian right
07:57 lng thanks
07:57 JoeJulian and if we're wrong about them being symlinks, then doing it the way I said will be safe.
07:58 rudimeyer_ joined #gluster
07:58 lng JoeJulian: no, they are not symlinks
07:58 * JoeJulian is going to beat you with a trout....
07:59 Azrael808 joined #gluster
07:59 lng oh
08:01 lng lng blocks JoeJulian bunches
08:01 lng and thrown him to the ground :-)
08:01 lng arm bar at the end
08:02 JoeJulian mjrosenb: I just confirmed that /usr/libexec/glusterfs/gsyncd has /usr/bin/python hardcoded into it.
08:03 Humble joined #gluster
08:03 JoeJulian I'll file a bug
08:03 glusterbot http://goo.gl/UUuCq
08:03 lng yet another bug
08:04 JoeJulian lng: Which software are you using that doesn't have any?
08:05 lng JoeJulian: any has bugs
08:06 JoeJulian Right.
08:06 lanning probably the one with all the features
08:06 JoeJulian hehe
08:06 JoeJulian The one with nobody using it.
08:06 glusterbot New news from resolvedglusterbugs: [Bug 764890] Keep code more readable and clean <http://goo.gl/p7bDp>
08:06 ctria joined #gluster
08:15 guigui1 joined #gluster
08:20 glusterbot New news from newglusterbugs: [Bug 882127] The python binary should be able to be overridden in gsyncd <http://goo.gl/cnTha>
08:24 Alpinist joined #gluster
08:25 mjrosenb JoeJulian: I installed via gentoo, and the install scripts change it to #!/usr/bin/env python2
08:26 mjrosenb but I have no clue why it appears to be doing the wrong thing here
08:26 JoeJulian mjrosenb: gsyncd is a binary
08:26 JoeJulian Or are you saying it builds from source...
08:26 JoeJulian I've never used gentoo
08:27 twx_ gentoo builds from source by default afaik
08:29 ankit9 joined #gluster
08:29 JoeJulian Since that's valid syntax in python2, you have python3 installed, and that error is occurring, the only possible conclusion would be that it's running python3. It's elementary my dear Rosenberg... ;)
08:30 * JoeJulian may have been reading too much Doyle recently.
08:35 mjrosenb JoeJulian: ohh.
08:36 mjrosenb JoeJulian: nevermind, I assumed you meant the .py file had #!/usr/bin/python hardcoded
08:38 mjrosenb JoeJulian: yes, gsyncd being a binary that calls /usr/bin/python /path/to/foo.py makes much more sense.
08:40 JoeJulian I filed the bug report on that. It should be configurable.
08:40 mjrosenb indeed.
08:44 JoeJulian Anybody know any Sr. Systems Architects?
08:45 lanning define "Systems"
08:54 JoeJulian lanning: pm
08:55 tjikkun_work joined #gluster
08:56 ankit9 joined #gluster
08:56 JoeJulian Hmm.. maybe he fell asleep over there like I'm doing here... I'm heading to bed. Talk to you all later.
08:59 guigui5 joined #gluster
09:00 lkoranda joined #gluster
09:06 mjrosenb JoeJulian: 'night.
09:07 gbrand_ joined #gluster
09:21 bauruine joined #gluster
09:31 kevein joined #gluster
09:35 duerF joined #gluster
09:48 morse_ joined #gluster
09:49 cyberbootje joined #gluster
09:58 morse_ joined #gluster
10:03 ekuric joined #gluster
10:05 ekuric joined #gluster
10:05 AndroUser2 joined #gluster
10:07 gbrand__ joined #gluster
10:11 sgowda joined #gluster
10:12 ngoswami joined #gluster
10:22 mgebbe joined #gluster
10:54 mooperd joined #gluster
10:57 guigui4 joined #gluster
11:20 mohankumar joined #gluster
11:22 saz joined #gluster
11:28 mohankumar joined #gluster
11:35 H__ any hints on "Unable to get <uuid>.xtime attr" issues ?
11:43 rosco__ Hi, can somebody help me with: ls: reading directory .: File descriptor in bad state
11:44 rosco__ fileserver1 is working, but fileserver 2, with the same data says this
12:12 toruonu joined #gluster
12:18 Alpinist joined #gluster
12:53 morse joined #gluster
12:54 duerF joined #gluster
12:58 plarsen joined #gluster
13:07 yinyin joined #gluster
13:38 edward1 joined #gluster
13:39 balunasj joined #gluster
13:49 aliguori joined #gluster
13:54 puebele joined #gluster
13:59 plarsen joined #gluster
14:10 plarsen joined #gluster
14:10 lh joined #gluster
14:10 lh joined #gluster
14:12 kkeithley joined #gluster
14:13 puebele1 joined #gluster
14:18 mooperd kkeithley: hello
14:18 kkeithley hi
14:18 glusterbot kkeithley: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
14:19 mooperd heh
14:19 kkeithley stupid glusterbot
14:19 mooperd did you see my post?
14:19 kkeithley about getting to 3500 files?
14:19 mooperd yep
14:20 kkeithley I replied, but I think you had already signed off.
14:20 mooperd ideas?
14:21 kkeithley yeah, I don't know of any limit.
14:21 kkeithley my dev box crashed and the xfs fs with all my vm images on it wouldn't mount, so I've been dealing with that.
14:21 kkeithley I'll take a look
14:21 mooperd Your welcome to login to my setup
14:24 kkeithley If I need to I'll let you know
14:24 mooperd cool
14:24 mooperd it doesnt seem to be a gluster proble
14:24 mooperd m
14:24 mooperd I can create files directly on the mount with no problem
14:28 kkeithley duly noted
14:31 jdarcy Great, build failed because I don't put git on my test machines.
14:32 jdarcy Seems like the automatic changelog generation could have been in a commit hook instead of the makefile.
14:37 H__ where can i read how geo-replication works internally ?
14:39 jdarcy H__: I don't know of any written docs I can share, but I can try to explain.
14:40 H__ cool
14:40 H__ i'm basically wondernig 2 things :
14:41 H__ 1) what does it do at 'startup' ? How does it determine that a sync-point is reached ?
14:41 H__ 2) after initial sync, how does it determine what needs to be synced ?
14:42 H__ 3.2.5 here still btw . I cannot get a change window to upgrade to 3.3 :(
14:46 jdarcy H__: The initial startup is basically rsync.
14:47 jdarcy H__: Incremental updates use something we call "xtime" which for any directory is the maximum xtime of any of its descendants.
14:48 jdarcy H__: The xtimes (maintained by the marker translator) are used to "prune" our search for changed files.
14:48 jdarcy H__: So if we have /foo and /bar, and the xtime on /bar hasn't changed, we don't need to recurse into /bar at all.
14:49 jdarcy H__: We do recurse into /foo, where we might find some things with changed xtimes (so recurse further) and some without (so skip).
14:49 H__ and how does this traverse ? If I update D in /A/B/C/D does A and B and C get notified of that via xtimes ?
14:49 jdarcy H__: It's basically like a Merkle tree, if you know those.
14:49 H__ not yet ;-) 1 moment
14:49 jdarcy H__: Exactly.  The marker translator propagates the marking up from D to /A/B/C and so on up to /
14:50 jdarcy IMO it should be versions rather than times, but no need to go into that.  ;)
14:51 * jdarcy was just writing slides about this, serendipitously.
14:52 dalekurt joined #gluster
14:52 H__ so, if I write 1000 new files in a deep /a/b/c/d/etc tree, does /a itself get 1000 xtime updates too then ?
14:54 jdarcy H__: We do try to ameliorate that effect, but as a worst case yeah.
14:55 H__ ok. And about 1) the initial rsync, does that trigger self-heal on all files or is it faster ?
14:55 nightwalk joined #gluster
14:55 jdarcy H__: It's independent of self-heal.  Generally, it should be faster.
14:58 H__ cool. Is geo-rep nowadays capable of replicating a part of a volume ? (Say just /A/B and leaving /C and /D untouched)
15:00 robo joined #gluster
15:01 stopbit joined #gluster
15:02 Alpinist joined #gluster
15:02 Alpinist_ joined #gluster
15:04 jdarcy H__: That I don't know.  I study the algorithms, know less about the actual user-visible feature set.
15:06 H__ ok :)   About algorithms: is symmetric replication (a la unison) on the table for the future ?
15:07 jdarcy H__: Yes.
15:08 jdarcy H__: Bidirectional/hierarchical geosync was supposed to be part of 3.4 but TBH might be 3.5 instead.
15:09 jdarcy H__: Further out, I'm working on *ordered* async replication - based on logging rather than scanning.
15:10 jdarcy H__: Even highly optimized scanning is still kind of crappy IMO.  We're in the I/O path, we should already know what needs to be propagated without having to scan for it.  This really matters for billion-file volumes.
15:11 noob2 joined #gluster
15:15 H__ exactly. I had hoped georep already worked like that, hence my question
15:16 jdarcy H__: After LISA I can send you my slides which cover exactly this topic.
15:17 H__ that'd be cool !
15:18 H__ plz send that to hans at shapeways dot com :)
15:20 H__ i hope the channel log reapers don't add that address to some spam lists :-P
15:22 glusterbot New news from newglusterbugs: [Bug 882278] [FEAT] NUFA <http://goo.gl/XyB0I>
15:23 tqrst any idea why the 'Port' entry for two of my bricks in 'volume status' is 'N/A' instead of an actual port number? The bricks are online and work fine as far as I can tell. All the others have the proper 2401{0,1,2,3,4} port number. See lines 37 and 39 in http://fpaste.org/yIk5
15:23 glusterbot Title: Fedora Pastebin - by Fedora Unity (at fpaste.org)
15:28 jbautista joined #gluster
15:30 nightwalk joined #gluster
15:36 tqrst probing doesn't help
15:48 tqrst what does the failures column mean in the rebalance status output?
15:52 noob2 tqrst: are your drives flaking out and causing your rebuild to fail?
15:53 tqrst noob2: not as far as I know
15:53 tqrst noob2: I'm just curious what the numbers in that column even *mean* - what is failing exactly?
15:54 tqrst ah, there is a rebalance log
15:55 tqrst I don't like what I'm seeing in there
15:55 nightwalk joined #gluster
15:56 noob2 can you fpaste what you're seeing?
15:56 tqrst http://fpaste.org/I9yg/
15:56 * johnmark hates wiki spam
15:56 glusterbot Title: Viewing whats with all the errors? (at fpaste.org)
15:56 tqrst johnmark: but everyone loves timex replicas!
15:58 tqrst that 00000000-0000-0000-0000-000000000000 looks like a bogus uuid
15:59 MalnarThe left #gluster
16:01 tqrst the errors in the log I just pasted are happening about 3-4 times per second by the looks of it
16:01 noob2 tqrst: i've never seen those before
16:01 tqrst noob2: google hasn't either
16:01 noob2 looks like your extended attributes disappeared
16:01 noob2 if i had to guess
16:02 tqrst what could cause that?
16:02 noob2 is your drive output errors in the messages?
16:02 noob2 i'm not sure
16:02 noob2 xfs, ext /
16:02 noob2 ?
16:02 tqrst ext3
16:02 noob2 what version kernel do you have?  there was a bug floating around awhile back
16:03 tqrst Linux 2.6.32-131.17.1.el6.x86_64 #1 SMP Wed Oct 5 17:19:54 CDT 2011 x86_64 x86_64 x86_64 GNU/Linux
16:03 tqrst (scientificlinux, based on centos)
16:03 noob2 is this rhel 6?
16:03 noob2 ah
16:03 dalekurt joined #gluster
16:03 noob2 lemme check.  i think that might be the one
16:04 tqrst I don't see anything suspicious in the system logs, either
16:05 tqrst this is scientific linux 6.1 if the version number makes any difference
16:05 noob2 ok
16:05 tqrst should I stop rebalancing for now?
16:06 noob2 i don't know if you're losing files or not.  this is getting a little above me.  can someone else chime in?
16:07 tqrst the files mentioned in those errors still seem to be there
16:07 noob2 yeah it looks like your hashes are hashing out to all zero's
16:07 noob2 ok
16:07 tqrst hm
16:07 noob2 maybe if it's hashing to nothing it won't move them
16:07 noob2 this is def weird
16:07 noob2 what version of gluster did you deploy?
16:07 tqrst 3.3.1
16:07 tqrst after a very, very annoying upgrade from 3.2.6
16:07 noob2 i can imagine
16:08 noob2 i'm a little worried about upgrading also
16:08 noob2 i have 3.3.0-6
16:08 noob2 ok something to try
16:08 noob2 on your bricks that are failing, try doing a getfattr .
16:08 noob2 and see if anything comes up
16:09 tqrst at the root?
16:09 noob2 getfattr -d -m . brick1/
16:09 noob2 yeah that's what i have in my notes
16:09 noob2 root of where the brick is mounted
16:10 semiosis :O
16:11 tqrst http://www.fpaste.org/zQZP/
16:11 glusterbot Title: Viewing Paste #256353 (at www.fpaste.org)
16:11 noob2 your trusted id's look weird
16:11 noob2 lemme see what mine look like in dev
16:12 daMaestro joined #gluster
16:14 tqrst (I just stopped rebalancing, just in case it was doing anything nasty)
16:14 noob2 ok
16:15 noob2 maybe those are fine.  mine look similar in my dev cluster
16:15 tqrst yeah, the other bricks look similar to that
16:15 dalekurt joined #gluster
16:16 noob2 i'm not sure :-/
16:16 tqrst :\
16:17 bulde joined #gluster
16:18 noob2 if joe julian or semiosis is around they can probably help with this
16:18 tqrst thanks for trying :O)
16:18 noob2 i've only got about 6 months under my belt with gluster.  i'm still learning
16:20 tqrst same here - we switched a large part of our research data server to gluster about 6 months back
16:20 tqrst it's been ok except during upgrades, and all the gfid issues in 3.2
16:21 noob2 yeah i haven't had many problems other than split brain because i'm logging to them constantly
16:21 noob2 management is very happy with it
16:22 dalekurt joined #gluster
16:45 tqrst welp, sent it over to the mailing list
16:48 Humble joined #gluster
16:49 noob2 that sounds good
16:49 tqrst I also found a bunch of errors in the client logs, along the lines of "Unable to self-heal permissions/ownership of '/...' (possible split-brain). Polease fix the file on all backend volumes"
16:52 nueces joined #gluster
16:53 Daxxial_ joined #gluster
16:57 tqrst hrm, all my trusted.afr.myvol-client-N= are 0x000000000000000000000000
17:00 thekev joined #gluster
17:02 noob2 ok i have the split brain command saved
17:02 noob2 lemme find it
17:03 tqrst I didn't even know *folders* could become split brained
17:03 noob2 gluster volume heal gluster info split-brain | less
17:03 tqrst yeah I just did that
17:03 tqrst 1023 entries
17:03 tqrst this is going to be fun
17:03 noob2 are the recent?
17:03 noob2 it seems to keep them in the logs for a long time
17:04 tqrst they all look like they're from today
17:04 noob2 ok
17:04 noob2 what's your networking setup look  like?
17:04 noob2 active/backup, 802.3ad?
17:05 bulde joined #gluster
17:05 tqrst not sure :\
17:05 noob2 cat /proc/net/bonding/bond0
17:06 noob2 or ip a;
17:06 tqrst nothing in the first one
17:06 zaitcev joined #gluster
17:06 noob2 how many network interfaces do you have?
17:06 tqrst two in each node, but only one is in use
17:07 pdurbin_ joined #gluster
17:07 tqrst http://www.fpaste.org/wy1K/
17:07 glusterbot Title: Viewing Paste #256362 (at www.fpaste.org)
17:07 noob2 ok
17:07 noob2 is it possible your network blipped for a minute while you were doing these syncs?
17:07 noob2 i mean rebalance*
17:07 samkottler joined #gluster
17:08 arusso joined #gluster
17:08 linux-rocks_ joined #gluster
17:08 samkottler joined #gluster
17:08 Daxxial_ joined #gluster
17:08 tqrst how could I tell?
17:08 noob2 good question.  bonding interfaces usually keep track of how many times they went down
17:08 tqrst the split brain timestamps span several hours btw
17:08 noob2 lemme look if regular interfaces do also
17:08 noob2 ok
17:09 arusso joined #gluster
17:09 tqrst most of the entries in there are folders, too
17:10 noob2 i think only bonded interfaces keep track of fail counts
17:10 noob2 do you have networking guys who can look a the interface?
17:11 tqrst yes, but they're currently tied up with something for the next few days
17:11 noob2 i see
17:12 noob2 if you wire up that second port you can bond it into active/backup mode without switch support
17:12 noob2 that'll help out for little blips
17:12 andreask joined #gluster
17:13 tqrst hm, those 1023 entries contain many dupes
17:13 tqrst just with different timestamps
17:13 lkoranda joined #gluster
17:13 noob2 right
17:13 noob2 it keeps updating it as it finds the same thing again
17:13 noob2 1023 is the max it'll display i've noticed
17:14 tqrst have you ever seen split brained folders, though?
17:15 tqrst I thought that only happened to files
17:15 noob2 yeah
17:15 noob2 i've seen folder, files or gfid's split
17:15 noob2 i have admins who are experts at trashing things haha
17:17 dalekurt joined #gluster
17:20 tqrst JoeJulian's split-brain fix is hardly practical if the split brained file is /...
17:20 tqrst rm -rf / on the problematic brick? :p
17:21 noob2 are your 1023 entries all the same thing or diff stuff?
17:21 tqrst seems to be about 10 folders
17:22 noob2 ok
17:22 noob2 might want to write a little python to handle this
17:22 _Bryan_ joined #gluster
17:22 tqrst but still - am I really supposed to remove the split brained folder from one brick?
17:22 tqrst that's ~1.5T of dta
17:22 tqrst data*
17:23 noob2 i think the idea is basically to decide which node has the correct copy and remove the copy from the bad node
17:23 noob2 and then kick off a self heal by stating through the client fuse  mount
17:23 tqrst that would involve removing a whole brick
17:23 noob2 wow so the whole brick is bad?
17:23 tqrst well, / is split brained
17:23 noob2 jeez
17:24 noob2 i hate to recommend anything because i'm worried about destroying your cluster
17:24 tqrst I have the vague impression that it's already fucked
17:24 noob2 ll
17:24 noob2 ol
17:24 tqrst but I guess I'll wait and see what the mailing list says
17:25 noob2 if we want to get drastic i suppose we could remove the brick from the cluster, wipe it and then readd it
17:26 tqrst JoeJulian: is there a more targeted way of fixing split-brain than removing one copy?
17:33 johnmark oy
17:34 Mo___ joined #gluster
17:40 tqrst hm, any way to make 'volume heal myvol info split-brain' not include files that don't exist any more?
17:42 tqrst ah nevermind, it looks like it just needed a minute or two to update itself
17:42 andreask left #gluster
17:43 ras0ir joined #gluster
17:48 robo joined #gluster
17:58 jdarcy If you get split-brain problems on a directory, the safer bet is often to massage its xattrs instead of removing it.
17:58 JoeJulian tqrst: I didn't (and on purpose) set the recursive switch in my example.
17:59 jdarcy I'll bet that's the first time any of you have heard "massage" and "xattrs" in the same sentence.
18:00 JoeJulian I got an xattr massage just the other day....
18:01 kkeithley as long as you're not talking about massaging your xxxattrs
18:01 jdarcy LOL
18:01 JoeJulian +1
18:02 JoeJulian tqrst: See if you're looking at bug 859581
18:02 glusterbot Bug http://goo.gl/60bn6 high, unspecified, ---, vsomyaju, ASSIGNED , self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs
18:02 dalekurt joined #gluster
18:03 tqrst that is one strange looking bug
18:03 tqrst checking
18:06 tc00per Is there a Gluster Community broadcast today?
18:06 tqrst JoeJulian: should I only be looking at 00/00/00000000-0000-0000-0000-000000000001?
18:06 tqrst if that's the case, then they are all symlinks to ../../../
18:06 JoeJulian tqrst: yes
18:07 johnmark tc00per: no. we couldn't pull it together in time
18:07 johnmark tc00per: but if one of y'all wants to do it, I'll be happy to help
18:08 glusterbot New news from resolvedglusterbugs: [Bug 859162] FUSE client crashes <http://goo.gl/6omD0>
18:08 JoeJulian tc00per: I could ramble on camera for a bit... ;)
18:11 johnmark JoeJulian: cool :)
18:12 JoeJulian "Don't remain in the 3.1.x releases" - Amar Tumballi
18:12 johnmark JoeJulian: being a board member, you have the power (and the technology) to speak on the community's behalf
18:12 johnmark LOL
18:12 JoeJulian Wow, that could be dangerous.
18:12 johnmark heh
18:13 JoeJulian Especially as I haven't had my coffee yet.
18:13 johnmark JoeJulian: that can be fixed
18:13 tqrst I'm trying to understand the output of 'volume heal myvol info split-brain'. What does it mean when there are a bunch of lines with <gfid:...> in them? e.g. http://www.fpaste.org/Aae8/
18:13 glusterbot Title: Viewing Paste #256388 (at www.fpaste.org)
18:13 tqrst some files don't have any <gfid> after them, so it's not what I thought it was (the list of conflicting gfids or something)
18:14 Bullardo joined #gluster
18:14 tqrst (besides, I thought only files had gfids?)
18:14 JoeJulian the <gfid:...> entries are the actual gfid files in .glusterfs/xx/yy/xxyy*
18:15 JoeJulian johnmark: did you get my google talk question?
18:18 johnmark JoeJulian: oh... yeah.. was on my way to bed and forgot to look it up this morning
18:18 johnmark JoeJulian: hang on. will respond
18:24 tc00per JoeJulian/johnmark: No need, just didn't want to miss any hijinx if they were going to be going on.
18:26 bauruine joined #gluster
18:27 rudimeyer_ joined #gluster
18:29 tqrst is there a better way to prevent people from writing to bricks other than mounting them in really obscure places?
18:30 semiosis selinux/apparmor?
18:30 JoeJulian Don't let users log into your servers. ;)
18:30 semiosis or that
18:30 JoeJulian I mount my bricks in /data/glusterfs/$volume/$brick. If /data/glusterfs is mode 700, no users can get to the brick.
18:31 tqrst whoever set gluster up here named the mount points very similarly to something else people actually use
18:31 tqrst I think I'll change the mount points, at least
18:32 JoeJulian Actually, I took ... wasn't it noob2's idea... can't remember for sure... the idea of putting my bricks in a directory under the mount so I'm actually using /data/glusterfs/$volume/$brick/brick so that if $brick isn't mounted, glusterfsd doesn't happy start replicating the data to the root filesystem.
18:32 JoeJulian s/happy/happily/
18:32 glusterbot What JoeJulian meant to say was: Actually, I took ... wasn't it noob2's idea... can't remember for sure... the idea of putting my bricks in a directory under the mount so I'm actually using /data/glusterfs/$volume/$brick/brick so that if $brick isn't mounted, glusterfsd doesn't happily start replicating the data to the root filesystem.
18:33 bennyturns joined #gluster
18:34 jdarcy I definitely approve of not putting bricks in a local volume's root.
18:35 jdarcy Once nice thing about that as a developer is that you can remove and recreate a directory instead of rebuilding a filesystem every time you need a fresh start.
18:36 zaitcev Swift enforces this by running the process as swift, so if volume is unmounted, swift-owned daemons cannot write into the OS. But I think glusterd runs as root.
18:38 noob2 JoeJulian: points for me :D
18:39 noob2 that's correct, gluster runs as root
18:41 tqrst JoeJulian: ok so the <gfid:> entries are gfid files, but why are they getting output by 'volume heal myvol info split-brain' instead of actual file paths?
18:44 tqrst the admin guide doesn't say anything about what they actually mean
18:46 gbrand_ joined #gluster
18:47 JoeJulian Read my blog post on "What is this new .glusterfs tree" for more info.
18:50 JoeJulian It doesn't show the actual file for three possible reasons. 1, the file that's associated with that gfid hasn't been checked yet. 2, the file associated with that gfid is also listed but hasn't been healed yet. 3, the file no longer exists.
18:50 tqrst in all cases, gluster still knows what file name that gfid points to, though, doesn't it?
18:50 JoeJulian No
18:51 tqrst hm
18:52 JoeJulian The gfid file is a hardlink. The only way to find the file is to get the inode number, then scan thebrick for that inode.
18:52 JoeJulian ^ true inly for files, not directories.
19:00 dalekurt joined #gluster
19:00 jack joined #gluster
19:01 arusso joined #gluster
19:01 Daxxial_ joined #gluster
19:01 samkottler joined #gluster
19:01 linux-rocks_ joined #gluster
19:01 thekev joined #gluster
19:01 cyberbootje joined #gluster
19:01 torbjorn__ joined #gluster
19:01 abyss^_ joined #gluster
19:01 ndevos joined #gluster
19:01 m0zes joined #gluster
19:02 Daxxial_ joined #gluster
19:02 arusso joined #gluster
19:03 noob2 JoeJulian: btw I should mention that i found out why ls was slow on my gluster setup.  a previous admin aliased ln to ln with colorize on all the servers.
19:06 sb123_ joined #gluster
19:07 sb123_ Hi I needed help with an issue I am facing
19:08 sb123_ I am trying to setup a 4-node cluster with 8 bricks
19:08 sb123_ I have set the replication to 2
19:09 sb123_ however, when I run ls on the Gluster mounted drive, the command gets stuck
19:09 y4m4 joined #gluster
19:10 sb123_ other commands like reading files/creating files etc
19:10 sb123_ if I was have 2 nodes it works fine
19:11 sb123_ can some one please help me with this
19:12 noob2 did you check your log directory on the mount?
19:13 sb123_ no. sorry, I dont know how to do so
19:13 noob2 check /var/log/gluster/[mount_name].log
19:14 noob2 there should be some info in there
19:14 sb123_ I did that
19:16 noob2 did you look for any errors in there?
19:16 noob2 that's the first place i look
19:16 sb123_ thanks. I will try it once again
19:16 sb123_ I will be right back
19:16 noob2 ok
19:22 sb123_ hi
19:22 glusterbot sb123_: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
19:23 itamar_ joined #gluster
19:23 sb123_ i could not find any error
19:23 noob2 does the client log indicate all bricks are connected?
19:24 sb123_ i did gluster volume info
19:24 sb123_ that showed things are fine
19:26 sb123_ also cli.log is showing that the bricks are connected
19:26 sb123_ thanks. I really appreciate this
19:26 noob2 how did you mount the volume?  was it nfs or fuse?
19:27 sb123_ nfs
19:27 sb123_ the default way basically
19:27 itamarjp joined #gluster
19:28 sb123_ would that make a difference?
19:28 sb123_ also the ls hangs even if the bricks are all empty
19:30 jbrooks joined #gluster
19:30 noob2 check to make sure you mounted with nfs version 3
19:30 noob2 version 4 causes mayhem
19:31 sb123_ mount -t glusterfs rs0:/bvgluster2 /mnt/gluster/
19:31 sb123_ I used the above command to do the mount
19:31 JoeJulian ~pasteinfo | sb123_
19:31 glusterbot sb123_: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
19:31 noob2 ok so you have the fuse mount type then
19:32 sb123_ yes. now I realized ti
19:32 sb123_ sorry
19:32 noob2 no problem
19:32 sb123_ this is what mount shows
19:32 sb123_ rs0:/bvgluster2 on /mnt/gluster type fuse.glusterfs (rw,default_permissions,al​low_other,max_read=131072)
19:32 noob2 that looks good
19:32 sb123_ I thought it was nfs because the config file showed nfs
19:33 sb123_ should I give the last few lines the log files
19:34 sb123_ will that help?
19:35 noob2 certainly can't hurt
19:35 sb123_ 1 min
19:36 sb123_ [2012-11-30 20:19:00.141192] I [client-handshake.c:1445:client_setvolume_cbk] 0-bvgluster2-client-5: Server and Client lk-version numbers are not same, reopening the fds
19:36 glusterbot sb123_: This is normal behavior and can safely be ignored.
19:36 sb123_ [2012-11-30 20:19:00.141266] I [client-handshake.c:453:client_set_lk_version_cbk] 0-bvgluster2-client-6: Server lk version = 1
19:36 sb123_ [2012-11-30 20:19:00.141484] I [client-handshake.c:453:client_set_lk_version_cbk] 0-bvgluster2-client-5: Server lk version = 1
19:37 sb123_ [2012-11-30 20:19:00.141535] I [client-handshake.c:1433:client_setvolume_cbk] 0-bvgluster2-client-7: Connected to rs3:24018, attached to remote volume '/home/akshat/brick2'.
19:37 sb123_ [2012-11-30 20:19:00.141564] I [client-handshake.c:1445:client_setvolume_cbk] 0-bvgluster2-client-7: Server and Client lk-version numbers are not same, reopening the fds
19:37 glusterbot sb123_: This is normal behavior and can safely be ignored.
19:37 sb123_ [2012-11-30 20:19:00.146012] I [fuse-bridge.c:4191:fuse_graph_setup] 0-fuse: switched to graph 0
19:37 sb123_ [2012-11-30 20:19:00.146088] I [client-handshake.c:453:client_set_lk_version_cbk] 0-bvgluster2-client-7: Server lk version = 1
19:37 sb123_ [2012-11-30 20:19:00.146150] I [fuse-bridge.c:4091:fuse_thread_proc] 0-fuse: unmounting /mnt/gluster/
19:37 sb123_ [2012-11-30 20:19:00.146483] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f7fba3f9cbd] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f7fba6cce9a] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7f7fbb1baab5]))) 0-: received signum (15), shutting down
19:37 sb123_ [2012-11-30 20:19:00.146518] I [fuse-bridge.c:4648:fini] 0-fuse: Unmounting '/mnt/gluster/'.
19:37 sb123_ that is from mnt-gluster-.log
19:38 sb123_ is there another log file that might help?
19:44 sb123_ can you see the log, as I can see a warning sign
19:45 JoeJulian Don't paste in IRC channels. Use a paste site please.
19:45 sb123_ will keep that in mind for the future
19:46 * JoeJulian saw that recently too... hmm...
19:46 JoeJulian Oh, right... I remember. I don't think it's the same though.
19:47 JoeJulian I had some errant libraries left behind from older distro releases.
19:47 sb123_ http://fpaste.org/JLhO/
19:47 glusterbot Title: Viewing Paste #256441 (at fpaste.org)
19:48 sb123_ this is the ouput of volume info
19:48 sb123_ the strange thing is that it works fine for 2 nodes but fails for 4 nodes
19:48 JoeJulian Ok, truncate the log at /var/log/glusterfs/mnt-gluster.log and try mounting again, then paste the log please.
19:48 JoeJulian nodes being servers? or clients?
19:48 sb123_ 1 minute
19:49 JoeJulian @glossary
19:49 glusterbot JoeJulian: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
19:51 sb123_ servers
19:51 sb123_ these are the ones which host the bricks
19:54 sb123_ http://fpaste.org/8A6m/
19:54 glusterbot Title: Viewing Paste #256442 (at fpaste.org)
19:54 wN joined #gluster
19:54 sb123_ that is the contents of mnt-gluster-.log
19:57 noob2 does gluster by default export all subdirectories over nfs?
19:58 noob2 nvm i got it :)
20:03 JoeJulian sb123_: hmm, that worked just fine... but if you try to do anything on /mnt/gluster it hangs?
20:03 sb123_ thanks. only ls hangs
20:03 sb123_ i can create files
20:03 sb123_ they also sync to other nodes
20:03 sb123_ i can read files
20:03 sb123_ delete files
20:04 sb123_ but I cant do ls
20:04 dalekurt joined #gluster
20:04 JoeJulian OH!
20:04 JoeJulian @ext4
20:04 glusterbot JoeJulian: Read about the ext4 problem at http://goo.gl/PEBQU
20:04 JoeJulian Should have jumped out at me right away, but I think my coffee just finally kicked in.
20:05 sb123_ 1 min. reading the article. will be back
20:06 sb123_ i am using 3.3.1
20:06 sb123_ it seems to be still occuring in it
20:06 JoeJulian yep
20:07 sb123_ also, if it works for 2 servers why does it fail on 4 servers
20:07 noob2 sb123: might want to consider xfs.  works fine for me
20:08 noob2 rhel6 has all the xfs speedup changes rolled in
20:08 sb123_ i am on ubuntu 12.10
20:09 JoeJulian That reminds me... I should probably file a bug against xfs. It did something wonky and unmounted itself on me a couple days ago.
20:09 glusterbot http://goo.gl/UUuCq
20:10 sb123_ thanks a lot
20:10 sb123_ I will try this with XFS
20:10 sb123_ have not used it before
20:11 tqrst if I get a "Unable to self-heal permissions/ownership of '/some/folder' (possible split-brain). Please fix the file on all backend volumes", at least one of the bricks should have different permissions/ownership on /some/folder than the other bricks, right?
20:13 noob2 is there a general recommendation of how many io-threads to have based on how many drives you have in a sever?
20:14 JoeJulian tqrst: Not necessarily. Check the xattrs.
20:15 JoeJulian noob2: Actually, I was going to thank you for that suggestion I credited you with earlier. When the xfs unmounted itself, the brick cleanly died and didn't restart because the directory for the brick was missing. If I hadn't taken your suggestion I would have filled up root.
20:17 noob2 :)
20:17 tqrst JoeJulian: http://www.fpaste.org/9cqF/ is 'getfattr -d -e hex -m .' on all bricks. gfid seems consistent throughout
20:17 glusterbot Title: Viewing Paste #256455 (at www.fpaste.org)
20:18 tqrst (and all permissions/owners are the same, which is why I was asking in the first place)
20:18 noob2 JoeJulian: at some point i'll run into that problem again when my first disk dies in the cluster
20:18 noob2 i'm glad you confirmed it works
20:19 JoeJulian trusted.afr.bigdata-client-2​8=0x000000000000004a00000000 shows pending operations required for trusted.afr.bigdata-client-28
20:19 JoeJulian trusted.afr.bigdata-client-3​0=0x000000000000004a00000000 shows those same pending operations for trusted.afr.bigdata-client-30
20:20 JoeJulian That may be split-brain if they're pointing at each other.
20:21 JoeJulian I also see pending for bigdata-client-{1,29,31}
20:23 JoeJulian In fact, since I bet that 30 and 31 are in the same replica set and each says there's pending ops for the other, that's the problem. setfattr -n trusted.afr.bigdata-client-30 -v 0x000000000000000000000000 on the second server in that fpaste and it looks like it should heal.
20:24 JoeJulian Ah, 0 and 1 are conflicting too. Do the same for trusted.afr.bigdata-client-1 where the value is 0x000000000000000100000001
20:25 tqrst trying
20:25 JoeJulian ... and 28 on that first entry..
20:25 JoeJulian Ok, I think that's all of them.
20:25 tqrst does -client-N correspond to the N in 'BrickN: ...' in volume info?
20:28 JoeJulian N-1
20:30 tqrst hrm. Pretty much every folder in that path have the same issue
20:30 tqrst has*
20:31 JoeJulian find -type d -exec setfattr -n trusted.afr... -v 0x0... {} \; is the quick and dirty way to clear those.
20:44 JoeJulian mjrosenb: which distro are you running?
20:44 JoeJulian oh, nm... I remember...
20:45 AK6L joined #gluster
20:46 AK6L hi folks.  i'm going to be setting up a Gluster... cluster... in EC2.  does anyone have any advice re: sizing, whether to use one big or several small EBS volumes, etc.?
20:46 semiosis AK6L: i like several smaller ebs vols
20:46 AK6L it's going to be a 1TB filesystem, replicated (rather than distributed) with a replica count of 3
20:47 AK6L semiosis: OK i was leaning in that direction too but mostly because that's how i'd do it with physical disks.  i actually have no idea what EBS looks like underneath.
20:47 AK6L i've been doing 'bare metal' ops for nearly 20 years but EC2 is pretty new to me.
20:47 semiosis ebs is a mystery
20:47 AK6L hehe, ok
20:47 semiosis whats your workload/use-case?
20:49 AK6L i believe it's going to be mostly random reads.  many many gzipped text files containing genetic data.
20:52 semiosis how big do you expect the largest file to be?
20:53 AK6L determining that now.
20:55 AK6L looks like there's one 30GB file, a hundred or so 100MB files, but most are <10MB
20:55 AK6L maybe one or two 1-5GB files.
20:57 johnmark AK6L: interesting use case
20:58 johnmark AK6L: somebody at UC Irvine was also using the zip/gzip to store genetic data using GlusterFS
20:58 johnmark I think in their setup, they were unzipping on the fly whenever they needed to access data
20:59 AK6L yeah i think that's probably the case here too
20:59 johnmark AK6L: may I ask which organization this is for?
20:59 AK6L i'm a subcontractor so i'd rather not reveal the company's identity without asking them first
20:59 johnmark AK6L: no worries :)
20:59 johnmark I only ask because I've been getting a *lot* of interest from life sciences folks recently
20:59 AK6L cool
21:00 AK6L i have to say i'm impressed with how easy gluster was to set up.
21:03 pdurbin looks like i need to retract my "gluster slower than NFS" for kickstarts... http://irclog.perlgeek.de/g​luster/2012-11-26#i_6185671
21:03 glusterbot <http://goo.gl/XM53d> (at irclog.perlgeek.de)
21:04 pdurbin <driver name='qemu' type='qcow2' cache='writeback'/> is speeding things up for our kvm disk images. writeback vs. none, that is
21:05 johnmark pdurbin: w00t
21:05 pdurbin heh. you can even see the speedup here: http://software.rc.fas.harvard.edu/ganglia​2/ganglia2_storage/graph.php?r=hour&amp;z=​xlarge&amp;c=gvm&amp;m=network_report&amp;​s=by+name&amp;mc=2&amp;g=network_report
21:05 glusterbot <http://goo.gl/G1QuI> (at software.rc.fas.harvard.edu)
21:05 pdurbin (if you hurry and click) :)
21:05 johnmark pdurbin: are you using the new qemu integration? or just the released stuff
21:05 JoeJulian Since I don't care about the data on a vm image, I actually use writethrough
21:05 johnmark pdurbin: heh :)
21:06 pdurbin johnmark: buh. i don't know
21:06 pdurbin sounds like i have homework before you show up next week
21:06 johnmark pdurbin: qemu integration stuff == GlusterFS 3.4 and QEMU 1.3
21:06 johnmark heh
21:06 pdurbin JoeJulian: ack
21:07 pdurbin i wish i could say i don't care about data on VM images :)
21:07 johnmark pdurbin: neither of which are actually released yet, but are in git
21:07 pdurbin johnmark: i doubt we're using it then
21:07 pdurbin homework done! \o/
21:08 JoeJulian pdurbin: I mount gluster volumes within my vm images. So the image isn't really anything more than the distro packages.
21:08 pdurbin JoeJulian: hmm. seems like a good way to do things
21:10 pdurbin jdarcy: you around? i think 5b176fdb is the sha1 you had sent my co-worker but we're not sure where to look...
21:10 johnmark pdurbin: ^^^ this is the way I recommend whenever anyone asks
21:11 bryan_gs joined #gluster
21:11 pdurbin ^^^
21:16 pdurbin johnmark: you mean you don't sit next to jdarcy?
21:16 JoeJulian I think he was referring to my usage of vm images
21:17 pdurbin ah. interesting. ok
21:17 pdurbin who knows with johnmark ;)
21:17 JoeJulian hehe
21:18 johnmark haha!
21:18 johnmark :P
21:18 johnmark pdurbin: and yes, I was referring to JoeJulian's usage of vm images
21:18 pdurbin johnmark: but... you're ok with the vm images themselves being on gluster, right?
21:18 redsolar_office joined #gluster
21:18 JoeJulian Mine are
21:19 johnmark pdurbin: yes
21:19 pdurbin ok. so disk images on gluster that mount gluster volumes
21:19 pdurbin gluster gluster gluster
21:20 * pdurbin thinks he's in the right channel
21:20 JoeJulian Before the direct qemu integration, images were just not at all efficient to use through the fuse mount. Even with integration, I still like the flexibility of being able to destroy a vm image or spin up an extra vm and still have them use the same safe data.
21:21 pdurbin mmm, safe data
21:22 noob2 these vm images are just going to look like a file on the admin side of things right?
21:22 noob2 a giant file i'd guess
21:22 johnmark pdurbin: right, so VMs mounting virtual disks which are also on GlusterFS
21:22 johnmark noob2: right
21:22 noob2 ok.  LIO does a very similar thing where you can export a file as block storage
21:22 JoeJulian noob2: I do 6gig vm images.
21:22 noob2 awesome
21:22 noob2 i'm excited for this to land
21:22 johnmark noob2: also, the new block device translator is going to make all of this that much easier
21:23 pdurbin johnmark: ah. so the virtual disk, the one with the data, shows up as /dev/sdb or /dev/vdb or something
21:23 noob2 nice
21:24 noob2 might as well tell hitachi/emc to start packing their bags :D
21:24 pdurbin ("a" being the operating system, that might get destroyed)
21:24 johnmark noob2: lulz... it's a really exciting development
21:27 bryan_gs joined #gluster
21:43 tqrst "man, why is this taking so long?" "ls |wc" -> 2000064 oh.
21:44 JoeJulian hehe
21:44 saz_ joined #gluster
21:46 balunasj joined #gluster
21:49 robo joined #gluster
21:59 tmirks joined #gluster
22:02 dalekurt joined #gluster
22:08 balunasj joined #gluster
22:18 jiffe98 I am nfs exporting a gluster mount and I see 'nfsd: non-standard errno: -117' throughout the logs anyone run into this?
22:22 JoeJulian no idea on that one...
22:22 JoeJulian @hack
22:22 glusterbot JoeJulian: The Development Work Flow is at http://goo.gl/ynw7f
22:42 Bullardo joined #gluster
22:49 jiffe98 that errno sounds like it is supposed to be something from xfs but I don't have xfs anywhere in the picture
23:07 jiffe98 [2012-11-29 20:29:02.511333] W [fuse-bridge.c:513:fuse_attr_cbk] 0-glusterfs-fuse: 5274386: STAT() /piwik/piwik-1.7/tmp/sessions/s​ess_ta46hvsabutgs5qbgfv0kdo5f6 => -1 (Structure needs cleaning)
23:07 jiffe98 so I guess it is basically the same thing as the xfs error would report
23:18 noob2 joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary