Camelia, the Perl 6 bug

IRC log for #gluster, 2012-10-11

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:02 hagarth joined #gluster
00:32 tc00per @split-brain
00:32 glusterbot tc00per: (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://joejulian.name/blog/fixin​g-split-brain-with-glusterfs-33/ .
00:36 tc00per Seems the chat archive at... http://www.gluster.org/interact/chat-archives/ is no longer 'live'. Is this know? If so, and there is no intent to restore perhaps a link to... http://irclog.perlgeek.de/gluster/ should be inserted somewhere?
00:36 glusterbot Title: Chat Archives | Gluster Community Website (at www.gluster.org)
01:11 neofob joined #gluster
01:12 neofob i see a bunch of open file descriptors from lsof while i'm doing rebalance, is it normal?
01:15 nightwalk joined #gluster
01:20 sensei_ joined #gluster
01:30 kevein joined #gluster
01:30 Qten left #gluster
01:50 aliguori joined #gluster
02:07 sunus joined #gluster
02:15 y4m4 joined #gluster
02:18 sunus joined #gluster
02:21 mohankumar joined #gluster
02:22 shireesh joined #gluster
02:32 atrius so.. i'm getting ready to set things up... here's what i have... two iscsi links from my storage server (standing in for the production SAN) which i've formated as XFS and am about to mount at /mnt/iscsi.. which i'll then share/mount on to /var/lib/nova/instances on each machine... does this sound about right?
02:54 wushudoin joined #gluster
03:21 glusterbot joined #gluster
03:38 cattelan joined #gluster
03:49 Humble_afk joined #gluster
03:57 cattelan joined #gluster
04:01 shylesh joined #gluster
04:05 glusterbot joined #gluster
04:29 deepakcs joined #gluster
04:29 vpshastry joined #gluster
04:30 jays joined #gluster
04:51 quillo joined #gluster
04:58 faizan joined #gluster
04:58 faizan joined #gluster
05:14 hagarth joined #gluster
05:26 seanh-ansca joined #gluster
05:28 mohankumar joined #gluster
05:29 sripathi joined #gluster
05:44 sgowda joined #gluster
05:44 overclk joined #gluster
05:52 mohankumar joined #gluster
06:03 mdarade1 joined #gluster
06:03 mdarade1 left #gluster
06:05 ondergetekende joined #gluster
06:07 atrius huh... i hope i'm doing this wrong or otherwise this is really really slow on IO
06:09 ramkrsna joined #gluster
06:09 ramkrsna joined #gluster
06:11 samppah kkeithley: has there been some major changes between glusterfs-3.3.0-6 and glusterfs-3.3.0-11? 3.3.0-11 seems to be much faster in my use atleast
06:11 ankit9 joined #gluster
06:13 zwu joined #gluster
06:17 mo joined #gluster
06:23 lkoranda joined #gluster
06:35 ngoswami joined #gluster
06:41 ctria joined #gluster
06:53 ekuric joined #gluster
06:54 stickyboy joined #gluster
06:57 tjikkun_work joined #gluster
07:02 vimal joined #gluster
07:03 hagarth joined #gluster
07:12 sgowda joined #gluster
07:19 ramkrsna joined #gluster
07:20 TheHaven joined #gluster
07:21 sac joined #gluster
07:26 andreask joined #gluster
07:36 Humble joined #gluster
07:36 Nr18 joined #gluster
07:41 dobber joined #gluster
07:42 ctria joined #gluster
07:44 hagarth joined #gluster
07:48 sgowda joined #gluster
07:49 JoeJulian samppah: Mostly just crash fixes or rpm dependency changes it looks like. rpm -q --changelog
07:50 samppah ah, must be something else then.. hmm
08:05 dobber joined #gluster
08:16 flowouffff hi guys
08:16 flowouffff i need help on a very wierd problem
08:16 flowouffff :)
08:17 flowouffff about glusterFS and ipv6 and some inet address family
08:20 Humble joined #gluster
08:23 sripathi1 joined #gluster
08:27 flowouffff anyone ? :)
08:30 samppah i have no idea about ipv6 :(
08:33 bulde flowouffff: please ask the question, will see if anyone in here can answer that...
08:33 flowouffff my corcern is the following:
08:34 flowouffff i've got ipv4 and ipv6 interfaces configured
08:34 flowouffff when i installed glusterFS
08:34 flowouffff i added the following option
08:35 flowouffff option transport.address-family inet to /etc/gluster/glusterd.vol
08:35 flowouffff otherwise i could not run gluster command
08:35 flowouffff it said "Connection failed. Please check if gluster daemon is operational."
08:35 bulde flowouffff: that may because we take 'inet/inet6' as the default
08:36 flowouffff i've tried it
08:36 flowouffff ah yes
08:36 bulde can you please post a bug, would like to track it
08:36 flowouffff but i forced to be inet
08:36 flowouffff can I MP u first ?
08:36 flowouffff before i post the bug
08:36 flowouffff just to make sure it is not due to a misconfiguration
08:37 bulde if its not a bug we close it with 'NOTABUG', but its ok to just file it... i guess our doc doesn't say much about IPv6 anyways
08:37 glusterbot New news from resolvedglusterbugs: [Bug 821192] QA builds: Requires: libcrypto.so.6()(64bit) on Fedora 16 <https://bugzilla.redhat.com/show_bug.cgi?id=821192> || [Bug 765421] rpms are non relocatable <https://bugzilla.redhat.com/show_bug.cgi?id=765421>
08:38 flowouffff ok i'll post it :)
08:39 bulde flowouffff: thanks
08:40 Triade joined #gluster
08:42 bulde1 joined #gluster
08:42 ankit9 joined #gluster
08:44 vpshastry1 joined #gluster
08:45 ndevos flowouffff: I've seen that error when localhost was not being resolved to the right IP (i.e. ::1 instead of 127.0.0.1)
08:50 flowouffff ok let me check my /etc/hosts
08:51 flowouffff root@bo02:/WOO# host localhost
08:51 flowouffff localhost has address 127.0.0.1
08:53 flowouffff nothing's wrong with localhost resolution
08:56 flowouffff bug reported: https://bugzilla.redhat.com/show_bug.cgi?id=865327
08:56 glusterbot Bug 865327: unspecified, unspecified, ---, kparthas, NEW , glusterd keeps listening on ipv6 interfaces for volumes when using inet familly address
08:56 Humble joined #gluster
08:59 pkoro joined #gluster
09:04 shylesh joined #gluster
09:07 ngoswami joined #gluster
09:07 glusterbot New news from newglusterbugs: [Bug 859861] extras don't respect autotools ${docdir} variable <https://bugzilla.redhat.com/show_bug.cgi?id=859861> || [Bug 865327] glusterd keeps listening on ipv6 interfaces for volumes when using inet familly address <https://bugzilla.redhat.com/show_bug.cgi?id=865327>
09:09 manik joined #gluster
09:09 flowouffff I just commented the bug i reported
09:10 flowouffff it really seems that we got a bug
09:10 flowouffff check the latest post :)
09:13 badone_home joined #gluster
09:14 raghu joined #gluster
09:15 rosco left #gluster
09:15 stickyboy joined #gluster
09:21 hagarth joined #gluster
09:21 lng joined #gluster
09:22 lng Hi! How to unmount the volume? Google doesn't provide the info...
09:26 flowouffff umount ?
09:26 flowouffff or stop ?
09:27 lng umount doesn't work
09:27 flowouffff what does it output ?
09:27 lng umount: /storage: device is busy.
09:27 lng nothing should use it from this server
09:28 lng but it is used by some other servers
09:28 lng lsof /storage/ gives me: find    16433 nobody  cwd    DIR   0,19     8192 9259265867489333824 /storage/200000/200000/200700/200704/08
09:31 lng http://pastie.org/private/rttt7nqelo6zdgv4eiow
09:31 glusterbot Title: Private Paste - Pastie (at pastie.org)
09:31 lng strange output
09:33 Humble joined #gluster
09:38 Humble joined #gluster
09:38 duerF joined #gluster
09:46 sunus joined #gluster
09:46 flowouffff yes
09:46 flowouffff what about lsof output?
09:47 flowouffff no bash session opened on mountpoint?
09:47 lng http://serverfault.com/questions/437218/​strange-processes-on-server-consume-cpu
09:47 glusterbot Title: linux - Strange processes on Server consume CPU - Server Fault (at serverfault.com)
09:47 lng I posted about it there ^
09:47 lng flowouffff: no bash session
09:49 puebele joined #gluster
09:59 sripathi joined #gluster
10:01 badone_home joined #gluster
10:06 TheHaven joined #gluster
10:08 glusterbot New news from resolvedglusterbugs: [Bug 765191] Crash in io-cache <https://bugzilla.redhat.com/show_bug.cgi?id=765191> || [Bug 764409] Service is down when programs are running <https://bugzilla.redhat.com/show_bug.cgi?id=764409> || [Bug 765283] Gluster getting crashed randomly <https://bugzilla.redhat.com/show_bug.cgi?id=765283>
10:19 eurower joined #gluster
10:21 Humble joined #gluster
10:22 eurower Hi all, I have created a glusterfs cluster with 2 tests servers. First, I have create a simple volume. Now, I would try to create a replica but I have the error for this command : gluster volume create fs-test1 replica 2 transport tcp 192.168.0.10:/nas10201_01/nastest1 192.168.0.20:/nas10202_01/nastest1
10:22 eurower Error : /nas10201_01/nastest1 or a prefix of it is already part of a volume
10:22 glusterbot eurower: To clear that error, follow the instructions at http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/
10:23 eurower Thanks glusterbot ;) I done it, but still teh error :s
10:31 vikumar joined #gluster
10:31 ngoswami_ joined #gluster
10:33 badone joined #gluster
10:36 rgustafs joined #gluster
10:38 glusterbot New news from resolvedglusterbugs: [Bug 797742] [glusterfs-3.3.0qa24]: glusterfs client crash due to stack overflow <https://bugzilla.redhat.com/show_bug.cgi?id=797742> || [Bug 824531] Polling errors in glusterd <https://bugzilla.redhat.com/show_bug.cgi?id=824531>
10:39 sunus hi, what's .vol file in example directory?
10:39 sunus i mean, what can we do with .vol files, it's a testcase?
10:50 hagarth joined #gluster
10:50 RNZ_ joined #gluster
10:53 RNZ joined #gluster
10:57 vikumar joined #gluster
10:58 ngoswami joined #gluster
11:03 bulde joined #gluster
11:07 kkeithley1 joined #gluster
11:08 glusterbot New news from resolvedglusterbugs: [Bug 799244] nfs-nlm:gnfs server does not allow kernel nfs mount <https://bugzilla.redhat.com/show_bug.cgi?id=799244>
11:16 mo joined #gluster
11:24 Humble joined #gluster
11:38 glusterbot New news from newglusterbugs: [Bug 862082] build cleanup <https://bugzilla.redhat.com/show_bug.cgi?id=862082>
11:38 glusterbot New news from resolvedglusterbugs: [Bug 765272] brick name duplication in /etc/glusterd store <https://bugzilla.redhat.com/show_bug.cgi?id=765272> || [Bug 765399] 1 gluster box locks up the filesystem due to a kernel errors <https://bugzilla.redhat.com/show_bug.cgi?id=765399> || [Bug 765501] File system can mount without error when half the files are missing <https://bugzilla.redhat.com/show_bug.cgi?id=765501> || [Bug 81
11:42 vimal joined #gluster
11:47 mweichert joined #gluster
11:48 nocturn joined #gluster
11:48 Psi-Jack Now, what does "GA" mean, in regards to GlusterFS's releases?
11:49 nocturn Hi all, we are running a gluster cluster for our dev environment.  we are seeing a lot of split brains in the self heal log and we cannot find a direct cause.  Running gluster 3.2 on SL6 (RHEL6 clone)
11:53 stickyboy Psi-Jack: Hmmm.  General Announcement?
11:53 Psi-Jack You're asking a question? ;)
11:54 stickyboy Psi-Jack: Taking a stab at it, that's all.
11:54 Psi-Jack hehe
11:54 Psi-Jack Basically trying to determine if GA is just a new release, or if it's a new stable release, I'm guessing new release, not yet considered fully stable.
11:55 stickyboy Ah, I dunno.
11:55 stickyboy "Gold mAster" :P
11:56 nocturn GA = General Availibility AFAIK
11:56 Psi-Jack General Availability is what it generally stands for, but that by itself doesn't mean much. :;
11:57 Psi-Jack The question is, what does it mean in regards to GlusterFS's releases, specifically.
11:57 Psi-Jack Not what it stands for, :)
11:57 nocturn Psi-Jack: I think gluster uses alpha, beta, rc, ga
11:57 nocturn so GA is the final version
11:58 Psi-Jack I see. :)
12:02 mo joined #gluster
12:03 vikumar joined #gluster
12:10 Humble joined #gluster
12:25 shireesh joined #gluster
12:37 bfoster_ joined #gluster
12:37 jdarcy_ joined #gluster
12:38 kkeithley1 joined #gluster
12:39 jdarcy__ joined #gluster
12:42 bfoster joined #gluster
12:51 hagarth joined #gluster
12:52 adechiaro If I'm syncing files between two different gluster volumes (on different clusters) via rsync and use -X to preserve extended attributes, will that cause any issues with gluster?
12:53 bulde1 joined #gluster
12:58 * ndevos thinks thats an interesting question
12:59 stickyboy ndevos: Me too.  I almost answered, then decided I didn't know enough about glusterfs internals. :P
13:00 stickyboy I'd assume as long as you're not copying the bricks themselves... (and therefore not the .glusterfs dir)... you're ok.
13:00 ndevos my *guess* is that the trusted.* xattrs will get overwritten by rsync, but maybe the xlators contain code to prevent overwriting glusterfs xattrs
13:01 adechiaro that's what I was hoping as well
13:02 adechiaro I've been running the rsync for a few days now and it seems to be OK so far but figured I'd see if anyone else had experience with this
13:02 ndevos rsync a file -> glusterfsd creates new xattrs for the new file -> rsync sets xattrs from the source -> xattrs on the destination are 'wrong'
13:02 adechiaro ahh hmm..  I see your point
13:02 ndevos ^ happens, unless "rsync sets xattrs from the source" can not do its job for the glusterfs xattrs
13:05 adechiaro just to be safe I'm going to stop and restart the rsync without extended attrs
13:06 adechiaro thanks for the input guys
13:06 bulde1 joined #gluster
13:07 vpshastry1 left #gluster
13:09 Humble joined #gluster
13:10 stickyboy ndevos: Yeah, it would make sense to set new xattrs on the remote side...
13:10 stickyboy Could be hairy if it didn't...
13:12 ndevos adechiaro: if your files dont have any custom xattrs, I'd recommend to leave those out if the rsync and just sync to the remote glusterfs volume
13:13 adechiaro ndevos: yes thinking about it now I believe you're absolutely right
13:13 shireesh joined #gluster
13:13 rgustafs joined #gluster
13:14 ndevos adechiaro: well, I dont know if the glusterfsd processes allow setting the glusterfs xattrs by hand/rsync, or if that gets ignored
13:15 ndevos setting the xattrs with rsync may be fine if the glusterfsd processes just overrule them, but I dont know what actually happening
13:15 adechiaro yeah I think only digging through the source would fully answer that
13:15 ndevos yeah, or test on a glusterfs volume and try to change some of the trusted.... xattrs
13:16 adechiaro I already rsync'd about 15T's from one volume to another with xattrs, I'll let you guys know if I find any issues with it but I'm restarting it now without -X
13:17 adechiaro I'm also a bit curious if it would be possible to clone a volume by directly copying the contents of the bricks?  Obviously it's not fully supported
13:17 eurower joined #gluster
13:20 eurower attr -r glusterfs.volume-id /nas10201_01 : attr_remove: Operation not supported. But I can add and remove some others attributes. Any idea ?
13:33 _Bryan_ I knwo there is a command for this..but for the life of me my goggle skills are failing me....
13:34 _Bryan_ what is the command that tells a server to resync its configuration files from another known good server within the gluster setup
13:36 Nr18_ joined #gluster
13:40 adechiaro has anyone ever seen an issue where a brick goes to a disconnected state due to it's glusterfsd process jump up to 100% cpu?  I haven't seen anything on google and strace shows nothing.  I've tried restarting glusterd service but the process remains, killing it just makes it a zombie (with a parent of init) and the port remains open so when glusterd starts up again it can't connect to the brick.
13:41 manik joined #gluster
13:41 rwheeler joined #gluster
13:42 adechiaro I've also read that init is supposed to clean up zombie processes by frequently calling wait() on it's children so something isn't quite right here because these remain around until a reboot
13:46 vikumar__ joined #gluster
13:46 ramkrsna joined #gluster
13:46 ramkrsna joined #gluster
13:46 ngoswami_ joined #gluster
13:47 kshlm shutd
13:58 Nr18 joined #gluster
13:58 johnmark hahaha
14:03 nueces joined #gluster
14:07 stopbit joined #gluster
14:09 semiosis :O
14:15 mweichert joined #gluster
14:19 semiosis _Bryan_: gluster volume sync <source> -- run on the target, which must not have any volumes configured iirc
14:19 semiosis file a bug
14:19 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
14:19 semiosis that's for me
14:20 semiosis adechiaro: sounds sort of like bug 832609
14:20 glusterbot Bug https://bugzilla.redhat.com​:443/show_bug.cgi?id=832609 urgent, high, ---, rabhat, ASSIGNED , Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up
14:20 semiosis though i'm not sure about the 100% cpu part
14:20 adechiaro semiosis: thanks, I'll take a look at it
14:22 nocturn left #gluster
14:22 adechiaro semiosis: this looks very similar but not quite the same symptoms.  I'll add my notes to the bug.  thanks again!
14:23 semiosis adechiaro: also re: rsync, are you doing that from brick to brick, brick to client, client to brick, or client to client?
14:23 semiosis sorry if i missed that
14:23 _Bryan_ semiosis: Thanks..that is what I was looking for
14:23 semiosis yw
14:24 adechiaro rsync is client to client.  clients are also servers in this case, not sure if that matters.
14:24 semiosis oh ok
14:24 semiosis i've heard, though not tried myself, tthat glusterfs client will not let you write the glusterfs xattrs to a file
14:24 semiosis idk what happens if you try though
14:25 adechiaro 2 instances of 12x2 distributed replicate volumes
14:25 semiosis so i'd guess adding -X to your client-client rsync would only help if you had your own xattrs to preserve
14:25 adechiaro that's what I was hoping was happening under the covers
14:34 sjoeboo I've got a 32x2 volume, 4 nodes, one of which filled its /. thats cleared up now, but glistered will not restart now
14:34 sjoeboo failing like
14:35 sjoeboo [2012-10-11 10:32:00.117665] E [xlator.c:385:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
14:35 sjoeboo [2012-10-11 10:32:00.117680] E [graph.c:294:glusterfs_graph_init] 0-management: initializing translator failed
14:35 sjoeboo [2012-10-11 10:32:00.117691] E [graph.c:483:glusterfs_graph_activate] 0-graph: init failed
14:35 sjoeboo volatile looks good and identical to the other nodes in the cluster
14:36 sjoeboo volfile*
14:36 semiosis hi matt!
14:37 semiosis sjoeboo: is this a source compile/install?
14:37 sjoeboo nope
14:37 sjoeboo well, got the srpms at one point and rebuilt them, but thats all
14:38 sjoeboo should i try detaching this node from another?
14:38 semiosis well if glusterd isnt running that will be difficult
14:39 semiosis usually when glusterd can't start it's because it can't find /etc/glusterfs/glusterd.vol -- which is the never-changing volfile that loads the "management" xlator that is glusterd
14:39 sjoeboo yes, exactly, other nodes show it, rightly so, as disconnected
14:39 sjoeboo right
14:39 sjoeboo its there….
14:40 sjoeboo -rw-r--r-- 1 root root 273 Oct 11 10:31 glusterd.vol
14:40 sjoeboo [root@seqc01 glusterfs]# pwd
14:40 semiosis other things that prevent glusterd from running show up pretty obviously in the log
14:41 atrius so.. i've got gluster installed and have one VM setup on the store.... i presume i'm doing something wrong because inside the VM disk write speeds are dreadfully slow... on the order of 1M/s or so
14:42 sjoeboo lots of Unknown key: brick-0 :-(
14:43 sjoeboo where 0..64
14:43 TheHaven joined #gluster
14:44 semiosis atrius: how are you testing write speeds?
14:44 atrius semiosis: just a dd from /dev/zero with a bs=1M and count=500... the same dd has been running for nearly 10 minutes now
14:45 atrius and apt-get install is also just hanging or taking forever.. hard to tell which at the moment :D
14:45 rwheeler joined #gluster
14:46 semiosis atrius: you can see whats going on with the apt by looking at /var/log/dpkg.log iirc
14:46 semiosis which will be similar to console output
14:46 atrius it looks like it isn't hung... just dead slow
14:47 wushudoin joined #gluster
14:47 atrius it says it is reading the package lists... 0%
14:47 semiosis sjoeboo: could you pastie/fpaste the last 100 or so lines of your glusterd.log file please?
14:47 sjoeboo :-( that was the 10GB file i had to blow away back when things were happy o get / not @ 100%
14:48 atrius ah.. it finally dewedged... everything seems to be working better now
14:49 atrius let me take that back.. it is working.. but dog slow
14:49 semiosis atrius: please ,,(pasteinfo)
14:49 glusterbot atrius: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
14:49 jbrooks joined #gluster
14:49 semiosis sjoeboo: ok i just want to see the whole log output of trying to start glusterd
14:50 semiosis the excerpts aren't painting a clear picture for me
14:50 sjoeboo i can start it manually with --debug and paste that over
14:50 atrius semiosis: http://www.fpaste.org/IdRz/
14:50 glusterbot Title: Viewing Paste #242525 (at www.fpaste.org)
14:52 semiosis atrius: what kind of network connects the servers hydra & tarvalon to the client?  and is that the same network used for the iscsi?
14:52 semiosis atrius: i suspect high latency is killing your performance :(
14:52 atrius semiosis: GigE (single links at the moment)
14:52 semiosis glusterfs & iscsi on same cable?
14:53 atrius semiosis: yeah.. i only have so many interfaces at the moment :(
14:53 atrius and i would expect degradation because of that.. but not this much... and also not just in the VM
14:53 atrius it seems the hosts can do their thing with no real problems
14:54 semiosis glusterfs replication is sensitive to latency, and that's more noticeable with lots of small ops
14:54 semiosis atrius: just a wild guess but maybe jumbo frames on the ethernet could help, if you're not using them already
14:55 semiosis sjoeboo: please do, it may help
14:55 sjoeboo semiosis: http://dpaste.org/dcSRF/
14:55 atrius semiosis: i'll try again to set them... the system was ignoring what i did before so presumably it wasn't correct.. :D
14:55 glusterbot Title: dpaste.de: Snippet #210803 (at dpaste.org)
14:56 daMaestro joined #gluster
14:58 semiosis sjoeboo: maybe you should try moving /var/lib/glusterd/vols out of there, then try starting glusterd with no volume config, and syncing from another server
14:58 valqk joined #gluster
14:58 valqk hi everyone.
14:58 semiosis hello
14:58 glusterbot semiosis: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
14:59 sjoeboo sure, how would i run that sync?
14:59 semiosis sjoeboo: oh and by the way, are there any glusterfsd (brick export) processes running on this machine?
15:00 semiosis sjoeboo: gluster volume sync <source> -- run that on the target (messed up) server, which requires no volumes present hence (re)moving /var/lib/glusterd/vols
15:00 sjoeboo no,  not gluster* prods at all
15:00 semiosis ok
15:00 sjoeboo okay
15:00 semiosis <source> being one of the other peers that's working on
15:00 semiosis ok
15:01 semiosis actually
15:01 semiosis lets double check the syntax for that, before you try
15:02 valqk I have a problem. I'm using glusterd 3.3.0-ppa1~lucid3 on debian squeeze and glusterfs 3.3.0-1 - 1 server - no replication and 20 clients. I use the glusterfs as nfs replacement because nfs has memory leaks. I'm having troubles with high iowait on client side. for example if a use launch a ff it is loading for more than a minute and client machine is showing hight iowait. can someone point me out where I'am wrong? I've created my volume like: volume create u
15:02 valqk servol server:/path/to/homes
15:02 valqk can someone tell me why I have high iowait on client side even if only one client is using the server
15:02 sjoeboo semiosis: same mgm vol errors
15:03 sjoeboo just starting up
15:03 mohankumar joined #gluster
15:03 balunasj joined #gluster
15:04 valqk *user , high iowait
15:04 valqk oh I'm using tcp for net layer
15:05 semiosis sjoeboo: just to be sure... does your /etc/glusterfs/glusterd.vol look like this http://pastie.org/5035694 ?
15:05 glusterbot Title: #5035694 - Pastie (at pastie.org)
15:06 semiosis (it should)
15:06 sjoeboo yes, exactly
15:07 semiosis ok, and yeah that volume sync syntax is correct
15:07 vikumar__ joined #gluster
15:07 ngoswami_ joined #gluster
15:07 sjoeboo yeah, i didn't get that far, gusted still failing to start
15:07 ramkrsna joined #gluster
15:07 semiosis right
15:09 glusterbot New news from newglusterbugs: [Bug 865493] Reduce the number of times extended attributes are read/written from the file system <https://bugzilla.redhat.com/show_bug.cgi?id=865493>
15:11 valqk some ideas on my issue?
15:13 semiosis valqk: what kind of network connects client & server?
15:13 semiosis sjoeboo: same debug output even with no /var/lib/glusterd/vols ?
15:14 valqk semiosis, 100mbit
15:14 sjoeboo yeah
15:14 sjoeboo identical
15:14 valqk semiosis, the server is connected with 1gb to a switch and clients are connected to the switch with 100mbps
15:18 nueces joined #gluster
15:18 sjoeboo so, one thing, int he dbug log i see it finding 2 of the 3 peers
15:18 sjoeboo and the one it isn't seeing is one which also had / fill but came away much happier
15:20 semiosis sjoeboo: ok maybe we should start this server with no config at all, an empty /var/lib/glusterd
15:20 semiosis oops
15:20 semiosis wait
15:20 sjoeboo ha, sure
15:20 semiosis keep the /ver/lib/glusterd/glusterd.info
15:20 sjoeboo nothing else?
15:20 semiosis thats the uuid the other peers know it by
15:20 semiosis right
15:20 sjoeboo okay
15:20 semiosis then probe it from one of the good servers
15:21 sjoeboo sure
15:21 semiosis ...after starting glusterd successfully
15:21 semiosis :)
15:21 semiosis if it can't start with no config i'm lost
15:22 sjoeboo closer!
15:22 sjoeboo started
15:22 sjoeboo other node see:
15:22 sjoeboo Hostname: 10.242.105.75
15:22 sjoeboo Uuid: cf32813c-3af9-479d-b409-5b9c1d91b5f0
15:22 sjoeboo State: Peer Rejected (Connected)
15:23 semiosis peer rejected means that peer's volume config is out of sync
15:23 blendedbychris joined #gluster
15:23 blendedbychris joined #gluster
15:23 sjoeboo yep, so sync then?
15:23 semiosis try a peer status on the bad one first, to see if it can see all the other peers
15:24 semiosis if it can, then do the sync on it
15:24 sjoeboo negative, probe them?
15:24 semiosis try probing one then restarting glusterd
15:24 semiosis sometimes that works
15:24 sjoeboo that did it (probe 1 then restart)
15:24 semiosis awesome!
15:24 semiosis now sync
15:24 valqk semiosis, https://gist.github.com/f8f626f68573f78d54b1 that's a profile of the volume - twice opened a save ff session
15:24 TheHaven joined #gluster
15:24 glusterbot Title: valqk's gist: f8f626f68573f78d54b1 Gist (at gist.github.com)
15:25 sjoeboo cool, actually, now it shows as a peer
15:25 sjoeboo still resync
15:25 sjoeboo ?
15:25 semiosis sjoeboo: on the bad peer, does gluster volume info show what it should?
15:25 sjoeboo yeah, it does
15:25 semiosis and are your glusterd processes running?
15:26 semiosis s/glusterd/glusterfsd/
15:26 glusterbot What semiosis meant to say was: and are your glusterfsd processes running?
15:26 sjoeboo yep!
15:26 semiosis then i think you're all set
15:26 semiosis i've seen it do the sync automatically before... not sure exactly when it does and when it doesnt
15:26 semiosis but a manual sync command is the solution in the latter case
15:27 sjoeboo sure
15:29 semiosis btw, we basically just did the same-hostname server ,,(replace) procedure
15:29 glusterbot Useful links for replacing a failed server... if replacement server has different hostname: http://community.gluster.org/q/a-replica-no​de-has-failed-completely-and-must-be-replac​ed-with-new-empty-hardware-how-do-i-add-the​-new-hardware-and-bricks-back-into-the-repl​ica-pair-and-begin-the-healing-process/ ... or if replacement server has same hostname:
15:29 glusterbot http://www.gluster.org/community/docum​entation/index.php/Gluster_3.2:_Brick_​Restoration_-_Replace_Crashed_Server
15:29 semiosis ^^^
15:29 semiosis without the data repair/self-heal
15:30 sjoeboo awesome, yeah, i was going to ask if next step was to basically pretend this failed outright and add it as new
15:30 sjoeboo wondering if i need to heal to make sure everything is all balanced/replicated..
15:30 semiosis yeah a heal would be needed if clients have been working with the other peers while this one was down
15:31 semiosis probably a good idea
15:34 semiosis sjoeboo: though you're using 3.3.0 so maybe the self-heal daemon will kick in and take care of that
15:34 semiosis i'm still on 3.1 so no experience with that
15:34 semiosis also maybe you should upgrade to the GA release since it looks like you're still on QA
15:35 valqk semiosis, can this be cause because I've set tcp layer explicitly?
15:35 semiosis also i've heard that 3.3.1 will be out real soon
15:35 semiosis valqk: tcp is your only option if you're using ethernet
15:35 valqk hmmm ok but what can cause the slowdown then
15:36 valqk the server is not iowaiting
15:36 semiosis valqk: i question your motivation for using glusterfs though... it's not meant to be a drop-in replacement for nfs, so if that's what you're expecting you may be disappointed :(
15:36 semiosis glusterfs servers do support nfs clients though
15:36 valqk semiosis, didn't found any other descent replacement of homes
15:36 semiosis maybe that would be better for you than using the native FUSE client
15:37 valqk semiosis, so I can mount with -t nfs to test
15:37 semiosis i've not run homedirs out of glusterfs, others have though, maybe they will be more helpful than I
15:37 semiosis ~nfs | valqk
15:37 glusterbot valqk: To mount via nfs, most distros require the options, tcp,vers=3 -- Also portmapper should be running on the server, and the kernel nfs server (nfsd) should be disabled
15:39 sjoeboo semiosis: yeah, as part of this i might rev to the GA, and will trigger a self-heal just to make double extra sure
15:41 ndevos oh, wow: glusterfs-client >= 2.0.1 is needed by (installed) libvirt-daemon-0.9.13-3.fc18.x86_64
15:43 * ndevos frowns, thats in a vm...
15:44 sjoeboo hm: gluster> volume status all
15:44 sjoeboo operation failed
15:51 sjoeboo hm, yeah, volume operations are failing
15:52 semiosis sjoeboo: does peer status on all servers show all other servers as peer in cluster connected?
15:53 semiosis i.e. no one is disconnected or rejected
15:53 sjoeboo yeah
15:53 sjoeboo all show connected
15:54 semiosis what does the glusterd log file show?
15:54 semiosis should be some info there
15:57 sjoeboo all different, failed node, which most clients are trying to hit, showing clients disconnecting (i need to go stop them), another has 0-seqc_gluster_vol-replicate-26: Stopping crawl as < 2 children are up, another failed self-heal attempts logged, and the fourth nothing really
15:58 seanh-ansca joined #gluster
16:00 valqk did someone used this: https://github.com/jdarcy/negative-lookup
16:00 glusterbot Title: jdarcy/negative-lookup · GitHub (at github.com)
16:03 Nr18 joined #gluster
16:04 chandank|work joined #gluster
16:13 neofob joined #gluster
16:14 sjoeboo ah, okay, from that failed node earlier, bricks seem offline
16:17 sjoeboo trying to search docs on bringing those online…any ideas? all i find the the mgmt console doc ..
16:21 semiosis sjoeboo: if the volume is "started" then glusterd will try to launch the glusterfsd processes when it starts up
16:21 sjoeboo yeah, i stopped/started the volume just nw, seem hapier
16:21 semiosis i've heard you can also do a volume start --force
16:22 semiosis cool
16:22 valqk doh... can't get nfs mounted
16:22 semiosis brick logs would show why the glusterfsd process(es) died
16:22 semiosis if they're dying
16:22 semiosis valqk: did you follow ,,(nfs) instructions?  that should work
16:22 glusterbot valqk: To mount via nfs, most distros require the options, tcp,vers=3 -- Also portmapper should be running on the server, and the kernel nfs server (nfsd) should be disabled
16:22 semiosis also make sure iptables allows the necessary ,,(ports)
16:22 glusterbot glusterd's management port is 24007/tcp and 24008/tcp if you use rdma. Bricks (glusterfsd) use 24009 & up. (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111.
16:23 valqk semiosis, sure - I have the portmap running and se the tcp and vers=3
16:23 valqk mount simply hangs
16:23 valqk and times out
16:25 valqk semiosis, I have no filtering on the internal interface
16:26 Mo___ joined #gluster
16:36 bulde1 joined #gluster
16:41 sjoeboo watching logs for the self heal
16:41 sjoeboo -1 (No such file or directory)
16:41 valqk hmm.. server and clients are sending packets from sunrpc and nfs...
16:41 sjoeboo lots of that
16:41 sjoeboo how worked should i be?
16:42 JoeJulian "[09:21] <semiosis> i've heard you can also do a volume start --force" no --, just "volume start $volname force"
16:44 ondergetekende joined #gluster
16:44 JoeJulian scrollback... tl;dr.
16:48 ramkrsna_ joined #gluster
16:51 Nr18 joined #gluster
16:52 ramkrsna__ joined #gluster
16:54 semiosis JoeJulian: thx for that
16:54 ramkrsna joined #gluster
16:54 ramkrsna joined #gluster
16:55 semiosis syntax check
16:57 ramkrsna_ joined #gluster
16:58 y4m4 joined #gluster
17:07 neofob so i had a question yesterday but probably it was too late (evening on east coast, america)
17:07 TheHaven joined #gluster
17:08 neofob i notice that when i do rebalancing, there are over 10K open files when i do lsof
17:08 neofob is that normal?
17:11 jbrooks_ joined #gluster
17:14 sjoeboo aside from staring at logs, any easy way to see the status of a heal operation ?
17:23 johnmark sjoeboo: isn't there a 'profile' subcommand somewhere?
17:23 manik joined #gluster
17:23 sjoeboo hmm
17:23 * johnmark looks for Dustin's "glusterfs for sysadmins" preso
17:23 sjoeboo ah, neato
17:24 edward1 joined #gluster
17:24 jbrooks joined #gluster
17:25 adechiaro joined #gluster
17:28 Nr18 joined #gluster
17:31 bulde1 :O
17:31 johnmark :O
17:32 johnmark sjoeboo: after you start the rebalance, see what happens when you stick "status" at the end of the rebalance command, as opposed to "start"
17:32 johnmark see https://video.linux.com/videos/demysti​fying-gluster-glusterfs-for-sysadmins/
17:32 glusterbot Title: Demystifying Gluster - GlusterFS For SysAdmins | The Linux Foundation Video Site (at video.linux.com)
17:34 johnmark sjoeboo: also http://linuxfoundation.ubicast.tv/med​ias/videos/2012-08-31_19-07-18_487556​/images/capture_1346187212_962024.jpg
17:34 sjoeboo cool, its still in the heal (sequencing data = lots of tiny files. lots.)
17:34 sjoeboo awesome
17:34 johnmark sjoeboo: that last link gets into teh volume profile info
17:34 johnmark sjoeboo: hrm, someone should reallly write a blog post about that :/
17:35 pdurbin joined #gluster
17:35 sjoeboo oh, i wouldn't be suppressed if someone we both knew had one soon enough
17:35 pdurbin sjoeboo: i grabbed lunch and missed out on all the gluster fun :(
17:36 johnmark sjoeboo: oooooh
17:36 sjoeboo oh, it started before lunch, but it was heads down fix shit time
17:36 ev0ldave joined #gluster
17:36 sjoeboo now = lunch, whew
17:36 johnmark sjoeboo: suppressed? or surprised? :)
17:36 sjoeboo surprised, haha.
17:36 johnmark heh heh
17:38 ev0ldave has anyone seen issues of gluster saying a brick is not connected when trying to do a heal but the status shows it as online?
17:39 ev0ldave restarted the glusterd service but heal still shows it as disconnected, logs are a bit cryptic
17:39 adechiaro ev0ldave: how are you getting the status?
17:39 ev0ldave gluster volume status glustervmstore
17:40 adechiaro yes I've seen that
17:40 adechiaro in fact I'm dealing with that problem right now
17:40 adechiaro https://bugzilla.redhat.com/show_bug.cgi?id=832609
17:40 glusterbot Bug 832609: urgent, high, ---, rabhat, ASSIGNED , Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up
17:40 adechiaro check your glusterfsd processes
17:40 ev0ldave adechiaro:  but when running gluster volume heal glustervmstore info, it shows the brick as offline
17:40 adechiaro yes
17:41 ev0ldave adechiaro:  they are all there, gluserfsd, glusterfs and glusterd
17:41 adechiaro ev0ldave: is one of them a zombie process or consuming 100% of a cpu?
17:43 ev0ldave adechiaro: as a matter of fact, yes, glusterfsd
17:43 adechiaro ev0ldave: bingo…  looks like you have the same issue I'm seeing as well
17:44 ev0ldave i restarted it yesterday but still a no go
17:44 faizan joined #gluster
17:44 dshea I think I have an issue with the glusterd, but I wanted to run by you guys the configuration and what testing we were doing before I report anything as a bug, to make sure I have not done something silly.
17:44 ev0ldave running it for openstack instance store but it seems it is unable to handle the load without losing a brick every few days
17:46 adechiaro ev0ldave: on my end what I'm seeing is that if you restart the service the glusterfsd process for that brick remains running and not shut down.  when the service comes back up, it can't connect to that port because the process is still running.  killing the process manually just turns it into a zombie and from what I've seen only a reboot clears it up. :-/
17:46 ev0ldave fuuuu
17:46 borei left #gluster
17:47 adechiaro ev0ldave: if you could comment on that bugzilla case I'm sure it would be helpful for the rest of the developers
17:47 ev0ldave sure, on it
17:47 adechiaro ty
17:49 dshea I currently have a 10 node cluster configured as a 5x2 distributed replicated volume called test-volume.  Each machine has (2) 2TB drives, with each physical drive as a brick, so the total available disk is 19TB.  I started a few test programs on a single client machine to write files to the cluster.  They essentially copy our test data (randomly generated files) 30k per directory and copy the directory in an infinite loop. once the copy operation
17:49 dshea completes, the entire directory has a .complete appened to the end of it via a mv operation.  At the same time I have a process looping through the *.complete directories and running md5sum --check on the files.  I have been getting read errors in my logs which leads me to believe maybe the i/o coming from this single client is the issue?
17:49 ev0ldave site seems to be unresponsive, i'll keep an eye on it and get it logged
17:50 adechiaro thanks again, much appreciated
17:50 dshea '9998: FAILED open or read' is the error I receive on a file named 9998
17:54 dshea This file is definitely there and accessible.  Although I get intermittently the same error while the client is under heavy i/o loading when I check the files interactively as well.
17:57 dshea We also noticed that in a directory with 30k files it takes 5-10 minutes for ls to return on a client that is not loaded and on a directory where no other i/o operation occurs.
17:57 noob2 joined #gluster
17:58 hattenator joined #gluster
18:10 noob2 does anyone else use dstat to monitor their gluster?
18:11 noob2 if you do dstat --disk-util i noticed something interesting.  gluster seems to roll through the bricks in each pair one by one.  i see 99.9% disk usage and then it'll stop and move into the next drive.  is that the auto heal daemon?
18:15 dshea left #gluster
18:19 davdunc joined #gluster
18:45 davdunc joined #gluster
18:45 davdunc joined #gluster
18:48 Technicool joined #gluster
18:48 TheHaven joined #gluster
18:52 aliguori joined #gluster
19:07 xymox joined #gluster
19:13 semiosis @latest
19:13 glusterbot semiosis: The latest version is available at http://goo.gl/TI8hM and http://goo.gl/8OTin See also @yum repo or @yum3.3 repo or @ppa repo
19:13 semiosis @qa releases
19:13 glusterbot semiosis: The QA releases are available at http://bits.gluster.com/pub/gluster/glusterfs/ -- RPMs in the version folders and source archives for all versions under src/
19:19 bennyturns joined #gluster
19:30 bennyturns joined #gluster
19:42 rwheeler joined #gluster
19:51 andreask joined #gluster
19:59 tryggvil joined #gluster
19:59 y4m4 joined #gluster
19:59 Bullardo joined #gluster
20:03 y4m4 joined #gluster
20:07 blendedbychris joined #gluster
20:07 blendedbychris joined #gluster
20:08 bennyturns joined #gluster
20:14 Daxxial_ joined #gluster
20:24 nightwalk joined #gluster
20:25 elyograg am I right in thinking that running a 'du' on a directory or entire gluster volume ends up calling stat on everything, triggering the self-heal checks?  Is there an alternate program that works more efficiently on gluster?
20:27 hattenator If you're talking about the entire volume, df should work.  If you want a subdirectory, du on the brick's directory is probably your best bet.
20:27 elyograg hattenator: and if you've got a 48x2 cluster on four servers?
20:27 elyograg er, 24x2
20:27 hattenator note that the files in brick/.glusterfs are almost all hardlinks, meaning du will see them as taking up lots of space, but in reality they take up about 512 bytes each.
20:28 hattenator the striping is 24-bricks-wide?
20:29 hattenator df should still be accurate
20:29 elyograg not striped.  distribute+replicate.  I don't have this built yet, but when it does get built, each server will have 12 drive bays, each disk will be a brick.
20:29 hattenator yeah, meant distributed when I said striped
20:30 elyograg two problems with df: 1) it doesn't do directories.  2) if you have multiple volumes using the same bricks, all volumes get the same 'used' output from df, even if one of them only has a little bit of space used.
20:31 elyograg s/bricks/brick filesystems/
20:31 glusterbot What elyograg meant to say was: two problems with df: 1) it doesn't do directories.  2) if you have multiple volumes using the same brick filesystems, all volumes get the same 'used' output from df, even if one of them only has a little bit of space used.
20:32 hattenator you're saying the gluster mount's df used column comes directly from the brick's df used column?
20:32 elyograg if there's a way to write a du clone that uses an alternate form of stat (one that doesn't trigger self-heal), that would be perfect.
20:33 elyograg hattenator: I think it must.  I built a couple of volumes on my test system, each using subdirectories of the same filesystems, and the df output for both of the client mounts was the same.
20:34 hattenator I think the self-heal stuff is built into some low level syscalls like getdirents and open(dir), so I don't think you can get around that.  But df uses the vfs data so it's instant.
20:35 hattenator I'd expect the total would be the same.  If I recall, it reports the total/free from the smallest/fullest replicated brick set.
20:38 hattenator oh, wow, you're right.  the df report is pretty bad
20:39 elyograg we'll use du even if it's nasty slow, just looking for a way to get around the inherent problem.
20:39 hattenator I have a brick on /usr/local/glusterStorage.  /usr/local is a filesystem.  If I put files in /usr/local/testfile, the df on the fuse mount goes up.
20:41 hattenator With distribute, you can't do much with the bricks.
20:41 hattenator Maybe quotas, though.  I haven't looked into that
20:41 hattenator If you set a quota on the directories you're interested, even if you set it to the maximum volume size, it should need to keep track of used for that directory.
20:43 hattenator I also wonder whether using distribute on a single machine instead of LVM or something is actually a good idea.
20:44 elyograg i wonder if you can set up a quota on a directory, then set up quotas on subdirectories in that directory, and get the individual as well as combined usage.
20:45 hattenator I haven't used gluster quotas, but that's how quotas usually work
20:47 hattenator Are you expecting to write multiple large files simultaneously often?
20:48 hattenator I'm worried about your distribute plan.  LVM would massively outperform it in most work patterns, I think.
20:49 hattenator small files, small writes, etc.
20:49 badone joined #gluster
20:50 Bullardo joined #gluster
20:52 elyograg hattenuator i don't think there will be large files, but that might depend on what you mean by large files.  We are looking to replace over 100TB of production SAN storage with gluster.  Our SAN hardware (and also the OEM) has proven itself to be unreliable.  replacing it with better SAN storage is going to be prohibitively expensive.  The system consists of about 76 million objects, most of which are jpg photos.  A few hundred thousand of them are
20:55 xymox joined #gluster
20:55 elyograg we are going to start off with two servers each with a handful of 3TB drives.  I hope to switch to 4TB drives shortly after that and then add servers in pairs when the first pair fills up.
20:57 dbruhn elyograg, is there a reason you are choosing to not use hardware raid on the bricks, lvm, or md?
20:57 elyograg The hardware will be Dell R720xd, 12 3.5" bays and an internal pair of 2.5" drives for the OS.
20:58 hattenator yeah, that's what I was getting at.  I don't have any tests or anything, but to my knowledge distribute is mostly aimed at cross-server distribution.  I suspect using the PERC card to setup a RAID volume would massively outperform gluster-distribute on a single server.
21:00 dbruhn My understanding is the same, you end up offloading all of your multi disk processing to the CPU, instead of letting a better raid subsystem handle it.
21:00 hattenator A 12-disk RAID0 would scare the crap out of anyone in my organization, and a 12-disk RAID5 is a little wider than most people would suggest striping, so personally I'd go with a raid-50 volume
21:00 elyograg dbruhn: Losing one drive of disk space to a RAID5 volume (or two to raid6) is something we hate to do.  That would be 3TB to 8TB per server depending on drive size and raid level.  also, raid5/6 has terrible performance on sustained writes.  from a gluster standpoint, rebuilding a 3TB or 4TB brick will be much faster than rebuilding a brick that's at least 33TB in size.
21:01 dbruhn I am assuming infiniband and RDMA?
21:02 hattenator That makes some sense, but if you lose one disk in two servers, you now either have massive filesystem corruption or a completely unusable filesystem.  That's a big risk, depending on how much these jpegs are worth.
21:02 elyograg also, we are not going to buy these servers fully populated.  although adding disks and expanding a raid volume is possible, it's a messy and extremely slow procedure that kills I/O performance while it's happening.  With individual disks, I can add one disk to each server pair and make the volume bigger.
21:03 dbruhn I see where you are coming from, I guess what hattenator and I are both getting at is you are potentially creating a latency nightmare for the cost of quickly rebuilding a brick once in a while
21:04 elyograg infiniband ... i wish.  penny-pinching dictates that it'll be gigabit to start out, and that pretty much means that 10Gb is the only upgrade path.
21:05 dbruhn Without RDMA you are going to worsen your latency situation, just a warning.
21:05 dbruhn not trying to deter you, just making sure you are aware before you get into it.
21:06 dbruhn When I talk about latency I am talking about all of those things that make du super slow
21:06 elyograg They are already running scared from the costs.  I know pretty much zero about infiniband ... can you tell me what it would cost for the required infrastructure, or at least some mfg/model numbers so I can look it up?
21:07 dbruhn You can get a 36 port QDR infiniband switch for about 7600 from mellenox.
21:08 elyograg 36 port would certainly be enough for the first few phases on my rollout.
21:08 dbruhn yep
21:09 dbruhn You can get smaller but mellenox is the name in infiniband
21:09 elyograg would a two-port HBA be enough, plugging one into each switch for redundancy?
21:09 elyograg and what would you recommend for those?
21:09 dbruhn I am using a single port HBA and replication
21:10 dbruhn distributed and replicated
21:10 elyograg the reason I'd want a two-port would be so that if a switch failed, nothing would go down.
21:11 dbruhn I have never felt with multipathing with infiniband, so I am the wrong person to ask on that one.
21:11 dbruhn I wonder if you could run one switch per replication set
21:11 dbruhn so if you lost a switch you would still be operational
21:12 dbruhn granted that sounds like a split brain nightmare if the both sides stay operational
21:12 duerF joined #gluster
21:12 dbruhn not sure if that could happen or not
21:13 elyograg I see the MTS3600 when I search.  is that what you were thinking?  Would it be at all possible to start with gigabit and then switch to rdma?
21:15 dbruhn http://www.mellanox.com/related-doc​s/prod_ib_switch_systems/IS5030.pdf
21:15 dbruhn I have no idea about switching, someone else would have to answer. My assumption is that it would be semi folk lift to change the transport on a volume. But I could be wrong
21:17 dbruhn I think if you check the archive logs, there were folks discussing changing transport earlier today even
21:17 balunasj joined #gluster
21:18 elyograg I am already planning to use a subdirectory on each filesystem for the bricks.  Perhaps I could just set up the tcp volume to begin with, then when we can buy some IB hardware, create the rdma volume using another dir, then migrate all the data over to the new volume.
21:19 dbruhn You would have to set up each volume and migrate the data, witch in my mind means you would need 2x the storage to facilitate the migration
21:19 * jdarcy o_O
21:20 jdarcy cp $src $dst && rm $src
21:20 elyograg My hope would be to be migrating to rdma well before we were finished migrating off the SANs.
21:22 jdarcy When deriving costs for IB (as with 10GbE) it's important to include cables and SFPs (or equivalent) as well as switches and NICs.
21:23 elyograg ok, so I can see that IS5030 switch for prices at or less than the $7600 you mentioned.  That puts it in a potentially better light than the 10GbE hardware I have looked at.  Either way, as you just said, I would need the misc stuff.  am I right in thinking that this is four times as fast as 10GbE?
21:23 elyograg plus lower latency?
21:24 jdarcy QDR IB is ~3x the bandwidth of 10GbE, and probably compares even better for latency.
21:24 badone_home joined #gluster
21:25 elyograg the servers that are going to natively mount this will also need IB cards and switchports, right?  I could do NFS mounts over gigabit, though.
21:25 jdarcy IB uses 10B/8B encoding, so it's really 32Gb/s counting the way Ethernet does.
21:25 dbruhn http://www.mellanox.com/content/pages.php?pg=​infiniband_cards_overview&amp;menu_section=41
21:25 glusterbot Title: Products: Mellanox Technologies (at www.mellanox.com)
21:26 dbruhn here is the qdr card I am using
21:28 dbruhn Here is a good article on gluster and transport
21:28 dbruhn http://www.unlocksmith.org/2009/11/i​nfiniband-10gige-and-glusterfs.html
21:28 glusterbot Title: Unlocksmith: Infiniband, 10GigE and GlusterFS (at www.unlocksmith.org)
21:28 elyograg When I saw that picture of the card, it looked massive, but then the brochure shows that they are half-height cards, which means they're actually quite small.
21:29 elyograg for some of the client systems, I might need PCI-X, but I would assume that something is available.
21:30 dbruhn You don't need infiniband for the clients
21:30 elyograg how does that work?
21:30 dbruhn the infiniband is only for the back end between the cluster machines
21:30 dbruhn and Gbe for the client to the cluster
21:31 dbruhn I am assuming you are using this as a file server from the jpg comments earlier
21:33 elyograg in a way, yes.  the SAN stuff that we have is accessed by a pair of Solaris x86 servers that provide redundant NFS access.  The filesystems are all ZFS.  We did this back when Sun was giving away Solaris.  Now that Oracle is in charge, we can't afford to continue growing this.
21:34 dbruhn You can't compare gluster to a san it's two totally different deals. A SAN is a block level storage device, and Gluster is built around NAS technology.
21:35 dbruhn essentially gluster allows you to create the same NFS access that your SAN and Sun boxes are today.
21:35 lkoranda joined #gluster
21:35 elyograg I know.  For our current setup, the NFS heads are the bottleneck.
21:35 tc00per|lunch Interesting discussion @elyograg and @dbruhn... I'm working on the 1GigE/10GigE/IB rationale for our systems as well. Any problem with setting up both rdma/tcp transports on volumes when creating them using TCP with 1GigE/10GigE now and adding the IB/RDMA capability later?
21:36 dbruhn what kind of bottlenecks are you experiencing on those NFS heads with san's behind them?
21:37 dbruhn tc00per, I am not sure on the capabilities there.
21:37 elyograg from what I understand, one major headache is the hundreds of NFS automounts.  We have an automount entry for each of our information providers.  Some of the providers are so big that each of the individual feeds from them are their own mounts.
21:37 dbruhn so are you going to replace NFS as your access methodology?
21:39 elyograg initially, I think we'll probably have to continue with NFS.  I hope to quickly migrate to a native mount and put symlinks in the original directory structure to point things at a subdir in the native mount.  When we entirely eliminate the SAN, I think we can probably put a single symlink pointing the root of the structure at a native mount directory.
21:41 elyograg what I *want* to do, but can't get the backing for, is put something like Isilon in.  We currently have snapshots and backups, we won't have them when we use gluster.
21:42 dbruhn I have used isilon, it's a nice system super easy to manage, similar architecture to gluster
21:42 dbruhn also way more expensive
21:42 dbruhn you can have backups with gluster if you use geo sync
21:43 elyograg dbruhn: and buy the required hardware.  it always comes down to money, and I can never get ahold of any.
21:43 dbruhn and you could probably use LVM and create snapshots if you really wanted to script it all
21:46 hattenator Does glusterbot listen to me?
21:46 hattenator @ext4
21:46 glusterbot hattenator: Read about the ext4 problem at http://joejulian.name/blog/gluste​rfs-bit-by-ext4-structure-change/
21:46 hattenator hurray
21:47 hattenator just needed that kernel version, 2.6.32-268.el6
21:49 elyograg I am considering going with Fedora so btrfs is new enough to use for the bricks.  we could then do brick-level snapshots.  recovering data from oopses would not be 100% straightforward, but it would be possible.
21:50 tc00per|lunch Looks like the dependencies on 'old' libs didn't go away with the 3.3.1-1 rpms on 'bits.gluster.com' (compared to the qa release). The release version still depends on compat-readline5 and openssl098e. I don't know much about readline and according to RHN openssl098e is outdated. The 3.3.0 versions did NOT depend on these. Is that because kkeithleys build/repo changed the dependencies?
21:54 iknowfoobar joined #gluster
21:56 copec_ joined #gluster
21:59 JoeJulian tc00per|lunch: Possible, or just because he knows more about writing specs.
21:59 copec_ left #gluster
22:01 JoeJulian tc00per|lunch: I've seen Kaleb's commits for the new new version so I expect his repo will have it as soon as he's satisfied with the build.
22:03 copec joined #gluster
22:08 chicharron joined #gluster
22:14 chicharron left #gluster
22:19 aliguori joined #gluster
22:20 Nr18 joined #gluster
22:22 jbrooks_ joined #gluster
22:33 jbrooks joined #gluster
22:37 rferris joined #gluster
22:41 glusterbot New news from newglusterbugs: [Bug 865619] Adding "large" amounts of metadata to a container or object, then deleting it, can strand metadata keys causing unnecessary reads of those keys <https://bugzilla.redhat.com/show_bug.cgi?id=865619>
22:41 rferris left #gluster
22:45 xymox joined #gluster
22:55 nightwalk joined #gluster
23:04 sensei joined #gluster
23:39 a2 joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary