Camelia, the Perl 6 bug

IRC log for #gluster, 2013-08-30

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:12 jporterfield joined #gluster
00:27 jporterfield joined #gluster
00:32 awheeler joined #gluster
00:43 jporterfield joined #gluster
00:53 asias joined #gluster
00:58 bala joined #gluster
01:16 kevein joined #gluster
01:20 awheeler joined #gluster
01:25 awheeler joined #gluster
01:30 awheeler joined #gluster
01:35 awheeler joined #gluster
01:40 awheeler joined #gluster
01:51 robo joined #gluster
01:51 harish_ joined #gluster
02:01 lpabon joined #gluster
02:06 jporterfield joined #gluster
02:27 hagarth joined #gluster
02:43 harish_ joined #gluster
02:52 jporterfield joined #gluster
02:54 saurabh joined #gluster
03:06 bharata-rao joined #gluster
03:06 jporterfield joined #gluster
03:19 shubhendu joined #gluster
03:24 StarBeast joined #gluster
03:27 shadow__ joined #gluster
03:28 shadow__ left #gluster
03:28 P0w3r3d joined #gluster
03:33 anands joined #gluster
03:48 jporterfield joined #gluster
03:57 awheeler joined #gluster
03:57 RameshN joined #gluster
04:07 jporterfield joined #gluster
04:12 ryant joined #gluster
04:14 ryant Does anyone know how to fix a split-brain directory:  I've got a 10 node cluster where 6 servers have one trusted.gfid for a directory and 4 have a different one
04:14 ryant I'm using replica 2 but each half of the replicated pair has the same gfid
04:18 dusmant joined #gluster
04:24 shylesh joined #gluster
04:26 glusterbot New news from resolvedglusterbugs: [Bug 957063] "remove-brick start" fails when removing a brick from replica volume <http://goo.gl/gtVG0>
04:28 ndarshan joined #gluster
04:29 shruti joined #gluster
04:30 nightwalk joined #gluster
04:30 satheesh joined #gluster
04:31 davinder joined #gluster
04:35 ppai joined #gluster
04:38 kanagaraj joined #gluster
04:39 harish_ joined #gluster
04:44 spandit joined #gluster
04:46 sgowda joined #gluster
04:47 itisravi joined #gluster
04:54 aravindavk joined #gluster
04:56 bala joined #gluster
04:59 StarBeast joined #gluster
05:08 satheesh1 joined #gluster
05:10 lalatenduM joined #gluster
05:11 lalatenduM joined #gluster
05:20 jporterfield joined #gluster
05:21 CheRi joined #gluster
05:26 vshankar joined #gluster
05:33 RedShift joined #gluster
05:40 rastar joined #gluster
05:41 raghu joined #gluster
05:48 ppai joined #gluster
05:53 jporterfield joined #gluster
05:56 sgowda joined #gluster
05:57 mohankumar__ joined #gluster
05:59 StarBeast joined #gluster
05:59 bulde joined #gluster
06:01 vpshastry joined #gluster
06:04 SunilVA joined #gluster
06:10 hagarth joined #gluster
06:13 guigui1 joined #gluster
06:16 rgustafs joined #gluster
06:20 jtux joined #gluster
06:21 JoeJulian ryant: I just had that myself. I just picked one and deleted the other. I didn't worry about the .glusterfs directory for those.
06:21 asias joined #gluster
06:25 vimal joined #gluster
06:45 d-fence joined #gluster
06:49 sgowda joined #gluster
06:50 ricky-ticky joined #gluster
06:56 eseyman joined #gluster
07:00 StarBeast joined #gluster
07:07 bulde1 joined #gluster
07:08 ngoswami joined #gluster
07:09 ctria joined #gluster
07:11 hybrid5121 joined #gluster
07:15 RedShift from the quick start guide, do I need to do anything special to access the gluster volume using NFS?
07:24 RedShift I'm getting "requested nfs version or transport protocol is not supported" even though I specified to use TCP in the mount command
07:24 RedShift ifc
07:24 harish_ joined #gluster
07:29 ujjain joined #gluster
07:31 RedShift ah found out what it was, I installed the rpc services *after* gluster, needed to restart glusterd to have it register with rpcbind
07:32 psharma joined #gluster
07:54 asias joined #gluster
07:54 jtux joined #gluster
08:02 nshaikh joined #gluster
08:05 mooperd joined #gluster
08:22 StarBeast joined #gluster
08:23 meghanam joined #gluster
08:24 verdurin joined #gluster
08:25 bulde joined #gluster
08:26 jtux joined #gluster
08:28 spandit joined #gluster
08:38 bharata-rao joined #gluster
08:40 jra_____ joined #gluster
08:46 jtux joined #gluster
08:50 spider_fingers joined #gluster
08:52 meghanam_ joined #gluster
08:53 glusterbot New news from newglusterbugs: [Bug 1002907] fix changelog binary parser <http://goo.gl/UB57mL>
09:06 jmh_ joined #gluster
09:07 vpshastry1 joined #gluster
09:10 duerF joined #gluster
09:17 shubhendu joined #gluster
09:25 sgowda joined #gluster
09:25 msciciel_ joined #gluster
09:28 dusmant joined #gluster
09:28 bharata-rao joined #gluster
09:32 edward1 joined #gluster
09:33 13WAA3BEN joined #gluster
09:33 3JTAAAY8Q joined #gluster
09:35 nshaikh left #gluster
09:37 jhp joined #gluster
09:39 jhp Hi everyone. I have a usuability question. I have to following problem to solve. I have a X number of servers spread over at least 2 datacentres. Interconnectivity between the datacentres is redundant but it is not garanteed that all servers can see eachother all the time.
09:39 manik1 joined #gluster
09:39 jhp I have an storage area that needs to be available on all servers (arround 1TB) and updates need to be instantanious or close to instantanious.
09:40 eseyman joined #gluster
09:40 jhp Can I use gluster to sync a dataset between all the servers if I put a 1TB disk in all those servers?
09:41 aalmos joined #gluster
09:41 jhp For example in the mirror mode? And what if then 1 host dies or one whole datacentre dies? Does the other datacentre lose the storage or can it live on on the local available copies?
09:41 smellis joined #gluster
09:42 jhp And what if the failed datacentre returns?
09:42 guigui joined #gluster
09:43 ahomolya joined #gluster
09:50 ahomolya hello there, I would like to ask a few general questions about striped volumes
09:50 ahomolya is there anyone who can help me?
09:54 glusterbot New news from newglusterbugs: [Bug 1002940] change in changelog-encoding <http://goo.gl/dmQAcW>
09:54 dusmant joined #gluster
10:02 manik1 joined #gluster
10:05 vimal joined #gluster
10:11 ndarshan joined #gluster
10:14 rjoseph joined #gluster
10:16 kanagaraj joined #gluster
10:21 vpshastry1 joined #gluster
10:22 RameshN joined #gluster
10:22 aravindavk joined #gluster
10:24 glusterbot New news from newglusterbugs: [Bug 1002945] Tracking an effort to convert the listed test cases to standard regression test format. <http://goo.gl/SMLUb1>
10:30 sgowda joined #gluster
10:34 ppai joined #gluster
10:37 pkoro joined #gluster
10:39 clag_ joined #gluster
10:41 RedShift is there a way to bind glusterd to a certain interface that it should use as source for every connection?
10:42 NuxRo RedShift: not afaik, but you can change the source address via "ip route" or use iptables SNAT
10:45 nshaikh joined #gluster
10:46 andreask joined #gluster
10:47 shruti joined #gluster
10:51 RedShift NuxRo how would you do that with ip route?
10:56 NuxRo RedShift: well, it's not very fine, afaik you can change the default source address or the source address based on a route
10:57 RedShift fixed it using SNAT
10:57 RedShift well, fixed it... worked around it
10:57 RedShift "close enough"
10:58 NuxRo cool
10:59 andreask joined #gluster
11:04 RedShift NuxRo I'm trying to create a redundant backend network
11:05 andreask joined #gluster
11:08 RedShift so far so good...
11:09 NuxRo good luck, gluster can be tricky in this cases
11:09 NuxRo what i do is run gluster peers over one network, and the clients mount volumes via nfs over another network
11:09 failshell joined #gluster
11:09 NuxRo works sort of okay-ish
11:10 RedShift yeah I'm trying something else, the backend network must be redundant as well, surviving soft failures (ie a ifconfig ethx down) too
11:10 RedShift right now I've set up a loopback interface on both hosts, and use dynamic routing over two interfaces to connect each other
11:11 RedShift the problem was gluster wasn't seeing the loopback IP's, but the regular source interface which can be used to reach the other peer
11:15 NuxRo for HA i use keepalived on all my gluster boxes and all access by the clients is done via this set of "keep alive" IPs, worked fine so far
11:16 RedShift hmm the peers don't seem to reconnect automatically even though they can reach each other
11:16 RedShift wait, spoke too soon
11:16 RedShift they show connected now
11:16 RedShift gotta find a way to shorten that time
11:18 NuxRo i think you want to shorten network.timeout (or smth similar), but doing so may cause other problems
11:18 NuxRo i think the default is 42 seconds or 60 seconds
11:18 RedShift anyway, my setup appears to be working
11:18 NuxRo congrats
11:18 CheRi joined #gluster
11:18 RedShift yeah felt like something like that, I can shorten the routing downtime by modifying some timers there
11:18 RedShift failover now happens in 20 seconds which is slow
11:18 nshaikh joined #gluster
11:18 RedShift + gluster timeouts, yeah that'll take a while
11:19 NuxRo yep
11:19 RedShift the data buildup by then could be huge
11:19 NuxRo really?
11:19 NuxRo not good, also make sure to run heal on volumes that are affected by a server/brick going down
11:19 ppai joined #gluster
11:19 RedShift doesn't gluster auto-heal?
11:20 mooperd joined #gluster
11:20 RedShift volume heal ds2 info doesn't show anything weird and I can see the files being updated on both sides
11:21 RedShift I'll suspend the VM running on it and compare md5sums
11:22 meghanam joined #gluster
11:22 meghanam_ joined #gluster
11:30 NuxRo RedShift: gluster auto heals, but you may want to trigger to autoheal ASAP once the server is back
11:32 RedShift does gluster do synchronous writes to peers? I mean, does it only return success on a write if the other peer has succesfully written the information too?
11:35 JuanBre joined #gluster
11:36 pkoro joined #gluster
11:41 NuxRo RedShift: i hope so, better ask on the mailing list so one of the devs can confirm
11:43 ppai joined #gluster
11:47 RedShift the md5sums match
11:47 RedShift so after the network failure, gluster recovered as it should
11:53 bulde joined #gluster
11:55 ndarshan joined #gluster
12:04 rastar joined #gluster
12:09 nightwalk joined #gluster
12:18 CheRi joined #gluster
12:18 ndarshan joined #gluster
12:18 manik joined #gluster
12:22 hagarth joined #gluster
12:26 RameshN joined #gluster
12:28 B21956 joined #gluster
12:32 vpshastry joined #gluster
12:44 harish_ joined #gluster
12:48 robo joined #gluster
12:54 davinder joined #gluster
12:55 mooperd joined #gluster
12:57 JuanBre joined #gluster
12:58 awheeler joined #gluster
12:59 plarsen joined #gluster
12:59 awheeler joined #gluster
13:07 mooperd joined #gluster
13:12 vpshastry left #gluster
13:19 jporterfield joined #gluster
13:19 karoshi joined #gluster
13:20 karoshi are there supposed to be (broken) symlinks (or symlinks at all) inside the .gluster directory of a repliacted brick?
13:23 rcheleguini joined #gluster
13:27 harish_ joined #gluster
13:29 MrRobotto joined #gluster
13:36 Guest94684 left #gluster
13:40 lpabon joined #gluster
13:42 kkeithley joined #gluster
13:43 kkeithley joined #gluster
13:45 bennyturns joined #gluster
13:47 foster1223 joined #gluster
13:48 foster1223 I have a performance question regarding a 4 node raid 5 xfs replicated-distributed cluster
13:49 foster1223 is there any documentation regarding what is the performance hit of using a replicated cluster?
13:50 ProT-0-TypE joined #gluster
13:50 foster1223 I would appreciate any help (docs or blogs) regarding performance
13:51 bugs_ joined #gluster
13:51 foster1223 I am getting close to 58MB/s on 4 10K disks in a raid 5 configuration
13:51 ProT-0-TypE joined #gluster
13:52 foster1223 but when I write to the disks locally I am receiving upwards of 800MB/s and I wanted to know where is that huge performance hit that I am seeing
13:53 foster1223 gluster 3.4.0  seems to add so better reads than before (close to 20MB/s)
13:54 andreask native gluster client?
13:54 foster1223 yes the fuse client
13:54 andreask and how do you test?
13:54 foster1223 just random dd's of different sizes
13:54 kkeithley Network? How many replicas?
13:55 foster1223 1Gb network with 2 replicas
13:56 andreask hmm ... 2 replicas ofer 1Gb
13:56 foster1223 I have the following set performance.flush-behind: on
13:56 foster1223 performance.write-behind-window-size: 128MB
13:56 andreask over
13:57 foster1223 is that too many replicas for a 1Gb network?
13:57 andreask well, the client does the replication so it writes to two servers
13:58 ryant 58 MBps * 2 replicas * 8bits/byte = 928 Mbps ~ 1Gbps
13:58 ryant the math works out, you're network bound
13:58 foster1223 ohhhh ok..that makes more sense
13:58 andreask try with nfs mount and you should see a better result
13:58 ryant nfs won't change it
13:58 ryant that'll just change caching
13:59 andreask but client would not saturate its network
13:59 andreask at the half of the capacity
13:59 foster1223 so in order to more performance to the disks I would need to add more networking?
13:59 kkeithley it'll change caching, but it will halve the number of writes made by the client. IOW replication will be done on the server
14:00 kkeithley s/it'll/using nfs will/
14:00 glusterbot kkeithley: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
14:00 kkeithley glusterbot ftw
14:00 plarsen joined #gluster
14:04 kaptk2 joined #gluster
14:04 ryant testing on single files isn't necessarily a fair test.  If you were to write more files so that the writes were spread over your disks you'd also achieve higher aggregate throughput
14:05 foster1223 I created a script that created 70,000 files in that directory
14:12 robo joined #gluster
14:12 bulde joined #gluster
14:14 MrRobotto left #gluster
14:18 MrNaviPacho joined #gluster
14:21 marbu joined #gluster
14:22 foster1223 Thanks!
14:23 foster1223 This info really helps
14:32 RedShift can gluster replicate an existing folder?
14:45 lpabon joined #gluster
14:48 zaitcev joined #gluster
14:58 harish_ joined #gluster
15:17 sprachgenerator joined #gluster
15:24 ryant @RedShift:  I would have thought self-head would replicate folders, but in my experience I end up with folders that exist on some bricks but not on others and self-heal doesn't take dare of the replication.
15:25 ryant I'd like to understand why this happens.  In one instance I found that the trusted.gfid attributes on the directory differed between bricks.  Don't know why, but that seems to stymie self-heal
15:31 awheele__ joined #gluster
15:33 foster1223 what does your gluster volume info say
15:55 jcsp joined #gluster
16:04 MrNaviPacho joined #gluster
16:12 JoeJulian ryant: two ways for that to happen. 1) have a replica set unavailable to the client (down or netsplit). The client creates a directory. Later, (all) the other set(s) is(/are) unavailable. Client creates the same directory. Two different gfids from two different mkdirs.
16:14 JoeJulian 2) in a two brick replica 2 one brick is down. The client makes the directory. The second brick is brought down. The first brick is brought up. The client makes the same directory.
16:15 JoeJulian That's the only scenarios I can think of to cause that.
16:21 LoudNoises joined #gluster
16:32 RedShift can the heal sweep be shortened?
16:33 RedShift or is there an easy way to tell if two nodes are full synchronized?
16:33 duerF^ joined #gluster
16:42 zaitcev joined #gluster
16:42 JoeJulian unless you somehow were able to get into a situation were out-of-sync files were not making it into the index, "gluster volume heal $vol info" should tell you just that.
16:44 jra_____ left #gluster
16:47 MrNaviPacho joined #gluster
16:56 jporterfield joined #gluster
17:00 jbrooks joined #gluster
17:03 glusterbot New news from resolvedglusterbugs: [Bug 842364] dd fails with "Invalid argument" error on the mount point while creating the file for first time on stripe with replicate volume <http://goo.gl/we3JJ>
17:04 RedShift how do I put a node that's been down back up again?
17:04 jbrooks joined #gluster
17:04 RedShift peer status says the node that was down is disconnected
17:08 RedShift am I supposed to use replace brick?
17:10 duerF joined #gluster
17:14 mbukatov joined #gluster
17:16 hagarth joined #gluster
17:18 RedShift hmm the nodes don't seem to be connecting
17:19 RedShift they can reach each other via the network, but when I do a telnet <otherhost> 24007, it times out
17:19 RedShift on <otherhost> itself, when I do a telnet localhost 24007, it accepts the connection
17:20 compbio make sure that the builtion firewall is off; e.g. service iptables status
17:20 compbio (that always bites me, at least)
17:21 RedShift yes
17:21 RedShift that's definitely not the problem :-)
17:21 ngoswami joined #gluster
17:23 JoeJulian localhost works, that's 127.0.0.1 on lo. What about the actual IP address of that server?
17:27 failshell joined #gluster
17:29 robo joined #gluster
17:29 MrNaviPacho joined #gluster
17:30 jbrooks joined #gluster
17:34 lalatenduM joined #gluster
17:38 RedShift uh never mind, my backend interface was blocking traffic on the switch
17:38 RedShift an old ACL was applied there
17:38 * RedShift should make a habbit of resetting test equipment more often -_-
17:47 jbrooks joined #gluster
17:47 jporterfield joined #gluster
17:56 jbrooks joined #gluster
17:59 nueces joined #gluster
18:07 RedShift is there a syntax or option available to make the "gluster" tool connect to a different host?
18:09 NuxRo havent heard of such an option, gluster commands must be used on a peer and propagates thorough the setup ..
18:09 NuxRo *through
18:09 shylesh joined #gluster
18:14 B21956 joined #gluster
18:14 B21956 left #gluster
18:20 RedShift nuxro gluster --remote-host
18:20 NuxRo heh, nice find
18:21 shylesh joined #gluster
18:25 JuanBre I am trying to activate quotas...but I get "Quota command failed"
18:26 JuanBre here is the cli.log http://pastebin.com/gWaDFpb4
18:26 glusterbot Please use http://fpaste.org or http://paste.ubuntu.com/ . pb has too many ads. Say @paste in channel for info about paste utils.
18:26 JuanBre http://paste.ubuntu.com/6045136/
18:26 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
18:27 dewey joined #gluster
18:28 JuanBre and my glusterd.log http://paste.ubuntu.com/6045138/
18:28 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
18:28 JuanBre gluster version: glusterfs 3.4.0alpha built on Feb 14 2013 17:44:47
18:28 dewey_ joined #gluster
18:58 jbrooks joined #gluster
19:03 bennyturns joined #gluster
19:22 jporterfield joined #gluster
19:31 jporterfield joined #gluster
19:37 plarsen joined #gluster
20:10 plarsen joined #gluster
20:32 plarsen joined #gluster
20:34 nueces joined #gluster
20:44 jporterfield joined #gluster
21:04 manik1 joined #gluster
21:05 manik1 left #gluster
21:33 andreask joined #gluster
21:49 rcheleguini joined #gluster
21:55 jporterfield joined #gluster
22:04 jporterfield joined #gluster
22:09 nueces joined #gluster
22:17 jporterfield joined #gluster
22:23 jporterfield joined #gluster
22:31 robo joined #gluster
22:59 tjstansell joined #gluster
23:02 jporterfield joined #gluster
23:03 tjstansell i have a replica pair where i've automated the re-joining of a cluster if we kickstart one of the nodes.  unfortunately, it occationally gets stuck and both nodes are in the "Sent and Received peer request (Connected)" state, never getting to the "Peer in Cluster (Connected)" state.  as such, volume information is not being transferred to the newly built system and the automation fails.
23:04 tjstansell does anyone have any ideas on how to debug *why* it can't get any farther?
23:11 tjstansell in this case, admin01 got kickstarted.  admin02 seems to have been attempting to reconnect to it.  it appears to have reconnected between the time glusterd started and the automation script on admin01 tried to probe admin02.
23:12 tjstansell admin01 logged: 0-glusterd: Unable to find peerinfo for host: admin02.mgmt (24007) at 22:50:57.673091
23:16 awheeler joined #gluster
23:16 tjstansell hm... actually, just before that, admin02 logged: 0-glusterd: Received RJT from uuid:...
23:19 tjstansell so the node that's still up appears to be trying to reconnect to the down'd node.  if the node that's up happens to talk to the node that's being rebuilt before the rebuilt node has a chance to probe the node that's still up, the rebuilt node replies with a RJT ... causing things to get into this weird state where they eventually both are talking, but don't fully join the cluster and exchange volume data.
23:44 Guest53741 joined #gluster
23:44 basic` uhh.. hmm… how do i get a second brick back online?
23:44 basic` it's showing offline, but i can't find any command that will try and start it back up again
23:45 StarBeast joined #gluster
23:45 jporterfield joined #gluster
23:51 ninkotech joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary