Camelia, the Perl 6 bug

IRC log for #gluster, 2013-05-09

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:29 yinyin joined #gluster
00:36 yinyin joined #gluster
00:44 Supermathie a2_: No
00:44 Supermathie a2_: I'm about to reset everything to a particular state, truncate the logs, set to debug, capture traffic, and fire it up.
00:48 hflai joined #gluster
00:52 Supermathie toddstansell, JoeJulian: Part of my mandate will be testing gluster on zfs. On FreeBSD, not this FUSE shit. :)
00:53 JoeJulian hehe
00:54 jhon38 joined #gluster
00:54 Supermathie I've realllly taken a liking to FreeBSD lately.
00:54 JoeJulian Theoretically, you could take zfs-fuse and turn it into a storage translator...
00:55 JoeJulian I used freebsd about 5 or 6 years ago. hated it with a passion.
00:55 Supermathie kind of like the block backend?
00:55 JoeJulian right
00:55 * Supermathie strokes neckbeard...
00:56 jhon38 hello
00:56 glusterbot jhon38: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
00:57 JoeJulian I should change "few minutes" to seconds... back when I wrote that, it was just me. :D
00:57 Supermathie Meh, keep expectations low. :)
00:57 JoeJulian Or maybe it was just me and samppah
00:58 jhon38 Am I on the right channel to ask a question about a problem installing gluster?
00:58 JoeJulian yep
00:59 JoeJulian But you only get three free questions before owing somebody a beer. That was your first one. ;)
00:59 Supermathie jhon38: Nuke it from orbit, it's the only way to be sure.
01:00 jhon38 Thanks.  I am building a tst setup in my VPC.  I have 2 nodes with 3.3.1 installed. There are no firewalls and I can ping each server and telnet to port 24007.  When I attempt to do the peer probe then I get a hang for about 60 seconds and the mystical error 107.
01:00 jhon38 Super - my motto for most things
01:01 JoeJulian Have you checked the logs for clues?
01:02 jhon38 I've gone over google and traced down everything I can find.  I have tried the install on 3 diff distributions (RH, Amazons own AI and Ubuntu) same problem
01:02 jhon38 yes - here is the logs
01:02 jhon38 [2013-05-08 20:40:42.298130] I [glusterd-handler.c:685:glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req ip-10-100-1-62.ap-northeast-1.compute.internal 24007
01:02 jhon38 [2013-05-08 20:40:42.303567] I [glusterd-handler.c:428:glusterd_friend_find] 0-glusterd: Unable to find hostname: ip-10-100-1-62.ap-northeast-1.compute.internal
01:02 jhon38 [2013-05-08 20:40:42.303599] I [glusterd-handler.c:2245:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: ip-10-100-1-62.ap-northeast-1.compute.internal (24007)
01:02 jhon38 [2013-05-08 20:40:42.304023] I [rpc-clnt.c:968:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
01:02 jhon38 [2013-05-08 20:40:42.344650] I [glusterd-handler.c:2227:glusterd_friend_add] 0-management: connect returned 0
01:02 jhon38 [2013-05-08 20:41:45.344191] E [socket.c:1715:socket_connect_finish] 0-management: connection to  failed (Connection timed out)
01:02 Supermathie jhon38: can you put those on pastie/pastebin/fpaste/whatever? Much easier to read
01:03 jhon38 https://gist.github.com/hlj​admin/628a468788bae32826ca
01:03 JoeJulian Unable to find hostname: ip-10-100-1-62.ap-northeast-1.compute.internal
01:03 glusterbot <http://goo.gl/txCMB> (at gist.github.com)
01:03 JoeJulian looks like you might have a problem with hostname resolution
01:03 jhon38 i have my /etc/hosts setup and can do nslookup of each node
01:04 Supermathie jhon38: paste your /etc/hosts
01:04 jhon38 i can also telnet and ping from each node t the other
01:05 jhon38 https://gist.github.com/hlj​admin/1b7e30e5a58ea32ce42d
01:05 glusterbot <http://goo.gl/8Wnu2> (at gist.github.com)
01:06 Supermathie jhon38: You're on a /16? Ugh, can you just 'server1' and 'server2' in /etc/hosts?
01:06 Supermathie for now?
01:07 Supermathie First rule of sysadmin: It's a DNS problem.
01:07 Supermathie Second rule of sysadmin: See rule 1.
01:07 jhon38 yep
01:08 JoeJulian "connection to %s failed" where %s is blank makes me wonder, though I have been led astray by that kind of thing before.
01:08 jhon38 i can resolve the hostname with no problem by every other means
01:09 Supermathie jhon38: Try putting server1 and server2 into /etc/hosts on both servers (as the FIRST entry in the respective line) and then try 'peer probe server2' from server1
01:11 JoeJulian I might also look at a tcpdump. The fact that you can telnet by hostname, but that the probe is failing due to a timeout, make me wonder where it's trying to connect to.
01:11 JoeJulian And, the obvious question, glusterd is running on both servers?
01:12 Supermathie Zeroth rule: Is it plugged in? And turned on? :)
01:12 jhon38 i tried a tcpdump and they are connecting
01:13 JoeJulian "Have you tried turning it off and on again? Is it plugged in?" - Roy Trenneman
01:13 JoeJulian Check the other log
01:14 jhon38 testing the hosts change now
01:14 JoeJulian Oh, I should have looked at that hosts file earlier. Add 127.0.0.1 localhost to it.
01:15 lkthomas hmm
01:15 lkthomas on 3.3.1, gluster support linux AIO, would it improve performance ?
01:16 jhon38 hosts file has localhost in place
01:16 jhon38 now i am getting a DNS resolution error
01:17 Supermathie lkthomas: In what situation? async writes already get acked to the client before they hit the FS layer (right?)
01:17 jhon38 https://gist.github.com/hlj​admin/faa7563725b500666900
01:17 glusterbot <http://goo.gl/GnHpD> (at gist.github.com)
01:17 lkthomas Supermathie: I am not sure, I am using gluster on local disk to disk replica
01:17 Supermathie jhon38: paste both /etc/hosts as well
01:18 lkthomas speed is extremely slow, 12hours only 62GB copied
01:18 jhon38 server2 - https://gist.github.com/hlj​admin/f191cbe63bf3d9e52140
01:18 Supermathie lkthomas: MOAR DETAILS
01:18 glusterbot <http://goo.gl/S1t8B> (at gist.github.com)
01:18 jhon38 server1 - https://gist.github.com/hlj​admin/ddf272826a5bae3acb51
01:18 glusterbot <http://goo.gl/aiVHy> (at gist.github.com)
01:18 Supermathie wow... glusterfs/nfs now using 1250% CPU...
01:19 lkthomas Supermathie: I am running local disk to disk replica 2 setup and mount locally, then, I move file from one HDD into the gluster volume using rsync and let it run overnight
01:19 Supermathie mount locally with..... fuse?
01:19 lkthomas fuse.glusterfsfuse.glusterfs
01:19 lkthomas fuse.glusterfs
01:20 fidevo joined #gluster
01:20 Supermathie Hey, on that note, I've found that I can't NFS mount gluster mounts to the same machine as Linux kills gluster's lockd when it loads the nfs and lockd modules. I'm guessing that just isn't supported...?
01:24 lkthomas how could I know if replica is running async or sync mode ?
01:24 Supermathie lkthomas: sync replica is a 'replicate' volume, async is geo-replication
01:24 lkthomas I see
01:24 lkthomas so it have to be sync mode
01:29 kevein joined #gluster
01:30 lkthomas I am wondering what I could tune to make gluster run faster
01:30 JoeJulian get infiniband and go rdma
01:30 lkthomas LOL, I am running local disk to disk replica, IB deployment make no sense
01:31 jhon38 joe - gluster is running on both servers, both have been totally rebooted.  same problem
01:31 Supermathie jhon38: You did the hosts file incorrectly
01:32 bala joined #gluster
01:32 Supermathie I meant like this: "10.100.1.62 server1"
01:32 jhon38 ohh...
01:32 JoeJulian lkthomas: Actually even an rdma virtual interface would, assuming there is such a thing and it works, in order to avoid context switching.
01:33 JoeJulian And they're turning out the lights and pushing me out the door. guess I'd better go home.
01:33 lkthomas LOL
01:33 Supermathie And I specifically said 'first entry' so that you would put in: "10.100.1.62 server1 ip-10-whatever-big-stupid-long-amazon-hostname" since the first entry for an ip in e/tc/hosts controlls reverse resolution.
01:33 lkthomas JoeJulian: any docs showing me how to build rdma virtual interface ?
01:34 jhon38 sorry - change made and trying it now
01:37 Supermathie a2_: OK! I now have a 2GB nfs.log in trace mod,e a core fileand a 2GB .pcap that ought to illustrate the problem.
01:38 montyz joined #gluster
01:38 lkthomas something I don't understand, glusterfs always start with replica, that means we always lost half of the disk space no matter what
01:38 duerF joined #gluster
01:40 Supermathie lkthomas: eh? No, you told it to build a replica when you configured the volume.
01:42 jbrooks joined #gluster
01:44 theron joined #gluster
01:45 portante|ltp joined #gluster
01:45 jhon38 supermathie:  edited hosts file - https://gist.github.com/hlj​admin/d4783c81e811a7447ef2
01:45 glusterbot <http://goo.gl/aAg8O> (at gist.github.com)
01:54 Supermathie jhon38: better?
01:54 jhon38 same problem
01:54 jhon38 restarted both nodes
01:55 Supermathie don't bother restarting
01:55 Supermathie this ain't windows
01:55 jhon38 i know - should say i restrted the service on both nodes
01:55 Supermathie did you ever paste output of 'peer probe serverx' on both nodes?
01:56 jhon38 i did earlier - wil redo now
01:56 jhon38 https://gist.github.com/hlj​admin/f3501b0295ad1a0f2dff
01:56 glusterbot <http://goo.gl/4Nlt5> (at gist.github.com)
01:58 hagarth joined #gluster
01:59 Supermathie I mean the CLI command & output
02:00 Supermathie this is weird.
02:00 jhon38 thats all the log i have
02:00 Supermathie Try blowing away gluster state files
02:00 Supermathie well how did you probe?
02:00 jhon38 gluster peer probe server2
02:01 jhon38 gluster peer probe server1
02:02 jhon38 this is the log from 1 to 2: https://gist.github.com/hlj​admin/a790a1cf01fc38f63ebb
02:02 glusterbot <http://goo.gl/dNnRC> (at gist.github.com)
02:02 Supermathie no output?
02:02 avati_ JoeJulian: ping?
02:02 jhon38 log from 2 to 1: https://gist.github.com/hlj​admin/b3cffae8d0b9d2580f03
02:02 glusterbot <http://goo.gl/ccVlL> (at gist.github.com)
02:03 jhon38 https://gist.github.com/hlj​admin/afa569142e1b640efdf3
02:03 glusterbot <http://goo.gl/85UQb> (at gist.github.com)
02:05 jhon38 supermathie - state files are located in /var/lib/glusterd/...
02:06 Supermathie jhon38: yeah
02:07 jhon38 supermathie:  simply wipe out that structure?  all the folders are empty
02:07 jhon38 the .info file has the UUID
02:09 Supermathie jhon38: Hey have you tried probing by IP?
02:09 jhon38 supermathie: yes, same result
02:10 bharata joined #gluster
02:11 jhon38 supermathie:  log from ip probe - https://gist.github.com/hlj​admin/5f9cb7acdca0ec3a90f7
02:11 glusterbot <http://goo.gl/qf2Lf> (at gist.github.com)
02:14 jhon38 supermathie:  running tcpdump on server1 while probing from server2: https://gist.github.com/hlj​admin/33edeaf279edcc190953
02:14 glusterbot <http://goo.gl/IP58S> (at gist.github.com)
02:14 jhon38 supermathie:  i used ip address
02:14 Supermathie jhon38: Your firewalls aren't off
02:15 jhon38 https://gist.github.com/hlj​admin/cef41f91014239a406c3
02:15 glusterbot <http://goo.gl/peqtI> (at gist.github.com)
02:23 yinyin joined #gluster
02:25 jhon38 supermathie:  i have my acl setup to allow all traffic
02:25 lalatenduM joined #gluster
02:25 jhon38 what port is not coming through?
02:32 Supermathie jhon38: the handshake isn't completing on 24007
02:32 Supermathie post your iptables
02:32 jhon38 server1: https://gist.github.com/hlj​admin/be35cba362dc624c3260
02:32 glusterbot <http://goo.gl/q3ArV> (at gist.github.com)
02:33 Supermathie huh. Odd.
02:33 Supermathie Well, somehow you have a networking problem.
02:33 jhon38 server2: https://gist.github.com/hlj​admin/c6dc69ac178dc4e84558
02:33 glusterbot <http://goo.gl/LWYY7> (at gist.github.com)
02:34 fidevo joined #gluster
02:34 jhon38 thats weird
02:48 jhon38 supermathie:  what port range besides 24007- is gluster using?
02:51 Supermathie jhon38: volume status volname will tell you
02:51 Supermathie but with nothing configured, just 24007
02:55 jhon38 ok - i need to dig around in the acl and see what is happening.  many thanks for your help.
02:58 Supermathie What do you mean the ACL?
02:58 Supermathie you just posted your firewall, it's empty
03:11 jhon38 im running in a VPC
03:11 MattRM joined #gluster
03:27 shapemaker joined #gluster
03:47 jikz joined #gluster
03:48 jikz_ joined #gluster
03:49 jikz joined #gluster
03:50 Supermathie jhon38: Virtual PC?
03:50 jhon38 AWS Virtual private Cloud
03:50 Supermathie OK, like I said, you have a firewall problem.
03:51 Supermathie not on your servers, but on amazon's site
03:51 jhon38 yes, looks like gluster uses 1023 as outgoing and acl started at 1024 - testing now to see if that was problem
03:53 jhon38 that was it - peer probe worked
03:53 jhon38 volume is online
03:58 Supermathie jhon38: So next time when someone asks you about your firewall being off, that INCLUDES the VPC firewall :D
03:59 jhon38 thanks - feel rather stupid on that one.
03:59 Supermathie I've done worse.
04:00 jhon38 ok - volume is up and set to replicate
04:00 jhon38 just dropped some files into server1 and they arent appearing on 2
04:02 jhon38 but at least its running and now i can dig deeper
04:02 Supermathie run: volume status volname, check the ports, make sure the hosts can talk back and forth on those ports
04:04 jhon38 no prob on comm on the ports
04:04 Supermathie post your volume info and stauts
04:04 jhon38 telnet into them from both nodes
04:05 jhon38 https://gist.github.com/hlj​admin/d46c5682c7c83c5236c6
04:05 glusterbot <http://goo.gl/QiyzZ> (at gist.github.com)
04:05 abyss^ joined #gluster
04:08 jhon38 https://gist.github.com/hlj​admin/900f86730485f66cda86
04:08 glusterbot <http://goo.gl/UohjY> (at gist.github.com)
04:08 jhon38 https://gist.github.com/hlj​admin/634d2090ef59e42fbb87
04:08 glusterbot <http://goo.gl/5zXYl> (at gist.github.com)
04:11 Supermathie jhon38: check all your logs for connection problems
04:12 saurabh joined #gluster
04:13 jhon38 [2013-05-09 04:08:50.871577] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
04:14 Supermathie harmless
04:16 jhon38 https://gist.github.com/hlj​admin/1e0a2c676d29a9f7c6e4
04:16 glusterbot <http://goo.gl/LP2Kw> (at gist.github.com)
04:18 Supermathie jhon38: Whichever client that is is unable to connect to the brick on the second host:
04:18 Supermathie [2013-05-09 03:35:26.903163] I [client-handshake.c:1433:client_setvolume_cbk] 0-gv0-client-0: Connected to 10.100.1.131:24009, attached to remote volume '/export/brick1'.
04:18 Supermathie [2013-05-09 03:36:23.052719] E [socket.c:1715:socket_connect_finish] 0-gv0-client-1: connection to  failed (Connection timed out)
04:18 JoeJulian avati_: pong
04:19 shylesh joined #gluster
04:19 Supermathie JoeJulian: Does this mean gluster has 39705952 outstanding call frames?
04:19 Supermathie (gdb) print (call_pool_t)this->ctx->pool
04:19 Supermathie $15 = {{all_frames = {next = 0x25f96c0, prev = 0x25f9620}, all_stacks = {next_call = 0x25f96c0, prev_call = 0x25f9620}}, cnt = 39705952, lock = 0, frame_mem_pool = 0x0, stack_mem_pool = 0x0}
04:19 Supermathie gluster/nfs
04:20 yinyin joined #gluster
04:20 JoeJulian jhon38: yes, all "secure" communications are expected to originate from privileged ports.
04:21 jhon38 ok - mod'd acl to allow
04:21 jhon38 they are connected
04:21 jhon38 https://gist.github.com/hlj​admin/230c4adf4ed5913bf4e0
04:21 glusterbot <http://goo.gl/S5Ive> (at gist.github.com)
04:22 JoeJulian Supermathie: I have absolutely no idea. :D
04:23 bala joined #gluster
04:44 albel727 joined #gluster
04:45 sgowda joined #gluster
04:47 vpshastry joined #gluster
04:53 hchiramm__ joined #gluster
04:55 JoeJulian Supermathie: I'm not sure if that structure is used in more than one place. I think it may be used for different stacks. I would need the source reference to even start looking at how that stack is used there.
04:56 Supermathie JoeJulian: glusterfs/nfs is spending all it's time processing... something. And no time actually responding to FOPs.
04:57 Supermathie And by all it's time, I mean 16 threads at 100% CPU.
05:00 bala joined #gluster
05:01 mohankumar joined #gluster
05:05 theron joined #gluster
05:07 hagarth_ joined #gluster
05:15 bulde joined #gluster
05:16 jhon38 supermathie:  many thanks.  test bricks are up and running
05:18 yinyin joined #gluster
05:18 bala joined #gluster
05:19 vpshastry joined #gluster
05:30 JoeJulian Supermathie: gdb and "thread apply all bt"?
05:33 JoeJulian Actually, I'm looking at another weird one. I've got this one thing, every three seconds, I get "I [socket.c:1798:socket_event_handler] 0-transport: disconnecting now". I debugged it and it's failing a read on socket "/tmp/4efb008e4e433ff7735a5a76111461d1.socket". That file doesn't exist.
05:34 JoeJulian I suspect it's the unused nfs service.
05:35 JoeJulian yep, that was it.
05:35 JoeJulian <-- file a bug
05:35 glusterbot http://goo.gl/UUuCq
05:41 aravindavk joined #gluster
05:43 raghu joined #gluster
05:44 JoeJulian Supermathie: Probably try increasing log levels before debugging with gdb. I would expect that to show a little better what it thinks it's doing. Might making gdb a bit easier to use if you know where you want to be looking.
05:45 JoeJulian lastly, is this something I can simulate without oracle?
05:46 vshankar joined #gluster
05:47 Supermathie JoeJulian: I have a full log with log level set to trace.
05:48 lalatenduM joined #gluster
05:50 Supermathie JoeJulian: :)
05:50 JoeJulian Trace log levels weren't the cause of a 39mil rpc queue, was it?
05:51 Supermathie JoeJulian: Nope. Happens regardless.
05:51 * JoeJulian must be tired... He's mixing tenses.
05:52 andreask joined #gluster
05:53 JoeJulian rpc queue... hmm... since it's on the nfs service, I would imagine it must be an outgoing queue...
05:53 Supermathie Also:
05:53 JoeJulian btw, what source line were you getting that ctx from?
05:54 Supermathie errr socket.c:1792 in socket_event_handler
05:54 Supermathie I have a coredump to examine and found that in a frame :)
05:56 JoeJulian Can I get the coredump?
05:57 Supermathie JoeJulian: errr yeah... have to upload it from work tomorrow... it's 10GB.
05:57 rastar joined #gluster
05:57 Supermathie JoeJulian: will post to -devel shortly with other details.
05:57 JoeJulian yuck... I'll have to download it to home. My work connection sucks. :D
05:58 JoeJulian Actually, just print *this
05:58 Supermathie Yeah 40 million outstanding frames will do that.
06:03 lalatenduM joined #gluster
06:03 glusterbot New news from newglusterbugs: [Bug 961197] glusterd fails to read from the nfs socket every 3 seconds if all volumes are set nfs.disable <http://goo.gl/3NMQU>
06:06 Supermathie JoeJulian: Is there a way I can tell gdb to export symbols so that the corefile is useful to someone else?
06:07 JoeJulian Oh, right... you roll your own binaries...
06:07 JoeJulian No clue
06:07 JoeJulian I'm a bit of a gdb novice.
06:07 Supermathie Yeah, I have necessary patches on top of 3.3.1
06:08 JoeJulian Can you just fpaste the result of "print *this"? I'm trying to trace the source. Also a "thread apply all bt" would be handy.
06:08 Supermathie JoeJulian: will provide
06:09 JoeJulian No guarantees I'll find anything, but it's good exercise to read through this stuff.
06:10 Supermathie JoeJulian: Can I redirect that to a file?
06:10 Supermathie 'set logging file foo.log' yep
06:12 JoeJulian Just found that myself.
06:12 Supermathie http://paste.fedoraproject.org/11222/80799531/
06:12 glusterbot Title: #11222 Fedora Project Pastebin (at paste.fedoraproject.org)
06:14 JoeJulian So you've got 111 problems with that dump... ;)
06:14 JoeJulian That's way  better than 99...
06:17 satheesh joined #gluster
06:21 Supermathie ffff
06:21 Supermathie 22
06:22 Supermathie ack sorry about that. network went wonky
06:24 Supermathie OK... ugh that's a lot of info in one email
06:25 mjrosenb joined #gluster
06:25 mjrosenb well, this is strange
06:25 mjrosenb I purchased beat hazard recently
06:25 Supermathie woot fun game
06:25 mjrosenb and it can't seem to index music that is mounted via gluster
06:26 mjrosenb I mounted the directory that is mounted on another machine via sshfs, and it seems to be fine
06:26 mjrosenb sans indexing proceeding at an incredibly slow pace
06:26 mjrosenb but it is also taking a very roundabout method of getting to this machine
06:29 mjrosenb Supermathie: can you confirm/deny these accusations?
06:29 JoeJulian Supermathie: not saying this is it, but I'm seeing a lot of references to the write-behind translator. Maybe we should try disabling that.
06:31 Supermathie JoeJulian: I think I was getting this with a pretty bare config, but sure, I'll try that.
06:31 JoeJulian Supermathie: Also, if you've still got gdb open, could you print *dest
06:31 JoeJulian I think its the client, but I'm not positive.
06:32 Supermathie need a context for dest
06:32 Supermathie i.e. from which frame
06:32 JoeJulian must be 22?
06:32 JoeJulian Oh, nevermind.
06:33 JoeJulian I see that it actually shows it as '""'
06:33 mohankumar joined #gluster
06:33 glusterbot New news from newglusterbugs: [Bug 960141] NFS no longer responds, get "Reply submission failed" errors <http://goo.gl/RpzTG>
06:40 Supermathie JoeJulian: looks different, but still: http://paste.fedoraproject.org/11224/36808161/ and disks idle.
06:40 glusterbot Title: #11224 Fedora Project Pastebin (at paste.fedoraproject.org)
06:40 JoeJulian This doesn't make sense. If I'm reading this right, you should have gotten an error, "(-->rpc)Invalid argument:"
06:41 Supermathie where?
06:41 JoeJulian probably the nfs log
06:42 Supermathie aaaaaaaand glusterfs/nfs is up to 41GB... full of unresponded frames.
06:43 Supermathie oh wow... 27185 root       20   0 11.1G 10054M  2168 S 26.0  7.8  0:54.34 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearless1.export-bricks-500117310007a850
06:43 Supermathie the brick itself is up to 11G
06:43 Supermathie and going up
06:43 JoeJulian I wonder if it also has a frame queue
06:43 Supermathie maybe one of the bricks is not responding properly to READs and WRITEs and causing the problem
06:43 JoeJulian Is that 192.168.11.2:24037
06:44 Supermathie 24031
06:44 JoeJulian Probably luck of the draw on the core dump then.
06:45 Supermathie 24037 is actually pretty quiet.
06:45 JoeJulian double check that the brick that's queuing up is actually mounted...
06:45 Supermathie This is a replicated volume, but only one of the replicas has that brick going up to huge amounts of RAM
06:46 JoeJulian I had one where the brick wasn't mounted and I was hitting the ext4 bug on the underlying filesystem.
06:46 Supermathie Yep it's up and responsive
06:46 Supermathie xfs
06:47 JoeJulian Since then I make bricks in a brick subdirectory under the mount point. That way if the directory's not there, the brick won't start.
06:47 Supermathie Good call.
06:47 Supermathie also handy for making it easy to blow away and recreate :)
06:48 JoeJulian yep
06:48 JoeJulian hrm.. jdarcy had a macro for walking the frame queue... now where did he put that...
06:49 brunoleon joined #gluster
06:49 dobber_ joined #gluster
06:55 jclift_ joined #gluster
07:04 ctria joined #gluster
07:07 Supermathie OK, Criminal Minds is over, paid commercials are on, time to get to bed.
07:09 JoeJulian Yeah, tired's catching up to me too.
07:13 JoeJulian Supermathie: I went to look at your twitter profile, but after I started to think about doing that, my arms were covered with hash marks....
07:14 hchiramm__ joined #gluster
07:15 Supermathie JoeJulian: :D
07:16 Supermathie Right, bed, g'night.
07:25 shireesh joined #gluster
07:40 vimal joined #gluster
07:53 ekuric joined #gluster
07:55 yinyin joined #gluster
08:10 wgao joined #gluster
08:25 hchiramm__ joined #gluster
08:36 hchiramm__ joined #gluster
08:45 deepakcs joined #gluster
09:03 majeff joined #gluster
09:04 JosephWHK joined #gluster
09:07 JosephWHK How can the.glusterfs directory help in auto-healing? By minimizing directory traversals?
09:12 manik joined #gluster
09:15 jikz joined #gluster
09:18 JosephWHK What is the use of two-level hash in .glusterfs directory?
09:20 rotbeard joined #gluster
09:22 kmtjiku joined #gluster
09:23 jurrien joined #gluster
09:23 hchiramm__ joined #gluster
09:28 jikz joined #gluster
09:30 jiku joined #gluster
09:31 kmtjiku joined #gluster
09:31 bulde1 joined #gluster
09:31 jiku joined #gluster
09:32 glusterbot New news from resolvedglusterbugs: [Bug 901568] print the time of rebalance start when rebalance status command is issued <http://goo.gl/7w0I3>
09:39 shireesh joined #gluster
10:05 bulde joined #gluster
10:19 ninkotech joined #gluster
10:19 ninkotech__ joined #gluster
10:31 vpshastry1 joined #gluster
10:48 sgowda joined #gluster
10:55 saurabh joined #gluster
10:59 ujjain joined #gluster
10:59 bulde joined #gluster
11:10 kkeithley1 joined #gluster
11:13 edward1 joined #gluster
11:18 lpabon joined #gluster
11:22 sgowda joined #gluster
11:28 yinyin joined #gluster
11:34 glusterbot New news from newglusterbugs: [Bug 961307] gluster volume remove-brick is not giving usage error when volume name was not provided to the cli <http://goo.gl/4m3Yx>
11:41 ricky-ticky joined #gluster
11:42 nickw joined #gluster
11:42 isomorphic joined #gluster
11:51 bulde joined #gluster
12:07 MikeyB joined #gluster
12:07 mohankumar joined #gluster
12:15 vpshastry1 joined #gluster
12:18 mohankumar joined #gluster
12:28 jikz joined #gluster
12:35 saurabh joined #gluster
12:35 bulde joined #gluster
12:36 andrewjsledge joined #gluster
12:46 jiku joined #gluster
12:48 jiku joined #gluster
12:52 manik joined #gluster
13:07 rotbeard joined #gluster
13:16 dustint joined #gluster
13:17 JoeJulian avati_ a2_: Are you around?
13:19 sjoeboo joined #gluster
13:31 bennyturns joined #gluster
13:40 rwheeler joined #gluster
13:43 andrei_ joined #gluster
13:44 hagarth joined #gluster
13:44 piotrektt_ joined #gluster
13:48 plarsen joined #gluster
13:55 jag3773 joined #gluster
14:01 andrei_ hello guys
14:01 andrei_ i was wondering if you could share some tips to improve glusterfs performance?
14:01 andrei_ i've got a small replicated cluster made of 2 servers
14:02 andrei_ i am seeing poor performance comparing to the underlying fs
14:02 andrei_ i am getting around 800-900mb/s read using dd from fs
14:02 JoeJulian My usual method is to figure out what I need, what my tools provide, and how to engineer the two to meet.
14:02 sjoeboo_ joined #gluster
14:02 andrei_ but only seeing about 200mb/s if mounted via glusterfs
14:03 JoeJulian How many clients will you have running dd simultaneously as part of your normal workload?
14:04 andrei_ JoeJulian: hi
14:04 andrei_ at the moment i've got two clients which will use glusterfs. However, during my tests (the performance figures i've mentioned above) I am mounting fs on the storage server
14:04 wushudoin joined #gluster
14:05 andrei_ so, there is no network traffic per say
14:05 JoeJulian Ah, right. We were talking about the context switching before.
14:06 andrei_ eventially glusterfs will be ran over rdma to the clients
14:06 andrei_ i am using 3.4 beta1 by the way
14:06 andrei_ are there any obvious performance improvements that I can tune?
14:08 andrei_ from what I can see i've got around 30% iowait when I am readying a single dd from the glusterfs volume
14:08 JoeJulian "gluster volume set help" will give you a list of all the settings you can play with.
14:08 andrei_ whereas I am only seeing about 5% iowait when ready from the fs directly
14:08 andrei_ do you know how to find out what is causing the iowait?
14:08 JoeJulian But do realize you're talking about tuning for dd, which is not a valid workload to be testing against.
14:09 JoeJulian iotop
14:09 andrei_ I realise that, but should I not try to first get the glusterfs performance clouse to the fs performance on the same server
14:10 andrei_ and then look for other benchmarks and compare the speed?
14:10 JoeJulian Also, you're setting block sizes to some reasonably large value, right?
14:10 JoeJulian I don't believe in benchmarks.
14:10 andrei_ i am using 8MB for the block size
14:10 JoeJulian I only value determining your needs and designing a benchmark to evaluate them.
14:11 andrei_ iotop shots that glusterfsd (about 6 processes) consuming a bunch of IO
14:11 andrei_ around 50% per each process
14:11 vpshastry1 joined #gluster
14:12 andrei_ i do realise that benchmarks can't show the real life usage
14:12 andrei_ but i don't see any other way at the moment to compare performance
14:13 andrei_ i am using 100GB random file for testing
14:13 JoeJulian What are you comparing against?
14:13 andrei_ at the moemnt it's dd test
14:13 andrei_ but I will also use iozone
14:13 andrei_ and i've also got a 4k random read benchmark file
14:13 andrei_ i am comparing it with the underlying os performance first
14:14 andrei_ on the same server
14:14 andrei_ to check how it performs without involving a network cable/switch/etc
14:14 Supermathie andrei_: look into 'fio'
14:14 andrei_ fio? a benchmark tool?
14:15 Supermathie andrei_: yeah, you can tell it very specific things to test and benchmark.
14:15 andrei_ in the past i've seen glusterfs speeds of around 700-800mb/s from a single client single thread
14:15 andrei_ and that was done over the wire
14:15 Supermathie mb/s or MB/s?
14:15 andrei_ MB
14:15 andrei_ not mbit
14:16 andrei_ i am surprised i am only seeing 200mb/s reading from the same server
14:16 andrei_ without any wires
14:16 JoeJulian My point is not how does glusterfs (or any clustered filesystem at all) perform compared to local filesystems, but what are your needs? What solution fills those needs? If it's performance you're looking for, then why do you need clustering?
14:17 andrei_ JoeJulian: I am looking for a redundant storage solution for the cloud infrastructure
14:17 bugs_ joined #gluster
14:17 andrei_ i need to be able to run maintenance on storage servers without bringing down the infrastructure
14:17 andrei_ however, I do want to provide good storage speed at the same time
14:18 andrei_ i've got pretty fast servers
14:18 Supermathie Should I be expecting things to break if I turn on storage.linux-aio?
14:18 andrei_ which are capable of doing close to 10gbit/s local fs reads
14:23 Supermathie Wow, OK, so enabling storage.aio just breaks EVERYTHING. so much for that.
14:24 JoeJulian andrei_: Since you don't care about multiple client access, you can enable eager-lock
14:25 JoeJulian Not sure if that would help with dd though.
14:25 rastar joined #gluster
14:27 JoeJulian ~pasteinfo | andrei_
14:27 glusterbot andrei_: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
14:30 andrei_ will do
14:30 andrei_ one sec please
14:31 JoeJulian I love when the bandwidth company wants to start pointing fingers and asks, " I was wondering if you had submitted a trouble ticket with your VoIP provider, by any chance?" and my answer is "I am my own VoIP provider".
14:31 andrei_ http://fpaste.org/11287/13681098/
14:31 glusterbot Title: #11287 Fedora Project Pastebin (at fpaste.org)
14:31 andrei_ as you can see i've currently have 2 servers
14:32 JoeJulian and both servers are on the same hardware?
14:32 andrei_ gluster is setup using transport tcp,rdma for the time being (i would like to only enable rdma when I switch to production)
14:33 andrei_ this is my mount options: http://fpaste.org/11289/81100081/
14:33 glusterbot Title: #11289 Fedora Project Pastebin (at fpaste.org)
14:34 andrei_ i was checking iostat when reading with dd and the server is reading from itself, not from the second glusterfs server
14:34 andrei_ so the traffic is not going over the wire
14:35 andrei_ JoJulian: nope, both server have different hardware and different number of disks
14:35 JoeJulian And you are mounted rdma so that should be much faster. What kernel?
14:36 JoeJulian Also, have you checked the client logs for any errors? I don't expect any but would hate to chase down the wrong problem if there's an easy one.
14:37 andrei_ ubuntu 12.04 with: Linux arh-ibstorage 3.2.0-40-lowlatency #43-Ubuntu SMP PREEMPT Wed Apr 3 18:26:22 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
14:37 andrei_ i will check the glusterfs logs
14:37 andrei_ what log file should I be looking at?
14:38 andrei_ etc-glusterfs-glusterd.vol.log ?
14:38 JoeJulian That's the management daemon
14:39 JoeJulian /var/log/glusterfs/glusterfs--secondary--ip.log is what I think it would name it.
14:41 JoeJulian what ib hardware are you using?
14:44 andrei_ i am using supermicro server, 24gb ram, 6 core xeon, lsi controller in jbod
14:44 andrei_ and zfs filesystem
14:44 jskinner_ joined #gluster
14:45 andrei_ the dd test is done on a 100gb random file which is not compressible and can't fit into zfs arc memory
14:45 andrei_ so it reads it from the disk
14:45 dobber___ joined #gluster
14:45 andrei_ regardless if i am doing local fs test or the glusterfs test
14:45 daMaestro joined #gluster
14:46 andrei_ there are no errors that I can see in the glusterfs-secondary-ip-.log
14:47 andrei_ the last entry there is from 12 at night when i've mounted fs
14:49 JoeJulian brb... work calls....
14:55 zaitcev joined #gluster
14:56 failshell joined #gluster
14:57 failshell hello. on my clients, when i run df -h, the used value reported for my gluster volumes is incorrect. i read online that's caused by the quota feature. but when i try to disable it, it says its already disabled. anyone knows how to fix that? im using a distributed-replicate on 3.2.7
14:58 JoeJulian andrei_: Which infiniband driver are you using for what model infiniband interface?
14:58 JoeJulian failshell: define incorrect
14:59 JoeJulian ie. My bricks are N and df reports M
14:59 jiku joined #gluster
15:00 luis_silva joined #gluster
15:00 failshell JoeJulian: on this machine i uses to backup our gluster cluster, the gluster mount reports 63G with df. while a du of the backup directory reports 71G
15:00 failshell my concern is that our monitoring system won't be able to notify us if we reach our threshold
15:00 failshell as it's wrong
15:01 JoeJulian So you've made a copy of all the files and that copy is bigger.
15:01 failshell yup
15:01 JoeJulian How are you copying?
15:01 _pol joined #gluster
15:01 failshell rsync
15:02 JoeJulian --sparse ?
15:02 _pol joined #gluster
15:02 failshell rsync -av --delete
15:02 JoeJulian So you're expanding any sparse files.
15:02 failshell what's a sparse file?
15:03 JoeJulian @lucky sparse files
15:03 glusterbot JoeJulian: http://en.wikipedia.org/wiki/Sparse_file
15:03 deepakcs joined #gluster
15:03 failshell ah
15:04 failshell lemme run my rsync job with --sparse
15:04 andrei_ JoeJulian: sorry got a little kid around - got distracted
15:05 JoeJulian I wonder if it'll change a non-sparse file to a sparse one... I kind-of doubt it.
15:05 failshell JoeJulian: would have to delete the data then to rersync?
15:05 JoeJulian andrei_: I work from home with my 3yo daughter on Thursdays and Fridays too.
15:05 andrei_ to be honest, I don't think it's IB at all as when I mount fs with tcp transport I am seeing the same picture
15:05 andrei_ nice!
15:05 andrei_ i've got a 3 week baby at home
15:05 JoeJulian Contrats
15:05 JoeJulian er, gah...
15:06 andrei_ thanks!
15:06 andrei_ let me just double check when I mount with transport tcp
15:06 andrei_ if I get the same picture
15:06 JoeJulian I'm not sure how the kernel handles that, tbh, when reading from the local server.
15:11 andrei_ I will let dd finish, but I think it's just slightly slower with tcp option
15:11 andrei_ i am seeing reads from fs between 150 and 220mb/s
15:11 andrei_ with the rdma option it has been around 200mb/s most of the time reaching 300mb/s occasionally
15:12 andrei_ i do see the same iowait numbers
15:12 andrei_ around 30%
15:12 JoeJulian failshell: These are the options I use for my backup. It would still expand sparse files but it maintains hardlinks. I also do some over-the-wire backups so --numeric-ids is useful. http://ur1.ca/drob2
15:12 glusterbot Title: #11297 Fedora Project Pastebin (at ur1.ca)
15:12 Supermathie Is it normal for glusterfs/nfs to be using so much cpu (130-150%) just handling 4 dd streams (bs=8192)
15:12 Supermathie Also, I may have a test case for people to try and reproduce...
15:13 JoeJulian Supermathie: I wouldn't think so, but I can probably check today.
15:13 jclift_ andrei_: Hmmm... what's the physical connectivity between those servers?
15:14 andrei_ IPoIB + 2x1GB ethernet
15:14 failshell JoeJulian: well, at least now i know its normal. i can sleep easier knowing gluster reports the actual size it has on the volume
15:14 andrei_ however, when i've setup the glusterfs i've used the IPoIB ip addresses
15:14 jclift_ andrei_: k.  The gluster volumes are using the IPoIB interfaces then?
15:14 andrei_ not the ethernet addresses
15:14 andrei_ yes
15:14 jclift_ k
15:14 andrei_ they are
15:14 JoeJulian jclift_: he was saying that he was able to determine that the read was being served by the local server for his test.
15:14 andrei_ yeah
15:15 jclift_ andrei_: Ahhh, k.
15:15 andrei_ i am checking iostat and I can see the the same server is serving data
15:15 * jclift_ was wondering if it was doing cpu overhead from non-rdma stuff with IPoIB.  But, guess not
15:15 andrei_ so it's not going over the wire from a different server
15:15 JoeJulian andrei_: Just for giggles, killall glusterfsd on the remote server and try again.
15:15 andrei_ will do
15:16 tbrown747 joined #gluster
15:16 Supermathie http://fpaste.org/11300/36811258/
15:16 glusterbot Title: #11300 Fedora Project Pastebin (at fpaste.org)
15:16 andrei_ should I not stop it gracefully?
15:16 * jclift_ hasn't tried 3.4beta1 yet with rdma nor IPoIB yet
15:16 JoeJulian As long as you don't killall -9, it'll be graceful.
15:17 andrei_ okay
15:17 jikz joined #gluster
15:18 tbrown747 hi guys, i have a question about/issue with samba over a gluster mount; i'm going to describe it below and if anybody has any insight i would appreciate it!
15:18 Kurian__ joined #gluster
15:18 JoeJulian disable all oplocks
15:18 andrei_ dd test is almost finished
15:18 tbrown747 i did
15:19 andrei_ once it's done i will kill the other glusterfs server and test again
15:19 tbrown747 i wound up having to disable strict locking
15:19 tbrown747 here is the error: 'ERROR: 0x80070021 The process cannot access the file because another process has locked a portion of the file."
15:19 JoeJulian Hmm.. did I ever do a factoid for ,,(samba)
15:19 glusterbot I do not know about 'samba', but I do know about these similar topics: 'samba acls'
15:19 JoeJulian nope
15:19 andrei_ JoeJulian: rdma transport gives me 220mb/s compared with 195mb/s over tcp
15:19 tbrown747 it looks like there is some kind of bug with NFS, and i was wondering if the bug is also present in the gluster client
15:19 wushudoin left #gluster
15:19 andrei_ not a huge difference
15:20 tbrown747 samba seems to be getting incorrect information about whether there is a POSIX lock on files that are being read
15:20 JoeJulian andrei_: I would guess that's context switching overhead.
15:20 JoeJulian tbrown747: Make sure you check your client log.
15:21 tbrown747 joeJ: I didn't do a full level 10 debug, but it seems this is a pretty common bug with NFS mounts
15:21 JoeJulian tbrown747: If the samba server is the only client, you can also set eager-locking
15:22 tbrown747 joeJ: eager-locking on the gluster client?
15:22 JoeJulian tbrown747: I'm referring to the gluster client log.
15:22 JoeJulian yep
15:22 Supermathie When my crazy-ass NFS problems happen, looks like the backing store has trouble. (no active sinks, etc). Looks like the problem really is the reads/writes not actually making it to the bricks.
15:22 tbrown747 joeJ: ok, I didn't check that to see what was happening on the disk level
15:22 tbrown747 joeJ: do you think disabling strict locking is a bad solution?
15:23 JoeJulian I disabled a bunch of locking for samba, but then I really want my windows users to get fed up with windows... ;)
15:24 tbrown747 i am locked into a really bad windows architecture for this application : (
15:24 JoeJulian Oh, also for samba, mount your client with attribute-timeout=0
15:24 JoeJulian @glossary
15:25 glusterbot JoeJulian: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
15:25 andrei_ JoeJulian: okay, killed the server and restarted dd test. seeing similar results as with the second server up
15:25 andrei_ dd not finished yet
15:25 andrei_ but iostat shows around 200-250mb/s
15:26 andrei_ with similar iowait figures
15:26 nueces joined #gluster
15:26 jag3773 joined #gluster
15:27 andrei_ so, I don't think the issue is with the second server
15:27 JoeJulian create a volume using localhost as the server name and test with that volume. Let's make sure that it's not a driver issue (I can't imagine it is though).
15:28 tbrown747 joeJ: thanks for the assistance- i will check into these items
15:28 andrei_ will do
15:28 andrei_ one sec
15:28 JoeJulian It can just be a 1 brick volume.
15:29 JoeJulian Supermathie: EBADF on /dev/zero?!?!
15:30 jthorne joined #gluster
15:30 Supermathie JoeJulian: That's the least of my problems :)
15:31 edong23 i thought the checklist for gluster speed was a. Dont check   b. use jumbo frames  c. have fast cpus and harddrives
15:31 newbie_x joined #gluster
15:32 JoeJulian edong23: I like that list. :D
15:32 edong23 JoeJulian: being serious, the real world workload is generally just fine on a good machine with gluster
15:32 edong23 even if dd reports moderate results
15:32 JoeJulian And if it's not, scale out.
15:32 edong23 right
15:33 Supermathie 72876032 bytes (73 MB) copied, 817.405 seconds, 89.2 kB/s
15:33 newbie_x Hi is distributive on that page http://www.gluster.org/download/ has web gui ?
15:33 glusterbot Title: Download | Gluster Community Website (at www.gluster.org)
15:33 edong23 well... thats pretty bad
15:33 JoeJulian otoh, I don't mind isolating where a bottleneck might be. If there's a bug I'd like to see it reported.
15:33 Supermathie I'm doing 16 parallel DDs to the same file at different offsets. Gluster just does not cope with certain things well.
15:33 edong23 well, yeah
15:34 JoeJulian newbie_x: gui = oVirt
15:34 edong23 Supermathie: why are you doing that?
15:34 edong23 and Supermathie   do you have a comparison of just running 16 DDs to the same file on your backing store?
15:35 JoeJulian That's not an uncommon use case. Lots of scientific processing and rendering processes to that.
15:35 Supermathie edong23: Trying to duplicate some of the problems I see in http://paste.fedoraproject.org/11223/80643136/
15:35 glusterbot Title: #11223 Fedora Project Pastebin (at paste.fedoraproject.org)
15:35 newbie_x JoeJulian: its mean yes ?
15:36 JoeJulian newbie_x: I'm not entirely sure if your question and my answer match, but if you want a gui for glusterfs, google oVirt.
15:38 Supermathie edong23: Same thing on backing store: each of the 16 dd is reporting between 49 and 227MBps
15:40 Supermathie edong23: Same thing on local FUSE mount: each dd is giving me 16.8MBps
15:40 andrei_ JoeJulian: when I am trying to set up a volume with localhost as the ip address it is not letting me do that
15:40 andrei_ Please provide a valid hostname/ip other than localhost, 127.0.0.1 or loopback address (0.0.0.0 to 0.255.255.255).
15:41 JoeJulian hrm... they must have changed that.
15:41 andrei_ so, should I use the ethernet ip address intead of the ipoib one?
15:42 JoeJulian ...
15:42 JoeJulian maybe associate a 192 address to lo
15:43 andrei_ let me try that
15:43 JoeJulian Supermathie: I have 2 computers I can use for testing. What do you want me to try/
15:43 JoeJulian ?
15:43 jskinner_ joined #gluster
15:44 JoeJulian Well, actually I can use 3 but I assume you'd like one to be a client only.
15:44 edong23 it seems like very good speeds to the underlying storage with 16 DDs running concurrently
15:44 Supermathie The only thing I've been able to use to make glusterfs totally sh*t a brick (hahaha) is racle. Oh, right, lemme get that trace and pcap up
15:45 andrei_ JoeJulian: that has worked, let me start the test
15:45 Supermathie edong23: Yeah, backing is either 8x200GB Enterprise SSD per node or FusionIO Ioscale card
15:45 edong23 Supermathie: ah
15:45 edong23 ssd
15:45 edong23 thats all you had to say
15:46 edong23 cause i have pretty good backing store speeds on my 20 drive system, but with that much dd  i would be lucky to get that sperformance
15:47 JoeJulian andrei_: btw... my expectation is that you'll have the same results... I'd just like to prove that first.
15:47 andrei_ sure, let's isolate that one
15:50 Supermathie edong23: Yeah I've been trying to get obscene speeds (or at least reasonable speeds) out of gluster and finding out that it's not designed for fast disk.
15:51 Supermathie JoeJulian: Want a copy of Oracle to test? :)
15:52 * JoeJulian cringes
15:52 Supermathie actually I think there's a dev version that's properly free
15:52 Supermathie Hey man, you know how many bugs I've tickled out of gluster by using Oracle?
15:52 JoeJulian I could just run down to Bellevue and see if they want to figure this out with me.
15:53 Supermathie I wonder if all my problems will magically fix themselves if I turn off distribute or replicate
15:54 Supermathie I will test that after I finish my performance testing...
15:54 JoeJulian You can't really, unless you hand-write a volfile.
15:55 JoeJulian Well, replicate you can, but I'm pretty sure distribute's in the graph even if you only have one brick.
15:55 Supermathie I can do that. Or just blow everything away and start again. I'm set up to do that.
15:55 bala joined #gluster
15:57 JoeJulian Nope, no distribute translator with only 1 brick.
15:57 Keawman joined #gluster
15:57 JoeJulian Before I tried that, though, I'd try disabling each (or all) of the performance translators first.
15:58 JoeJulian If there's a race condition, that's where I'd look first.
15:58 chirino_m joined #gluster
15:58 Keawman trying to upgrade from 3.4alpha to 3.4beta installed new repo and did yum update but it wont upgrade what am i missing
15:58 Supermathie JoeJulian: That's where I started from. Turned these options on to try and help
15:59 andrei_ JoeJulian: yes, you were right. I can see the same pattern emerging with the lo:1 192.168.222.222 interface as with the infiniband or the ethernet
15:59 JoeJulian Keawman: 2 options... 1, wait a few hours for the new package, or 2, yum downgrade.
16:00 Keawman JoeJulian, a new release in a few hours?
16:01 DEac- joined #gluster
16:01 JoeJulian I saw that kkeithley has already committed the change about 2 1/2 hours ago.
16:02 JoeJulian it's bug 961117
16:02 glusterbot Bug http://goo.gl/7Q2eB unspecified, unspecified, ---, kkeithle, MODIFIED , glusterfs version went backwards in rawhide
16:03 Keawman well my problem isn't that it upgraded and wont work ...it doesn't recongnize that there is an update available
16:03 JoeJulian Did you read the bug report?
16:03 Keawman sorry
16:05 JoeJulian andrei_: alrighty then... cpu load during the read?
16:05 andrei_ glusterfs and glusterfsd both consume around 40-45% each
16:05 andrei_ iowait is around 30%
16:06 andrei_ that is on a model name      : Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
16:06 andrei_ 12 cores total
16:06 Keawman JoeJulian, Thank you...i thought maybe you had misunderstood my problem but that was exactly the problem. I'll wait a few hours...i also had no idea i was that ahead of the curve on trying this update hadn't checked in a week or so
16:07 JoeJulian It's a single thread, though, so we wouldn't end up using more than 2 cores unless my coffee hasn't fully kicked in yet.
16:07 andrei_ that's right
16:07 andrei_ overall the server is about 65% idle
16:08 JoeJulian Keawman: Yeah, it's how the rpmversion check works. I've never really liked the whole alpha/qa/beta naming thing.
16:08 Keawman i'm using centos 6.4 will it take a while for that to filter down
16:09 andrei_ JoeJulian: this is what I see in the iotop: http://fpaste.org/11307/81157491/
16:09 glusterbot Title: #11307 Fedora Project Pastebin (at fpaste.org)
16:09 JoeJulian Keawman: What yum repo are you using?
16:10 Keawman JoeJulian, download.gluster.org's
16:11 JoeJulian actually, it's there now.
16:11 JoeJulian "yum clean all" and it should work.
16:12 andrei_ JoeJulian: should I check to see if I get the same picture if I mount gluster volume using nfs instead of native?
16:13 Keawman JoeJulian, your right...maybe it's because i'm currently on glusterfs-3.4.0alpha-2.el6.x86_64
16:14 JoeJulian probably...
16:15 andrei_ let me try that
16:15 andrei_ need to check how to export nfs via gluster first ))
16:16 JoeJulian /usr/bin/rpmdev-vercmp 3.4.0-0.4.beta1 3.4.0alpha-2
16:16 JoeJulian 3.4.0-0.4.beta1 < 3.4.0alpha-2
16:16 JoeJulian Keawman: Yep, that's it. Either "yum downgrade 'gluster*'" or erase and install.
16:17 Keawman JoeJulian, any side effects of downgrading that you know of?
16:17 JoeJulian andrei_: Sorry, probably was mean for Keawman.
16:17 JoeJulian Keawman: No, it's just to tell yum to choose the lesser version. The rpm scripts run as if it's an upgrade.
16:18 Keawman JoeJulian, ok thanks much for your help
16:19 JoeJulian You're welcome
16:19 jclift_ andrei_: On CentOS the NFS server is automatically started when you start a volume, so people just need to get their client to try attaching to it and it works.
16:19 jclift_ andrei_: Similar may happen with Debian/Ubuntu too (unsure)
16:20 jclift_ andrei_: So you might not need to do anything more than try mounting
16:20 JoeJulian Yep
16:20 jclift_ andrei_: Something like mount -t nfs myserver:myvolume /my/mount/point
16:20 jclift_ Hmmm, there are options about nfsv3 too, but I don't remember what exactly. :D
16:20 JoeJulian @nfs
16:20 glusterbot JoeJulian: To mount via nfs, most distros require the options, tcp,vers=3 -- Also an rpc port mapper (like rpcbind in EL distributions) should be running on the server, and the kernel nfs server (nfsd) should be disabled
16:20 jclift_ Sweet
16:20 andrei_ thanks guys
16:20 andrei_ you were right
16:21 andrei_ it's automatically exported
16:21 andrei_ doing dd now
16:21 jclift_ :)
16:21 JoeJulian bbiab... my daughter just woke up and is ready for breakfast.
16:21 JoeJulian btw... nfs should be slower for a bulk-transfer test.
16:22 andrei_ with nfs the speeds are about the same
16:22 andrei_ but glusterfsd is eating about 70-80% cpu instead of 40-45 with native glusterfs
16:22 jclift_ Heh
16:23 jclift_ If you're in a daring mood, you can try muck around with the translators and stuff yourself
16:23 andrei_ speed has actually increased with nfs
16:23 andrei_ to about 260mb/s mark
16:23 jclift_ andrei_: If you do a ps -ef | grep gluster, you'll see a glusterd process, and possible a few glusterfs and glusterfsd processes
16:23 andrei_ compared with 220mb/s with native
16:23 jclift_ k
16:24 andrei_ with nfs I see more glusterfsd processes compared to the native glusterfs mount
16:24 jclift_ andrei_: Do you have a /var/lib/glusterfs/ directory?
16:24 jclift_ Or maybe /var/lib/gluster/ ?
16:24 jclift_ Something named like that anyway. :)
16:24 toddstansell /var/lib/glusterd ;)
16:24 jclift_ Cool.
16:25 jclift_ andrei_: So, inside that directory there'll be an "nfs" subdirectory, with a file in there ending in .vol
16:26 jclift_ That's a text file (you can edit) which tells Gluster how to start the NFS server, sets some options, and adds in various "translator" layers.
16:26 andrei_ where would i find it?
16:26 andrei_ in /etc somewhere?
16:26 jclift_ In /var/lib/glusterd/nfs/
16:27 andrei_ yeah, i can see nfs-server.vol
16:27 jclift_ andrei_: Be aware, that file (on CentOS) gets regenerated each time glusterd is started.  So, changes in there don't easily stick.
16:27 jclift_ Cool.  Copy that to some other place.  i.e. /root/my-nfs-server-testing.vol
16:28 andrei_ yeah
16:28 andrei_ done
16:28 jclift_ Then start a gluster NFS server using:   $ sudo glusterfs -f /root/my-nfs-server-testing.vol -N
16:29 jclift_ Well, the -N bit isn't really needed.  It makes it keep in the foreground (which I do when writing translators so I can have them print stuff to the screen)
16:29 andrei_ do I need to kill the existing nfs server?
16:29 jclift_ Um, I normally do, but I'm not sure it's needed.
16:29 jclift_ The NFS server will have a glusterfsd process and a glusterfs one.
16:29 jclift_ I just "sudo kill <pid of glusterfs process" and leave the glusterfsd one running
16:29 jclift_ Seems to work fine that way.
16:30 jclift_ Haven't really looked further into the "right" way of doing that yet. ;D
16:30 andrei_ do I need to make any changes in the .vol file?
16:30 jclift_ Heh, now this is my point... that .vol file will work "as is".
16:30 jclift_ If you take a look through it, you can see all of the bits that glusterfs is doing for your NFS volume.
16:31 jclift_ So... try running the NFS server with a bunch less translators.  i.e. rip some of them out
16:31 jclift_ Then try the speed testing again.  See if it makes a difference.
16:31 elyograg left #gluster
16:32 jclift_ andrei_: You'll also notice when reading through the .vol file that the translators chain into each other.  They start at the bottom one, and work their way up to the top one, explicitly saying what the next one in the order is.
16:32 jclift_ So, you can muck about my just changing the bottom one to go right to the top one, etc.
16:32 jclift_ andrei_: Is this making sense?
16:33 Mo_ joined #gluster
16:34 chirino_m joined #gluster
16:35 lbalbalba joined #gluster
16:36 andrei_ yeah
16:36 andrei_ it makes sense
16:37 jclift_ Cool.  If you rip 90% of them out and notice a doubling in speed, then that's pretty clear evidence the layers of translators are doing something non-optimal for your setup.
16:37 jclift_ If you rip 90% of them out and there's no real change, that at least means the problem isn't there.
16:38 failshell JoeJulian: using --sparse improved df's output a lot
16:38 jclift_ failshell: Yeah, sparse files are your friend for most stuff.
16:39 lbalbalba hi. just wondering: has anyone ever attempted done 'gcov/lcov' code coverage of the gluster tests ? im curious how much of the code gets tested by the test cases
16:39 jclift_ failshell: They only generally suck if you're doing stuff with vm's which start out as sparse files then fill out into non-sparse files over time.  Can become super fragmented.
16:39 failshell jclift_: all my infra runs on VMs :)
16:39 failshell even gluster
16:39 jclift_ Heh :)
16:40 jclift_ failshell: Yeah, I like sparse files for everything.  It's pretty rare (for me) when a VM gets used enough for the fill out to be a problem.
16:40 moon_account joined #gluster
16:40 duerF joined #gluster
16:41 jclift_ lbalbalba: Personally not sure (newish to team).  Other guys might know more.  Feel free to run tests yourself and report anything interesting of course. :)
16:41 uli joined #gluster
16:42 JoeJulian moon_account: Once your trusted peer group is established, only a trusted peer can add a new server to the group.
16:42 andrei_ the speed with the .vol file is about the same
16:42 jclift_ andrei_: Oh well, at least we know. :)
16:42 andrei_ JoeJulian: do you have any other ideas what we could try?
16:43 JoeJulian NuxRo: worm is a mount option.
16:43 lbalbalba jclift_: thanks. i have run gcov/lcov on projects before, but in those cases i could just run the test suite cases on the compiled but not yet installed sources. im not sure how to proceed when i have to install the binaries first.
16:43 JoeJulian moon_account: correct
16:44 JoeJulian You're welcome
16:44 uli hey guys, today i built a cluster with 4 virtual machines (vbox), all of them gluster 3.3.1, the cluster works fine. when I mount glusterfs (mount -t glusterfs gluster01:/volume1 /samba/glusterfs) df shows right size an i can access it, can create folders from windows...
16:44 jclift_ lbalbalba: Yeah, I have no experience with that type of stuff either
16:44 uli but extendet attributes cannot be written
16:44 andrei_ JoeJulian: we've established that the second server doesn't slow down the read; using lo interface doesn't effect the speed and that mounting with nfs instead of glusterfs increases performance by about 30%
16:44 uli using samba4.1
16:45 uli all bricks are xfs
16:45 uli has someone an idea where im going wrong...?
16:45 uli even tried mount -t glusterfs gluster01:/volume1 /samba/glusterfs -oacl
16:46 * jclift_ goes and gets food
16:46 uli in shell i can write fattrs (setfattr -x user.DOSATR filename ... e.g.)
16:47 JoeJulian andrei_: The only thing I can think that would cause nfs to be faster is that we someone put more processes in the mix and thus use more cores.
16:47 uli trying this with windows xp or windows 8 always gives a access denied
16:47 * JoeJulian is distracted... re-reading that even I'm not sure what the word someone was intended to be...
16:47 andrei_ the trouble is glusterfs only seems to perform about 1/4 of the speed of the underlying fs
16:48 andrei_ that's what I would like to solve
16:48 aravindavk joined #gluster
16:48 andrei_ or at least understand what's causing poor performance
16:49 JoeJulian uli: it's a vbox thing. there's some setting for that.
16:49 JoeJulian iirc
16:49 uli hmmm
16:50 uli vmware same effect.... (esx 5.1)
16:50 Supermathie andrei_: When I'm done all my Oracle testing, this beastly hardware *may* be able to be made available to the gluster community for performance testing & optimizations.
16:50 andrei_ )))
16:50 andrei_ what kind of hardware do you have?
16:51 JoeJulian uli: how are you accessing your volumes with windows? NFS or a samba reshare?
16:51 uli JoeJulian, its a samba share
16:52 Supermathie 2x(2-CPU E5-2670, 128GB RAM, 8x400GB SSD for data, 3.2TB FusionIO card for data, 10GbE, QDR IB) for the servers, 4 x (2-CPU E5-2660, 128GB RAM, 10GbE) for the clients
16:52 uli [glusterfs]
16:52 uli path = /samba/glusterfs
16:52 uli browsable = yes
16:52 uli read only = no
16:52 shylesh joined #gluster
16:52 JoeJulian Up your samba log level and see what it says. If you can do it from a shell, then the posix stuff
16:53 JoeJulian is working
16:53 uli JoeJulian, kk gonna try that, thx so far
16:56 andrei_ wow!
16:56 andrei_ yeah, nice hardware man!
16:56 andrei_ did you get much testing with glusterfs / rdma?
16:58 dewey joined #gluster
16:59 tbrown747_ joined #gluster
16:59 Supermathie JoeJulian: My coredump, binaries, pcap and gluster trace are uploaded... I don't mind passing it around the "gluster devs" but I don't want it public.
16:59 Supermathie andrei_: heh heh... awaiting an IB cable to actually *use* the IB cards... :)
17:01 andrei_ yeah, these guys are not cheep
17:01 andrei_ around $50 or so
17:01 lbalbalba crap. when i truy to do a build for gcov code coverage using *FLAGS+='-fprofile-arcs -ftest-coverage', i run into this error: http://pastebin.com/Dup5VEze
17:01 lbalbalba does anyone have an idea whats going wrong here ?
17:01 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
17:01 andrei_ are you using linux on them?
17:01 andrei_ what version?
17:02 tbrown747_ JoeJulian: i have a followup re: smb concurrent read problem over gluster mount; eager-locking has no effect. if I move the share to the underlying XFS partition i don't have the issue
17:02 hchiramm__ joined #gluster
17:02 tbrown747_ I don't know if I ever actually described the issue- concurrent read of the same file results in both clients getting locked out as if the other had a write lock
17:03 Supermathie andrei_: RHEL6 on servers, RHEL5 on clients (for Oracle). Will also test with FreeBSD/ZFS/Gluster on server.
17:03 bulde joined #gluster
17:06 thomasle_ joined #gluster
17:06 lbalbalba -fprofile-arcs implies -lgcov, so i shouldnt be getting that error :(
17:07 Supermathie Can someone (who doesn't mind possibly wedging gluster) try: "sudo getfattr -m - -d -e hex /path/to/file/on/fuse"
17:07 uli JoeJulian, i get following error in samba  set_nt_acl: failed to convert file acl to posix permissions for fil
17:08 uli an convert_canon_ace_to_posix_perms: Too many ACE entries for file Neuer Ordner to convert to posix perms.
17:12 andrei_ guys, why am I seeing a lot of these: http://fpaste.org/11323/81195741/
17:12 glusterbot Title: #11323 Fedora Project Pastebin (at fpaste.org)
17:13 andrei_ this is after starting glusterfsd
17:15 duerF joined #gluster
17:17 NuxRo JoeJulian: looks like it's `gluster volume set test features.worm enable`
17:18 NuxRo I'll edit the wiki once I register on the wiki
17:18 andrei_ JoeJulian: i've now switched to transport rdma without tcp
17:18 andrei_ and I am seeing this:
17:18 andrei_ dd: writing `100G-urandom-1': Bad file descriptor
17:18 andrei_ dd: closing output file `100G-urandom-1': Bad file descriptor
17:18 JoeJulian lbalbalba: Looks to me like it's not linking the gcov library
17:19 andrei_ when I am trying to copy to the glusterfs volume
17:19 lbalbalba JoeJulian: i got that ;) i just dont know *why*
17:19 lbalbalba JoeJulian: -fprofile-arcs implies -lgcov,
17:20 andrei_ it has copied around 2gb of data and fall apart
17:20 lbalbalba JoeJulian: could it be that somehow that doesnt get passed on down correctly in the build system ? just a long shot
17:21 JoeJulian lbalbalba: That'd be my guess.
17:22 lbalbalba JoeJulian: does the build system use any specific things i should add to pass down stuff to the linker ? besides LDFLAGS, which obviously isnt working.
17:22 JoeJulian tbrown747_: not sure... if I were trying to figure that out I'd probably look at straces and/or wireshark captures to see where the lock is coming from. And, of course, turn up the logging in samba.
17:23 JoeJulian lbalbalba: probably want to ask in #gluster-dev . I don't know much about that side of things.
17:23 andrei_ I can't seems to write to that file but i can read from it
17:24 lbalbalba JoeJulian: thanks. ill go and check #gluster-dev, then. or send a email to the dev list.
17:25 JoeJulian Supermathie: pulled the xattrs as you requested. No problemo.
17:26 lbalbalba left #gluster
17:26 Supermathie JoeJulian: There was a command that made the volume wedge... can't remember exactly. Did it show you the trusted attributes?
17:26 aravindavk joined #gluster
17:26 dustint joined #gluster
17:31 andrei_ me Supermathie: zfs will run like a beast on your hardware!
17:31 andrei_ you can use your fusion for additional cache
17:31 andrei_ wow
17:32 andrei_ 3tb of cache
17:32 andrei_ nice
17:32 andrei_ your zfs speed should be in multiple gigabytes/s
17:32 andrei_ )))
17:35 Supermathie andrei_: heh, will be comparing ssd vs fusion for performance.
17:35 andrei_ isn't fusion suppose to kick ass
17:35 andrei_ like with around a million 4k random iops?
17:35 JoeJulian tbrown747_: Your client is using the acl mount option, right?
17:35 Supermathie In the future, may be putting up to 8 3.2TB FusionIO cards in a single box for storage.
17:35 JoeJulian er, no... uli? ^^
17:36 Supermathie andrei_: Yeah, some cards get up that high.
17:37 tbrown747_ JoeJulian: i've got this from the gluster client log: [fd-lk.c:456:fd_lk_insert_and_merge] 0-fd-lk: new lock requrest: owner = 33bcb01262b2a615, fl_type = F_WRLCK; so gluster is interpreting it as a write lock
17:37 JoeJulian Supermathie: No, trusted attrs do not show from that command through the client. You can request specific ones, but they don't show up in a regexp list.
17:37 tbrown747_ samba claims that it has issued a read lock for that access
17:38 Supermathie JoeJulian: I recall that when I requested a specific trusted xattr it wedged gluster. Just remembered I never did file a bug report.
17:38 glusterbot http://goo.gl/UUuCq
17:41 JoeJulian tbrown747_: I would strace that and see who's confused. Which version of gluster are you using?
17:41 JoeJulian btw... I wouldn't be at all surprised if windows was issuing write locks for reads.
17:42 rastar joined #gluster
17:43 Supermathie JoeJulian: Well it does shared locks which blocks writes, right?
17:43 tbrown747_ JoeJulian: 3.3.1
17:44 JoeJulian Supermathie: but it's blocking two clients from reading the same file... <shrug>
17:44 tbrown747_ JoeJ: I will get a wireshark of the actual samba request, but if windows was doing that; A) samba should reflect that in smbstatus, and B) wouldn't i have the same problem when i move the share to the underlying XFS filesystem?
17:44 tbrown747_ unless the XFS mount is just stomping on lock requests for some reason
17:44 JoeJulian hmm, good point.
17:45 JoeJulian Or it's a race condition that doesn't show from a local mount.
17:46 tbrown747_ the additional latency from the gluster network locking brings it out
17:46 tbrown747_ i would still think smb would reflect that though
17:46 JoeJulian off topic: Everyone's seen the new Audi commercial with Zachary Quinto and Leonard Nimoy, right? If not you must go find it now. I'll wait.
17:46 tbrown747_ i will check it out
17:47 dewey I have a question about GlusterFS performance in order to set my expectations.
17:47 JoeJulian I have an answer about it, but it may not match your question.
17:48 JoeJulian Did you want my answer first?
17:48 JoeJulian Might make it easier to tailor your question to match it.
17:48 dewey JoeJulian: an unconventional approach, but let's try it :-)
17:48 JoeJulian hehe
17:49 * dewey is writing up his question
17:49 JoeJulian List your needs. Find the tools that provide those needs. Test. If the tools don't satisfy the requirements during testing, re-engineer.
17:51 dewey I've found a lot of "how to make it faster" in Google and a few comparison, but I'm going to be asked to compare it to "native" speeds or other storage technologies.  My quick-and-dirty  benchmarking using IOZone, 5 clients, 100MB files at 4k block size shows GlusterFS to be up to 2 orders of magnitude slower than either direct operations or operations to a peer using kernel NFS (i.e. no...
17:51 dewey ...Gluster).  What is the relatively level of performance expected?  I can research and troubleshoot and tune but I'm looking for "how do I know when I've arrived"
17:52 dewey (Environment:  VMs running on 3 separate ESX hosts with 8x10k drives in RAID5 exposes as virtual disks, CentOS 6.4 with latest gluster RPMs from gluster.org)
17:52 JoeJulian All clustered storage systems will fail when tested against native speeds. If you don't need clustered storage, stick with native.
17:52 dewey Surely.  It's not a matter of "native is faster" -- of course it's faster.  I'm asking "how much faster should I expect"?
17:53 semiosis dewey: consider single thread perf. vs. aggregate perf
17:53 semiosis look at higher order metrics like time to complete a workload
17:53 JoeJulian ooh, I like that line. I'm stealing it.
17:53 semiosis my single thread performance went down, but i was able to run 100s of times as many threads vs my old dedicated raid array/nfs
17:55 dewey I'm looking at average performance per client across 5 threads.  glusterfs shows 7.3Mb/s on "Reverse Read", NFS and native show 1500 Mb/s
17:55 semiosis well, 100s in theory at least... 10x in practice, with plenty of room still to grow
17:58 dewey Ultimately I'd like to use this to store virtual machines, which I know is a central use-case for glusterfs.   What I'm looking fore here is for someone amongst the cognescenti to say "yes, that's around correct" or "Dude, you've completely screwed something up!"!"
17:58 semiosis dewey: what kind of network do you have that delivers NFS at 1.5Gb/s?
17:58 semiosis 10GbE?
17:58 dewey Yes
17:58 dewey and I haven't normalized for caching, so that might not all be hitting the net.
17:58 semiosis imho vm storage isn't a "central" use case
17:59 JoeJulian If I were going to propose a new system to store virtual machines, it would be qemu-kvm using 3.4's direct library access.
17:59 vpshastry1 joined #gluster
17:59 semiosis it's one there's been a lot of demand for, but it hasnt been central to glusterfs
17:59 dewey There's quite a bit I haven't done -- I'd hardly call my current results a rigorous analysis -- it's a quick first look.
17:59 JoeJulian The central cause has been thousands of clients serving petabytes of data.
18:00 dewey Ahhh.  Based on a talk by Jeff Darcy about 6 months ago I got the distinct impression that GlusterFS was RedHat's horse in the storage race to underly qemu-kvm.
18:00 dewey Perhaps I took the wrong info away...
18:00 JoeJulian It is, but that's not how it got started.
18:00 semiosis dewey: vm storage has really only been specifically considered since 3.3, but the big improvements for vms which JoeJulian mentioned are coming in 3.4
18:00 dewey I have merely many 10s of clients and TB of data.
18:02 dewey Also, we are currently committed to VMware, so qemu/kvm is out (other than in my basement, where I don't have a 10G network.  Pity)  Anyway, my use case is:  cheap, no SPOF storage within a datacenter than can replicate across the continent to another datacenter.
18:02 dewey I'd *also* like to use it to expose some local storage in a 0SPOF way.
18:02 dewey semiosis:  good info, thanks.
18:04 JoeJulian Are you looking for real-time replication across high latency connections?
18:04 dewey Not real time.  It would be nice to violate the laws of physics, but "cheap" is one of the constraints :-)
18:05 JoeJulian VMware != cheap ;)
18:05 MattRM What do you consider as 'high latency'?
18:05 dewey Hence why the rest of the system needs to be cheap ;-)
18:06 semiosis MattRM: >1ms
18:06 dewey More seriously, we're already working on WAN acceleration and the primary VM storage is Equallogic and Isilon and doing coast-to-coast replication via VMware's SRM.
18:07 tbrown747_ JoeJulian: wireshark checks out, windows is behaving (for once); I haven't used strace before, I gather that run both the smb server process and the gluster client wrapped in strace and see what system calls they are making/receiving?
18:07 dewey My current coast to coast latency is ~100ms (ugh!  It was only about 80ms last time I measured!)  That is way outside anything where I'd expect synchronous replication to work.
18:08 MattRM We use Gluster for storage replication between 2 datacenters, it's definitely >1ms latency ;-)
18:08 JoeJulian Heh, I (think) I coined the term 0SPOF in 2009 - or at least I was harassed about "What are you trying to do, coin a new term?"
18:08 dewey MattRM:  I presume you use geo-replication.
18:09 dewey JoeJulian:  good term :-)
18:09 semiosis i use (normal, ARF) replication between EC2 availability zones -- about 1ms latency -- and have no complaints
18:09 semiosis s/ARF/AFR/
18:09 glusterbot semiosis: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
18:10 semiosis glusterbot: meh
18:10 glusterbot semiosis: I'm not happy about it either
18:10 MattRM dewey: No.
18:10 dewey Nice backbone.  We're in a commercial hosting environment on both costs.
18:11 dewey Oh goody.  My Office -> west coast latency is 100ms.  My DC ->DC latency is only around 80ms.
18:11 dewey MattRM:  Really?  Excellent!  I'm sure it depends on change rates, etc, but that's something that I might play with now.
18:12 JoeJulian best possible RT time in the US corner-to-corner is 30ms. That's if you had fiber running in a straight line.
18:13 JoeJulian ... with no repeaters
18:13 MattRM dewey: I've just checked, the latency is ~3 ms between the gluster boxes.
18:15 dewey Nice!  I'm clearly not going to approach that.
18:15 Supermathie 0.030ms Mmmmm 10GbE
18:15 dewey My DCs are Boston and San Mateo
18:16 MattRM There are some issues with this setup, we're trying to debug them now.
18:16 dewey performance or functionality issues?  I actually found this version of gluster very easy to get working (though of course performance is why I'm on here)
18:19 JoeJulian Boston -> San Mateo -> Boston = 8681 km. 8681km / speed of light = 29ms
18:19 MattRM I'd classify them as 'performance' - if there's a blip for a moment, gluster on one box consumes all CPU, which causes sync delay, which makes the other box eat the CPU, and so on. A major headache during last few days.
18:19 dewey Make sense.
18:20 MattRM blip = connectivity issue for a moment.
18:20 dewey JoeJulian:  Yup.  I'm not disappointed in my 80ms.  Reality is that I'm not going to run SAN RAID5 over the connection :-)
18:20 dewey MattRM:  Interesting.  How much of a "moment", and does it show up as packet loss or packet latency?
18:20 premera can I mix NFS and gluster fuse mounts ? ie. mount same volume as NFS on one machine and as a grluster FUSE on another ?
18:20 JoeJulian blib with ping-timeout set to something longer than the blip should = everything pauses 'till the blip ends.
18:21 JoeJulian premera: yes
18:21 premera thx
18:21 dewey premera:  yes, I'm doing that.
18:21 tbrown747_ JoeJulian: looks like samba is behaving as well, from the strace: open("file_OMN0508182207.mov", O_RDONLY) = 36
18:21 JoeJulian And I need better insurance rates...
18:21 jag3773 joined #gluster
18:21 JoeJulian tbrown747_: but where's the lock?
18:22 premera dewey: whats your experience ? my first impression is that NFS is much faster for multithreaded random file access ?
18:22 MattRM dewey: I'm not sure yet, it's very hard to catch such event. It wasn't an issue for a long time, but last week was a hell because of that.
18:22 dewey I haven't run perf tests between Gluster NFS and GlusterFS.  I've run quick-and-dirty others and that's an interesting test.  I have a meeting in 8 minutes but I'll run one after
18:23 JoeJulian The kernel nfs client uses FSCache - caching the directory and file stat information locally.
18:23 JoeJulian So using an nfs mount, you /can/ have stale directory information.
18:23 tbrown747_ JoeJulian: how about this- flock(36, 0x60 /* LOCK_??? */)          = 0
18:23 * dewey is running the Gluster NFS performance test
18:24 tbrown747_ i don't see any F_SETLKW
18:24 y4m4 joined #gluster
18:25 jbrooks joined #gluster
18:25 tbrown747_ it gets write leases, presumably to inform oplocks
18:26 Supermathie JoeJulian: sudo getfattr -d -e hex filename_on_fuse -n trusted.afr
18:27 dewey premera:  alas, I have to go to a meeting.  If you want my results, email dewey@sasser.com and I'll send them to you.
18:27 JoeJulian so an flock 0x60 is an exclusive, non-blocking lock.
18:28 JoeJulian If your other application is attempting the same lock, it will fail because the lock is exclusive.
18:28 premera thanks dewey, will do
18:29 JoeJulian Supermathie: getfattr -n trusted.afr bar = bar: trusted.afr: No such attribute
18:29 tbrown747_ so it's samba
18:29 Supermathie JoeJulian: Huh, my getfattr hangs and can't be killed. 3.3.1
18:29 bulde1 joined #gluster
18:29 JoeJulian 3.3.1 from the rpm
18:29 dewey joined #gluster
18:29 JoeJulian Maybe it's one of the patches you've applied?
18:29 Supermathie JoeJulian: on a fuse mount right?
18:29 JoeJulian Right
18:29 Supermathie JoeJulian: Naw, it did this on stock.
18:30 JoeJulian Where bar is a normal file.
18:31 Supermathie oh well
18:31 JoeJulian If you can get that hang, and kill -USR1 {pid of fuse client} do you get the dump in /tmp, or a 0 length file?
18:33 * JoeJulian needs to get door hangers made up for his next hotel stay. "Rm 404 - Not Found"
18:33 Supermathie JoeJulian: I get a dump
18:33 bulde joined #gluster
18:34 Supermathie wind_from=dht_getxattr
18:34 Supermathie wind_to=subvol->fops->getxattr
18:34 Supermathie unwind_to=dht_getxattr_cbk
18:34 JoeJulian hmm, let me try this test a little differently
18:37 JoeJulian 3 brick distribute volume - same result as before
18:37 Supermathie JoeJulian: Tried on a replicated?
18:38 bennyturns joined #gluster
18:38 JoeJulian yep, that hangs.
18:38 JoeJulian file a bug
18:38 Supermathie har har
18:38 glusterbot http://goo.gl/UUuCq
18:38 Supermathie i winz
18:38 dustint joined #gluster
18:38 premera I have a two-brick replica, would like to add a 3rd one, I am doing this: "gluster volume add-brick gv0 replica 3 192.168.50.12:/export/brick1". Is that all ? Is the 3rd brick going to get all data synced eventually ? Do I need somehow explicitly initiate sync process to the 3rd brick ? Can I monitor sync progress ?
18:39 Supermathie yes, yes, no, yes
18:39 premera thank you, so how do I monitor progress, is it healing ?
18:40 Supermathie premera: 'volume heal gv0 info' shows... what needs to be done so far. I think.
18:41 premera ok, thank you, will check it out
18:41 JoeJulian I just keep wanting to give snarky answers because of your nick, premera. Sorry if one escapes me. I've just had to talk to premera customer service before...
18:41 premera :-)
18:44 tbrown747_ JoeJulian: thanks again for your help, I will follow the samba thread. I wonder why XFS behaves differently though? It seems like flock() is not a traditional file lock, perhaps the gluster client is just being more considerate of it
18:49 bchilds say i have a gluster volume mounted through FUSE, and i'm writing N kb to it.  for what N does FUSE write the data out to the bricks/become consistent?
18:49 bchilds i'm trying to find the FUSE buffer size for writes before it spills into a new transaction to optimize write performance
18:52 semiosis bchilds: see write behind option
18:55 uli JoeJulian, i did  come one step further... logfile for brick1 sais E [posix.c:2583:posix_getxattr] 0-volume1-posix: getxattr failed on /export/brick1/: user.DOSATTRIB (No data available)
18:56 uli but i can setfattr -n user.DOSATTR -v test /export/brick1/file.txt
18:56 uli brick1 is mounted: ext4 (rw,relatime,user_xattr,barrier=1,data=ordered)
18:59 bennyturns joined #gluster
18:59 chouchins joined #gluster
19:03 aravindavk joined #gluster
19:04 sjoeboo_ joined #gluster
19:06 glusterbot New news from newglusterbugs: [Bug 961506] getfattr can hang when trying to get an attribute that doesn't exist <http://goo.gl/K0EVA>
19:14 chlunde joined #gluster
19:17 JoeJulian @ext4
19:17 glusterbot JoeJulian: (#1) Read about the ext4 problem at http://goo.gl/xPEYQ or (#2) Track the ext4 bugzilla report at http://goo.gl/CO1VZ
19:21 chouchins JoeJulian: do you have a good doc on getting glusterfs 3.4a2 working with ovirt 3.2.1 ?
19:21 chouchins having a terrible time getting this to work
19:21 JoeJulian I've never had a chance to use oVirt
19:21 chouchins we're building our new datacenter around glusterfs and ovirt
19:22 chouchins thought I'd try out the new glusterfs vdsm stuff in 3.2.1
19:22 Supermathie HEY! That reminds me... has anybody made an iSCSI translator for GlusterFS? So you can export LUNs?
19:22 JoeJulian Cool. I would suspect they'd probably have more knowledge about that in #ovirt
19:22 chouchins no problem.  Will you be at the Red Hat Summit in a few weeks?
19:23 JoeJulian Supermathie: A block device translator has been developed. I haven't checked to see if it made it into 3.4
19:23 JoeJulian I will be there.
19:23 chouchins great :)  If I remember right I owe you a beer still.
19:23 JoeJulian :)
19:24 dustint joined #gluster
19:30 chouchin_ joined #gluster
19:39 a2_ bchilds, 128KB
19:39 JoeJulian a2_: You pinged last night?
19:41 a2_ JoeJulian, hey yes.. wanted to check if we can disable glusterbot's feature of enforcing +i on the channel?
19:42 JoeJulian I thought I had it set to enforce -i... hmmm
19:43 mjrosenb Supermathie: JoeJulian you guys remember the gluster / beat hazard problem I was having yesterday?
19:43 turtles_ joined #gluster
19:43 turtles_ Hi, is it possible to modify a volume's replication level?
19:43 bennyturns joined #gluster
19:44 JoeJulian a2_: Any thoughts on this self-heal loop? http://ur1.ca/drmif
19:45 glusterbot Title: #11273 Fedora Project Pastebin (at ur1.ca)
19:45 JoeJulian turtles_: Yes, use add-brick or remove-brick. Set the new replica level and add/remove the associated bricks.
19:45 turtles_ Also, how do I tell which bricks data is replicated on? Like, if I had six bricks in one machine, a replication level of three, and five servers, how can I be sure the data isn't all on the first server?
19:46 turtles_ JoeJulian: thanks
19:46 JoeJulian ~brick order | turtles_
19:46 glusterbot turtles_: Replicas are defined in the order bricks are listed in the volume create command. So gluster volume create myvol replica 2 server1:/data/brick1 server2:/data/brick1 server3:/data/brick1 server4:/data/brick1 will replicate between server1 and server2 and replicate between server3 and server4.
19:47 turtles_ Cool, ok. Thanks!
19:51 Supermathie mjrosenb: Mmmmmhmmmm
19:53 dewey joined #gluster
19:53 mjrosenb you guys thnk this is a problem with gluster, or normal for gluster, and beathazard reacting in a stupid way that gluster is known to act?
19:55 a2_ JoeJulian, i don't think that's a loop.. lookup() on / happens pretty much for every operation
19:55 JoeJulian There were no operations.
19:55 a2_ that's a client logfile right?
19:55 JoeJulian right
19:56 a2_ there must have been *SOME* operations.. the client never initiaties anything itself
19:56 a2_ some background process
19:56 JoeJulian I did a stat on the mountpoint.
19:56 JoeJulian Then that went on forever.
19:56 JoeJulian I finally wiped the third replica and let it heal completely.
19:57 JoeJulian I wondered if it was that 3-way replication bug that jdarcy found.
19:57 samppah
19:57 turtles_ JoeJulian: I am attempting to execute this command but getting a "wrong brick type" error  gluster volume add-brick glusterv1 replica 1 gluster2:/mnt/sda3
19:58 turtles_ which seems to be the format outlined in the documentation I've been able to find
19:58 JoeJulian turtles_: If you want to reduce the replica count, you'll need to use remove-brick
19:58 a2_ JoeJulian, maybe.. i'm not sure
19:59 turtles_ Ok, I'm still getting that error with gluster volume add-brick glusterv1 replica 3 gluster2:/mnt/sda3
20:00 JoeJulian ~pasteinfo | turtles_
20:00 glusterbot turtles_: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
20:00 turtles_ http://fpaste.org/11351/68129649/
20:00 glusterbot Title: #11351 Fedora Project Pastebin (at fpaste.org)
20:00 andreask joined #gluster
20:01 chouchins joined #gluster
20:02 JoeJulian is gluster2 a peer?
20:04 turtles_ It is
20:04 bulde1 joined #gluster
20:05 turtles_ Could it be a matter of using 3.2?
20:05 turtles_ In the process of moving to 3.3 now
20:09 JoeJulian That might be... I don't remember if 3.2 could change replica count.
20:20 StarBeast joined #gluster
20:22 nueces joined #gluster
20:23 StarBeast joined #gluster
20:24 StarBeast joined #gluster
20:25 albel727 a total newb here with stupid questions. is glusterd daemon required to be run on all nodes with bricks? or is it just for dynamic configuration, and if one doesn't need that, he can avoid it and only use glusterfs and vol files?
20:25 JoeJulian ~processes | albel727
20:25 glusterbot information.
20:25 glusterbot albel727: the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more
20:26 JoeJulian ~glossary | albel727
20:26 glusterbot albel727: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
20:26 JoeJulian I think those two factoids might answer your question.
20:26 albel727 thank you. although the first link seems to be dead.
20:26 JoeJulian but yes, you /can/ write your own volfiles
20:27 JoeJulian It's not recommended.
20:27 albel727 why? if I want static configuration, that seems to be perfect. what else there is to it?
20:28 JoeJulian Mostly because it requires that you understand translators and how they stack to form a graph and it makes it a total pain to diagnose if you need help because we don't know where you're staring from.
20:29 JoeJulian But if you already know what you're doing, you don't need to ask me if you can do it. :P
20:29 albel727 heh, true enough.
20:29 semiosis JoeJulian: c.g.o is finally down
20:29 semiosis !
20:30 semiosis Supermathie: ping
20:30 JoeJulian I always use the caveat that if you know what you're doing, feel free to ignore any or all recommendations.
20:31 Supermathie semiosis: PONG
20:31 * JoeJulian felt a great disturbance in the Force, as if millions of voices suddenly cried out in terror and were suddenly much happier that that Q&A site was gone!
20:32 albel727 well, I'll surely try to shoot myself in the foot and try using manually written volfiles first.  pity I don't see any documentation on them.
20:33 JoeJulian I think if you use search on the wiki, you should be able to find some of the old translator documentation.
20:34 semiosis Supermathie: johnmark sent an email in early march asking for input on alternative discussion sites, since community.gluster.org was going to be shut down.  do you have any thoughts on that?  perhaps regarding discourse?
20:35 semiosis http://gluster.org/pipermail/​gluster-infra/2013-March.txt
20:35 glusterbot <http://goo.gl/OmuBr> (at gluster.org)
20:35 Supermathie JoeJulian: No entry for 'translator' in glossary?
20:35 Supermathie semiosis: BRB, will answer shortly
20:35 chirino joined #gluster
20:35 semiosis Supermathie: thanks, any time
20:36 JoeJulian My feeling are: q&a sites are inherently broken. People that go to them don't understand what a question is (or an answer for that matter). Let the people that want to do support in that fashion use stackexchange.
20:37 turtles_ JoeJulian: following up, moving to 3.3 made the add/remove brick stuff possible
20:37 albel727 yeah, found some info on vol-files. and most importantly, I've got and idea where to look for translators and their options. will be digging in that direction, thanks again.
20:37 JoeJulian excellent
20:47 nueces joined #gluster
20:48 Supermathie semiosis: HTG's Discourse site has some help topics http://discuss.howtogeek.c​om/category/computer-help, feel free to browse through there and see how that's working. I think Discourse would be great as a *discussion* board, perhaps for suggestions and broader questions that would get nailed at SF for being too vague. It's a handy place to post code snippets - github-style markdown and oneboxing are all supported.
20:48 glusterbot <http://goo.gl/tzGtY> (at discuss.howtogeek.com)
20:48 piotrektt_ semiosis: thx for your beta repos for ubuntu :)
20:48 Supermathie Having a forum like Discourse (instead of? along with?) the mailing list also allows you to curate a presence, for what it's worth.
20:50 piotrektt_ oh, and by the way. glusterd supports multithread, is that default option or there is something to turn on?
20:50 piotrektt_ i mean it supports it from 3.4
20:52 Supermathie semiosis: If we wanted to try it out, I'm sure I could set up another site on the servers.
20:53 Supermathie semiosis: A better site to check would probably be http://discuss.emberjs.com/
20:53 glusterbot Title: Ember.JS (at discuss.emberjs.com)
20:57 Supermathie piotrektt_: Is that not performance.client-io-threads or performance.nfs.io-threads?
20:58 piotrektt_ Supermathie: that's why I am asking, I don't know. I've read on gluster community page about multithreading, and I wonder
20:59 andreask joined #gluster
21:00 piotrektt_ Supermathie: http://www.gluster.org/2013/05/get​-yer-3-4-beta-open-community-ftw/
21:00 glusterbot <http://goo.gl/mwYAZ> (at www.gluster.org)
21:01 Supermathie piotrektt_: Not sure eithe r:)
21:01 piotrektt_ it's in glusterd section so i bet that's for communication between servers
21:02 piotrektt_ oh, btw. do you know any test on performance with gluster with cifs and nfs?
21:02 Supermathie piotrektt_: My understanding is that the clients (fuse client, nfs server, etc) are responsible for all the work. There's no 'communication between servers'...
21:03 piotrektt_ Supermathie: oh, i think opposite :)
21:03 piotrektt_ but, computers are a lot like philosphy, so I can be wrong :)
21:05 JoeJulian There's very little communication between servers.
21:05 piotrektt_ Supermathie: so you want to tell that when client is sending to a replicated volume it's sending to 2 or more volumes at once?
21:06 JoeJulian replicated volumes send to 2 or more bricks at once.
21:07 Supermathie http://www.websequencediagrams.com/fi​les/render?link=Q5L4oIoO10o6od6RRMHT <- this is my understanding
21:07 glusterbot <http://goo.gl/hXmvL> (at www.websequencediagrams.com)
21:07 piotrektt_ yeah, so that's the reason cifs mounted share is so slow... hmm...
21:08 piotrektt_ when a cifs client is slow... write will be slow...
21:08 piotrektt_ oh
21:09 Supermathie disclaimer: that hasn't been vetted and is my understanding only :p
21:10 JoeJulian Hmm, interesting. You should get ndevos to coordinate with you on that.
21:10 Supermathie ♥ diagrams
21:11 piotrektt_ Supermathie: didn't I say computers are like philosophy? :P
21:11 JoeJulian ndevos is the guy to added gluster decoding to wireshark.
21:13 piotrektt_ hmm... the coolest option I could think of that gluster could have had I to decide which interface is used to read and which to write. I suppose it's hard to implement :/
21:14 Supermathie lol ew
21:15 Supermathie JoeJulian: decoding of gluster comm frames? nice.
21:24 bulde joined #gluster
21:25 sjoeboo_ joined #gluster
21:26 piotrektt__ joined #gluster
21:27 chirino_m joined #gluster
21:28 baskin joined #gluster
21:30 premera is there a way to reset stats for volume top ? seems my top files no longer exist
22:10 andrei_ has anyone here managed to get reliable results with rdma transport?
22:10 andrei_ i would like to discuss how you've managed that please
22:11 andrei_ JoeJulian: might be interestesting for you. I've done some more testing
22:11 andrei_ and it seems that when you are creating a gluster volume and specify tcp,rdma transport the performance suffers
22:12 andrei_ even if you are mounting volume with tcp
22:15 vex joined #gluster
22:15 vex joined #gluster
22:23 andrei_ Naw, I was wrong. performance is the same regardless if you use protocol tcp or tcp,rdma
22:26 semiosis Supermathie: been afk for a while but thanks for the links & info
22:28 bulde joined #gluster
22:32 GLHMarmot joined #gluster
22:44 sjoeboo_ joined #gluster
22:48 nueces joined #gluster
23:36 GLHMarmot joined #gluster
23:37 plarsen joined #gluster
23:41 yinyin joined #gluster
23:42 flrichar joined #gluster
23:48 flrichar joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary