Camelia, the Perl 6 bug

IRC log for #gluster, 2012-12-18

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 semiosis and we have compilation
00:00 semiosis \o/
00:00 JoeJulian o/\o
00:00 Ryan_Lane \o/
00:00 semiosis https://launchpad.net/~semiosis/+archi​ve/ubuntu-glusterfs-3.3/+build/4070190
00:00 glusterbot <http://goo.gl/JaHhx> (at launchpad.net)
00:03 Ryan_Lane semiosis: thanks :)
00:03 Ryan_Lane I hope this fixes my issue
00:04 johnmark Ryan_Lane: fingers crossed
00:04 semiosis i hope so too
00:04 Ryan_Lane we were in the middle of transitioning our homedirs to gluster when this happened. still in the middle of it. hopefully will get a chance to test this soon.
00:05 semiosis Ryan_Lane: are you by any chance heading to monitorama?
00:05 Ryan_Lane nope. I'm trying not to travel :)
00:05 semiosis oh ok
00:05 Ryan_Lane I think my next trip may be fosdem
00:06 semiosis oh thats a big one
00:06 Ryan_Lane yeah. always lots and lots of good talks to see
00:06 Ryan_Lane though my main room talk didn't get accepted. I'm going to have to try for the cloud devroom
00:10 semiosis lucid packages built successfully
00:10 semiosis they'll be in the repos shortly
00:12 Ryan_Lane I think they are already there
00:13 semiosis probably
00:14 semiosis https://launchpad.net/~semiosis/+arc​hive/ubuntu-glusterfs-3.3/+packages -- still says pending
00:14 glusterbot <http://goo.gl/3YP68> (at launchpad.net)
00:14 semiosis there we go
00:14 semiosis this deserves a tweet
00:20 badone joined #gluster
00:24 Staples84 joined #gluster
00:28 Kins Can anyone give me a description fo what the "replica COUNT" option is for volume create? How exactly does this work if I have a client and a server, use a replica count of two, and later add an extra client node?
00:29 JoeJulian You can have as many clients as you want. Replicas are for keeping ... well... replicas of files on multiple bricks. They're grouped in the order listed so replica 2 server1:/data/brick server2:/data/brick server3:/data/brick server4:/data/brick" will have replicated files on server1&2 and server3&4.
00:30 Kins I see
00:30 Kins Is there a way to modify that on a created volume?
00:36 mooperd joined #gluster
00:36 Kins JoeJulian, thanks for your answer ;)
00:36 JoeJulian Sorry, was under the desk...
00:37 Kins Haha, no problem.
00:37 Kins I wasn't sure if you would respond to the second question, and didn't want to forget to thank you.
00:37 JoeJulian Yes, you can just do that using add-brick but you'll have to add enough bricks to complete the change.
00:37 kwevers joined #gluster
00:38 JoeJulian So if you have a 2 brick distribute volume, you can add-brick replica 2 <brick> <brick> and it'll become a 2x2 distribute-replicate volume.
00:39 Kins What if I had a 2 brick replica, and did add-brick replica 1 <brick>
00:45 JoeJulian Then it should become a 3 brick distributed volume.
00:45 Kins Ok cool, thanks
00:53 glusterbot New news from resolvedglusterbugs: [Bug 764966] gerrit integration fixes <http://goo.gl/AZDsh>
01:03 Kins I think my pool is messed up, is there a way to forcefully delete all peers?
01:05 JoeJulian quick and easy, assuming you have no data to consider, is to delete /var/lib/glusterd/* from all your servers after stopping all the ,,(processes)
01:05 glusterbot the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more information.
01:05 Kins I'll try that, thanks
01:10 dalekurt joined #gluster
01:30 hagarth joined #gluster
01:30 wushudoin joined #gluster
01:45 Kins So, I guess fstab entries for gluster don't work in ubuntu? Breaks the boot process
01:46 semiosis what version of glusterfs?  what version of ubuntu?  what's the fstab line?
01:46 semiosis they can be made to work
01:47 Kins 3.3.1, 12.04,  web02:/gv0 /gluster glusterfs defaults 0 0
01:47 m0zes defaults,_netdev ?
01:47 semiosis is that on machine web02?
01:47 Kins I had to take out _netdev, as it putting out an error message
01:47 Kins Yeah
01:47 semiosis m0zes: _netdev does nothing on ubuntu :(
01:48 semiosis and glusterfs gives a silly warning which can be ignored
01:48 semiosis so you're effectively trying to mount from localhost?
01:48 m0zes semiosis: leave it to ubuntu to break something that everyone uses
01:48 Kins I guess
01:49 semiosis you can add 'nobootwait' to your fstab options, that will prevent it from holding up the boot process
01:49 semiosis as for why it's not working, gotta check the client log file
01:50 semiosis m0zes: things change
01:50 semiosis ... well, on some distros... others are still using kernel 2.6!
01:50 semiosis ba-zing :P
01:50 Kins Well, nobootwait, will it still mount?
01:50 * m0zes is running 3.7. whee gentoo
01:50 Kins OR will it just skip it
01:50 semiosis there you go
01:51 semiosis Kins: no idea... if it's failing to mount there's probably something wrong, but at least with nobootwait your system will boot and you can investigate
01:51 Kins Ok, I will try it!
01:52 semiosis Kins: do you have some exotic networking setup? bridged interface? bonding?
01:52 Kins Not as far as I know
01:52 semiosis hmm then thats odd
01:52 semiosis i have to go
01:52 Kins I didn't do anything to the default, its a rackspace cloud server
01:52 Kins Running their ubuntu image.
01:53 semiosis will be back later or tomorrow
01:53 Kins Thanks for your helop
01:53 semiosis yw
01:53 Kins help* :S
01:53 semiosis i think we can get it working :)
01:53 semiosis later
01:54 kevein joined #gluster
02:27 bharata joined #gluster
02:33 purpleidea joined #gluster
02:34 grade__ joined #gluster
02:35 hchiramm_ joined #gluster
02:40 grade__ hi guys! I was successful with mounting gluster volume in autofs.
02:42 grade__ I have another question. I can access the volume and read the volume with ordinary user at my client but I can't write in the volume an oridinary user.
02:43 grade__ is there a config that I can use so I could write to it even if it is ordinary user account. thank you.
03:03 m0zes grade__: does the directory/file you are writing to have permissions that should allow the user to write? in the case of my homedir, /homes is the volume, /homes/m0zes is owned by m0zes:mozes_users with a chmod of 755.
03:05 m0zes s/volume/mount point/
03:05 glusterbot What m0zes meant to say was: grade__: does the directory/file you are writing to have permissions that should allow the user to write? in the case of my homedir, /homes is the mount point, /homes/m0zes is owned by m0zes:mozes_users with a chmod of 755.
03:11 grade__ hi mozes. thank you for your reply. where should I add the permission at the gluster server mount dir or at my client mount dir?
03:15 bauruine joined #gluster
03:16 m0zes grade__: client mount dir, I believe.
03:17 m0zes I have never needed to let normal users write to the root of a glusterfs mount, only to subdirectories of that mount.
03:18 grade__ i see thank you mozes. I'll try your suggestion above.
03:43 sgowda joined #gluster
03:44 shylesh joined #gluster
04:03 scott_ joined #gluster
04:03 scott_ Is anyone on?
04:05 Guest53902 Anyone using ZFS on Linux with GlusterFS?
04:05 Guest53902 or get ZFS snapshots working with Gluster?
04:16 sgowda joined #gluster
04:19 Guest53902 the ZFS mountpoint is /sp1, and my glusterfs mountpoint is /export.  An ls -l /sp1 shows: drwxr-xr-x   3 root root      4 Dec 17 20:10 . dr-xr-xr-x. 24 root root   4096 Dec 17 19:53 .. drw-------   5 root root      5 Dec 17 20:10 .glusterfs -rw-r--r--   2 root root 453171 Dec 17 19:54 ps.txt dr-xr-xr-x   1 root root      0 Dec 17 19:49 .zfs
04:20 Guest53902 ls -lh /sp1/.zfs shows:
04:20 Guest53902 total 0 dr-xr-xr-x 2 root root 2 Dec 17 20:20 shares dr-xr-xr-x 2 root root 2 Dec 17 19:49 snapshot
04:20 Guest53902 Under /export:
04:21 Guest53902 [root@gfs01 export]# ls -la /export
04:21 Guest53902 ls: cannot access /export/.zfs: No data available
04:21 Guest53902 ???????????  ? ?    ?         ?            ? .zfs
04:21 Guest53902 while i can take snapshots, they aren't accessible to the gluster client.  Any ideas?
04:28 m0zes Guest53902: it isn't recommended to change the bricks directly, and that is what a snapshot like that is. replicated volume?
04:29 m0zes Guest53902: I am guessing the servers see the snapshot dirs, try to set the  correct xattrs on them, fail, and the servers split-brain...
04:29 Guest53902 I'm using a single virtual Centos 6.3 server for testing.  So no, Gluster is set to the default of distributed.
04:30 m0zes is the snapshot dir writable?
04:30 Guest53902 Nope, the .zfs directory is read-only.
04:30 Guest53902 All ZFS snapshots are read-only as well.
04:31 Guest53902 The snapshot is available under /sp1/.zfs/snapshot/<snapshot name>
04:32 Guest53902 But, under the Gluster mount, I can't even cd to /export/.zfs.
04:32 m0zes that *may* be what gluster is getting hung up on. the clients/servers seem to function when xattrs can be set on the files/folders they're accessing/serving. if it is read-only that may be the only way it can handle it...
04:32 Guest53902 Ah.
04:33 Guest53902 So if gluster can't explicity set attributes on a directory, it won't be able to access it
04:34 m0zes global snapshots would be a cool idea, I don't know how glusterfs could handle them without having some sort of metadata server and in-depth knowlege of the snapshotting mechanism. :/
04:34 Guest53902 I was really hoping to be able to expose the underlying snapshots to the user.
04:35 Guest53902 Well, by default, snapshots are provided BY zfs... so it should just look like part of the filesystem.
04:36 m0zes yeah, but the knowledge of what is in those snapshots would have to be handled by glusterfs, if it can't set its xattrs.
04:36 m0zes all this is just me theorizing, btw. I don't have extensive knowledge of glusterfs under the hood.
04:37 Guest53902 but maybe you're right.  The snapshot directory would look different depending on which storage node takes the snapshot.  If I have 10 servers using a stripe or distributed (or any combination including replica) they'd look different from server to server.  So just exposing the .zfs directory might cause confusion.
04:37 m0zes definitely.
04:38 Guest53902 I'd have to have some way to give my users snapshots.  I'm looking to from a Solaris/ZFS based storage array to gluster.  So snapshots are a must.
04:38 Guest53902 *to move
04:39 Guest53902 I'd still want to use ZFS for the underlying brick storage because of the performance, compression, and, if I can get it to work with gluster, snapshot capabilities.
04:42 isomorphic joined #gluster
04:44 m0zes those are all great features of zfs. it is unlikely that the necessary tie-ins to glusterfs would happen, though. partially because of the licensing issues with zfs on linux, besides the fact that it would be a massive undertaking that would probably require a re-design of how glusterfs handles metadata.
04:45 m0zes compression is something I have really missed with my move from a zfs based storage server to xfs underneath glusterfs.
04:46 Guest53902 Well, I can do it all except with snapshots.
04:46 Guest53902 How do you handle making snapshots or backups with your installation?
04:47 sgowda joined #gluster
04:48 rastar joined #gluster
04:48 m0zes bacula. not going to work for me much longer, I afraid. I have users that put every bad practice into use for *any* filesystem.
04:49 * m0zes has 1 user with 48,000,000 <4K files in his homedir. another with 24,000,000. I've also got users that like to create TB size files, and change a few bytes on a daily basis...
04:49 Guest53902 same for me
04:50 m0zes oh the joys of a campus-wide hpc cluster.
04:51 Guest53902 lol
04:54 m0zes The backup issue may eventually be solved by setting up an automigrating "archive" filesystem (replacing files with symlinks to the archived file if it is old enough). still trying to figure out how to handle that gracefully, and from a software standpoint.
04:57 m0zes it is still a pain to have point-in-time backups.
04:59 vpshastry joined #gluster
05:00 sgowda joined #gluster
05:07 Kins semiosis, are you still around?
05:07 Guest53902 thanks for the info.
05:08 semiosis am I!
05:08 deepakcs joined #gluster
05:08 Kins Heh
05:10 Kins Weird problem even before the other one. Doing ls on the mountpoint for the brick gives a input/output error after gluster starts. If I disable gluster, it mounts fine. The only way I can get everything running is to kill all gluster processes, unmount /mnt/gluster/brick1 and remount it on all clients.
05:10 Kins Then it works :S
05:28 raghu joined #gluster
05:32 semiosis Kins: that is weird... pastie.org your client logs
05:32 semiosis but idk how long i'll be up to review them tonight
05:32 semiosis getting late here
05:32 Kins Yeah, I can't sleep until I get this fixed, haha
05:32 semiosis could throw in a brick log too while you're at it
05:32 Kins Which log file is client logs? gluster.log?
05:32 semiosis /var/log/glusterfs/client-mount-point.log
05:34 Kins I noticed if I disable the volume and restart, it doesn't happen, and if I start the volume after starting up, it doesn't happen.
05:34 Kins Wait, it still does :(
05:34 bala joined #gluster
05:35 Kins semiosis, can you recommend a good command line paste uploader for ubuntu?
05:36 semiosis i just copy & paste into a web browser, but one that i've heard of is pastebinit
05:36 semiosis altough pastebin.com is frowned upon by glusterbot
05:36 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
05:37 semiosis but i think that utility can use a different site
05:37 Kins on ubuntu it defaults to paste.ubuntu.com
05:37 Kins Not a big pastebin.com fan myself
05:37 semiosis well there you go
05:37 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
05:37 semiosis glusterbot: meh
05:37 glusterbot semiosis: I'm not happy about it either
05:38 Kins @paste
05:38 glusterbot Kins: For RPM based distros you can yum install fpaste, for debian and ubuntu it's dpaste. Then you can easily pipe command output to [fd] paste and it'll give you an url.
05:38 Kins http://paste.ubuntu.com/1446858/
05:38 Kins http://paste.ubuntu.com/1446859/
05:38 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
05:38 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
05:39 Kins First is client, second is brick
05:39 semiosis please ,,(pasteinfo) also
05:39 glusterbot Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
05:39 semiosis neither of those is a client log
05:39 semiosis one is your glusterd server log, the other is a brick log
05:40 semiosis ,,(processes)
05:40 glusterbot the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more information.
05:41 bulde joined #gluster
05:42 Kins hrmm
05:42 Kins nfs.log, could that be it?
05:43 Kins Doesn't reflect well on me that I can't even find the correct logs file :S
05:43 semiosis no, it's named for your mount point, /var/log/glusterfs/client-mount-path.log
05:43 Kins bricks  etc-glusterfs-glusterd.vol.log  geo-replication  geo-replication-slaves  glustershd.log  nfs.log
05:43 Kins All I have
05:44 semiosis then you have never mounted a glusterfs volume on this host
05:44 semiosis ?!
05:44 Kins Can't even get to that point
05:44 Kins Just starting the volume causes this.
05:45 hagarth joined #gluster
05:45 semiosis pastie the output of 'gluster volume info' and also 'mount'
05:46 Kins http://paste.ubuntu.com/1446867/ http://paste.ubuntu.com/1446869/
05:46 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
05:49 semiosis you may have a client log file on host web03
05:49 semiosis the brick log suggests a client connected from there
05:50 Kins I checked already, nothing
05:50 Kins I can get gluster running and mounted
05:51 semiosis if that were true, you'd have a log file
05:51 semiosis unless you're redirecting your log file somehwere else with a mount option
05:51 semiosis but then i wouldnt have had to tell you the log file location twice ;)
05:51 Kins But I have to kill the server, kill all client processes, umount brick1 then remount it
05:51 Kins Then restart gluster server
05:51 Kins And it mounts
05:51 semiosis that doesnt add up
05:51 Kins I cleaned up the log files a while ago
05:52 Kins Which is why the log wasn't there
05:52 Kins Let me upload the newly created log
05:52 semiosis would you?!
05:52 Kins I really hope this is the right log file!
05:52 Kins http://paste.ubuntu.com/1446874/
05:52 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
05:54 Kins Could this be some bizzare issue with the hosts I am using? should I try removing my pool, and using IPs instead?
05:55 semiosis the client was unable to reach 10.183.0.104
05:55 semiosis that's one problem
05:55 semiosis bad /etc/hosts entry maybe?
05:56 Kins That is web03
05:56 Kins Maybe.. my iptable entries are messed up?
05:56 Kins 'I can ping it fine
05:56 semiosis could be... ,,(ports)
05:56 glusterbot glusterd's management port is 24007/tcp and 24008/tcp if you use rdma. Bricks (glusterfsd) use 24009 & up. (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111.
05:57 Kins ACCEPT     tcp  --  10.160.0.0/11        anywhere             multiport dports 24007:24047
05:57 Kins Looks right
05:58 Kins I'm not that proficient with iptables though
05:59 semiosis sorry but im too tired, gotta sign off for the night
05:59 Kins Thanks for your help :(
05:59 semiosis i'll catch up with you tmrw
05:59 semiosis yw
05:59 Kins Anything you would suggest trying in the meantime?
06:00 Kins I want to figure this out tonight
06:00 semiosis start over? :)
06:00 semiosis follow the ,,(quick start) guide
06:00 glusterbot http://goo.gl/CDqQY
06:00 Kins I did, guess its my only option
06:00 semiosis good luck
06:00 semiosis and good night
06:00 Kins Night
06:02 Kins Oh wow, I missed this. Note: When using hostnames, the first server needs to be probed from one other server to set it's hostname.
06:03 Ryan_Lane joined #gluster
06:29 vimal joined #gluster
06:32 overclk joined #gluster
06:40 bulde joined #gluster
06:50 rgustafs joined #gluster
06:54 rudimeyer joined #gluster
07:08 ramkrsna joined #gluster
07:15 ngoswami joined #gluster
07:17 Nevan joined #gluster
07:23 jtux joined #gluster
07:31 vpshastry joined #gluster
07:46 vpshastry joined #gluster
07:49 ctria joined #gluster
07:59 shireesh joined #gluster
08:07 mooperd joined #gluster
08:11 sripathi joined #gluster
08:11 andreask joined #gluster
08:11 jtux joined #gluster
08:22 ekuric joined #gluster
08:27 nissim joined #gluster
08:37 mdarade joined #gluster
08:38 mdarade left #gluster
08:39 passie joined #gluster
08:52 dobber joined #gluster
08:53 sripathi1 joined #gluster
08:56 glusterbot New news from newglusterbugs: [Bug 888174] low read performance on stripped replicated volume in 3.4.0qa5 <http://goo.gl/OUhHe>
08:58 duerF joined #gluster
09:02 gbrand_ joined #gluster
09:04 mgebbe joined #gluster
09:05 mgebbe___ joined #gluster
09:22 tjikkun_work joined #gluster
09:30 ramkrsna joined #gluster
09:37 DaveS joined #gluster
09:41 sgowda joined #gluster
09:59 tryggvil joined #gluster
10:10 Norky joined #gluster
10:13 wica JoeJulian: Ter is a option called iagnostics.brick-sys-log-level
10:13 wica diagnostics.brick-sys-log-level
10:14 mohankumar joined #gluster
10:18 shireesh joined #gluster
10:19 wica JoeJulian: I have done now, "gluster volume set glusterfsvol02 diagnostics.client-sys-log-level WARNING" and "gluster volume set glusterfsvol02 diagnostics.brick-sys-log-level WARNING"
10:19 wica and now I get some logs in my syslog
10:47 vimal joined #gluster
10:51 guest2012 joined #gluster
11:00 manik joined #gluster
11:08 bulde joined #gluster
11:12 x4rlos Can i set gluster parameters to allow delete, but not creation of files? :-l
11:21 guest2012 quick one: in 'gluster volume remove-brick ... status' output, which unit is the "size" field in?
11:23 VSpike When creating new vms to host gluster nodes, will they benefit from > 2 cores or not?
11:25 x4rlos VSpike: Im interested in your use of KVMs on gluster. how you finding it?
11:26 VSpike x4rlos: my gluster nodes will be vmware guests, in fact... and I'll let you know soon once it's up and running :)
11:26 x4rlos cool :-)
11:29 x4rlos Gonna try myself with debian/kvm.
11:31 mooperd joined #gluster
11:31 nullck joined #gluster
11:35 bulde joined #gluster
11:36 overclk joined #gluster
11:52 spn joined #gluster
11:57 tryggvil_ joined #gluster
12:02 edward1 joined #gluster
12:12 overclk joined #gluster
12:13 GLHMarmot joined #gluster
12:13 shireesh joined #gluster
12:15 rgustafs joined #gluster
12:20 overclk joined #gluster
12:29 ctria joined #gluster
12:33 kkeithley1 joined #gluster
12:56 VSpike If I'm setting up two gluster nodes for two web servers, and I want to use ucarp, would I be better creating two virtual ips, one with node A as master, one with node B as master, and connecting one web server client to each?
12:56 VSpike In other words, spread the requests across both nodes when both are up
12:57 VSpike Or would it perform just as well with a single virtual ip on one of the nodes with both web servers connected to it?
12:57 VSpike So that by default all client requests go to node A, unless it fails at which point they all go to node B?
12:59 darth joined #gluster
13:02 dustint joined #gluster
13:05 darth someone has used glusterfs in production ? I would like to know the possible problems with performance, and if there are any way to make faster acess in case the client is a node of gluster too.
13:06 balunasj joined #gluster
13:12 shireesh joined #gluster
13:12 H__ darth: i'm using gluster in production, but only for some months. Others here can help you much better than I can, but ask away
13:13 bida joined #gluster
13:13 darth thanks H_
13:13 toruonu joined #gluster
13:13 toruonu ok I wanted to revisit the issue that I have folders that cannot be removed:
13:14 toruonu [mario@ied AnalysisCode]$ rm -Rf crab_0_121129_151837/
13:14 toruonu rm: kataloogi `crab_0_121129_151837//share' ei õnnestu kustutada: Directory not empty
13:14 toruonu [mario@ied AnalysisCode]$
13:14 toruonu rm: cannot remove directory `crab_0_121129_151837//share': Directory not empty
13:14 toruonu changed LANG:)
13:15 toruonu the directory listing is this:
13:15 toruonu http://fpaste.org/7mPE/
13:15 glusterbot Title: Viewing crab_0_121129_151837/: total 16 drwx ... 2 mario HEPUsers 16384 Nov 29 15:19 ... crab_0_121129_151837/share: total 0 (at fpaste.org)
13:15 toruonu this directory for sure hasn't been used for days and the server itself has been restarted in the meantime
13:21 overclk joined #gluster
13:25 glusterbot New news from resolvedglusterbugs: [Bug 824472] writes fail on nfs mount <http://goo.gl/gH7Je>
13:40 Norky toruonu, what's the recursive directory listing?
13:41 Norky do "ls -lra crab_0_121129_151837" and paste the result
13:42 toruonu I think the fpaste was recursive :) here's what you asked:
13:42 toruonu http://fpaste.org/3vps/
13:42 glusterbot Title: Viewing total 48 drwxr-xr-x 2 mario HEPUsers ... drwxr-xr-x 20 mario HEPUsers 16384 N ... mario HEPUsers 16384 Nov 29 15:19 . (at fpaste.org)
13:42 toruonu there really isn't anything there
13:42 toruonu it's the share dir that can't be removed for what ever reason
13:42 toruonu and I'm not the only one, those directories popped up before I think we moved to NFS mount and they're still there and non-removable
13:43 guest2012 joined #gluster
13:43 Norky err, sorry, I meant ls -lRa
13:43 toruonu http://fpaste.org/Z2c5/
13:43 glusterbot Title: Viewing crab_0_121129_151837/: total 48 drwx ... hare crab_0_121129_151837/share: tot ... 9 15:18 .nfs83fff0ff36d4e3490000000f (at fpaste.org)
13:43 hagarth joined #gluster
13:43 Norky there you fo
13:43 toruonu ah so it was after NFS move
13:44 toruonu still… no help
13:44 toruonu http://fpaste.org/pJV2/
13:44 glusterbot Title: Viewing [mario@ied AnalysisCode]$ rm -f crab ... 29_151837/ [mario@ied AnalysisCode]$ ... not empty [mario@ied AnalysisCode]$ (at fpaste.org)
13:44 toruonu the file didn't actually get removed
13:45 toruonu and the NFS has been remounted many times
13:46 Norky ls -a crab_0_121129_151837/share
13:46 Norky see if the file is still there, or another has been recreated
13:47 toruonu http://fpaste.org/Cq1e/
13:47 glusterbot Title: Viewing -rw-r--r-- 1 mario HEPUsers 11264 No ... Nov 29 15:18 crab_0_121129_151837/s ... (at fpaste.org)
13:47 Norky might be an idea to check glsuter volue heal status
13:47 toruonu I did ls -la to the file I removed and it's still there
13:48 Norky I'm not a gluster expert, just suggesting things
13:49 Norky .nfs83 foo might be something particular to glsuter when you mount it via nfs, I'm not sure
13:49 Norky apparently you asked about this before: http://irclog.perlgeek.de/g​luster/2012-11-29#i_6196737
13:49 glusterbot <http://goo.gl/oZrhz> (at irclog.perlgeek.de)
13:50 Norky did JoeJulian's advice not help?
13:52 rwheeler joined #gluster
13:53 toruonu he assumed a process may have it open and then had to run to the office :)
13:54 toruonu but … the server has been restarted at least 2x in between and NFS has been mounted from even another gluster server
13:54 toruonu the file is still there and it's not deleting
13:54 toruonu I kind of forgot about it and today stumbled on it again
13:54 toruonu so it's been around for 20 days now :)
13:56 Norky what other machines access this volume?
13:57 Norky including the glsuter servers themselves
13:58 Norky "lsof .nfs83fff0ff36d4e3490000000f" on all machines which might have ever accessed the volume
14:00 aliguori joined #gluster
14:10 toruonu I can guarantee that other nodes do NOT use this file
14:10 toruonu we have only one node that has working CRAB and I was the one who created the directory
14:11 toruonu this node has been restarted
14:11 toruonu so it cannot be that other nodes use this directory really … at least I see no reason why they should
14:16 Norky and you've proved that with "lsof" or "fuser"?
14:24 larsks joined #gluster
14:25 mohankumar joined #gluster
14:26 jdarcy IIRC the .nfs* files are created by the NFS *client*.  Actually they're not created, they're the result of renaming instead of unlinking while a file's open.
14:27 nullck joined #gluster
14:27 jdarcy Those are often left lying around, in every NFS implementation as far back as I can remember.
14:27 toruonu but how to get rid of them
14:29 jdarcy toruonu: I haven't read all of the scrollback yet.  Did you try removing that specific file (not the containing directory)?
14:29 toruonu yes
14:29 toruonu rm -f
14:29 toruonu no help
14:29 toruonu no error either
14:30 jdarcy OK.  Did anything show up in the log when you tried that?
14:31 larsks toruonu: Those files are typically created when a file is deleted but a process still has the file open.
14:31 larsks https://uisapp2.iu.edu/confluence-prd/​pages/viewpage.action?pageId=123962105
14:31 glusterbot <http://goo.gl/OZbJ8> (at uisapp2.iu.edu)
14:31 larsks Usually killing the associated process will make them go away (or at least make them deletable).
14:32 larsks That article includes recommendations for using the lsof command to identify the owning process.
14:32 jdarcy If the file does in fact exist on the back end, you could also try deleting it there.
14:33 Norky JoeJulian already gave him that link, back in November
14:33 jdarcy Otherwise it's a caching artifact on the client, which is a bit surprising but not totally unheard of.
14:36 Norky jdarcy, thank you for your advice (a week or two ago) btw
14:36 Norky I've been talking to RH in the UK and they're in agreement that something odd is going on
14:37 toruonu yes the trouble with that hypothesis is that there's no way a process can hold onto it really
14:38 Norky (my problem being Gluster performs several times slower than expected)
14:38 toruonu the directory got stuck at creation time on the node I'm on right now
14:38 kevein joined #gluster
14:38 Norky toruonu, and lsof confirms this?
14:38 toruonu said node has been rebooted 3x and the hardnode that it's on has been rebooted once
14:38 jdarcy Norky: You're welcome, even though I can't remember much of anything nowadays.  ;)
14:38 larsks toruonu: Out of curiousity, you've verified that nothing actually has the file open?
14:39 toruonu [mario@ied AnalysisCode]$ /sbin/fuser crab_0_121129_151837/share/.​nfs83fff0ff36d4e3490000000f
14:39 toruonu Cannot stat crab_0_121129_151837/share/.​nfs83fff0ff36d4e3490000000f: No such file or directory
14:39 toruonu Cannot stat crab_0_121129_151837/share/.​nfs83fff0ff36d4e3490000000f: No such file or directory
14:39 toruonu [mario@ied AnalysisCode]$
14:39 toruonu ehh
14:39 toruonu it seems the file has disappeared in the meantimg
14:39 toruonu s/meantimg/meantime
14:39 toruonu erm
14:39 toruonu or not
14:39 Norky do a directory listing on share again
14:40 toruonu http://fpaste.org/Ns1T/
14:40 glusterbot Title: Viewing [mario@ied AnalysisCode]$ rm -f crab ... 0_121129_151837/share/.nfs83fff0ff36 ... directory [mario@ied AnalysisCode]$ (at fpaste.org)
14:40 toruonu the directory listing shows it
14:40 toruonu ls -la says it's not there
14:40 toruonu rm doesn't complain
14:41 Norky listing on the share directory
14:41 Norky NOT the file
14:41 toruonu [root@ied ~]# lsof |grep crab
14:41 toruonu [root@ied ~]#
14:41 toruonu yes the listing on the share has the same output as before
14:41 toruonu the file is listed
14:41 jdarcy Hm.  "ls -la" says it's not there?  But regular "ls" does show it?
14:42 toruonu http://fpaste.org/iwDn/
14:42 glusterbot Title: Viewing crab_0_121129_151837/share/: total 4 ... rwxr-xr-x 3 mario HEPUsers 16384 Nov ... 9 15:18 .nfs83fff0ff36d4e3490000000f (at fpaste.org)
14:42 plarsen joined #gluster
14:43 jdarcy You'd get something like this if there was a directory entry for a file that actually didn't exist (corrupt directory).  Have you verified whether the file is or is not present *on the bricks* (not from the client)?
14:44 * jdarcy is wondering whether this might be a local-FS error on the bricks.
14:45 Norky stop the volume, umount the brick and fsck?
14:45 mooperd joined #gluster
14:45 toruonu well then there's the nice task of locating where the heck this directory is
14:46 toruonu Norky: not an option unless it's the last thing left and become crucial
14:46 noob2 joined #gluster
14:46 toruonu it's a production system with users on
14:46 toruonu so … how do I find which brick has the file?
14:47 Norky if it's a production system I would suggest you check for corruption sooner, rather than later ;)
14:47 jdarcy toruonu: On the client side, "getfattr -n trusted.glusterfs.pathinfo $path" should show you which brick it's on (even if the directory on that brick is screwed up).
14:48 Norky jdarcy, (or whoever) can one remove a single server from a (replicated) gluster cluster to do maintenance without stopping the volume?
14:49 shireesh joined #gluster
14:49 Norky I should qualify that I mean without breaking things :)
14:50 toruonu well that's kind of the point of replication, no?
14:50 toruonu if a node goes belly up the rest keeps working
14:50 toruonu and once it's back it heals
14:50 toruonu it should work fine because I've done it many times
14:50 toruonu just now had a SATA controller replaced on one of the nodes
14:51 toruonu but we use 3x replication so 2 nodes should keep quorum
14:51 Norky I too think the answer is "yes", but not having thoroughly tested it myself, I wanted confirmation from peopel who know this stuff
14:51 toruonu well I just did that (or well had it done by maintenance) on one of my 6 gluster nodes
14:51 toruonu the volumes kept working nicely
14:51 toruonu and it's not the first time
14:51 toruonu those .nfs issues were there before I ever restarted anything :)
14:53 Norky I dont' think any one is blaming your .nfs83 file on your having restarted a machine
14:53 toruonu oh the fun continues
14:53 toruonu http://fpaste.org/PN8o/
14:53 glusterbot Title: Viewing [root@ied AnalysisCode]# getfattr -n ... e]# getfattr -n trusted.glusterfs.pa ... r directory [root@ied AnalysisCode]# (at fpaste.org)
14:53 Norky I'm just suggesting you do as jdarcy said, work out which server contains that file, then take that server offline and fsck the brick
14:53 toruonu on the crab directory it says not permitted, on the .nfs it says the file doesn't exist :P
14:53 toruonu while the best part is that I used tab completion
14:54 Norky ied is one of the gluster servers?
14:55 Norky and also, I think you want to use the full path
14:55 toruonu ied is the client
14:55 toruonu and no difference with full path
14:55 larsks left #gluster
14:55 Norky actually, I'm confusing myself, I shodul stop trying to help - I fear I know just enough to be dangerous
14:56 toruonu jdarcy: any other ideas how to determine it without resorting to blindly coumbing through gluster nodes?
14:56 toruonu btw this getfattr doesn't seem to work on any file… neither the good ones
14:57 jdarcy Permission denied on the getfattr?  As root?  Weird.
14:57 toruonu I even went out of the VZ container to the hardnode, same result
14:57 toruonu [root@wn-d-117 ~]# getfattr -n trusted.glusterfs.pathinfo /home/mario/Summer12/CMSSW_5_3_4/src/Analy​sisCode/Configuration/SingleTopSkimmer.py
14:57 toruonu /home/mario/Summer12/CMSSW_5_3_4/src/Analy​sisCode/Configuration/SingleTopSkimmer.py: trusted.glusterfs.pathinfo: Operation not supported
14:58 Norky rootsquash?
14:58 toruonu no nfs mount options so possible
14:58 toruonu but does gluster server default to rootsquash?
14:58 jdarcy Oh, are you mounting through NFS or native?
14:58 toruonu NFS
14:58 jdarcy Blah.  Then yeah, pathinfo won't work.
14:59 toruonu native doesn't work well enough (way too slow due to lack of negative caching)
14:59 toruonu at least for /home
14:59 johnmark hrm... negative caching... I think we're working on that
14:59 jdarcy You could do a native mount on one of the servers and get pathinfo through that.
15:00 jdarcy In late 3.3 (not sure if it made it into 3.3.1) we turned on negative caching in FUSE.  I also have two translators that do it, but they're not committed.
15:00 stopbit joined #gluster
15:00 toruonu well it's not in 3.3.1
15:00 toruonu :)
15:01 toruonu the whole reason I moved from native to nfs was that any stuff users tried to do was extremely slow
15:01 toruonu like 10x slower if not worse
15:01 toruonu negative lookup caching + generic cache are what are needed to make it work for home
15:02 toruonu now first off, the .nfs file doesn't exist itself:
15:02 toruonu [root@se3 /]# getfattr -n trusted.glusterfs.pathinfo /mnt/mario/Summer12/CMSSW_5_3_4/src​/AnalysisCode/crab_0_121129_151837/​share/.nfs83fff0ff36d4e3490000000f
15:02 toruonu getfattr: /mnt/mario/Summer12/CMSSW_5_3_4/src​/AnalysisCode/crab_0_121129_151837/​share/.nfs83fff0ff36d4e3490000000f: No such file or directory
15:02 toruonu the share dir I found
15:02 toruonu it's located in 3 locations as it should be for 3 replicas including the server where I did the native mount
15:03 toruonu direct brick ls -la shows the .nfs file
15:03 jdarcy On all three of those bricks?
15:03 toruonu http://fpaste.org/wA7X/
15:03 glusterbot Title: Viewing total 24 drwxr-xr-x 2 1000005 100000 ... 0005 1000002 4096 Nov 29 15:19 .. -r ... 9 15:18 .nfs83fff0ff36d4e3490000000f (at fpaste.org)
15:03 toruonu checking on the other two
15:04 toruonu yup
15:04 jdarcy If you could do a "getfattr -d -e hex -m . $path" on all three, that might give us some clues about what's happening.
15:04 toruonu alrighty
15:05 toruonu you mean on the .nfs itself?
15:05 toruonu or the share directory
15:05 jdarcy Either might be useful.
15:05 jdarcy Probably a bit more so for the file itself.
15:06 toruonu http://fpaste.org/jNcP/
15:06 glusterbot Title: Viewing [root@ganymede ~]# for i in se1 se2 ... rom absolute path names # file: d36/ ... solute path names [root@ganymede ~]# (at fpaste.org)
15:06 rwheeler joined #gluster
15:06 mooperd joined #gluster
15:06 toruonu http://fpaste.org/6Ufj/
15:06 glusterbot Title: Viewing [root@ganymede ~]# for i in se1 se2 ... d4e3490000000f trusted.afr.home0-cli ... solute path names [root@ganymede ~]# (at fpaste.org)
15:06 bennyturns joined #gluster
15:06 Staples84 joined #gluster
15:07 jdarcy OK, same GFID and all changelogs are clear.  That seems reasonably healthy.
15:08 jdarcy The easy thing to do would be to nuke the file on each of the bricks.
15:09 wN joined #gluster
15:14 toruonu well depends what you define easy ;D it's not the only occurrence
15:23 scotty_ joined #gluster
15:24 _Scotty joined #gluster
15:29 semiosis :O
15:31 toruonu what surprised you semiosis?
15:31 semiosis s/:O/good morning/
15:31 glusterbot What semiosis meant to say was: good morning
15:31 Technicool joined #gluster
15:31 toruonu and yes I just checked with a user, he's got tons of such folders
15:32 larsks joined #gluster
15:32 larsks Is libgfapi part of the 3.3 release?  Or is this not yet formally released?
15:32 wnl_work joined #gluster
15:33 kkeithley1 libgfapi has not been formally released. If you want to play with it you need to check out the git tree and build it
15:34 semiosis larsks: future, maybe 3.4, idk
15:34 semiosis but it's in git master
15:34 mooperd joined #gluster
15:34 wnl_work to the guys who ran the BoF at LISA last week: thanks. it was very helpful
15:34 larsks Okay, I'll grab the source.  I'm trying to figure out how best to monitor the health of a gluster environment, and I'm wondering if libgfapi would make it easier to write monitoring tools.
15:34 kkeithley1 That was jdarcy
15:35 wnl_work jdarcy: thanks
15:35 kkeithley1 and/or johnmark
15:35 semiosis @puppet
15:35 glusterbot semiosis: (#1) https://github.com/semiosis/puppet-gluster, or (#2) https://github.com/purpleidea/puppet-gluster
15:36 guest2012 Hi all! After a remove-brick start, should I wait for a <complete> status before committing?
15:36 semiosis larsks: as i mentioned in #crimsonfu -- my puppet module has all the nagios checks i use
15:37 jdarcy wnl_work: You're more than welcome.
15:37 nueces joined #gluster
15:37 guest2012 What if I issue the remove-brick commit just after the start?
15:37 * jdarcy was taking ethics training in another window.  :-P
15:38 neofob hi all, when i replace a brick, is it okay if the new brick has bigger size?
15:39 toruonu soo…. any ideas how to elminate all of the .nfs stale files in one go? or do I need to track them down one by one and kill them?
15:39 larsks semiosis: I see you're mostly trolling the logs for status information.  That's what I was hoping to avoid.  I was hoping for tools that could query gluster directly for the current state of things.
15:39 jdarcy Bigger is easy.  Smaller could lead to some odd behavior.
15:39 neofob jdarcy: thanks, just want to confirm
15:39 mooperd_ joined #gluster
15:40 semiosis larsks: why avoid it?  it works
15:40 semiosis larsks: i'm using 3.1.7, in 3.3.0+ there's a (iirc) gluster volume status command you could use to get info as well
15:41 semiosis but idk exactly what that provides
15:41 shireesh joined #gluster
15:41 larsks semiosis: I really prefer my tools to be able to answer questions about their current status rather than having to extract log data that may not be designed for easy machine extraction.
15:41 jdarcy toruonu: The tricky issue is that they might not all be stale.  Some might still be open but unlinked.  One of many ways that NFS makes our lives painful.
15:41 twx_ @yum
15:41 glusterbot twx_: I do not know about 'yum', but I do know about these similar topics: 'yum repo', 'yum repository', 'yum33 repo', 'yum3.3 repo'
15:42 twx_ @yum repo
15:42 glusterbot twx_: kkeithley's fedorapeople.org yum repository has 32- and 64-bit glusterfs 3.3 packages for RHEL/Fedora/Centos distributions: http://goo.gl/EyoCw
15:42 semiosis larsks: i'm on 3.1.7 still, but since 3.3.0+ you have a new 'gluster volume status' command
15:42 semiosis which may be helpful
15:42 jdarcy larsks: There are several status commands, some generic and some specific e.g. to self-heal or rebalance, but they're just status - not really tools for problem determination.
15:42 semiosis as for machine readable, istr something about a --script option
15:42 semiosis but not sure that's what you want
15:43 larsks Yeah, I know, and most them produce output that's really more suitable for human consumption (fixed width rather than delimited fields, heads and separators, etc).
15:43 larsks Don't mind me.  I'm just getting started on all of this.
15:43 larsks (PS: ...and
15:44 larsks ...and "gluster volume status" appears to wrap *columns* in weird ways)
15:44 wushudoin joined #gluster
15:45 Humble joined #gluster
15:46 VSpike when you do "gluster peer status" on two servers in a cluster, is it expected for it to show the same uuid for both?
15:46 semiosis VSpike: definitely not
15:47 VSpike http://pastie.org/5547863
15:47 glusterbot Title: #5547863 - Pastie (at pastie.org)
15:47 Norky VSpike, it if you're stupid like me and clone the gluster server VMs ;)
15:47 VSpike oh wait, I've got old gluster here.. just remembered, I haven't added the PPA to these
15:47 VSpike Norky: i might have done :)
15:48 wnl_work i did that once by accident, and gluster wouldnt let me add the peer, rightly claiming it was a duplicate
15:48 Norky yeah, I soon learnt, if one is using VM templates, install gluster *AFTER* the clone operation
15:48 manik joined #gluster
15:49 wnl_work thats not necessary. just be sure to remove the file containing the UUID before you clone it
15:49 wnl_work it will get recreated
15:49 wnl_work (obviously you dont want to do that on a system that already has gluster running)
15:50 Norky VSpike, I dunno about debian's packaging, but on RHEL it is the package install routine that generates the UUID, so in my case I fixed it by removing and (re-)installing the packages
15:50 Norky ahh, wnl_work's suggestion sounds even simpler :)
15:50 VSpike I just purged the package anyway so i could install the PPA ones
15:51 VSpike I imagine that will do the job .. not sure where the UUID is installed
15:51 wnl_work thats what ive done in AWS to create an image with gluter pre-installed. seems to work fine
15:51 wnl_work in 3.3 it is in /var/lib/glusterd/glusterd.info
15:51 wnl_work older versions probably put it in /etc somewhere
15:52 VSpike Cool thanks, yes it was in /etc/ here
15:52 VSpike Purge didn't remove it either :/
15:52 semiosis VSpike: /var/lib/glusterd/glusterd.info -- if that's missing (such as after first install, or any other time) then glusterd generates a new one when it starts
15:52 semiosis VSpike: it moved from /etc/glusterd to /var/lib/glusterd with the 3.2 -> 3.3 vers. change
15:53 wnl_work semiosis: is the UUID the only thing that will ever appear in that file?  (just curious)
15:53 semiosis wnl_work: afaik yes
15:57 mooperd joined #gluster
15:57 _Scotty Anybody using ZFS as the block store for gluster
15:58 semiosis people have
15:58 VSpike in a fail-over arrangement with gluster, is it reasonable to point all clients at one server in a pair with an arrangement to fail over to the other if the first fails?
15:58 semiosis ~mount server | VSpike
15:58 glusterbot VSpike: (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds, or (#2) Learn more about the role played by the server specified on the mount command here: http://goo.gl/0EB1u
15:59 _Scotty semiosis: any problems i should be aware of?
15:59 VSpike Oh, that's clever. But in that case, it's probably important to add that I'm going to use NFS
15:59 semiosis _Scotty: iirc, there was an issue on some ZFS implementations with xattrs not being supported on all file types
16:00 semiosis glusterfs needs ,,(extended attributes) on files
16:00 jdarcy VSpike: Then yeah, it's reasonable though (obviously) not optimal in terms of load-balancing and performance.
16:00 glusterbot (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://goo.gl/Bf9Er
16:00 VSpike semiosis: I guess with NFS I'd need ideally to have two failovers ... Client A -> Server A (with server B as failover), Client B -> Server B (with server A as failover)
16:01 jdarcy VSpike: Note that there is one case where you actively *do* want to concentrate clients on one server, which is if they're going to rely on NLM for locking.
16:01 daMaestro joined #gluster
16:01 _Scotty semiosis: thanks.
16:03 VSpike jdarcy: yikes. I probably do need locking, since I think Wordpress will require it. I'm really tempted to start with gluster client for simplicity (and given that my load will be reasonably low at first)..
16:03 VSpike and hope that the various caching layers make up for the added slowness. I can keep an eye on it and change it later if it looks sensible
16:04 VSpike Otherwise I can see I'm going to tie myself up in a lot of extra new stuff to learn and definitely take longer to get it right - and possibly make it less reliable through not understanding everything
16:04 _Scotty global namespace question.  i want to move data and home directories to gluster. ideally i would like home directories to be replicated, but not data.  reason being, the storage bricks will be running raid10 but will not be HA. the idea is if the data directory isn't available due to a hw problem, i can swap out a chassis on my own time.  home directories should be always available.  do i need to set up two separate gluster volume
16:06 jdarcy If you want different replication then yes, you need two volumes.  Variable replication within a volume is a distant-road-map item.
16:06 Norky yes, _Scotty, replication (or the lack thereof) is a per-volume setting
16:07 _Scotty hmm
16:07 * Norky is too late to be helpful
16:07 _Scotty can you run multiple volumes across the same hardware
16:07 _Scotty id hate to have to use dedicated hardware for each volume
16:07 Norky yes, you can
16:08 _Scotty how do volume quotas work in that regard? if i don't set them, theoretically users could consume all available space with their home directory usage, leaving nothing for the data volume?
16:09 Norky in my case, I have a shelf of about 30TB (after RAID6) per machine. Each 30TB is a LVM PV and I have 4 separate LVs on each machine that constitute four separate gluster volumes
16:09 bdperkin_gone joined #gluster
16:10 Norky there are other ways you might partition a large 'blob' of physical storage - I'm using LVM because it's the RH-supported method
16:10 _Scotty Norky: thanks for the info.  i'm going to be using 12 drive 1U storage nodes. so, with 2tb drives, gives me 12tb usable in raid10.  there'd be one lvm per storage node.
16:10 rwheeler joined #gluster
16:11 Norky "partition" in the general sense - I'm not necessarily suggesting MSDOS (or GPT as it's >2TB) partition tables
16:12 _Scotty Norky: well, it would ideally be zfs so it shouldnt be an issue.  i just need to make sure the xattrs aren't an issue (per semiosis), so ill include that with my testing.
16:13 Norky ahh, ZFS will certainly handle the division of physical storage into discrete size-limited chunks, but I've no idea bout using it with gluster :)
16:13 _Scotty the zfs mountpoint would be /tank.  so if i understand i could create two gluster volumes pointing to /tank/data and /tank/home for example
16:13 Norky is ZFS doing the RAID, or is this hardware RAID?
16:14 _Scotty Norky: zfs is doing the raid.  i just present it as a local directory to gluster.
16:14 Norky not that it matters terribly, I'm jus tcurious
16:14 _Scotty Norky: i like that zfs provides transparent on-the-fly compression.
16:15 _Scotty Norky: i can take snapshots under zfs as well but id have to see how to present the snapshot dir to Gluster if possible
16:16 Norky so you would do "zfs create /tank/data ; zfs create /tank/home" (and then presumably "zfs set quota")?
16:16 Norky I *think* that would work
16:17 Norky thinking about it, the combination of Gluster and ZFS is interesting
16:17 _Scotty Norky: yeah, i'm running zfsonlinux.  i already set up /etc/zfs/zdev.conf to give meaningful names to the drives.  so it looks like zfs create tank mirror a0 a1 mirror a2 a3, &c. then zfs set compression=on tank.
16:18 Norky test it, or better still wait to see if someone here is more knowledgeable about the two togehter
16:18 _Scotty Norky: i'd use gluster to enforce directory and subdirectory quotas.  much easier to use gluster for that.
16:18 bdperkin joined #gluster
16:19 Norky oo, ZFS on Linux, that's FUSE again isn't it?
16:19 Norky I presumed you were on OpenSolaris or BSD
16:19 _Scotty Norky: on my 12 drive 1U box running zfsonlinux in raid10, i get 900MB/s & 900 IOPS write, 1.8GB/s & 1800 IOPS read
16:20 _Scotty Norky: i use the DKMS kernel module.
16:20 Norky not too terrible
16:21 Norky you've read http://community.gluster.or​g/a/glusterfs-zfs-on-linux/ ?
16:21 glusterbot <http://goo.gl/uqjE8> (at community.gluster.org)
16:21 Norky well, the article it links too
16:21 _Scotty Norky: plus all the data is checksummed. no write holes. default compression is lzjb, so its lightweight and fast for mediocre compression, but it still averages 1.5:1
16:22 _Scotty Norky: i thumbed through it but it's dated.  also i'd never run dedupe on zfs unless it was specifically for a vdi environment
16:22 Norky aye, ZFS has some lovely features, just last time I looked at it I got the impression it really wasn't ready for serious use on Linux
16:23 Norky that was as while ago though :)
16:23 _Scotty Norky: i've been running it for about a year with no issues in my env
16:24 Norky I shoudl stress that I am making no particular recommendations to use one thing or not use another, I have nowhere near enough experience with gluster to do that :)
16:24 _Scotty Norky: ~300TB
16:24 Norky 300TB is "lots", it's not necessarily "serious" ;)
16:25 _Scotty Norky: lol
16:25 Norky I mean, look at my collection of... errm... art... on my home machine, twice that size
16:25 _Scotty Norky: right now that is split between two 81 drive storage servers. i need to move to a distributed filesystem, and im benchmarking orangefs and gluster
16:26 Norky you're benchmarking your actual application, yes?
16:26 Norky i.e. not setting too much faith in 'artificial' benchmarks
16:27 chirino joined #gluster
16:27 Norky I've come to appreciate that the distinction is quite important for distributed filesystems
16:27 _Scotty Norky: yes in a manner of speaking.  it's a subset of what we have.  there are over 1b files ranging in size from 500 bytes to 1TB.  used for everything from an HPC grid to users editing code in their home directories
16:28 _Scotty Norky: i cant really optimize for a specific use case because its literally "everything"
16:28 Norky ahh, fair enough
16:28 Norky what's the HPC you're doing?
16:29 _Scotty Norky: open grid engine and hadoop
16:29 Norky (I'm setting up gluster for a small HPC cluster for CFD here)
16:30 _Scotty Norky: if you are starting from scratch and need grid engine or pbs, try the rocks clusters distribution
16:30 _Scotty Norky: if it's hadoop try cloudera
16:31 Norky oh, no, the cluster we put in about 5 years ago, with PBS Pro (not my favourite, but Altair are good to work with)
16:32 toruonu what ever you do … never ever touch torque or pbs … or maui
16:32 Norky it's just in this case the customer wanted some storage to replace their aging EMC system
16:33 toruonu if you're starting fresh go with slurm
16:33 Norky I like torque and moab
16:33 toruonu I have a HPC center with 5000 cores and 2PB of storage and the main problem maker is torque + maui
16:33 _Scotty Norky: ah ok.  do you work with a lot of small files? one thing i liked about gluster over orangefs is the fact separate metadata isnt required and the data is stored as-is on the underlying brick
16:33 stat1x joined #gluster
16:33 _Scotty toruonu: agreed
16:33 toruonu have you even seen maui source code?
16:34 toruonu or torque for that matter
16:34 toruonu it's hideous and prone to bugs
16:34 toruonu and it has so many scaling issues
16:34 _Scotty toruonu: grid engine has its fair share of bugs too
16:34 toruonu we're in the process of moving from torque to slurm
16:34 toruonu and so far what I have in slurm works wonders
16:34 Norky Maui had some problems that cuased us to go for Moab - I'd say Moab is *muc* better
16:34 toruonu well Moab afaik is based on shared code with maui
16:34 toruonu they add stuff to explain the expensive license
16:34 toruonu btu the core crap is the same
16:35 Norky it costs money, but given the work they've put into turning maui into moab (mostly throing the original away from what I understand) it's worth it
16:35 toruonu well … slurm gives you most of it for free :) and scales to 100k jobs / hour and 65k nodes :)
16:36 Norky *shrug* I've never had a problem with Moab
16:36 toruonu has dual-controller setup so if one goes wacko stuff still works etc
16:36 _Scotty Im pretty much stuck with grid engine forever. its what they coded for, so switching engines is a nonstarter
16:36 toruonu well I know a number of Tier2 centers that have Moab and are pissed about it and pushing for WLCG to start supporting slurm officially
16:37 toruonu our problem right now is that the Grid component (CREAM CE) officially only supports torque and partially SGE schedulers
16:37 Norky its scheduling features are too configurable for the  majority of our customers and their baby clusters, but you dont' have to use them
16:37 toruonu also some LSF
16:37 toruonu it's not the scheduling
16:37 toruonu if it just plain gets stuck or can't create reservations
16:37 toruonu my usual problem is that maui can't fill the cluster
16:37 toruonu 5000+ jobslots
16:37 toruonu and maui runs stuck at ca 3700-4400 range
16:38 toruonu where it just can't create reservations
16:38 toruonu and it's not the only issue :)
16:38 stat1x joined #gluster
16:38 toruonu with loads of jobs the system just doesn't always respond with no real big load etc
16:38 Norky *shrug* I've never tried it on so large a system
16:38 toruonu torque+maui/moab is nice up to ca 800-1000 cores
16:38 Norky speaking of LSF, there I do have an opinion
16:38 Norky LSF itself, fine
16:38 toruonu above that I sometimes wish I could meet the developers to vent the anger
16:39 Norky EnginFrame (web frontend) - horrid, work of the devil
16:39 toruonu never used LSF myself, I know some Tier 1 centers use it...
16:39 toruonu but slurm is quite elegant and simple
16:39 toruonu and it's fast
16:39 Norky slumr is the only one I've never touched
16:39 Norky slurm
16:39 Norky I shall, one day
16:40 toruonu slurm forced me however to shared homes which forced me to gluster :P
16:40 toruonu that's the only downside that I saw in slurm… it doesn't support stagein/out by the scheduler itself
16:40 toruonu you have to have shared home
16:40 Norky ...soon as I've finished this list of jobs...
16:41 aliguori joined #gluster
16:47 Norky toruonu, when was the last time you tried Moab?
16:49 Norky from what I recall from a year ago, talking to the folks at clsuterresources/adaptivecomputing, Moab has gotten much better at scale...
16:50 toruonu well considering that Moab comes from Maui and I hate maui with my whole being I'm unwilling to commit a single cent on this software
16:50 toruonu therefore never :)
16:50 Norky ahh
16:51 isomorphic joined #gluster
16:51 Norky software prejudice
16:51 Norky I can understand :)
16:51 toruonu I've had too much clusterf*** from torque/maui that I'd ever want to use them again unless I really really have to
16:52 Norky there are any number of pieces of software that might actually be quite good, but I will never go near because previous versions, or other software from the same organisation has caused me serious pain
16:52 toruonu and unless they rewrote everything it can't be good… the source code of maui is awful
16:58 khushildep joined #gluster
17:08 zaitcev joined #gluster
17:17 Norky I have Red Hat Storage, which includes Samba, and hooks for Gluster to start it and share out volumes automatically on startup. Given that there's no NIS or Winbindd on RHS, how should I do centralised user accounts so that Samba can allow authenticated Active Directory users to access the volumes?
17:22 johnmark Norky: there are a few RHS users here, but not that many
17:22 johnmark in any case, if anyone here has SAMBA experience, it's probably still applicable
17:23 Norky well I have Samba itself connected to AD
17:24 Technicool Norky, that may be in the RHS admin guide
17:24 mooperd_ joined #gluster
17:24 Technicool can't remember if i read that or just thought i did though...
17:24 Norky but when a user comes to connect, they can authenticate, but the authed user is not associated with any "local" Unix user account
17:25 x4rlos I just added a new gluster volume (replica 2) and the other node didnt have the area mounted (lvm2) - so i mounted, and now try and recreate and get the following error:
17:25 x4rlos volume create: gv0-database: failed: /srv/kvm/test-database or a prefix of it is already part of a volume
17:25 glusterbot x4rlos: To clear that error, follow the instructions at http://goo.gl/YUzrh or see this bug http://goo.gl/YZi8Y
17:25 Technicool Norky, its more of a Samba ? than a Gluster one correct?
17:25 Norky Technicool, yeah, I'm looking at the manual, possibly blindly, my mind is starting to fail
17:25 Technicool we <3 you glusterbot
17:25 Norky not a gluster problem, no
17:25 Norky one specific to RHS
17:26 Technicool Norky, lemme check and see if I am crazier than i assumed
17:27 Technicool Norky, i also thought it had some sort of script in RHS to do the user mapping?   if you have support for the instances you can open up a ticket as well
17:27 x4rlos This is a bit of a bug imo. It started to create volume, then failed. Now its in a bit of limbo. Already bug-reported?
17:28 Technicool x4rlos, it is actually a failsafe and intended to be there
17:28 Norky x4rlos, did you click on both links that glusterbot gave you?
17:29 khushildep joined #gluster
17:29 Norky 2<8Technicool2> Norky, i also thought it had some sort of script in RHS to do the user mapping? -- quite possibly, I'll keep looking
17:30 Norky I really shoudl stop doing that, no need to quite peopel in IRC
17:30 Norky quote*
17:30 Technicool im grepping through the doc now, not seeing anything about user mapping but it might be a KB as well
17:30 mohankumar joined #gluster
17:31 Norky googling got me all sorts of links... to RHEL docs
17:31 Norky and the problem is that RHS is different enough from RHEL in this case that I'm stuck :)
17:33 raghu joined #gluster
17:34 Mo___ joined #gluster
17:34 jack_ joined #gluster
17:35 sjoeboo_ Norky: not a RHS guy, but isn't that really jsut stripped down RHEL to start? can't you just yum install samba-winbind on it ?
17:36 x4rlos Norky: Im sorry, i have appropriately kicked myself in the balls.
17:36 sjoeboo_ we have 1 RHS volume, and I'm pretty sure we've added stuff like this..
17:36 Norky ahah! I was being dim
17:36 johnmark sjoeboo_: that should be feasible
17:37 sjoeboo_ johnmark: i think thats how we rev'd to 3.3.1 :-)
17:37 Norky the answer is samba-winbind, which is available for RHS but not installed  by default
17:37 johnmark Norky: ok
17:37 johnmark sjoeboo_: heh :)
17:37 sjoeboo_ (on the one RHS volume, the big chunk of our gluster is all centos + gluster rpms)
17:37 Norky x4rlos, kick me in the balls while you're at it
17:37 johnmark lulz
17:37 Norky johnmark, thank you anyway
17:37 Norky https://access.redhat.com/knowledge/doc​s/en-US/Red_Hat_Storage/2.0/html/Admini​stration_Guide/sect-Administration_Guid​e-GlusterFS_Client-CIFS.html#idp3752336
17:38 glusterbot <http://goo.gl/kEBkg> (at access.redhat.com)
17:38 Norky it is actually in the admin guide <M
17:38 x4rlos I like gerrit :-./
17:40 Technicool Norky, sent you a link in PM as well
17:42 khushildep joined #gluster
17:42 Norky thanks Technicool, I think a lot of that is in the RHEL HA guide anyway - clustered Samba is the specific example they give for a clustered service
17:42 _Scotty Norky: lol
17:43 Technicool Norky, was thinking more for the AD integration part
17:43 Norky ahh yes, section 2.7
17:43 Norky that's for tomorrow
17:43 Norky it grows late
17:43 Norky I'm still at work
17:44 Norky and dangerously sober, to boot
17:44 Norky I'm off, good evening folks
17:44 Technicool thats a bad combo, work and sober
17:44 Norky indeed
17:44 _Scotty Thanks Norky!
17:44 Technicool its why i keep a flask at my de....i mean....nm
17:44 Norky for what, _Scotty ?
17:44 _Scotty The information & commentary :)
17:44 Norky ahh, no worries
17:45 _Scotty Technicool: speak for yourself.  i have a whole cabinet in my bottom drawer.  the sun is always above the yardarm.
17:45 _Scotty Technicool: generally only tapped on fridays after 5, though.
17:51 ramkrsna joined #gluster
17:51 ramkrsna joined #gluster
17:58 chirino joined #gluster
18:00 rwheeler_ joined #gluster
18:03 chirino joined #gluster
18:07 _Scotty left #gluster
18:07 Kins Hey semiosis, I started from scratch again last night. Still running into problems.
18:18 mooperd joined #gluster
18:19 chirino joined #gluster
18:25 Ryan_Lane joined #gluster
18:26 noob2 joined #gluster
18:28 theron joined #gluster
18:33 morse__ joined #gluster
18:33 johnmark sjoeboo_: ping
18:33 sjoeboo_ johnmark: pong!
18:33 johnmark sjoeboo_: :D
18:33 johnmark sjoeboo_: let's try to set up a Cambridge meetup for January, if possible
18:34 sjoeboo_ yeah, sounds good to me! I'm on vacation towards teh end for a few days but beyond that...
18:34 johnmark sjoeboo_: coolio
18:35 johnmark will flesh out details over email
18:38 sjoeboo_ nice, look forward to it!
18:38 sjoeboo_ we pushed yet another volume into production (WAY too fast) last week
18:40 johnmark haha
18:40 johnmark "way too fast"
18:42 semiosis Kins: the only way to get help here is to describe your problem
18:43 the-me johnmark: could you say me anything positive about the CVE bug one and 3.2.x? :o
18:44 the-me or semiosis :)
18:44 Kins :P Still having the same issue as before. Set up glusterfs from scratch, exactly as described as in QuickStart. Everything works fine. I restart one server, still works. I restart both servers, and I get input/outpout on both servers/clients brick mounts
18:45 semiosis the-me: have not had a chance to look at it sorry
18:46 the-me semiosis: it would be a good idea to prioritize it
18:46 semiosis do you have a client log from that?  it would be helpful if you had a log showing everything from the successful mount all the way to the i/o errors, including server restarts in between
18:47 Kins How about if I zip all logs from both servers?
18:47 semiosis Kins: i'd prefer pastebins
18:47 Kins Ok
18:47 Kins From one server, or both?
18:48 semiosis pastebin all the logs, everything you have
18:48 Kins ok
18:48 semiosis sorry but i dont have time today to play the "do you have this log?" game like we did last night
18:49 Kins Sorry, I really appreciate the help.
18:49 Kins Do you prefer all in one pastebin, or seperate for each log?
18:49 nissim joined #gluster
18:50 semiosis seperated
18:50 semiosis s/per/par/
18:50 glusterbot What semiosis meant to say was: separated
18:59 johnmark the-me: the one you're trying to get us to backport?
19:00 johnmark the-me: I need to forward your note to devs to get some clarification
19:11 Kizano_werk left #gluster
19:14 y4m4 joined #gluster
19:17 Alpinist joined #gluster
19:24 jbrooks joined #gluster
19:25 khushildep joined #gluster
19:26 tryggvil joined #gluster
19:32 Kins semiosis, http://paste.ubuntu.com/1448306/
19:32 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
19:34 Kins paste of pastes
19:38 noob2 joined #gluster
19:41 hattenator joined #gluster
19:48 nissim hi
19:48 glusterbot nissim: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
19:49 nissim does anyone knows how can I list each brick files?
19:50 neofob nissim: ls -R ?
19:50 nueces joined #gluster
19:57 nissim joined #gluster
19:58 nissim oops
20:06 plarsen joined #gluster
20:10 nissim joined #gluster
20:16 bronaugh joined #gluster
20:19 andreask joined #gluster
20:19 bronaugh ok, so what's involved on creating a glusterfs filesystem with one block on top of an existing filesystem containing data?
20:20 dstywho joined #gluster
20:35 JoeJulian Nothing exciting. Just create the volume specifying that filesystem as the brick.
20:36 JoeJulian Assuming I'm understanding your question correctly, of course.
20:36 JoeJulian @glossary
20:36 glusterbot JoeJulian: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
20:38 m0zes any thoughts on using these as building blocks for glusterfs? http://www.quantaqct.com/en/01_product/02_detai​l.php?mid=27&amp;sid=158&amp;id=159&amp;qs=100
20:38 glusterbot <http://goo.gl/A4i9Z> (at www.quantaqct.com)
20:38 m0zes I was thinking a rack full of those, each with 12 3TB disks, 1 256GB ssd and 1 os disk.
20:38 cicero i'll take two
20:39 cicero what about your network
20:39 cicero oh i see
20:39 cicero Network  (4) Intel® 82574L GbE RJ45 ports
20:39 m0zes 10GbE. +$50/node
20:39 cicero (1) Intel® 82599ES 10Gb SFP+ port (optional)
20:39 cicero mm
20:40 m0zes there is also a pcie slot for IB potentially.
20:41 m0zes almost 1.5PB raw in a rack with just 3TB drives.
20:42 m0zes 2PB/rack with 4TB drives is mighty tempting.
20:42 m0zes I am looking to build a "faster than tape" long-term archival system.
20:44 khushildep joined #gluster
20:46 larsks semiosis (et al): re: monitoring, it looks like the git version of glusterfs includes a '--xml' option for, e.g., "gluster volume status" that produces parse-able output.
20:47 semiosis larsks: awesome
20:47 semiosis although xml is so 1995
20:47 johnmark :)
20:47 johnmark larsks: 'tis true
20:48 johnmark m0zes: ah, faster than tape? we can do that!
20:48 larsks semiosis: Yeah, I'll forgive them for using XML as an output format.  At least they don't expect me to write configuration files using XML!
20:50 semiosis hehe
20:51 chirino joined #gluster
20:51 y4m4 joined #gluster
20:52 m0zes johnmark: yeah, no problems there. we want a solution in the 2-10PB range, with a cost of roughly half of what a competing tape system wants. the last time we got a quote for something like this, the software for tape was licensed / GB
20:52 m0zes it was >$1,000,000 for ~1PB of storage.
20:53 elyograg i really detest licensing (and especially subscriptions) based on scale.
20:53 H__ m0zes :)
20:53 johnmark heh
20:54 johnmark um, yeah.... that's... not something I'd buy
20:54 m0zes I can put together a 4PB raw glusterfs system, for just the cost of that licensing.
20:55 elyograg generally I'm a fan of free.  I understand that you can't give support away, so we pay for support contracts where we can't do it ourselves or it makes sense to have experts available.
20:56 * m0zes is in an academic hpc environment. it is easy to get money for hardware. software, not so much.
20:58 Kins Speaking of support, anyone available to do some (hopefully quick) gluster consulting for a fairly simple setup? I'm having issues that I just can't figure out, and I don't have the time to work through it.
21:00 JoeJulian semiosis: I'm thinking of spending some vacation time trying to see if I can write a json version of the xml encoder.
21:00 JoeJulian larsks: What's wrong with freeswitch? ;)
21:00 semiosis sounds relaxing ;)
21:00 larsks JoeJulian: what?
21:01 JoeJulian writing configuration files in xml
21:01 larsks Ah.  Yuck.
21:01 semiosis mmmmmaven
21:01 * semiosis says as he goes into hour 36 of debugging java
21:01 larsks Yeah, developers who think it's a good idea for people to be typing XML by hand need to be smacked.
21:07 johnmark Kins: that reminds me, we shoudl really start a Gluster jobs board
21:07 johnmark semiosis: aaaaaahhhhhh
21:07 Kins johnmark, you should! :P
21:08 Kins Anyone care to comment? http://community.gluster.org/q/ubunt​u-12-04-and-3-3-issue-after-reboot/
21:08 glusterbot <http://goo.gl/BelEV> (at community.gluster.org)
21:13 dbruhn joined #gluster
21:18 bauruine joined #gluster
21:24 the-me johnmark: yes, but you also agreed about this
21:37 semiosis http://paste.ubuntu.com/1448295/ <-- Kins' brick log
21:37 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
21:37 semiosis lots of lstat failed & i/o error
21:38 semiosis JoeJulian: does that ring any bells for you?  idk what to make of it
21:38 semiosis xfs brick filesystem
21:39 m0zes joined #gluster
21:40 johnmark the-me: yes. noted - will harangue our devs
21:42 the-me johnmark: for your users it would be nice to have got a 3.2.8, for me/Debian it would be enough to have got a patch :)
21:42 cicero Kins: are there any firewall ACLs that might get in the way?
21:42 cicero [2012-12-18 00:38:47.911669] W [socket.c:1512:__socket_proto_state_machine] 0-gv0-client-1: reading from socket failed. Error (Transport endpoint is not connected), peer (50.56.110.234:24009)
21:43 cicero Kins: you should probably have the peer addresses go over servicenet not the public interface
21:43 cicero Kins: i.e. 10.x not 50.x
21:43 m0zes joined #gluster
21:46 johnmark the-me: /me nods
21:46 m0zes_ joined #gluster
21:52 johnmark sjoeboo_: you know the Boston area better than I do - do meetups here generally start in the evening? or afternoon?
21:53 Kins cicero, I have opened all the ports required for gluster, and I am using the servicenet.
21:54 Kins I'm not really sure why its trying to use 50.x there
21:55 Kins Hosts file for all 3 only contains servicenet IPs
21:55 Kins 2*
21:59 elyograg are your clients on the service net? I've only been half paying attention, so maybe you answered that already.  if they aren't, that may be a problem.  the clients talk to all the gluster servers directly.
22:00 Kins They are, I have two servers, exactly the same, both running as client and servers.
22:00 Kins I just realized I had port 1111 open instead of 111, could that be the problem? I'm not even sure what that port is used for
22:01 Kins That is a pretty awful example of user error if so :S
22:04 elyograg that actually brings up a question.  If I have two NICs in all my gluster peers, can I put a local LAN ip address in /etc/hosts on all those peers, but have the other LAN address be in DNS for all the clients?  the idea behind that is that gluster would use a dedicated network for heals/rebalance/etc and the other network for accessing the data - gigabit would be less of a bottleneck.
22:04 daMaestro joined #gluster
22:04 Kins Ahh, nevermind. 111 is open, it was just listed in iptables as "sunrpc"
22:07 elyograg all my servers will have four network ports, which means i could have two bonded interfaces - for network redundancy in different switches, not link aggregation.
22:08 guest2012 joined #gluster
22:19 cicero Kins: if you go by hostname resolution it might refer to itself by its eth0 (public) ip by default
22:19 cicero Kins: that might be part of the issue
22:20 Kins cicero, thanks, I'll try deleting the volume/peers and try using IPs
22:26 aliguori joined #gluster
22:34 Kins cicero, doesn't seem to have been the problem!
22:37 johnmark elyograg: yikes, you have ideas??? seize him!
22:37 elyograg i'm a heathen that way. ;)
22:37 johnmark heh heh
22:43 Kins cicero, I did something wrong creating the volume, so ignore the last thing I said
22:43 Gilbs joined #gluster
22:44 Kins Ok yeah, that definetely didn't help.
22:47 dbruhn joined #gluster
22:48 Gilbs Howdy gang, is there any issues with running gluster with haproxy/keepalived?  When trying to probe my 2nd server I get "Probe on localhost not needed" both fqdn and ip.  Ubuntu 12.04/3.3.1.
22:54 JoeJulian @clone
22:54 glusterbot JoeJulian: I do not know about 'clone', but I do know about these similar topics: 'cloned servers'
22:55 JoeJulian @cloned servers
22:55 glusterbot JoeJulian: Check that your peers have different UUIDs ('gluster peer status' on both). The uuid is saved in /var/lib/glusterfs/glusterd.info - that file should not exist before starting glusterd the first time. It's a common issue when servers are cloned. You can delete the /var/lib/glusterfs/peers/<uuid> file and /var/lib/glusterfs/glusterd.info, restart glusterd and peer-probe again.
22:55 JoeJulian The other question is, what's the point of haproxy/keepalived? Is that just for NFS?
22:59 Gilbs haproxy/keepalid for another app, non gluster related.  I never seen that issue so wondered if it was a problem.
23:00 elyograg i plan to use corosync/pacemaker for nfs access.  it'll run on gluster peers with no bricks.
23:01 JoeJulian Not sure if you're using cloned servers, but there's that tip. Then the hostnames used for each server will have to resolve to each server from both servers.
23:01 Gilbs I don't remember if they are cloned, but I will check out the tip, thanks.
23:02 bronaugh ok, what's performance actually like with glusterfs in the real world?
23:02 JoeJulian It meets ,,(Joe's performance metric)
23:02 glusterbot nobody complains.
23:03 JoeJulian The question is, what is your use case and does it's performance meet with your use case requirements.
23:04 JoeJulian Or an even better question would be, what is your use case requirements and can you build a system that will meet those requirements using this tool.
23:05 bronaugh ok, if I want to shove data over infiniband, is it going to pull >1GB/sec or not?
23:05 JoeJulian Yes, it can.
23:05 bronaugh second use case: I have a bunch of tiny files. what kind of additional overhead am I looking at vs a kernel-space filesystem?
23:06 bronaugh I know there's a crapload of context switches and indirection.
23:06 bronaugh but what impact does that really end up having?
23:06 JoeJulian Well, since you're using IB, rdma can avoid a lot of context switches.
23:07 bronaugh it'll help some. but you still need to go through the kernel once to get to glusterd, and then back through again to return data, yes?
23:08 JoeJulian With replicated volumes, you'll have an extra tcp round trip per lookup for self-heal checking. 2 to N round trips if the file doesn't exist where N is the number of distribute subvolumes.
23:08 bronaugh hmm
23:09 bronaugh how well tested is the RDMA code path?
23:09 bronaugh we were using NFS/RDMA until it caused kernel panics; I then read the code and was completely horrified.
23:09 JoeJulian The self-heal check will actually send R packets, one for each replicate subvolume, but they'll be sent in rapid sequence and has to wait for the last one to return.
23:09 bronaugh (it's a complete bag of ass)
23:10 JoeJulian That sounds terrible.
23:10 bronaugh uh yeah. they process an unbounded buffer in an interrupt handler..
23:10 bronaugh and NFS over TCP likes to kick your dog...
23:10 bronaugh (likes to randomly disappear for several minutes if it sees the right kind of load)
23:11 JoeJulian IP/RDMA is considered "tech preview" as of 3.3.1. 3.4.0 has some (apparenetly) major improvements and the team is hoping to consider 3.4.1 to be release quality.
23:11 bronaugh ok, so running it over IP would be safer.
23:11 JoeJulian 3.4.0qa5 is out for testing if you want to try it on your development systems.
23:12 bronaugh yeah. guess it can't take the kernel down, at least.
23:12 JoeJulian s/IP/IB/
23:12 glusterbot What JoeJulian meant to say was: IB/RDMA is considered "tech preview" as of 3.3.1. 3.4.0 has some (apparenetly) major improvements and the team is hoping to consider 3.4.1 to be release quality.
23:13 bronaugh well; that sounds good...
23:13 bronaugh I just don't trust anything in the infiniband space at this point :/
23:13 JoeJulian I've had some people complain, and some swear their installation is the best thing since sliced bread.
23:14 bronaugh yeah uh; you can put me under the complaining column. nothing but headaches.
23:14 JoeJulian The difference seems to be as much hardware driver implementation as much as anything though.
23:14 bronaugh Mellanox ConnectX cards, Mellanox switch, QDR IB
23:15 tryggvil joined #gluster
23:15 JoeJulian http://supercolony.gluster.org/pipermail​/gluster-users/2012-December/035006.html
23:15 glusterbot <http://goo.gl/z3kBi> (at supercolony.gluster.org)
23:16 JoeJulian That thread might interest you.
23:16 bronaugh it's interesting. we've hit 2.9GB/sec using ib_test_whatever.
23:16 bronaugh which is 90% of theoretical.
23:16 bronaugh and we've hit 2.4GB/sec using iperf over TCP/IP with multiple threads
23:17 Gilbs Probe on localhost not needed...   Checked UUIDs, they are both different.  ip/names are correct in hosts file.
23:17 bronaugh our strategy: use Debian OFED packages for userspace, use latest kernel (3.6.7)
23:18 bronaugh the most important thing for TCP/IP performance is putting the link into connected mode and upping the MTU to 64k.
23:21 JoeJulian I'll be back in a few. I've got some stuff that needs shipped within 39 minutes.
23:22 Gilbs Or it's free?
23:22 Gilbs :)
23:39 plarsen joined #gluster
23:42 chirino joined #gluster
23:47 Humble joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary