Camelia, the Perl 6 bug

IRC log for #gluster, 2013-02-20

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:02 Humble joined #gluster
00:11 hagarth joined #gluster
00:31 yinyin joined #gluster
00:44 hagarth joined #gluster
00:45 yinyin joined #gluster
00:45 Humble joined #gluster
01:00 bala joined #gluster
01:10 hagarth joined #gluster
01:31 sjoeboo joined #gluster
01:34 Humble joined #gluster
01:53 hagarth joined #gluster
02:24 hagarth joined #gluster
02:48 Humble joined #gluster
02:50 sjoeboo joined #gluster
03:06 glusterbot New news from resolvedglusterbugs: [Bug 764890] Keep code more readable and clean <http://goo.gl/p7bDp>
03:08 pipopopo joined #gluster
03:11 duffrecords joined #gluster
03:13 Humble joined #gluster
03:14 hagarth joined #gluster
03:17 duffrecords I just tried to mount a Gluster volume via NFS but got the "wrong fs type, bad option, bad superblock" error.  it's currently mounted on several other servers without any apparent problem but any server that tries to mount it now gets that error.  how can I tell if I need to fsck the brick?
03:18 m0zes duffrecords: what was the mount command?
03:22 sahina joined #gluster
03:23 duffrecords mount /Users
03:23 duffrecords it's configured in fstab
03:23 duffrecords 10.80.80.100:/homes     /Users  nfs     defaults,_netdev,nfsvers=3        0       0
03:24 m0zes duffrecords: glusterfs nfs requires tcp.
03:24 m0zes 'mount -t nfs fileserver:volume /mnt/point -o vers=3,tcp,nolock,noatime' usually works for me.
03:26 duffrecords I get the same error
03:27 duffrecords is there a way to check the health of a Gluster volume?
03:29 m0zes rpcinfo and gluster volume status
03:30 m0zes rpcinfo checks to see that the nfs daemon registered with rpc. mine failes to every now and then'
03:33 m0zes rpcinfo -t 10.80.80.100 100003 3
03:34 m0zes that should check if the nfs server itself is listening correctly
03:35 satheesh joined #gluster
03:53 mohankumar joined #gluster
03:57 vpshastry joined #gluster
03:58 pai joined #gluster
04:01 duffrecords those commands seem to indicate everything's ok.  I'll have to come back to this tomorrow.  thanks for your help
04:10 rastar joined #gluster
04:14 rastar left #gluster
04:16 sripathi joined #gluster
04:19 yinyin joined #gluster
04:31 deepakcs joined #gluster
04:40 overclk joined #gluster
04:42 bala1 joined #gluster
04:42 bulde joined #gluster
04:43 lala joined #gluster
04:47 raghu joined #gluster
04:49 shylesh joined #gluster
05:06 vpshastry joined #gluster
05:12 shireesh joined #gluster
05:20 vpshastry1 joined #gluster
05:24 yinyin joined #gluster
05:32 yinyin joined #gluster
06:09 mohankumar joined #gluster
06:13 anmol joined #gluster
06:22 rastar joined #gluster
06:24 overclk joined #gluster
06:36 glusterbot New news from resolvedglusterbugs: [Bug 887711] Cannot delete directory when special characters are used. <http://goo.gl/MOc1N> || [Bug 865914] glusterfs client mount does not provide root_squash/no_root_squash export options <http://goo.gl/hJiLH> || [Bug 886041] mount fails silently when talking to wrong server version (XDR decoding error) <http://goo.gl/8CIQD>
06:52 glusterbot New news from newglusterbugs: [Bug 859250] glusterfs-hadoop: handle stripe coalesce in quickSlaveIO <http://goo.gl/FtWSm> || [Bug 861947] Large writes in KVM host slow on fuse, but full speed on nfs <http://goo.gl/UWw7a> || [Bug 892808] [FEAT] Bring subdirectory mount option with native client <http://goo.gl/wpcU0>
06:56 Humble joined #gluster
06:58 Nevan joined #gluster
07:02 bala1 joined #gluster
07:05 vshankar joined #gluster
07:15 Humble joined #gluster
07:22 rgustafs joined #gluster
07:22 vikumar joined #gluster
07:23 guigui1 joined #gluster
07:25 bulde joined #gluster
07:26 sripathi1 joined #gluster
07:26 jtux joined #gluster
07:33 Humble joined #gluster
07:47 sripathi joined #gluster
07:48 guigui3 joined #gluster
07:51 ngoswami joined #gluster
07:53 glusterbot New news from newglusterbugs: [Bug 912997] gluster volume create shows 'device vg' option even if BD backend support not compiled <http://goo.gl/3v0rt>
07:53 ekuric joined #gluster
07:55 ctria joined #gluster
08:00 Nevan joined #gluster
08:02 jtux joined #gluster
08:07 puebele1 joined #gluster
08:18 timothy joined #gluster
08:18 sripathi joined #gluster
08:18 timothy Hi All!
08:18 timothy I am looking for installing glusterfs-swift packages in debian.
08:18 timothy I could install other glusterfs packages from http://download.gluster.org/pub/​gluster/glusterfs/LATEST/Debian.
08:18 timothy Will someone please help me installing?
08:18 glusterbot <http://goo.gl/uDVZL> (at download.gluster.org)
08:19 bulde joined #gluster
08:21 timothy both squeeze and wheezy repo does not contain glusterfs-swift packages for DEBIAN in download.gluster.org
08:23 ndevos timothy: I think that is because those packages conflict with the openswift packages, they contain some patched openswift bits
08:24 timothy ok ...
08:25 puebele1 joined #gluster
08:26 ndevos timothy: but maybe semiosis or kkeithle can add them somehow, you could file a bug about that
08:26 glusterbot http://goo.gl/UUuCq
08:28 timothy ndevos: thank you Niels, I will check with kkeithle
08:29 cw joined #gluster
08:30 ndevos timothy: sure, and if you lurk in here, maybe someone else could provide more/better details than me
08:30 andreask joined #gluster
08:32 timothy ok, :)
08:33 bulde joined #gluster
08:36 ramkrsna joined #gluster
08:44 Staples84 joined #gluster
08:45 jtux joined #gluster
08:47 andreask joined #gluster
08:49 WildPikachu joined #gluster
08:55 hagarth joined #gluster
08:56 duerF joined #gluster
08:58 gbrand_ joined #gluster
09:10 sripathi joined #gluster
09:20 rotbeard joined #gluster
09:23 glusterbot New news from newglusterbugs: [Bug 826512] [FEAT] geo-replication checkpoint support <http://goo.gl/O6N3f> || [Bug 830497] [FEAT] geo-replication failover/failback <http://goo.gl/XkT0F> || [Bug 847839] [FEAT] Distributed geo-replication <http://goo.gl/l4Gw2> || [Bug 847842] [FEAT] Active-Active geo-replication <http://goo.gl/Z41og> || [Bug 847843] [FEAT] Improving visibility of geo-replication session <http://goo.gl/fr
09:33 timothy joined #gluster
09:35 guigui3 joined #gluster
09:38 hybrid512 joined #gluster
09:43 sripathi joined #gluster
09:46 dobber_ joined #gluster
09:49 tryggvil joined #gluster
09:52 Humble joined #gluster
09:59 timothy joined #gluster
10:02 Staples84 joined #gluster
10:10 aravindavk joined #gluster
10:13 shireesh joined #gluster
10:17 shireesh joined #gluster
10:19 shireesh_ joined #gluster
10:19 sripathi joined #gluster
10:20 rwheeler joined #gluster
10:24 sahina joined #gluster
10:29 aravindavk joined #gluster
10:33 Humble joined #gluster
10:34 edward1 joined #gluster
10:36 vshankar joined #gluster
10:40 andreask joined #gluster
10:44 Staples84 joined #gluster
11:04 aravindavk joined #gluster
11:15 Humble joined #gluster
11:21 duerF joined #gluster
11:23 bala1 joined #gluster
11:39 Humble joined #gluster
11:40 sjoeboo joined #gluster
11:40 VSpike I created some gluster servers and a volume. I can mount the volume from one of the servers itself, but I cannot mount it from a client. There is (unfortunately) a firewall between server and client. As far as I can tell, I'm using the same rules I use for another client/server arrangement yet it fails to work.
11:40 VSpike The error I get just says "Mount failed. Please check the log file for more details."
11:41 VSpike I have checked the logs, but I've not yet developed the skill of picking out the useful stuff from the noise in gluster logs, or knowing which file to look in
11:42 VSpike I have ports 24007, 24009 and 111 open for TCP and UDP
11:42 VSpike Could someone please suggest how I can debug this or what I can try next?
11:43 andreask joined #gluster
11:44 hagarth joined #gluster
11:44 VSpike I can use netcat from the client to connect to 24007 and 24009 and type junk, which causes complaints in the server log
11:45 VSpike Should I see something listening on UDP 111?
11:47 VSpike Aha.. on my servers that work, I do udp        0      0 *:sunrpc                *:*                                 593/rpcbind
11:48 VSpike I lack an rpcbind process on the problematic servers
11:49 VSpike How many times has someone said that http://community.gluster.org/ is down?
11:53 Norky RPC is only if you want to use NFS access
11:53 Norky it's not necessary for 'native' Gluster mounts
11:54 VSpike Ah - OK
11:54 VSpike I was puzzled as to why I didn't even have rpcbind installed on those servers :)
11:58 Humble joined #gluster
11:58 lh joined #gluster
11:58 lh joined #gluster
11:58 VSpike Wait, what?
11:59 VSpike The non-working servers hadve gluster 3.2.5 on them... wtf? :)
12:01 VSpike I have semiosis ppa but aptitude offers me no updates .. my apt/deb fu is clearly weak
12:09 rsevero_ joined #gluster
12:11 rsevero_ Hi. I have a few glusterfs servers which I would like to upgrade to some kernel past 3.2.21. As I can't maintain my ext4 filesystems (because of the 32/64bit hashs issue) and don't want to use XFS because of the massive file loss in case of power loss, which filesystem should I use? Ideas?
12:14 atrius joined #gluster
12:23 theron joined #gluster
12:24 Humble joined #gluster
12:40 jclift_ joined #gluster
12:41 vpshastry1 left #gluster
12:44 hchiramm_ joined #gluster
12:46 aravindavk joined #gluster
12:50 raven-np joined #gluster
12:51 kkeithley joined #gluster
12:52 gbrand__ joined #gluster
12:53 gbrand___ joined #gluster
12:55 gbrand__ joined #gluster
12:57 sjoeboo joined #gluster
12:58 lh joined #gluster
12:58 bulde joined #gluster
13:02 Humble joined #gluster
13:07 rastar left #gluster
13:08 dustint joined #gluster
13:11 aravindavk joined #gluster
13:11 mooperd joined #gluster
13:17 hagarth :O
13:22 Humble joined #gluster
13:23 aravindavk joined #gluster
13:27 ThatGraemeGuy joined #gluster
13:28 ThatGraemeGuy hi semiosis, are you around?
13:28 duerF joined #gluster
13:29 sjoeboo joined #gluster
13:33 ThatGraemeGuy or anyone else familiar with semiosis' PPA that adds upstart support for gluster?
13:33 rwheeler joined #gluster
13:34 abyss^_ I'm looking for glusterfs client/common 3.2.7 (or higher?) to debian lenny. Someone can help? Packages for wheezy or smth are not good because of depend libc6 > 2.8
13:35 Humble joined #gluster
13:36 satheesh joined #gluster
13:38 kkeithley @ppa
13:38 glusterbot kkeithley: The official glusterfs 3.3 packages for Ubuntu are available here: http://goo.gl/7ZTNY
13:38 kkeithley ThatGraemeGuy: ^^^
13:38 ThatGraemeGuy joined #gluster
13:39 jclift__ joined #gluster
13:39 abyss^_ kkeithley: 3.3 but 3.2?
13:39 abyss^_ 3.2.7 is only for precise ubuntu and this have to high libc6 version
13:41 kkeithley I don't know, semiosis does the ppa/.debs for ubuntu
13:43 kkeithley @seen
13:43 glusterbot kkeithley: (seen [<channel>] <nick>) -- Returns the last time <nick> was seen and what <nick> was last seen saying. <channel> is only necessary if the message isn't sent on the channel itself. <nick> may contain * as a wildcard.
13:43 kkeithley @seen semiosis
13:43 glusterbot kkeithley: semiosis was last seen in #gluster 18 hours, 47 minutes, and 10 seconds ago: <semiosis> gotta run, good luck
13:43 ThatGraemeGuy thanks :)
13:43 ThatGraemeGuy I'll drop him an email then
13:43 kkeithley @later
13:43 glusterbot kkeithley: I do not know about 'later', but I do know about these similar topics: 'latest'
13:44 lala joined #gluster
13:44 bennyturns joined #gluster
13:45 abyss^_ btw: client version 3.3.1 should work with 3.2.7 server version? Maybe I just compile client ;)
13:47 balunasj joined #gluster
13:47 x4rlos abyss^_: i don't think that will work.
13:55 Humble joined #gluster
13:56 dobber_ joined #gluster
14:02 ndevos abyss^_: no, that wont work, 3.2 is not compatible with 3.3
14:11 abyss^_ ok, thank you and you :D
14:15 tomsve joined #gluster
14:15 hagarth @channelstats
14:15 glusterbot hagarth: On #gluster there have been 89411 messages, containing 3953362 characters, 664160 words, 2728 smileys, and 328 frowns; 649 of those messages were ACTIONs. There have been 30802 joins, 1063 parts, 29786 quits, 12 kicks, 104 mode changes, and 5 topic changes. There are currently 182 users and the channel has peaked at 203 users.
14:20 GabrieleV joined #gluster
14:20 jclift joined #gluster
14:21 GabrieleV Hello ! What about writing files directly in the exported directory of a 3.0.5 server ? I tested with 1 file and I found it on the replicated server and on the clients. Is it safe ? Thank you.
14:26 JusHal joined #gluster
14:27 JusHal having issues mounting a glusterfs volume on boot, network is not up yet when it is tried. But it seems mount.glsuterfs 3.3.1 does not accept _netdev anymore: unknown option _netdev (ignored)
14:28 ndevos JusHal: _netdev is used by rc.sysinit and the netfs service (on rhel base distributions)
14:28 ThatGraemeGuy JusHal: I have the same issue :(
14:30 ndevos ... and it is correct that mount.glusterfs ignores the _netdev option
14:30 JusHal ndevos: ah ok, let me check then
14:31 JusHal chkconfig netfs on , solved my issue
14:31 JusHal ndevos: thanks
14:31 ndevos JusHal: you're welcome :)
14:32 aliguori joined #gluster
14:33 ThatGraemeGuy ndevos: I'm using Ubuntu 12.04 with the "official" PPA, and my volume listed in /etc/fstab isn't mounting at boot. any ideas?
14:34 ThatGraemeGuy https://launchpad.net/~semiosis​/+archive/ubuntu-glusterfs-3.3 <-- that PPA i mean
14:34 glusterbot <http://goo.gl/7ZTNY> (at launchpad.net)
14:34 ThatGraemeGuy 'sudo mount -a' after boot completes works fine
14:35 ndevos ThatGraemeGuy: not really, but I think ubuntu/debian use a similar option like _netdev, but its called different :-/
14:35 ThatGraemeGuy its my understanding that this behaviour is expected with the ubuntu default glusterfs packages, but the package in the PPA has an upstart job that's supposed to resolve the issue
14:35 ThatGraemeGuy ah, ok. i just need to find out what that option is then
14:36 ndevos yeah, try search the irc logs, it's been mentioned a couple of times
14:37 ndevos now, if one only could search through http://www.gluster.org/interact/chat-archives/
14:37 glusterbot Title: Chat Archives | Gluster Community Website (at www.gluster.org)
14:37 tqrst can someone explain to me how 'gluster volume rebalance myvol status' could show two new nodes which are in the same replica pair as having a different number of rebalanced files?
14:37 rgustafs joined #gluster
14:38 ThatGraemeGuy ndevos: searching, will see if I can find it, thanks
14:50 Humble joined #gluster
14:54 Humble joined #gluster
14:54 sjoeboo joined #gluster
14:56 hagarth joined #gluster
14:58 hchiramm_ joined #gluster
14:58 bennyturns joined #gluster
14:59 ThatGraemeGuy ndevos: made some changes to my fstab, using the server hostname instead of localhost, and using 'nobootwait' instead of '_netdev', which apparently does nothing on ubuntu. now i get this in my client mountpoint log: 0-glusterfs: DNS resolution failed on host ...
14:59 stopbit joined #gluster
15:00 ThatGraemeGuy DNS is fine, and 'sudo mount -a' works after booting completes, so i guess i'm still searching for the elusive fstab option that will tell the mount process to wait for the network, as i assume that what's happening is that its mounting just before dns is operational
15:00 ndevos ThatGraemeGuy: right, 'nobootwait' sounds familiar, but normally you want to mount localhost...
15:00 ThatGraemeGuy these are VMs that boot insanely quickly, which may not be helping me in this instance
15:01 ThatGraemeGuy ok i'll try localhost again. 'nobootwait' was required to stop it from halting the boot process and requiring keyboard input to resume when the mount didn't succeed
15:02 gbrand_ joined #gluster
15:03 semiosis ThatGraemeGuy: fixed a couple bugs related to mounting from localhost at boot last week, have you updated recently?
15:03 semiosis also, are you mounting from localhost or remote server in fstab?
15:03 ThatGraemeGuy i installed a few hours ago, so i guess that depends on your definition of "recently" :)
15:04 ThatGraemeGuy localhost:/emags /home/emags glusterfs defaults,nobootwait 0 0
15:04 ThatGraemeGuy that's how it is now, not working. using the server's hostname gives me the "dns resolution failed" message in the client mountpoint log
15:06 ThatGraemeGuy apt-get update/dist-upgrade is showing no updates so i assume i'm up-to-date
15:11 semiosis ThatGraemeGuy: interesting, i'll see if i can reproduce it.  is this a physical or virtual machine?
15:11 ThatGraemeGuy 2 VMs
15:11 semiosis ok
15:11 jdarcy joined #gluster
15:12 semiosis fyi, ubuntu doesn't have a _netdev option, it just tries mounts before & after networking has started as far as i can tell
15:13 ThatGraemeGuy yes, i realised that after a bit of initial googling
15:14 ThatGraemeGuy interestingly though, i have 'defaults,nobootwait' in fstab right now, yet 'mount -a' still says 'unknown option _netdev (ignored)'
15:14 ThatGraemeGuy so it may not understand it, but i assume 'defaults' is still putting it there
15:15 semiosis so, defaults is a placeholder when you're not setting any other opts, if you have anything else to put there, you can drop the "defaults"
15:15 semiosis you'll still get defaults unless you override them
15:16 ThatGraemeGuy sweet, i learned something today, thanks :)
15:16 semiosis the _netdev ignored message is from the glusterfs client itself, since that is an fstab option not an actual mount option
15:17 balunasj joined #gluster
15:18 plarsen joined #gluster
15:20 rsevero_ Hi. As there is much more active users now I'm reposting this. I have a few glusterfs servers which I would like to upgrade to some kernel past 3.2.21. As I can't maintain my ext4 filesystems (because of the 32/64bit hashs issue) and don't want to use XFS because of the massive file loss in case of power loss, which filesystem should I use? Ideas?
15:22 jdarcy rsevero_, what was the last XFS version where you saw such loss of files?
15:23 ThatGraemeGuy semiosis: not sure if this means anything, but with only 'nobootwait' as an fstab option, 'mount -a' still reports 'unknown option _netdev (ignored)'
15:23 Humble joined #gluster
15:23 semiosis ?!
15:24 ThatGraemeGuy yeah, that's what i thought too :-/
15:24 semiosis ThatGraemeGuy: please ,,(pasteinfo)
15:24 glusterbot ThatGraemeGuy: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
15:25 rsevero_ jdarcy: Can't even say. Tried a few years ago and since then never come near XFS again. Has this changed? I always understood that's expected on XFS.
15:25 ThatGraemeGuy semiosis: http://dpaste.org/VmriP/
15:25 glusterbot Title: dpaste.de: Snippet #219658 (at dpaste.org)
15:27 bugs_ joined #gluster
15:28 bala joined #gluster
15:30 sripathi1 joined #gluster
15:33 semiosis ThatGraemeGuy: unable to reproduce the problem on my precise test vm.  could you please gather & send logs?  first clear any existing client log, which would be /var/log/glusterfs/home-emags.log, then reboot and pastie.org that log file please
15:35 ThatGraemeGuy ok, cleared logs, rebooting
15:35 jskinner_ joined #gluster
15:36 ThatGraemeGuy semiosis: http://dpaste.org/kXstd/
15:36 glusterbot Title: dpaste.de: Snippet #219660 (at dpaste.org)
15:37 semiosis hmmm
15:38 bdperkin_gone joined #gluster
15:40 ThatGraemeGuy semiosis: i've just noticed that the mount succeeded on the other box
15:40 semiosis :D
15:41 ThatGraemeGuy timestamps are 40 sec later
15:41 semiosis there's something funny about your networking, idk why dns wouldn't work at that point in boot on a VM
15:41 semiosis and too busy to dive into it atm, sorry
15:41 JusHal left #gluster
15:42 ThatGraemeGuy ok no problem, i'm heading home soon anyway
15:42 atrius joined #gluster
15:42 ThatGraemeGuy thanks for trying
15:43 semiosis i would try a remote nfs mount in fstab, trying to mount the volume via nfs from the other server, to see if the kernel nfs client can resolve DNS during boot
15:44 ThatGraemeGuy ok, i will have a go at that in the morning
15:45 bdperkin joined #gluster
15:47 bala joined #gluster
15:55 cjohnston_work question: playing around with geo-replication and I have confirmed root passwordless SSH is working, yet I am seeing a "faulty" error and my logs indicate to me that master is having issues initiating a pickle connection to the slave (or in reverse).  Any ideas?
16:00 ThatGraemeGuy left #gluster
16:04 daMaestro joined #gluster
16:06 Humble joined #gluster
16:19 mohankumar joined #gluster
16:20 bala joined #gluster
16:22 y4m4 joined #gluster
16:24 ramkrsna joined #gluster
16:24 neofob joined #gluster
16:31 lala joined #gluster
16:34 xian1 joined #gluster
16:45 elyograg VSpike: when you look at gluster volume info, you get shown a list of ports for bricks and the port number that each brick is using.  You'll need to have port 24007 and every single one of the brick ports open.  If you add bricks, more ports will need to be opened.
16:46 plarsen joined #gluster
16:47 rsevero_ joined #gluster
16:48 elyograg rsevero_: improved barrier support should make data loss with XFS a thing of the past.  if you have a disk controller with battery backed cache you can turn off barriers for increased performance without data loss worries.
16:50 jdarcy The main point IMO is that I'd rather trust a filesystem that had a bug years ago than one that I know for sure has introduced just-as-serious bugs in the last few months (and which has a clearly broken development process).
16:51 sahina joined #gluster
16:51 rsevero_ elyograg: First of all, thanks for your attention. I don't have a disk controller with battery backed cache so I believe I would have to turn on barriers.
16:52 * jdarcy doesn't trust code from maintainers who typically test only on their laptops and often push patches which weren't even tested that much (or reviewed at all).
16:52 elyograg rsevero_: any recent kernel version (probably including later 2.6 too) should have them on by default.
16:52 jdarcy elyograg: ...along with many changes to ameliorate the performance impact
16:52 rsevero_ jdarcy: As I mentioned before, AFAIU the problem with extensive data loss on XFS on power outages where expected, not a bug. But I can definitely be wrong about this.
16:53 rsevero_ I will take a look at the barriers issue and maybe do some tests to see how it goes.
16:54 elyograg without the battery backed cache, any filesystem can have minor data loss on power interruption, but the days of extreme file corruption are hopefully gone.  I have seen that kind of corruption before, but I specifically remember that in that situation, there was something about the HDD controller that resulted in the kernel saying "no barier support" as it mounted the filesystems.
16:55 rsevero_ elyograg: I not talking about minor data loss as it's obviously expected after a abrupt power loss, I'm talking about several old files disappearing with XFS. Something I've only seem before in ReiserFS.
16:56 elyograg rsevero_: yeah, I get it.  When I saw problems before, it was on a server with a MySQL database.  The databases were so corrupted that they couldn't be salvaged.
16:57 semiosis if you're concerned about data loss, use glusterfs to replicate between datacenters :)
16:58 rsevero_ semiosis: Unfortunatelly I don't have the necessary bandwidth ;)
16:58 semiosis aw
16:58 elyograg cheap, fast, reliable.  pick two.
16:59 rsevero_ elyograg: I'm trying to get _some_ of the three.
17:00 elyograg where I work, we tend to go with cheap and reliable. ;)
17:02 rsevero_ elyograg: That works for me. I'm trying to get a post 3.2 kernel also. This extra seems to be rather difficult right now.
17:02 elyograg sometimes I catch developers focusing on performance at the expense of reliability -- usually after a problem or a failure has caused complete collapse. :)
17:03 rsevero_ elyograg: People can act really crazy, I know...
17:03 cjohnston_work is xattr needed for geo-replication?
17:04 semiosis recently had a developer trying to do better than O(n) when the input would never be more than ~20
17:04 semiosis come on!
17:04 elyograg wow.
17:07 elyograg optimization at that level makes sense for tight kernel code in a place that gets executed a lot, but generally speaking you've got tons of CPU to spare in most development situations.
17:07 semiosis yeah the latter :)
17:09 semiosis cjohnston_work: probably not, but what are you really asking?
17:10 cjohnston_work I managed to geo-replication started
17:10 semiosis great
17:10 cjohnston_work however
17:11 cjohnston_work getting an error from the master which looks like the query_xattr() python method is running
17:11 cjohnston_work and getting a bad return code
17:12 cjohnston_work then transport endpoint is not connected error, likely from the bad return and raise_oserr() call
17:14 zaitcev joined #gluster
17:18 xian1 folks, barrier support in xfs will not stop files from disappearing.  it will stop your journal from becoming corrupt, which could destroy your file system, unless you are an xfs_db guru.  Also, even with battery-backed cache on your RAID controllers, if you have the hard drive write cache on, none of the other protections you hoped for will guarantee your safety.  cheers.
17:20 rsevero_ xian1: But if I disable write cache I will be fine?
17:22 mynameisdeleted joined #gluster
17:22 mynameisdeleted hi all
17:23 mynameisdeleted was in glusterfs and was surprised that I got an immediate response to me joining.. then I got dissappointed to find out it was a bot
17:23 mynameisdeleted telling me to go here
17:23 mynameisdeleted so... for shared network-based massive shared filesystem for large house or small office thats linux-based
17:24 mynameisdeleted glusterfs can guarantee all files get saved twice while letting me scale up on storage beyond what can fit in one box with 6 sata ports
17:24 mynameisdeleted raid5 or raid0 is best for single file read/write speed but gluster will help isolate one users desktop read/write performance from another
17:25 mynameisdeleted so one person running updatedb or loading 100GB of picture-data wont slow everyone else down much
17:26 mynameisdeleted lustre is only good for single-file read performance and terrible for apps such as webserver and desktop from what I know
17:26 mynameisdeleted openafs is very non-high-performance but scales well to lots of users
17:26 elyograg mynameisdeleted: a gluster FUSE client writes to all replicas at once.  A gluster NFS client writes to the NFS server which then writes to all replicas at once.
17:27 mynameisdeleted nfs client is faster?
17:27 mynameisdeleted also if all boxes are linux based they can just run gluster client right?
17:27 elyograg nfs client works differently.  it speaks NFS to one of the gluster servers, then from there it is using a FUSE client.
17:27 mynameisdeleted glusterfs-client on debian
17:27 semiosis @later tell ThatGraemeGuy i tried reproducing the problem on EC2 and got the same result you did! something is up with resolving hostname 'localhost'  -- i added another alias for 127.0.0.1 in /etc/hosts, '127.0.0.1 localhost gluster' and changed fstab to use gluster instead of localhost, and it worked!
17:27 glusterbot semiosis: The operation succeeded.
17:28 mynameisdeleted so is glusterfs faster than nfs?
17:28 mynameisdeleted I've used fuse clients before like sshfs and they are slow
17:29 mynameisdeleted I wanted all infiniband based linux work stations to use fully offloaded data-reads that can saturate 10gbps or better
17:29 mynameisdeleted everything is debian-linux based
17:29 elyograg i can't really answer that question.  ultimately it's still using fuse, but some of the client-side caching benefits inherent in NFS do apply.
17:30 mynameisdeleted maybe I'll also look at ibm gpfs
17:30 mynameisdeleted thats used in a lot of hpc environments
17:30 mohankumar joined #gluster
17:32 mynameisdeleted ntfs3g is ok and uses fuse
17:32 tqrst is it just me or does community.gluster.org return "500- server error"?
17:32 cjohnston_work so I am alone on this issue where geo-replication is failing on this _query_xattr command?
17:33 mynameisdeleted its me  too
17:33 semiosis tqrst: heard something about that site going down in march, didnt expect it to be down this soon though.  /cc johnmark
17:34 tqrst semiosis: down as in being decommissioned?
17:34 semiosis idk the details but that's my understanding yes :(
17:34 tqrst (I wouldn't be surprised, given that most questions on it went unanswered)
17:34 semiosis and most of my answers went unacknowledged
17:35 xavih left #gluster
17:35 mynameisdeleted I got a web server
17:35 mynameisdeleted anyone got a site backup?
17:36 cjohnston_work im imaginging this a common bug somehwere - could just be the release I am on (3.2.7)
17:36 hagarth joined #gluster
17:37 mynameisdeleted is glusterfs3.4 a big improvement wiht block device translator and qemu provisioning?
17:37 mynameisdeleted guess for virtualization that is an improvement
17:37 xian1 rsevero_: HDD write cache should be disabled on any file system where you care about your data.  RAID controller write cache should be disabled if you don't have battery-backed cache, as well.
17:38 semiosis mynameisdeleted: havent heard much yet, 3.4 is not yet even GA
17:38 semiosis @qa releases
17:38 glusterbot semiosis: The QA releases are available at http://bits.gluster.com/pub/gluster/glusterfs/ -- RPMs in the version folders and source archives for all versions under src/
17:38 tqrst I've had to deal with full servers dying, but never an actual hard drive failure... until now. The hard drive backing the brick foo:/mnt/bar on my 20x2 distributed-replicate died. I stopped glusterd on foo, replaced the dead hard drive, created a new /mnt/bar. What now? "volume replace-brick myvol foo:/mnt/bar foo:/mnt/bar" sounds a bit silly.
17:39 tqrst (surprisingly enough, "Replacing a dead brick" is not a section in the admin guide)
17:39 mynameisdeleted so is glusterfs production grade?
17:40 semiosis mynameisdeleted: yes
17:40 mynameisdeleted and yet tqrst has a filesystem dying... I gues to make a production grade deployment one must practice all recovery features
17:40 mynameisdeleted on a test filesystem
17:40 mynameisdeleted so one knows what works and doesn ton production
17:41 tqrst mynameisdeleted: what? Hard drives fail all the time.
17:41 semiosis not gluster's fault
17:41 tqrst I don't see what gluster has to do with that
17:41 mynameisdeleted yeah.. but knowing how to recover from that is essential to production deployment
17:41 mynameisdeleted and making sure sys-admins are well versed in that
17:41 tqrst I'm hoping that bit is me failing at reading the docs
17:41 semiosis so the question becomes, is the admin production grade? :)
17:41 tqrst the admin is not an actual admin ;p
17:41 Kins joined #gluster
17:42 mynameisdeleted I've never failed to recover raid
17:42 mynameisdeleted if a disk fails
17:42 semiosis tqrst: did you kill the glusterfsd brick export daemon for that brick?
17:42 mynameisdeleted I'm iffy about goign "raid0" over 2 glusterfs nodes because then I've turned a single point of failure into double
17:42 mynameisdeleted but I can go copy-twice on 3 nodes
17:43 mynameisdeleted I had a motherboard die last year on a box I was about to make a node out of for gluster
17:43 semiosis tqrst: you should kill it and restart glusterd to respawn it.  also what version of glusterfs?  healing should begin automatically with 3.3.0+
17:43 mynameisdeleted fortunately I didn tmove any real data
17:43 tqrst semiosis: 3.3.1
17:43 semiosis tqrst: i suspect once you respawn a healthy glusterfsd process for that brick healing should commence
17:44 mynameisdeleted does glusterfs load-balance read and write requests?
17:44 tqrst semiosis: (and glusterd is down right now, so yes glusterfsd is also down :))
17:44 tqrst semiosis: yeah, I was just worried that it might dislike waking up to an empty /mnt/bar
17:44 mynameisdeleted and I better use source compile or debian backports for glusterfs as debian squeeze uses 3.0.3
17:44 rsevero_ xian1: Thanks for your tip. I will try it and see if the result isn't too drastic in terms of performance loss.
17:44 semiosis tqrst: are you *sure* about that?  on most distros stopping glusterd does not affect glusterfsd
17:44 semiosis tqrst: check with ps!
17:45 mynameisdeleted auto-healing is a must for me
17:45 tqrst semiosis: double checked
17:45 timothy joined #gluster
17:45 semiosis mynameisdeleted: the community maintains a debian repo also, see ,,(latest)
17:45 glusterbot mynameisdeleted: The latest version is available at http://goo.gl/zO0Fa . There is a .repo file for yum or see @ppa for ubuntu.
17:46 tqrst spinning glusterd back up, here's hoping
17:46 mynameisdeleted anyone awsome
17:46 mynameisdeleted let me add that
17:47 semiosis mynameisdeleted: re: load balancing, it only applies to replicas... reads are balanced automatically between replicas, writes go to all replicas.
17:47 Mo___ joined #gluster
17:50 tqrst semiosis: brought it back up. Looks like it is indeed filling /mnt/bar back up.
17:51 semiosis :)
17:51 tqrst ...hopefully this is faster than rebalancing
17:57 mynameisdeleted I know gpg is secure.. but I hate having to go through apt-keying and figuring out Release.gpg to have apt not squack at me every time I install software
17:57 tqrst semiosis: is there a way to get the progress of this? "volume heal myvol info" isn't very informative.
17:57 mynameisdeleted but upgrading my glusterfs software
17:58 tqrst s/get/monitor
17:58 semiosis tqrst: i use df to compare the used size of the replicas
17:58 tqrst (something along the lines of "healing 5% done" as opposed to monitoring /var/log/glusterfs/bricks/mnt-bar.log for activity)
17:59 daMaestro joined #gluster
17:59 tqrst I guess that'll do :)
18:00 tqrst thanks
18:00 semiosis yw
18:02 mynameisdeleted silly question... lets say I want distributed filesystem over wan
18:02 mynameisdeleted 100mbps to 1gbps but maybe 100 or 200ms ping from servers being in different continents
18:02 mynameisdeleted is gluster an alternative to scripted unisync for the folders I want to stay identical?
18:02 mynameisdeleted whats the best wan-distributed filesystem?
18:03 mynameisdeleted I'd like a web server that appears identical despite runing on 2 continents
18:03 mynameisdeleted or 3 or 4
18:03 mynameisdeleted the dns server is custom so before it respondes to any request it looks up the ip and location of the ip that sends the request
18:04 mynameisdeleted and gives differnet results depending on country of origin
18:04 mynameisdeleted to support geographic distribution
18:12 andreask joined #gluster
18:13 tqrst semiosis: hm, so if I ever forget to mount /mnt/bar before launching glusterfsd, this means that my root partition's /mnt/bar folder will be "healed" too eh
18:14 xian1 mynameisdeleted: a delayed comment to your comment about never failing to recover from RAID disk failure:  statistically, the greatest chance of a second disk failure in a RAID occurs when reading the parity from other disks to rebuild a single disk failure.  Been there, several times.  That's why SAS drives are better, because you can have an idea from looking at SCSI counters that a disk may be bordlerline.  Of course, has nothing to do with gluster.
18:14 semiosis mynameisdeleted: glusterfs replication (AFR) doesnt perform well over high latency links.  a future release of glusterfs will have "multi-master georeplication" to support that use case.
18:15 mynameisdeleted so maybe glusterfs in a main datacenter is good for large parallel filesystem
18:15 mynameisdeleted but to make a remote filesystem mirror it cron with unisync is a great option
18:15 jiffe98 when initiating a full self-heal through the shell does it matter which server you start it from?
18:15 semiosis tqrst: yep, the solution to that is to make the brick directory a subdir of the brick disk's mount point... for example, mount /dev/sdd at /mnt/foo then make the brick path /mnt/foo/bar
18:15 mynameisdeleted that works over any filesystem
18:16 mynameisdeleted self-heal is similiar performance to raid resyncing?
18:16 semiosis tqrst: so if /mnt/foo isn't mounted, /mnt/foo/bar wont exist and glusterfsd will die trying to start
18:16 mynameisdeleted takes hours and hours and hours to check everything and read every file?
18:16 dbruhn__ joined #gluster
18:16 tqrst semiosis: right
18:16 mynameisdeleted I guess 2nd disk failure is why raid6 or raid10 is way better than raid5
18:17 semiosis mynameisdeleted: well not really
18:17 tqrst semiosis: is it as simple as bringing glusterfsd down, mkdir /mnt/foo/bar, mv /mnt/foo/.* /mnt/foo/bar, replace-brick blah:/mnt/foo blah:/mnt/foo/bar commit force?
18:17 mynameisdeleted lighting rebuilding of only async data is probably much better for disks
18:17 semiosis mynameisdeleted: glusterfs knows what files are out of sync, then heals them using either a diff or full algorithm.  because the healing is file based, rather than entire block device based, it is less painful than rebuilding raid
18:18 semiosis besides the general similarity that you are repairing consistency of two replicated things, there's really not much deeper in common between glusterfs & raid replication
18:18 mynameisdeleted and it maintains a list of outt-of-sync files
18:18 daMaestro joined #gluster
18:19 semiosis tqrst: that sounds reasonable but i've never tried.  let me know how it goes :)
18:19 mynameisdeleted gluster supports if I ahve 10 nodes that all files are saved on exactly 2 of them?
18:19 tqrst semiosis: I'll wait until this brick is done healing first
18:19 semiosis tqrst: maybe try it on a test volume before production :)
18:19 mynameisdeleted does it also support somethign like raid5 or raid6?
18:19 mynameisdeleted where I can support single or double node failure but get to keep more than half my space
18:19 mynameisdeleted I gues thats bad for production
18:19 semiosis mynameisdeleted: you set a replica count for the whole volume, then you add bricks in multiples of that count.  all files get the same replication factor.
18:20 mynameisdeleted reason its acceptable with raid5 is I just change my min read and write size to be a full stripe(block on each drive)
18:20 mynameisdeleted so if I have replica-2.. I can have 2, 4 or 6 nodes but not 5?
18:20 semiosis mynameisdeleted: well "nodes" is not clear, we use ,,(glossary) to keep things clear
18:20 glusterbot mynameisdeleted: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
18:21 semiosis so you can have 2, 4, or 6 "bricks"
18:21 mynameisdeleted and if I have 4.. can it split all data on serverN:/foo equally?
18:22 semiosis i wish i could show you ,,(brick naming) but the site is down :(
18:22 glusterbot http://goo.gl/l3iIj
18:22 semiosis johnmark: can you get us a backup/dump of c.g.o?  at least the docathon articles?
18:22 semiosis johnmark: those are valuable
18:24 semiosis mynameisdeleted: you could make a 2x2 distributed-replicated volume, which distributes files evenly (half&half) between two replica pairs.
18:24 semiosis mynameisdeleted: some reading material i think you'll appreciate...
18:24 semiosis ,,(rtfm)
18:24 glusterbot Read the fairly-adequate manual at http://goo.gl/E3Jis
18:25 semiosis ,,(extended attributes)
18:25 glusterbot (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://goo.gl/Bf9Er
18:25 semiosis ,,(joe's blog)
18:25 glusterbot http://goo.gl/EH4x
18:25 semiosis ,,(semiosis tutorial)
18:25 glusterbot http://goo.gl/6lcEX
18:25 semiosis that article on joe's blog is a bit outdated but the picture may be helpful even if nothing else is
18:26 mynameisdeleted guess before I get proper nodes I can distribute with virtual block devices on qemu
18:27 semiosis you can have multiple bricks per server
18:27 mynameisdeleted or file-pretend blocks which are loop-devices from filesystem
18:27 jskinner_ joined #gluster
18:27 semiosis bricks are just directories, although you're strongly encouraged to use XFS for the brick filesystem
18:27 semiosis gotta run, lunch.  bbiab
18:28 mynameisdeleted so xfs is better than ext4
18:28 mynameisdeleted does xfs support mount -o acl over nfs?
18:29 ekuric1 joined #gluster
18:34 mynameisdeleted infiniband acceleration on glusterfs... is that through nfs rdma support?
18:34 mynameisdeleted is replication accelerated like that too?
18:39 glusterbot New news from resolvedglusterbugs: [Bug 764204] Bring in variable sized iobuf <http://goo.gl/97FvN>
18:44 tryggvil joined #gluster
18:50 nueces joined #gluster
18:52 disarone joined #gluster
18:55 mooperd joined #gluster
18:55 cjohnston_work fixed my geo-rep issues, I backported 3.3.1 from 3.2.6
18:57 SunCrushr joined #gluster
18:57 SunCrushr Hey everyone.  Hope everybody is having a good day.
18:57 jskinner_ joined #gluster
18:58 SunCrushr I have a quick question for you.  I'm setting up an infiniband connected cluster, and I'm thinking of using glusterfs to replicate and serve storage from two storage servers in the cluster.
18:58 SunCrushr Is it true that rdma is non longer supported by glusterfs?
19:00 elyograg SunCrushr: I am not an authoritative source, but I believe it was broken in 3.3.0 and fixed in 3.3.1, the current release.
19:00 andreask joined #gluster
19:01 SunCrushr Thanks for the info.  Does anyone else have anything to add?  I'm just asking because I've seen a lot of conflicting info out there on the web about this subject.
19:02 SunCrushr I do see that the 3.3.1 RPMs do indeed have a glusterfs-rdma package, which gives me hope.
19:03 penglish soooo
19:03 penglish How well does gluster deal with re-IP-ing every node?
19:03 penglish DNS names can stay the same obviously
19:05 SunCrushr Also, I'm looking for the best way to utilize multiple infiniband links.  IBBond obviously won't work, because that is active-passive.  Can gluster be given two separate IPs and do round robin or something like that?
19:06 SunCrushr I could probably get a lot of these answers on http://community.gluster.org/ if it wasn't down.  Anyone know when it's going to be back up?
19:13 hybrid512 joined #gluster
19:14 daMaestro joined #gluster
19:25 Ee__ joined #gluster
19:25 semiosis penglish: if you used ,,(hostnames) and are just changing the IP they point to you should be ok
19:25 glusterbot penglish: Hostnames can be used instead of IPs for server (peer) addresses. To update an existing peer's address from IP to hostname, just probe it by name from any other peer. When creating a new pool, probe all other servers by name from the first, then probe the first by name from just one of the others.
19:26 penglish We did consistently use hostnames
19:26 penglish Which approach do you think would be safer: move one node to new IP and see if it rejoins the cluster properly, then another, then another and so on?
19:27 penglish Or just power them all down and bring them all back up with new IPs?
19:30 semiosis penglish: either *should* work, the latter would guarantee beyond any possible doubt that the hostname gets resolved again
19:31 penglish hmm
19:31 penglish my cow-orker has pointed out that despite using hostnames when we set it up, "peer status" always seems to show at least one IP address for at least one of the other hosts
19:31 penglish Which implies there may be trouble.. I see why he was concerned
19:31 semiosis read ,,(hostnames) again
19:31 glusterbot Hostnames can be used instead of IPs for server (peer) addresses. To update an existing peer's address from IP to hostname, just probe it by name from any other peer. When creating a new pool, probe all other servers by name from the first, then probe the first by name from just one of the others.
19:32 semiosis idk how you'd fix that, could be difficult
19:33 semiosis does the volume show one or more brick addreses with ip, or are all the bricks using hostnames?
19:40 rgustafs joined #gluster
19:59 Ee__ left #gluster
20:09 dbruhn joined #gluster
20:12 georgeh|workstat having a problem with locks on a brand new volume (two servers distributed-replicate), doing a gluster volume status hangs for about 10 minutes (frame-timeout is set to 600) then returns nothing, all other commands except gluster peer status and gluster volume info return operation failed
20:13 georgeh|workstat looking at the logs, I see entries like '0-glusterd: Unable to get lock for uuid: 18f811a0-996a-41e4-be16-b56a2c35c4be, lock held by: 18f811a0-996a-41e4-be16-b56a2c35c4be'
20:13 georgeh|workstat everytime I execute a command
20:16 hagarth joined #gluster
20:20 pol_ joined #gluster
20:21 pol_ joined #gluster
20:22 semiosis @later tell ThatGraemeGuy wait actually it seems to be intermittent.  i've tried rebooting a bunch of times and sometimes the mount works at boot, other times i get the dns failure message.  i'll get to the bottom of this, but it may take a few days/the weekend.
20:22 glusterbot semiosis: The operation succeeded.
20:22 sjoeboo joined #gluster
20:26 semiosis weird, my local kvm test vm mounts at boot every time, but my ec2 test instance only does sometimes
20:28 duffrecords joined #gluster
20:29 georgeh|workstat now I'm getting "0-glusterd: Received RJT from uuid: 00000000-0000-0000-0000-000000000000" and "Lock response received from unknown peer: 00000000-0000-0000-0000-000000000000" but gluster peer status still appears okay
20:29 georgeh|workstat commands still not working
20:30 semiosis georgeh|workstat: did you clone your servers?
20:32 georgeh|workstat nope, fresh install
20:32 georgeh|workstat not vms
20:34 georgeh|workstat can't even run a statedump, that just hangs for awhile then returns nothing and the files are not created
20:34 georgeh|workstat restarted glusterd on each server, no luck
20:36 georgeh|workstat if I stop glusterd on either of them and then run gluster volume status, it works, but the moment I restart the peer it fails
20:38 georgeh|workstat semiosis, when you asked about cloning, what were you thinking?
20:38 georgeh|workstat seems like the two peers are having difficulty talking to each other, not sure why
20:39 dbruhn joined #gluster
20:49 elyograg georgeh|workstat: if they were cloned, they would have the same peer UUID.
20:50 badone joined #gluster
20:50 _benoit_ joined #gluster
20:51 georgeh|workstat ah, no, they have different peer UUIDs
20:51 georgeh|workstat which makes the lock message even more confusing on the one server
20:52 tomsve joined #gluster
20:54 duffrecords I just discovered my servers are now unable to mount Gluster volumes over NFS (the error is wrong fs type, bad option, bad superblock, etc.).  it was working fine as of Feb. 11th.  the only recent error I see in nfs.log is "Unable to resolve FH" yesterday but that was for a different volume
20:55 duffrecords in fact, it looks like the logs have stopped as of 8 PM last night
21:00 duffrecords "gluster volume status" looks fine and if I run "rpcinfo -t <gluster-server> 100003 3" on another server it looks ok too
21:00 duffrecords what else can I do to check the integrity of the volume?
21:14 duffrecords seems to mount fine using the Gluster client.  but it fails with NFS
21:22 semiosis duffrecords: do you have nfs-common installed?
21:22 jclift_ joined #gluster
21:23 semiosis or maybe missing a : or a / in your remote path?
21:23 semiosis remember, nfs mounts require server:/volume whereas glusterfs fuse mounts only need server:volume (/ optional)
21:27 penglish semiosis: sorry, too much multitasking. To answer your question: with "gluster peer status" - on one host it shows only hostnames for the other hosts! On on other hosts it shows one IP and hostname for the others..
21:27 penglish ie: it is not consistent
21:28 semiosis that wasnt my question, i was asking if -- inside any volume -- there were ip addresses used in brick paths
21:28 semiosis you could ,,(pasteinfo) and i can see for myself
21:28 glusterbot Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
21:29 penglish semiosis: gluster peer probe <hostname> appears to have worked
21:29 semiosis the reason peer status appears "inconsistent" is (probably) because each host shows only the other hosts, not itself, and you have one host who is known by IP because you probed from it to the others when setting up
21:29 semiosis penglish: thats great!
21:29 penglish :-)
21:30 erik49 joined #gluster
21:34 duffrecords forget I mentioned "gluster volume status."  I was just pointing out that its output looks the way it should—everything online.  I'm just trying to find errors or warnings that would give me clues where to start looking but I'm not seeing anything obvious, except the NFS mount failing
21:35 sjoeboo joined #gluster
21:35 duffrecords regarding the NFS mount, /etc/fstab hasn't changed and it was working last week.  I'm the only one who logs into these servers and makes system-level changes, and I was out sick all week, so I'm 99.9999% sure nothing was modified
21:36 elyograg does gluster volume info show any options, specifically nfs.disable ?
21:37 duffrecords nope
21:37 semiosis elyograg: it shows all options that have been modified (even those which were set back to default manually)
21:39 duffrecords semiosis: to answer your earlier question, nfs-common is installed on the client
21:39 georgeh|workstat I have attempted to re-create this volume multiple times, detached and re-attached the peers, and I am still getting the hangs and operation failed for gluster commands, any ideas?
21:40 semiosis georgeh|workstat: log files are in /var/log/glusterfs
21:41 georgeh|workstat it seems like the cluster lock is held by one peer and then nothing works afterwards
21:42 georgeh|workstat keeps saying it is unable to acquire lock
21:42 semiosis georgeh|workstat: check *all* your servers, do gluster peer status... do you see peer rejected anywhere?
21:42 duffrecords I'm able to mount the volume in question successfully using the glusterfs-client package, so that solves (for now) the problem that has everybody on my case today but I won't be able to use the glusterfs-client software as a replacement for the NFS datastore that our VMware hosts are using.  I wish ESXi wasn't such a crippled Linux distro or else I would just install the Gluster client and call it done
21:42 semiosis georgeh|workstat: i mean you should execute 'gluster peer status' on every server, not just one
21:42 georgeh|workstat only two servers, running gluster peer status on both returns fine
21:42 georgeh|workstat both peers are in the cluster
21:42 georgeh|workstat no rejected
21:43 georgeh|workstat UUIDs seem right
21:43 semiosis have you tried restarting glusterd on them?
21:43 georgeh|workstat yep
21:43 georgeh|workstat does nothing
21:43 semiosis sometimes that helps.  also what version of glusterfs is this?
21:43 georgeh|workstat v 3.3.1-1
21:44 semiosis distro?
21:44 georgeh|workstat the gluster.org RPMs
21:45 georgeh|workstat glusterfs-epel
21:54 georgeh|workstat this doesn't make any sense, if I shutdown glusterd on one of the peers, the other works fine, and vice versa, but not when both are up
21:55 semiosis georgeh|workstat: i'm surprised you dont have peer rejected status, because sounds like it kinda
21:56 mooperd joined #gluster
21:56 semiosis georgeh|workstat: i think the same solution could help though... pick one server, stop glusterd, kill all gluster processes, and move everything except glusterd.info out of /var/lib/glusterd (to a safe backup location)
21:57 semiosis then when you start glusterd up again it will have its same uuid but no peer/volume config.  now probe both ways between the servers and restart glusterd again.  it should sync the peer & volume info from the one that was left online
21:57 semiosis and hopefully all will be well
21:59 tryggvil joined #gluster
22:00 jiffe98 is there a way to get a list of locks held?
22:01 nueces joined #gluster
22:01 semiosis good question, idk
22:19 nueces joined #gluster
22:19 semiosis @later tell ThatGraemeGuy ok this is kinda drastic but should get you going until i come up with a better solution: add a post-start sleep to /etc/init/glusterfs-server.conf like this... http://pastie.org/6259835
22:19 glusterbot semiosis: The operation succeeded.
22:28 georgeh|workstat semiosis, that didn't work, now the one I stopped and cleaned out only says that it has State: Accepted peer request (Connected)
22:28 georgeh|workstat not syncing
22:29 semiosis did you probe in both directions?  restart glusterd on both servers afterward?
22:29 semiosis ,,(replace)
22:29 glusterbot Useful links for replacing a failed server... if replacement server has different hostname: http://goo.gl/4hWXJ ... or if replacement server has same hostname:
22:29 glusterbot http://goo.gl/rem8L
22:29 georgeh|workstat yes, the one left up said the peer was already in the cluster, the other went into the state accepted peer request
22:30 semiosis and you restarted glusterd on both afterward?
22:31 georgeh|workstat yep
22:32 semiosis weird
22:32 semiosis on the one with no config, try 'gluster volume sync <otherhost> all'
22:35 georgeh|workstat it returns unsuccessful
22:35 georgeh|workstat tried restarting both again too
22:36 sjoeboo joined #gluster
22:37 raven-np joined #gluster
22:39 jskinner_ joined #gluster
22:40 georgeh|workstat hmmm...it is telling me 'volname is not present in operation ctx' when I try to sync
22:43 semiosis georgeh|workstat: hate to say this but sounds like things are so messed up you might be better off starting over.  as long as you recreate volumes with same bricks (in same order) you should be ok
22:44 georgeh|workstat I've tried recreating the volumes from scratch, only thing I haven't done is uninstall the RPMs and reinstall
22:44 semiosis you'll need to clear the path or a prefix of it is already part of a volume message
22:44 glusterbot semiosis: To clear that error, follow the instructions at http://goo.gl/YUzrh or see this bug http://goo.gl/YZi8Y
22:44 tqrst are there any plans to make distributed-replicate volumes automatically figure out how to properly pair bricks such that there is maximum reliability? Ideally, I would want to be able to say "this brick is on disk A in server B which is in rack C" for all my bricks, and then have gluster determine how to pair replicas so as to maximize resiliency against failures. Right now, this has to be done manually.
22:44 georgeh|workstat did that
22:45 semiosis georgeh|workstat: not just the volumes, but everything in /var/lib/glusterd on both servers
22:45 georgeh|workstat yep, did that
22:45 semiosis on both?
22:45 semiosis and this problem came back?
22:45 georgeh|workstat yep
22:45 semiosis wow
22:45 georgeh|workstat so, that's why I'm confused, what is causing this problem?
22:46 semiosis georgeh|workstat: your systems are working ok besides this?  no network or system failures happening?
22:46 georgeh|workstat nothing I can see
22:46 semiosis tqrst: havent heard anything about that
22:53 mooperd joined #gluster
23:20 Ryan_Lane joined #gluster
23:21 Ryan_Lane I have a stopped volume that I'm trying to delete and I'm being told the volume deletion was unsuccessful...
23:21 Ryan_Lane [2013-02-20 23:20:35.708435] I [cli-rpc-ops.c:885:gf_cli3_1_delete_volume_cbk] 0-: Returning with -1
23:21 Ryan_Lane that's the log entry I'm getting in cli.log
23:22 Ryan_Lane nothing else is getting logged
23:23 JoeJulian "jiffe98> is there a way to get a list of locks held?" - kill -USR1 should trigger a state dump to /tmp and should show locks.
23:25 JoeJulian georgeh|workstat: So the problem is that /sometimes/ cli commands don't work?
23:26 JoeJulian Ryan_Lane: Since it's the callback that's receiving a failure, I think the actual failure is on one of the other glusterd instances. Check all the glusterd logs.
23:27 Ryan_Lane ah. indeed. just found an issue
23:27 Ryan_Lane lock held
23:29 Ryan_Lane hm
23:29 Ryan_Lane also:  0-glusterd: Request received from non-privileged port. Failing request
23:35 JoeJulian @learn privileged port as By default, glusterd only accepts connections from privileged ports (1-1024). This ensures that the sending host at least has root privileges as a rudimentary security measure. This can be overridden by setting allow-insecure on.
23:35 glusterbot JoeJulian: The operation succeeded.
23:36 Ryan_Lane oh. wonderful….
23:36 JoeJulian @forget privileged port
23:36 glusterbot JoeJulian: The operation succeeded.
23:36 Ryan_Lane it seems the deletes partially succeeded
23:36 JoeJulian @learn privileged port as By default, glusterd only accepts connections from privileged ports (1-1024). This ensures that the sending host at least has root privileges as a rudimentary security measure. This can be overridden by setting allow-insecure on for that volume.
23:36 glusterbot JoeJulian: The operation succeeded.
23:36 Ryan_Lane but only just enough to make it impossible to restart glusterd
23:37 JoeJulian That bug's been reported and, iirc, there's a fix already in 3.4.
23:37 Ryan_Lane /var/lib/glusterd/vols/juju-home exists, but /var/lib/glusterd/vols/juju-home/info does not
23:37 ultrabizweb joined #gluster
23:38 JoeJulian georgeh|workstat: I think the privileged port thing may have been what you were encountering as well.
23:38 Ryan_Lane it only seems to have done this on a single brick
23:38 Ryan_Lane rsync to the rescue
23:38 JoeJulian That tends to be a problem if outgoing ports 1-1024 are all busy.
23:39 JoeJulian I'm not even sure that it's that useful of a security measure.
23:40 georgeh|workstat does gluster have a problem with bonded itnerfaces?
23:40 JoeJulian What I would like to see is signed keys to authorize peers into the trusted pool.
23:40 tqrst can an "add-brick" operation end up being pending for a while? I added a pair of bricks. They still won't show up in volume info, have no .glusterfs, no trusted.gfid xattr but have a trusted.glusterfs.volume-id.
23:41 tqrst by "for a while", I mean more than 10 minutes
23:41 JoeJulian georgeh|workstat: None that I've heard of. It should be transparent to the application.
23:41 tqrst (also, add-brick returned Operation Failed, so I'm not quite sure why those xattrs were left there.)
23:42 JoeJulian tqrst: No, it should be nearly instant.
23:42 tqrst JoeJulian: well that's interesting
23:42 tqrst JoeJulian: is it safe to wipe the xattrs and try again?
23:42 JoeJulian probably
23:43 JoeJulian Well, safe, yes.
23:43 JoeJulian Will it work? Unknown. Definition of insanity.
23:44 tqrst Volume name bigdata rebalance is in progress. Please retry after completion
23:45 tqrst and ' gluster volume rebalance bigdata status' just sits there, apparently failing to get a lock which is held by the same machine.
23:46 tqrst the unlock request is from a node that doesn't exist any more, either
23:47 JoeJulian restart glusterd maybe. If that doesn't work, try restarting all glusterd.
23:47 JoeJulian There's commands to release locks, but I think that's only locks on volumes, not management.
23:51 tqrst zombie glusterfsd, great
23:54 Ryan_Lane this partially-deleted volume thing is causing issues :(
23:54 Ryan_Lane same thing with the partially stopped volume issue
23:55 Ryan_Lane I wonder if all of the issues I've been having have been related to gluster halfway doing actions
23:58 Ryan_Lane that and the fact that glusterd is single-threaded
23:59 Ryan_Lane Failed to rmdir: /var/lib/glusterd/vols/juju-home, err: Directory not empty

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary