Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-04-03

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:09 mattappe_ joined #gluster
00:30 vpshastry joined #gluster
00:33 DV joined #gluster
00:41 shyam joined #gluster
00:47 bala joined #gluster
00:49 gdubreui joined #gluster
01:09 tokik joined #gluster
01:16 nightwalk joined #gluster
01:21 RicardoSSP joined #gluster
01:24 jmarley joined #gluster
01:57 semiosis jclift: ping?
01:57 semiosis JoeJulian: ping?
01:58 semiosis @later tell jclift i found this PPA which has samba for ubuntu trusty with the gluster-vfs enabled: https://launchpad.net/~monotek​/+archive/samba-vfs-glusterfs
01:58 glusterbot semiosis: The operation succeeded.
01:58 semiosis @later tell jclift it's a bit outdated though
01:58 glusterbot semiosis: The operation succeeded.
01:58 semiosis @later tell monotek ping me about https://launchpad.net/~monotek​/+archive/samba-vfs-glusterfs when you have a chance.  thanks!
01:58 glusterbot semiosis: The operation succeeded.
02:02 haomaiwa_ joined #gluster
02:04 hagarth joined #gluster
02:20 haomai___ joined #gluster
02:20 vpshastry joined #gluster
02:28 sks joined #gluster
02:30 mattappe_ joined #gluster
02:30 semiosis @3.3 upgrade notes
02:30 glusterbot semiosis: http://vbellur.wordpress.com/2012/​05/31/upgrading-to-glusterfs-3-3/
02:30 semiosis hagarth: ping?
02:35 jrcresawn_ joined #gluster
02:40 raghug joined #gluster
02:50 nightwalk joined #gluster
02:55 joshin joined #gluster
02:55 joshin joined #gluster
02:59 bharata-rao joined #gluster
02:59 hagarth1 joined #gluster
03:03 wgao_ joined #gluster
03:21 davinder joined #gluster
03:24 chirino joined #gluster
03:26 RameshN joined #gluster
03:31 wgao joined #gluster
03:39 shubhendu joined #gluster
03:41 harish__ joined #gluster
03:48 kdhananjay joined #gluster
04:00 itisravi joined #gluster
04:01 joshin joined #gluster
04:01 glusterbot New news from newglusterbugs: [Bug 1083816] DHT: Inconsistent permission on root directory of mount point after adding a new brick <https://bugzilla.redhat.co​m/show_bug.cgi?id=1083816>
04:13 ravindran1 joined #gluster
04:18 sahina joined #gluster
04:18 psharma joined #gluster
04:24 nightwalk joined #gluster
04:25 Jakey joined #gluster
04:32 dusmant joined #gluster
04:35 atinm joined #gluster
04:35 kanagaraj joined #gluster
04:37 ndarshan joined #gluster
04:37 deepakcs joined #gluster
04:41 hagarth joined #gluster
04:46 shylesh joined #gluster
04:50 prasanth_ joined #gluster
04:51 bharata-rao joined #gluster
04:51 wgao joined #gluster
04:52 ppai joined #gluster
04:54 lalatenduM joined #gluster
04:59 nishanth joined #gluster
05:00 benjamin_ joined #gluster
05:10 sks joined #gluster
05:13 RameshN_ joined #gluster
05:17 spandit joined #gluster
05:24 hchiramm__ joined #gluster
05:27 rjoseph joined #gluster
05:38 jrcresawn_ joined #gluster
05:39 vpshastry joined #gluster
05:41 haomaiwa_ joined #gluster
05:48 glusterbot joined #gluster
05:50 vkoppad joined #gluster
05:53 nightwalk joined #gluster
06:01 haomaiw__ joined #gluster
06:02 mattapperson joined #gluster
06:12 rahulcs joined #gluster
06:20 glusterbot New news from newglusterbugs: [Bug 1054199] gfid-acces not functional during creates under sub-directories. <https://bugzilla.redhat.co​m/show_bug.cgi?id=1054199> || [Bug 1041109] structure needs cleaning <https://bugzilla.redhat.co​m/show_bug.cgi?id=1041109> || [Bug 1083816] DHT: Inconsistent permission on root directory of mount point after adding a new brick <https://bugzilla.redhat.co​m/show_bug.cgi?id=1083816> || [Bug
06:22 vimal joined #gluster
06:23 ndarshan joined #gluster
06:28 rgustafs joined #gluster
06:31 psharma joined #gluster
06:34 rjoseph joined #gluster
06:34 ricky-ti1 joined #gluster
06:36 ekuric joined #gluster
06:36 nshaikh joined #gluster
06:46 hchiramm__ joined #gluster
06:47 jtux joined #gluster
06:50 xavih joined #gluster
06:50 glusterbot New news from newglusterbugs: [Bug 921215] Cannot create volumes with a . in the name <https://bugzilla.redhat.com/show_bug.cgi?id=921215>
06:54 bharata-rao joined #gluster
06:54 raghug joined #gluster
06:59 ctria joined #gluster
07:00 rahulcs joined #gluster
07:00 jtux joined #gluster
07:05 shubhendu joined #gluster
07:06 RameshN_ joined #gluster
07:06 dusmant joined #gluster
07:06 RameshN joined #gluster
07:06 eseyman joined #gluster
07:06 sks joined #gluster
07:09 hagarth joined #gluster
07:10 swat30 joined #gluster
07:10 keytab joined #gluster
07:10 Dasberger joined #gluster
07:24 vpshastry1 joined #gluster
07:28 fsimonce joined #gluster
07:29 ravindran1 left #gluster
07:30 psharma joined #gluster
07:31 nightwalk joined #gluster
07:45 andreask joined #gluster
07:55 neoice joined #gluster
07:56 hchiramm__ joined #gluster
07:58 klaas joined #gluster
08:03 vpshastry joined #gluster
08:04 mattapperson joined #gluster
08:04 Gugge joined #gluster
08:26 Andyy2 joined #gluster
08:36 Andyy2 My KVM virtual machines (qemu+libgfapi+kvm) crash with disk errors on gluster node reboot (distributed+replicated setup). Any ideas on what to look for?
08:40 samppah Andyy2: can you send output of gluster vol info to pastie.org?
08:41 samppah i'd guess that you want to tune network.ping-timeout value
08:42 Andyy2 samppah: here it is: http://pastie.org/8990692
08:42 glusterbot Title: #8990692 - Pastie (at pastie.org)
08:44 Andyy2 samppah: seems you're right. 42 seconds default might be too much. What would be a preferred value?
08:44 samppah Andyy2: I have set it to 10 seconds
08:45 samppah haven't seen any issue so far
08:45 Andyy2 I will try that. smack!
08:45 samppah hope it helps :)
08:47 fsimonce` joined #gluster
08:51 dusmant joined #gluster
08:51 raghug joined #gluster
08:52 vpshastry joined #gluster
08:58 vpshastry joined #gluster
09:00 joshin joined #gluster
09:02 fsimonce joined #gluster
09:02 abyss I added bricks but mount point on glusterfs servers for that bricks are in mount point, that's mean every brick have lost+found directory... Ofcourse that  some kind of problem because heal status show that some files (lost+found) need to be heal etc. Is there any solution for that?
09:03 rahulcs joined #gluster
09:04 abyss or maybe I should bother about that?
09:04 hagarth joined #gluster
09:05 Andyy2 abyss: your bricks must not be the mount point directly, but a directory under the mount point.
09:05 kdhananjay joined #gluster
09:06 nightwalk joined #gluster
09:11 abyss Andyy2: ok, but another solution?;) Now is too late to change mount point;)
09:12 abyss maybe remove extended attr from each lost+found directory would help?
09:12 abyss or something
09:16 Andyy2 I don't think removing the attrs will help. they would be recreated. you really need to find a way for recreating a correct setup, imho, ot live ignoring the heal problems on lost+found.
09:18 Andyy2 short of that, you could stop the vol and then modify the vol files on bricks and clients by hand to point to a new directory under the mount. then move the .gluster directory + data files under that. cross fingers and try it on another system before doing it on live data.
09:22 abyss Andyy2: ok. Thank you.
09:31 joshin joined #gluster
09:36 rahulcs joined #gluster
09:46 psharma joined #gluster
09:50 ndarshan joined #gluster
09:54 joshin joined #gluster
09:54 joshin joined #gluster
10:00 YazzY left #gluster
10:04 jmarley joined #gluster
10:07 dusmant joined #gluster
10:23 glusterbot New news from newglusterbugs: [Bug 1083963] Dist-geo-rep : after renames on master, there are more number of files on slave than master. <https://bugzilla.redhat.co​m/show_bug.cgi?id=1083963>
10:25 joshin joined #gluster
10:25 joshin joined #gluster
10:27 ravindran1 joined #gluster
10:29 Slash joined #gluster
10:39 rahulcs joined #gluster
10:40 kkeithley1 joined #gluster
10:40 andreask joined #gluster
10:42 nightwalk joined #gluster
10:48 vkoppad joined #gluster
10:51 qdk joined #gluster
11:06 tdasilva joined #gluster
11:22 hagarth joined #gluster
11:22 Slash joined #gluster
11:28 shyam joined #gluster
11:30 kanagaraj joined #gluster
11:32 ppai joined #gluster
11:40 rahulcs joined #gluster
11:41 diegows joined #gluster
11:44 rahulcs joined #gluster
11:45 mfs3 joined #gluster
11:46 atrius joined #gluster
11:47 RameshN_ joined #gluster
11:48 dusmant joined #gluster
11:50 mfs3 having issues with GlusterFS 3.4.2 on 100+ cluster; when adding/removing bricks while fs is mounted; xinetd reports 'all ports in use'. If bricks are added/removed without the fs being mounted; no such issues.
11:50 mfs3 running latests CentOS 6.5 over 10G on E5-2650v2
11:51 mfs3 also I noticed the cli/glusterd does not scale well 'peer probe' and 'add-brick' takes long
11:52 shubhendu joined #gluster
11:53 saurabh joined #gluster
11:53 ndarshan joined #gluster
11:53 glusterbot New news from newglusterbugs: [Bug 1077452] Unable to setup/use non-root Geo-replication <https://bugzilla.redhat.co​m/show_bug.cgi?id=1077452>
11:53 RameshN joined #gluster
11:54 Slash joined #gluster
11:55 mfs3 double checked everything; indeed many extra ports are in use when hitting this issue
11:58 nishanth joined #gluster
12:04 tdasilva joined #gluster
12:04 itisravi joined #gluster
12:04 gluster-admin joined #gluster
12:05 Andy-gluster-dud Hi all
12:05 mattapperson joined #gluster
12:05 Andy-gluster-dud I am running glusterfs 3.4.2, and I am trying to setup quota, but I get this: gluster> v set wp-storage quota.deem-statfs on
12:05 Andy-gluster-dud volume set: failed: option : quota.deem-statfs does not exist
12:06 Andy-gluster-dud I searched through the source and there is absolutely no reference to deem-statfs
12:06 Andy-gluster-dud was this removed from 3.4.2?
12:08 tdasilva left #gluster
12:09 atinm Andy-gluster-dud, can u please try this command :  v set wp-storage features.quota-deem-statfs on
12:10 Andy-gluster-dud atinm, yup I tried that first, doesn't work
12:10 Andy-gluster-dud gluster> v set wp-storage features.quota-deem-statfs on
12:10 Andy-gluster-dud volume set: failed: option : features.quota-deem-statfs does not exist
12:10 Andy-gluster-dud Did you mean features.quota-timeout?
12:10 Andy-gluster-dud gluster>
12:10 ultrabizweb joined #gluster
12:11 Andy-gluster-dud I searched the 3.4.2 source code and found no reference to deem-statfs
12:15 atinm Andy-gluster-dud, have u downloaded the code from git repo?
12:16 Andy-gluster-dud no from I downloaded from gluster.org
12:17 Andy-gluster-dud the funny thing is that in 3.4.2 behaviour is as if features.quota-deem-statfs on by default
12:19 atinm u can probably post this question to gluster-dev ?
12:19 Andy-gluster-dud I guess, sad that the documentation of new releases is almost non-existent
12:21 nightwalk joined #gluster
12:28 jclift semiosis: k.  That's useful, but I'm not personally into PPA's. :)
12:29 kkeithley_ You'll be happy to know then that 3.5.0 GA is blocked until we get the documentation done
12:34 kkeithley_ glusterfs-3.4.3 RPMs for EPEL (RHEL, CentOS, etc.) are available in YUM repos at http://download.gluster.org/pu​b/gluster/glusterfs/3.4/3.4.3/.  Fedora RPMs are in Fedora updates-testing repos. Other distros coming soon
12:34 glusterbot Title: Index of /pub/gluster/glusterfs/3.4/3.4.3 (at download.gluster.org)
12:43 jclift kkeithley_: "Happy"... Meh.  But I'm glad it'll mean our reputation for docs will start improving with this release. :)
12:44 benjamin_ joined #gluster
12:45 sroy joined #gluster
12:45 deepakcs joined #gluster
12:47 dusmant joined #gluster
12:47 ndarshan joined #gluster
12:52 hagarth joined #gluster
12:53 RameshN_ joined #gluster
12:59 japuzzo joined #gluster
13:01 kanagaraj joined #gluster
13:09 primechuck joined #gluster
13:10 rahulcs joined #gluster
13:11 bennyturns joined #gluster
13:22 rwheeler joined #gluster
13:22 haomaiwang joined #gluster
13:22 rahulcs joined #gluster
13:23 hagarth joined #gluster
13:23 ctria joined #gluster
13:25 dusmant joined #gluster
13:34 mattapperson joined #gluster
13:36 shyam joined #gluster
13:40 rahulcs joined #gluster
13:44 JoeJulian son of a... I thought I had all the places that could automatically upgrade my gluster servers fixed and somehow I still missed one. Gah!
13:47 JoeJulian now I'm going to have to spend all effing day fixing split-brain
13:48 gmcwhistler joined #gluster
13:50 JoeJulian kkeithley: glusterfs-server's dependency on glusterfs-libs is broken. Should require the same version.
13:51 theron joined #gluster
13:55 chirino joined #gluster
13:57 harish__ joined #gluster
13:59 nightwalk joined #gluster
14:02 chirino_m joined #gluster
14:21 ctria joined #gluster
14:21 wushudoin joined #gluster
14:23 ndevos hi NuxRo, I was wondering if you could share some details of your CloudStack + Gluster environment and use-case?
14:24 ndevos NuxRo: if you can, could you put some info in an email to me?
14:24 rpowell joined #gluster
14:24 NuxRo ndevos: dont have much to be honest, it was only used briefly, for testing reasons only, we're going with local storage for the time being (we have shitty network, no 10G)
14:24 seapasulli joined #gluster
14:25 NuxRo i'd happily write up something when we'll start using it
14:29 JoeJulian ndevos: probably should have tagged you on this too.  glusterfs-server's dependency on glusterfs-libs is broken. Should require the same version.
14:35 ndevos NuxRo: okay, thanks, it would be great if you can keep me informed about it :)
14:35 ndevos JoeJulian: I thought that was fixed a while ago already?
14:36 ndevos JoeJulian: is that for the Fedora packages, or the ones on download.gluster.org?
14:36 JoeJulian Well glusterfs-server just updated to 3.4.3 and (somewhat fortunately) didn't start because glusterfs-libs was still 3.4.2. download.gluster.org
14:39 ndevos JoeJulian: ah, its fixed in 3.5 with http://review.gluster.org/5455 ....
14:39 glusterbot Title: Gerrit Code Review (at review.gluster.org)
14:39 JoeJulian see how you are...
14:40 shubhendu joined #gluster
14:40 ndevos hmm, and http://review.gluster.org/5437 in 3.4
14:40 glusterbot Title: Gerrit Code Review (at review.gluster.org)
14:41 davinder joined #gluster
14:42 diegows joined #gluster
14:45 kkeithley_ JoeJulian: sigh. and sigh.
14:47 chirino joined #gluster
14:47 dusmant joined #gluster
14:48 rahulcs joined #gluster
14:48 sahina joined #gluster
14:49 chirino_ joined #gluster
14:56 vpshastry joined #gluster
14:58 kkeithley_ JoeJulian: from #gluster-dev
14:58 kkeithley_ (10:56:17 AM) kkeithley_: the fedora dist-big spec has    Requires:         %{name}-libs = %{version}-%{release}
14:58 kkeithley_ (10:56:41 AM) kkeithley_: for %package server
14:58 kkeithley_ (10:57:47 AM) kkeithley_: rpm -q --requires -p glusterfs-server-3.4.3-1.el6.x86_64.rpm
14:58 kkeithley_ (10:57:49 AM) kkeithley_: ...
14:58 kkeithley_ (10:57:55 AM) kkeithley_: glusterfs-libs = 3.4.3-1.el6
15:00 lmickh joined #gluster
15:09 JoeJulian kkeithley_, ndevos: yep... I'm completely at a loss how that happened. puppet did a yum upgrade (bad puppet) of glusterfs-server (and glusterfs-fuse) and for the life of me I can't see any reason why they succeeded without upgrading libs.
15:09 sprachgenerator joined #gluster
15:10 benjamin_____ joined #gluster
15:10 ekuric joined #gluster
15:10 JoeJulian unless...
15:10 ekuric joined #gluster
15:13 ctria joined #gluster
15:15 rpowell left #gluster
15:17 mattappe_ joined #gluster
15:20 kkeithley_ unless.....
15:20 kkeithley_ ?
15:21 JoeJulian Was installing the src
15:21 JoeJulian Unless the glusterfs package "Provides:         %{name}-libs = %{version}-%{release}"
15:23 purpleidea JoeJulian: badpuppet++
15:23 JoeJulian I could have sworn I changed that to the specific version.
15:24 JoeJulian I totally freaked out the folks at the puppet meetup when I told them I "ensure => latest" for most things.
15:29 kmai007 joined #gluster
15:31 kmai007 what feature would i set, if i wanted a fuse client to immediately when it detects an issue with 1 of the storage gluster servers?  I tried network.ping-timeout to 7 sec, but i'm observing it "hangs or waits" longer than that
15:32 kmai007 are there additional feature sets that I have to examine?
15:32 kmai007 @glusterbot network.ping-timeout
15:36 JoeJulian @ping-timeout
15:36 glusterbot JoeJulian: The reason for the long (42 second) ping-timeout is because re-establishing fd's and locks can be a very expensive operation. Allowing a longer time to reestablish connections is logical, unless you have servers that frequently die.
15:36 ndk joined #gluster
15:37 jag3773 joined #gluster
15:37 JoeJulian kmai007: You should check your client log to see what else may be causing any delay you may be noticing.
15:38 kmai007 i believe gluster is working as designed,
15:38 nightwalk joined #gluster
15:38 JoeJulian IMHO, any network or server situation where the ping-timout is a more than once every few years problem, is a subsystem that is inadequate for clustered services.
15:39 kmai007 my scenario is
15:40 kmai007 a webserver has both FUSE and glusterNFS mounts
15:40 kmai007 if a gluster storage server died, the glusterNFS will not be affected if its not the storage its NFS'd to
15:41 kmai007 but the FUSE mount will wait, 42 sec. to ensure its really a troubled storage server.
15:42 JoeJulian No, the nfs server is a client, too, so it will also have a timeout.
15:42 kmai007 that time, the web server will halt threads to the FUSE mount, that will back up our apache proxy, and will not serve any static content until the 42seconds has completed
15:43 gmcwhistler joined #gluster
15:43 kmai007 interestingly, in my test, gluster 2x2, my client has a glusterNFS mount, and i drop a server that it is not "mounted" it doesn't hiccup/wait
15:43 kmai007 i can traverse and ls the mount
15:44 kmai007 which I thought was aweseom
15:45 chirino joined #gluster
15:45 kaptk2 joined #gluster
15:46 Andyy2 kwai007: it probably depends on which server you stop. I have the same issue with libgfapi-mounted disk images. if you reboot the right server, your vm will die (linux will unmount root and windows will bsod).
15:47 Andyy2 42 seconds in an eternity in such a situation.
15:47 gmcwhist_ joined #gluster
15:47 kmai007 correct.  so i have a 2x2 distr.-repl., and i ensure  i don't drop the server its claimed to be mounted to as "glusterNFS"
15:47 sijis joined #gluster
15:48 kmai007 and i don't drop 2 of the same replicate pair, and glusterNFS keeps working as usual
15:48 kmai007 Andyy2: have you adjusted the 42 parameters
15:49 JoeJulian I was about to do the math and show you how 42 seconds every 3 years is eight nines of uptime, but then I realized that the bigger your cluster the higher the likelihood of a ping-timeout. I need to reevaluate my position on this...
15:50 shyam joined #gluster
15:52 Andyy2 I have, but have to wait till tonight for testing (live cluster. can't play with it too much).
15:52 johnbot11 joined #gluster
15:56 zerick joined #gluster
15:57 vpshastry joined #gluster
15:58 sijis i'm seeing an error 'prefix of it is already part of a volume' its a new server with a new disk. not sure what it means. http://fpaste.org/91355/40658139/
15:58 glusterbot Title: #91355 Fedora Project Pastebin (at fpaste.org)
16:00 shyam joined #gluster
16:00 coredumb joined #gluster
16:01 coredumb Hello
16:01 glusterbot coredumb: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
16:02 coredumb a couple months ago it was advised to not use ZFS as the underlying FS for gluster, has the situation changed ?
16:03 JoeJulian sijis: It's telling you that the path /opt/gluster-data/brick-gbp1/gbp1 on one of your servers, or a prefix thereof, has already been tagged as having been part of a volume. ,,(path or prefix)
16:03 glusterbot sijis: http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/
16:06 Andyy2 coredumb: I'm using zfs as the underlying system for gluster. it's a rough experience.
16:06 sijis JoeJulian: i was reading that but i'm not even seeing that .glusterfs file at all. and its a new disk which hasn't been reused
16:06 sijis JoeJulian: is there a way to see what part is part of a volume?
16:07 sijis part of the path
16:10 Mo__ joined #gluster
16:11 coredumb Andyy2: how rough :D
16:12 coredumb but as you still use it, feels like it's worth it ^^
16:14 coredumb i mean Andyy2 you trust your production on it ?
16:14 tomased left #gluster
16:14 Andyy2 zfs on linux has issues with deadlocks under memory pressure, and arc cache size problems too (can go way beyound the max_arc_cache parameter). i'm dealing with this now, but some of my bricks cannot survive 24hr operation without locking.
16:14 Andyy2 right now, the trobles are more than the benefits. I hope this to change soon, or I'm back to raid+lvm for good.
16:17 ctria joined #gluster
16:17 sijis i'm on gluster 3.4.2 if that makes a difference
16:18 sijis gluster volume info says 'no volumes present'
16:19 coredumb Andyy2: which version are you using ?
16:19 coredumb of ZFS ?
16:21 coredumb latest git master tree is solving a lot of deadlocks and performance issues
16:21 coredumb still testing this version as previous patchset were still deadlocking on NFS servers
16:23 hagarth joined #gluster
16:26 semiosis well, looks like i'll be upgrading prod from 3.1.7 to 3.4.2 tonight at 4 AM :/
16:26 jobewan joined #gluster
16:26 semiosis JoeJulian: ^^
16:28 raghug joined #gluster
16:29 zaitcev joined #gluster
16:30 Andyy2 coredumb: I'm using 6.2 official.
16:31 jruggier1 joined #gluster
16:31 coredumb yeah too old and too full of deadlocks :/
16:33 Andyy2 should I try the latest master? Any data corruption problems?
16:34 jruggiero joined #gluster
16:34 coredumb not that i know of
16:35 coredumb i myself have deadlocks only with NFS :(
16:36 coredumb but was thinking of doing something with glusterfs ...
16:37 Andyy2 Do you exercise the fuse mount enough ? I only get deadlocks during periods of intense writes (i.e. at night, during backups).
16:37 coredumb have some incompatibilities with some old clients mounting NFS on ZFS
16:37 coredumb was wondering if presenting glusterfs instead of zfs directly would fix it
16:38 vpshastry joined #gluster
16:38 Andyy2 I've been using solaris-exported nfs for years. not a single problem there. if you can live with it not being a cluster, it's a pretty solid combination.
16:39 coredumb yeah no solaris here
16:39 Andyy2 you could use omnios.
16:40 coredumb actually old AIX can't list nfs content if on zfs
16:40 coredumb zfs on linux *
16:42 coredumb even EL4 has issues ...
16:42 coredumb as glusterfs would be a different posix fs
16:43 coredumb was wondering if....
16:43 coredumb but i still need inline compression and snapshots so my choice is pretty limited
16:45 jmarley joined #gluster
16:45 Andyy2 yes, limited options. and zfs+gluster really is a neat combo.
16:45 coredumb yep
16:46 coredumb right now i'm doing a zfs raid1 on 2 luns in 2 separated SANs
16:49 coredumb but i have 2 separate servers and i need to manually fail over on secondary to import my pool
16:51 Andyy2 coredumb: any pointer on compiling zfsonlinux from source ? The usual automake/autoconf scripts are failing in my testing server.
16:51 jbrooks left #gluster
16:52 coredumb Andyy2: gimme a sec
16:53 coredumb got some SRPM for you
16:53 Andyy2 wait... I'm a debian guy.
16:53 kmai007 has anybody tread the waters of glusterNFS on UDP?
16:53 kmai007 i know its not supported
16:54 Matthaeus joined #gluster
16:54 kmai007 but i've seen glusterfs builds where it was an option
16:54 coredumb Andyy2: then don't gimme a sec :D
16:54 jruggiero joined #gluster
16:54 Andyy2 :)
16:55 coredumb i can at least list you deps
16:55 coredumb wait a sec
17:00 coredumb Andyy2: kernel-devel only for spl
17:01 coredumb zlib-devel e2fsprogs-devel libblkid-devel for zfs
17:01 nishanth joined #gluster
17:01 Andyy2 thank you.
17:02 mattappe_ joined #gluster
17:03 hagarth joined #gluster
17:04 vpshastry joined #gluster
17:05 ricky-ticky joined #gluster
17:06 Andyy2 ok, i compiled spl. doing zfs now.
17:06 Andyy2 I hope this does not blow in my hands :)
17:07 mattappe_ joined #gluster
17:07 primechuck joined #gluster
17:07 raghug joined #gluster
17:07 primechuck joined #gluster
17:08 * coredumb is not responsible if it blows in Andyy2 hands :D
17:08 Andyy2 :D
17:09 coredumb but as the devs advise to use git master branch to the ones reporting deadlock issues ...
17:09 coredumb guess it's a safe bet ;)
17:10 sputnik1_ joined #gluster
17:10 jruggiero joined #gluster
17:11 jbrooks joined #gluster
17:11 Andyy2 Well... I'll play on a test two-node cluster before moving this to production. meanwhile, I have a monit proces set that will drop caches when it detects hung tasks. seems to work, but is ugly as hell.
17:11 jruggier1 joined #gluster
17:11 Andyy2 zfs compiled... ouch. now I have to try it. No sleep tonight I guess.
17:12 coredumb let me know :)
17:13 Andyy2 yep.
17:13 ctria joined #gluster
17:14 jruggiero joined #gluster
17:14 primechuck joined #gluster
17:14 jruggier1 joined #gluster
17:16 jruggier2 joined #gluster
17:18 nightwalk joined #gluster
17:25 jruggiero joined #gluster
17:28 Matthaeus1 joined #gluster
17:34 dusmant joined #gluster
17:35 ricky-ticky1 joined #gluster
17:36 semiosis @3.3 upgrade notes
17:36 glusterbot semiosis: http://vbellur.wordpress.com/2012/​05/31/upgrading-to-glusterfs-3-3/
17:38 tdasilva joined #gluster
17:42 jeremia}{ joined #gluster
17:43 jobewan joined #gluster
17:46 jeremia}{ left #gluster
17:47 warci joined #gluster
17:49 sks joined #gluster
17:50 warci anybody awake? :) i'm in deep trouble....
17:51 semiosis i'm awake, also in deep trouble
17:51 warci i've been struggeling with nfs issues for a few weeks now and need to get an answer or we will have to abandon gluster
17:51 warci ow? What's your issue?
17:52 warci i really need to understand how the nfs part of gluster handles file permissions
17:53 coredumb Andyy2: from what i remember it was not advised as per zfs acl support
17:54 Andyy2 the problem is xattr handling (dual seek on zfs to get the xattr). you can mitigate with zfs set xattr=sa
17:54 coredumb warci: what's your issue with permissions ?
17:54 semiosis warci: my trusty old 3.1.7 prod cluster is having "issues".  i'm preparing to upgrade to 3.4.2 (latest).
17:55 glusterbot New news from newglusterbugs: [Bug 1084147] tests/bugs/bug-767095.t needs to be more robust. It's failing on long hostnames. <https://bugzilla.redhat.co​m/show_bug.cgi?id=1084147>
17:55 coredumb Andyy2: ok
17:55 coredumb warci: not that i can help though :)
17:56 warci well, when i access my volumes i get all kinds of freaky errors when i want to see the permissions of a file, for example with ls -l.
17:57 warci when i just use find, or ls, everything is fine
17:59 jruggiero joined #gluster
18:03 rotbeard joined #gluster
18:06 semiosis warci: feel free to put those "freaky errors" on a pastie...
18:10 jobewan joined #gluster
18:21 andreask joined #gluster
18:28 jobewan joined #gluster
18:29 warci semiosis: http://pastie.org/8991963
18:29 glusterbot Title: #8991963 - Pastie (at pastie.org)
18:29 warci it's always like this
18:29 warci random errors on file permissions
18:30 semiosis did you perhaps run different versions of glusterfs together?
18:30 sijis so i'm trying to create a new volume. getting an odd 'its being created in root partition'. these are the complete steps i ran http://fpaste.org/91421/65497411/. can anyone tell me what i obvisouly did wrong
18:30 glusterbot Title: #91421 Fedora Project Pastebin (at fpaste.org)
18:30 warci mmm normally not, i'll verify
18:30 warci anyway, this is when mounted as nfs
18:30 semiosis warci: yeah but still
18:31 warci nah, all are 3.4.2-1
18:31 coredumb what happens on newly created files ?
18:32 warci when just touching a test file: everything is fine
18:32 coredumb how did these files got here ?
18:32 coredumb so you create files it's OK
18:32 semiosis warci: try stopping & starting the volume
18:32 semiosis (which will interrupt access)
18:32 semiosis warci: also check the brick log files
18:32 warci it was data that was already there, i made a volume containing the existing stuff
18:32 kkeithley_ sjis: your fpaste doesn't show that you mounted your lv anywhere, and in particular nit's not mounted at /opt/gluster-data/brick-test1
18:33 semiosis ooh
18:33 warci stopped & started many times
18:33 semiosis warci: please ,,(pasteinfo)
18:33 glusterbot warci: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
18:33 lpabon joined #gluster
18:35 warci http://fpaste.org/91426/96550098/
18:35 glusterbot Title: #91426 Fedora Project Pastebin (at fpaste.org)
18:35 sijis kkeithley_: i havne't :( .. blah, let me do that and retry
18:35 warci semiosis: i tried 2 types of brick: one formatted ext4 and another XFS
18:35 warci both have the same problem
18:36 sijis kkeithley_: i have to mount it to the directory before test1, correct. aka sudo mount -t xfs /dev/vg99/lv_test1 /opt/gluster-data/brick-test1/
18:37 kkeithley_ /bin/mkdir -p /opt/gluster-data/brick-test1/,  mount /dev/vg99/lv_test1 /opt/gluster-data/brick-test1, mkdir /opt/gluster-data/brick-test1/test1,   glsuter volume create ....
18:37 warci the logs for the brick and the nfs.log: http://fpaste.org/91430/13965502/
18:37 glusterbot Title: #91430 Fedora Project Pastebin (at fpaste.org)
18:40 warci the thing is, it kinda looks like it's working when i just look at one directory at the time
18:41 sijis kkeithley_: i just forgot the mount. the volume create worked
18:41 warci it's only when i do a recursive ls or i move too quickly it starts giving errors
18:45 sijis kkeithley_: if i get an error creating a volume that says "failed: {path} or a prefix or its is already part of a volume." i did read hte julian blog but that didn't work (its new disks). is there a way to see what it thinks is part of the volume?
18:46 warci trying to create 1000 files of 10MB now... looks like it's working fine, sometimes the process stalls a bit
18:48 kkeithley_ That's telling you that there's already a .glusterfs dir at the brick location or in a parent dir.
18:48 warci could the issue be caused because i'm using data that was already on disk before i created the gluster volume?
18:49 sijis kkeithley_: i checked there's none (will paste)
18:50 semiosis if anyone has an idea, please help... i can telnet from A to B:24007 just fine, but when glusterd on A tries to connect to B:24007 the socket hangs in SYN_SENT
18:50 semiosis kkeithley_: JoeJulian: ^^ HALP
18:50 nightwalk joined #gluster
18:50 semiosis i am using 3.4.2 now
18:50 semiosis but saw the same thing between these hosts with 3.1.7
18:51 semiosis am suspecting some kinda system lib version mismatch
18:51 semiosis or kernel
18:52 sijis kkeithley_: here's the paste http://fpaste.org/91434/55116113/
18:53 glusterbot Title: #91434 Fedora Project Pastebin (at fpaste.org)
18:53 sroy joined #gluster
18:55 JoeJulian semiosis: just got back from morning appts.
18:55 semiosis welcome back
18:55 glusterbot New news from newglusterbugs: [Bug 1084175] tests/bugs/bug-861542.t needs to be more robust. It's failing on long hostnames. <https://bugzilla.redhat.co​m/show_bug.cgi?id=1084175>
18:55 kkeithley_ semiosis: that's weird.  I did have a situation with build.gluster.org where I couldn't ssh to it from certain RHEL boxes. It was running the CentOS 6.3 kernel. It wouldn't ACK the SYNs. ssh from Fedora boxes were fine.
18:55 kkeithley_ After I updated to a newer kernel then it worked.
18:56 semiosis thanks thats helpful
18:56 semiosis and that seems to have worked here too!!!
18:56 semiosis wow!
18:57 JoeJulian interesting
18:58 semiosis actually not quite
18:58 JoeJulian I bet the telnet connection is from port >= 1024
18:58 semiosis never mind
18:58 semiosis i thought i had removed puppet but it was still there & reverted me to 3.1.7
18:58 semiosis bah
18:59 sijis kkeithley_: any suggestion?
19:00 kkeithley_ ping the two hosts? They have different IP addrs?
19:00 JoeJulian sijis: When you tried and failed to create your volume, the brick that didn't fail was marked with those extended attributes. Now it will raise that error.
19:01 JoeJulian That's why, earlier, I said "all servers".
19:01 toastedpenguin joined #gluster
19:03 toastedpenguin anyone deployed gluster in AWS EC2?
19:03 sijis JoeJulian: let me retry again
19:05 JoeJulian Our resident AWS expert is battling his own problem at the moment and may not be super responsive until it's solved. But you should really just ask your question and see if someone has an answer.
19:06 toastedpenguin its been a long time since I have used gluster and my current company is deployed 100% in AWS, would it be a good candidate to be used as storage for backups e.g. DB etc. for Windows based instances?
19:07 vpshastry left #gluster
19:07 semiosis why is a host saying 'unable to find friend host.name' where host.name is the local machine name
19:09 toastedpenguin I know gluster supports CIFS/SMB so I shouldn't have an issue getting the windows servers access to the file share, but with it running in AWS is it a viable storage solution for backups?  My other options seems to be running an instance with mirrored EBS but I would like to use something scalable and Linux based if possible
19:09 sijis JoeJulian: weird. that worked this time.  --- ahh. i was just using the xfs mount path, not the brick path. doh!
19:09 JoeJulian excellent
19:10 sijis JoeJulian: so the idea that its a new disk, isn't always necessary the case (i wasn't trying to reuse any brick)
19:11 sijis JoeJulian: is it possible th during the volume create process something gets borked and causes this?
19:11 coredumb warci: yes, that was my question, were the datas on disk before you created the gluster volume
19:12 coredumb and as the answer is yes, it definitely may come from that
19:12 JoeJulian toastedpenguin: https://wiki.ubuntu.com/UbuntuCloudDay​s/23032011/ScalingSharedStorageWebApps
19:12 glusterbot Title: UbuntuCloudDays/23032011/S​calingSharedStorageWebApps - Ubuntu Wiki (at wiki.ubuntu.com)
19:12 coredumb you better put back your datas on the volume
19:13 warci coredumb: yes, the data was already there.... so i'll try to copy some stuff over
19:13 JoeJulian semiosis: Wasn't that the localhost name resolution problem?
19:14 semiosis JoeJulian: i have the hostname mapped to 127.0.0.1 in /etc/hosts... is that bad?
19:14 coredumb warci: iirc it's not advised to touch datas directly on the partition without getting through the gluster volume
19:15 JoeJulian semiosis: shouldn't be.
19:15 toastedpenguin JoeJulian:  thx, for the link
19:17 semiosis that's really old
19:17 semiosis toastedpenguin: ^
19:17 semiosis 3 years
19:17 warci coredumb: i know, but i created the volume on top of the data, and afterwards i only went through te gluster mount
19:17 warci and from a gluster point of view everything is fine
19:17 warci it's just nfs that complains
19:18 coredumb yep but i think thats why
19:18 coredumb easy to confirm :)
19:19 semiosis afk
19:20 warci ok... testing right away
19:20 lalatenduM joined #gluster
19:23 JoeJulian semiosis: yes.
19:24 warci coredumb: starting to think you're right
19:25 JoeJulian semiosis: It does a getaddrinfo then a getnameinfo on that address. If the hostname resolved from that address doesn't match, it fails. Since your hostname resolves to 127.0.0.1 and 127.0.0.1 probably resolves to localhost, it looks like that will fail.
19:25 JoeJulian If I'm reading this routine correctly.
19:26 warci coredumb: looking good!! I think this is causing the problem.... damnit.... i even read somewhere on the internet that this shouldn't pose any issues
19:26 coredumb warci: cool then that's easy to fix :)
19:26 warci THE INTERNET LIED
19:27 warci yeah, i'm copying a whole bunch now to be sure it works
19:30 coredumb :)
19:31 warci looking good sofar... i think you saved my ass
19:31 coredumb Andyy2: how's tests going?
19:31 warci and that after 2 weeks of testing an trying stuff..... my god
19:32 coredumb ^^
19:32 JoeJulian If you start with a pre-loaded brick it officially results in "undefined behavior". That undefined behavior usually results in having a pre-loaded volume IF the pre-loaded brick is the left-hand brick in a replica set. It does still require either doing a lookup on each file through a fuse client ("find $fuse_mount_path"), or running "volume heal ... full" and waiting until that's done.
19:33 JoeJulian ... but only 1 brick can be pre-loaded, otherwise you may end up with gfid clashes and other "bad stuff".
19:34 coredumb warci: how do you handle your NFS ?
19:34 coredumb do you use a VIP to have fully redundant setup?
19:35 warci nope, just straight access to the server
19:35 coredumb ok
19:35 warci the plan is to put it in a dfs
19:35 coredumb windows DFS?
19:36 warci yeah, apparently it works :)
19:36 warci already did some tests
19:36 coredumb ok
19:36 coredumb as NFS or CIFS ?
19:36 warci the nfs part of the server is only for windows and aix clients
19:36 warci we don't want to add samba
19:37 coredumb ok cool
19:37 warci it's just to give easy read-only access to those machines
19:37 warci so nfs is perfect, no extra config
19:37 coredumb are you happy with network perfs ?
19:37 semiosis JoeJulian: thanks!
19:37 warci perfs are not important :)
19:37 coredumb don't know which kind of file size you have
19:37 coredumb hehe ok
19:37 coredumb lucky you :D
19:37 warci random load, mostly small files
19:37 warci but not a lot
19:38 warci already using the gluster part in production and we're happy
19:38 warci but again, performance is not critical
19:38 coredumb what kind of total size you're looking at ?
19:38 warci 30TB
19:38 coredumb ok
19:39 warci it's now on an IBM xiv san
19:39 JoeJulian dfs works with nfs3?
19:39 warci yup!
19:39 warci just need to install windows nfs client
19:39 coredumb windows 2012 ?
19:39 semiosis still having this tcp connection problem, even with matched kernels
19:39 warci and then it's fully transparent
19:39 JoeJulian wow. I thought it was smb only.
19:39 warci yeah, we're really lucky it works
19:39 JoeJulian semiosis: tcpdump the remote end and check that it's actually sending it's SYN_ACK
19:40 warci we have some mixed windows & linux volumes, previously on a netapp
19:40 diegows joined #gluster
19:40 semiosis JoeJulian: [S] is responded to with [S.] -- whats [S.] mean?
19:41 coredumb warci: gives me some ideas
19:41 coredumb we're also migrating a netapp
19:41 coredumb :D
19:41 warci gotta say, the only issue we have
19:41 warci and for us it's not an issue
19:41 warci is that windows users by default get a uid of -2
19:42 coredumb uh
19:42 warci you need to add some AD schema things if you want to get correct uid's
19:42 coredumb oh yeah get that
19:42 warci but we just have the "others" group as read only, so it's fine for us
19:42 warci but a bit crude
19:42 coredumb yeah clearly
19:42 coredumb i'd be fired proposing that :D
19:42 warci haha :D
19:43 JoeJulian semiosis: S. is the ACK reply. The alternative would be R.
19:43 warci anyway, we had some really bad permission issues with netapp cifs/nfs, that's why we try to avoid mixing where possible
19:43 JoeJulian More accurately it would be "R". (not "R." as it may have been misinterpreted)
19:43 warci and that crappy solution is just because we have some homebrew app that needs windows
19:44 warci the rest is all clean
19:44 coredumb indeed
19:44 JoeJulian semiosis: So your server is receiving the SYN, replying with the ACK, but the initiator is not receiving the ACK.
19:44 JoeJulian semiosis: Could something on the new server be blocking incoming packets destined for privileged ports?
19:45 JoeJulian nc -p 1000 {oldserver} 24007
19:45 theron joined #gluster
19:45 * JoeJulian bet that hangs.
19:45 toastedpenguin semiosis: the specifics of the config or the general support of being in the AWS?
19:47 semiosis toastedpenguin: the aws info is still relevant
19:47 semiosis toastedpenguin: but the gluster info is for 3.1, which is old
19:47 semiosis some is still relevant
19:49 theron_ joined #gluster
19:50 JoeJulian lol... for semiosis "old" is yesterday.
19:51 toastedpenguin semiosis:ok cool, I'm a Linux guy by nature supporting a Windows server env, .NET developers....and I hate the idea of using a Windows server as a file server for storing backups or FTP storage location so having a better solution is what I am after
19:52 rotbeard joined #gluster
19:58 mfs joined #gluster
19:58 coredumb how can a linux guy by nature can be supporting a windows env ?
19:59 coredumb are you able to live happily ?
19:59 semiosis JoeJulian: get this... i wiped the whole /var/lib/glusterd on both servers then rebooted them.  now here's the tcpdump of trying to probe from one to the other: http://pastie.org/private/l4h7ehhaih8x562x7ym2a
19:59 glusterbot Title: Private Paste - Pastie (at pastie.org)
19:59 semiosis syn, synack, over & over
19:59 semiosis getting ready to scrap this whole upgrade idea & just build a new cluster & move the disks over
20:00 semiosis i have until 12 AM tonight to come up with a plan, thats when my prod maint window opens
20:01 semiosis the probe finally failed: peer probe: failed: Probe returned with unknown errno 107
20:03 JoeJulian transport endpoint not connected.
20:03 semiosis after so many failed tcp connections
20:04 JoeJulian So it shows the syn coming in, the ack going out... out... OUT dammit!... A syn going out, and the ack coming back.
20:04 semiosis yep
20:05 JoeJulian semiosis: but if you telnet it works. That same tcpdump on telnet will result in it "hearing" the ack.
20:05 semiosis right
20:06 andreask joined #gluster
20:06 semiosis http://pastie.org/private/txyjje3f89ts7zntqj2ng
20:06 glusterbot Title: Private Paste - Pastie (at pastie.org)
20:06 semiosis telnet
20:06 semiosis notice the high source port
20:07 JoeJulian semiosis: nc -p 1000 "10.237.153.183 24007" from 10.187.33.254 (I expect it to hang until it times out).
20:07 JoeJulian heh, bad quote placment
20:07 JoeJulian semiosis: "nc -p 1000 10.237.153.183 24007" from 10.187.33.254 (I expect it to hang until it times out).
20:07 semiosis first give me a nc command that is supposed to work
20:07 JoeJulian :P
20:07 semiosis i tried with -p 1000 & -p 10000 and both hung
20:07 semiosis i'm not too familiar with nc
20:08 JoeJulian -p specifies the source port.
20:08 JoeJulian Check the tcpdump. It probably just "looks" like it's hung becuase it's not very verbose.
20:08 semiosis ok, going to try 10000 first
20:11 semiosis confirmed
20:11 semiosis the 1000 source port cant establish a connection
20:11 semiosis why could that be?
20:12 toastedpenguin is there documentation on a minumum or recommended configuration for gluster deployments?
20:12 JoeJulian semiosis: Can only be a networking issue.
20:13 mattap___ joined #gluster
20:13 JoeJulian semiosis: if it's iptables I'm going to hit you over the head with something....
20:13 gdubreui joined #gluster
20:14 semiosis JoeJulian: and i'd deserve it
20:14 semiosis but it's not iptables
20:14 semiosis no rules
20:16 semiosis JoeJulian: also, why does tcpdump say invalid checksum on all the packets sent by glusterd???
20:16 JoeJulian semiosis: Not sure, but mine says that too.
20:16 semiosis kkeithley_: ??
20:17 JoeJulian ndevos might know more
20:18 JoeJulian semiosis: So if it's not Linux, it must be AWS.
20:18 Matthaeus1 semiosis, would you be willing to run a quick test for me?
20:18 semiosis Matthaeus1: maybe, whats up?
20:18 Matthaeus1 yes | nc -l -p 10000
20:18 Matthaeus1 then telnet host 10000 and see what you get.
20:18 Matthaeus1 Just to rule out firewall/linux networking issues.
20:18 semiosis Matthaeus1: sure
20:18 JoeJulian Matthaeus1: 10k works. It's the <1024 that fail.
20:19 Matthaeus1 Then yes | sudo nc -l -p 666
20:20 primechu_ joined #gluster
20:20 Matthaeus1 And preferably replace 666 with whatever port is actually giving you trouble.
20:20 JoeJulian We've confirmed more than one privileged port.
20:21 JoeJulian ssh works, so it's not all of them.
20:21 Matthaeus1 No, but if you can isolate the problem privileged port outside of gluster, that chops your problem space neatly in half.
20:21 JoeJulian yep
20:21 semiosis Matthaeus1: works
20:22 semiosis trying 666 now
20:22 Matthaeus1 semiosis: Using the same problem port?
20:22 JoeJulian semiosis: Is the nc -l on 254?
20:22 Matthaeus1 666 was arbitrary.
20:22 semiosis 666 works
20:23 Matthaeus1 Problem is almost certainly within gluster, then?
20:23 Matthaeus1 I'm going to go back to the sidelines for a bit and keep watching.  Thanks for humoring me.
20:23 JoeJulian No, wait... if you tcpdump from .33.254 you don't see the ACK though.
20:23 semiosis JoeJulian: all my tests have been from 10.187.33.254 to 10.237.153.183, but pretty sure i get the same problem other direction
20:23 semiosis Matthaeus1: sure
20:24 mfs anyone else tried running on 100+ servers with glusterfs 3.4.2 on centos 6.5 already?
20:24 JoeJulian semiosis: reverse the direction of that nc/telnet test.
20:24 semiosis ok
20:24 semiosis with port 666?
20:24 JoeJulian yes
20:24 nightwalk joined #gluster
20:25 semiosis works
20:26 Matthaeus1 Just to verify:  gluster uses tcp exclusively, yes?
20:26 JoeJulian wtf...
20:26 JoeJulian yes
20:26 semiosis wtf is what i'm sayin
20:26 semiosis this defies all reason
20:26 Matthaeus1 Ignore the syn/ack thing, then.  TCP is handled by the kernel and so is the 3-way handshake.  It's a red herring.
20:27 semiosis Matthaeus1: i know, thats why I upgraded the kernels to match on both boxes
20:27 Matthaeus1 sanity check:  uname -a on both?
20:27 semiosis although I think processes can do their own tcp
20:27 semiosis and gluster might be doing that
20:27 semiosis Matthaeus1: same uname
20:27 Matthaeus1 semiosis: just checking
20:27 JoeJulian how can a packet not appearing at the destination be a red herring?
20:28 semiosis at this point i've eliminated all other possibilities besides the red herrings
20:28 semiosis :)
20:29 JoeJulian If the ack gets sent by 10.237.153.183 and doesn't arrive at 10.187.33.254 how can that be a software or kernel issue?
20:29 Matthaeus1 Using tcpdump with the nc/telnet test, do you see the expected ACK packet?
20:30 semiosis all tests work with nc/telnet, so expected packets must be there, right?
20:30 semiosis let me try the gluster probe again, tcpdumping both sides
20:35 calston semiosis: o hai, I use your PPA. thanks
20:35 semiosis calston: \o/
20:35 semiosis yw
20:39 semiosis ok get this... A sends SYN to B:24007, outgoing checksum is 0xD18B which is incorrect.  B:24007 receives the SYN with corrected checkusm 0x0A8A
20:39 semiosis B:24007 responds with SYNACK with incorrect checksum, 0xD18B (seen this before!) which never arrives back at A
20:40 semiosis ALL of the packets sent by glusterd, ON BOTH SIDES, have checksum 0xD18B
20:40 semiosis that ain't the kernel!
20:40 semiosis does this make sense to anyone?
20:40 JoeJulian ... actually, that triggers something I remember from way back.
20:40 warci another question, now that my nfs issue is solved: is there any risk using nfs and gluster on the same volume? Apparently red hat doesn't support this?
20:41 JoeJulian I think the checksum is calculated by the kernel after tcpdump sees it.
20:41 Joe630 don't do that.
20:41 Joe630 warci: you can use an nfs client on a gluster volume
20:41 Joe630 but do not fun nfsd and gluster on the same system
20:41 Joe630 do not *run
20:42 JoeJulian warci: glusterfs provides an ,,(nfs) server. There is no issues using nfs and fuse mounts on the same volume.
20:42 glusterbot warci: To mount via nfs, most distros require the options, tcp,vers=3 -- Also an rpc port mapper (like rpcbind in EL distributions) should be running on the server, and the kernel nfs server (nfsd) should be disabled
20:42 warci aha, thanks guys, that was what i needed to know
20:42 JoeJulian warci: Which I /think/ is probably what you were asking.
20:42 warci true
20:43 Joe630 ,,(rustytrombone)
20:43 glusterbot Joe630: Error: No factoid matches that key.
20:43 warci but i read in the red hat storage doc it was not supported
20:43 warci but maybe they mean running the kernel nfs
20:43 Joe630 ooh, good bot.
20:43 JoeJulian warci: there's a lot of stuff that works that isn't "supported" by red hat.
20:44 elico joined #gluster
20:44 warci hehehe :D
20:44 JoeJulian Which means that there may be some difficulty either explaining to customers in a way that they can understand, training support, training engineers, or just a lack of qualified internal testing by them.
20:44 warci yeah, have to say, reading their document doesn't fill me with confidence :)
20:45 warci but anyway, i'll do some extensive testing to be sure
20:45 JoeJulian What their documents say is that "We will take your money and we will put our reputation behind the fact that X will work."
20:45 Joe630 warci, can you point me to the doc?  i may be able to help understand
20:45 Joe630 and it might help me understan
20:49 warci https://access.redhat.com/site/documentation/en-U​S/Red_Hat_Storage/2.1/pdf/Administration_Guide/Re​d_Hat_Storage-2.1-Administration_Guide-en-US.pdf -----> page 54
20:49 warci Red Hat does not support simultaneous data access through fuse-mount and nfs-mount
20:53 warci they also won't support a filesystem that's not on LVM.... huh
20:53 warci some weird stuff going on
20:56 twx joined #gluster
20:59 badone joined #gluster
21:00 LoudNoises joined #gluster
21:11 warci anyway guys, thanks for all the info, i'm off to bed!
21:11 coredumb gnite warci :)
21:18 shyam joined #gluster
21:36 semiosis i am remembering now that rshade98 had a similar problem to what i'm going through now
21:36 semiosis https://botbot.me/freenode/gluster/msg/12274846/ - "it's saying connection timed out. But the netcat to the same port returns"
21:36 glusterbot Title: Logs for #gluster | BotBot.me [o__o] (at botbot.me)
21:39 semiosis wow
21:40 semiosis i switched to a different instance type and now it's working
21:40 semiosis w o w
21:40 Matthaeus1 semiosis: what instance type were you on?
21:41 semiosis when both hosts are m1.large, all good.  when one or both are c3.large, glusterd can't establish tcp connections (though telnet/nc work)
21:42 semiosis something to do with the low number ports
21:42 semiosis maybe
21:42 semiosis idk
21:42 Matthaeus1 Are you in EC2 classic or in a VPC?
21:42 semiosis classic
21:44 ckannan joined #gluster
21:44 semiosis but i've been using a c3.large for months, this problem only showed up yesterday when one of them died & I replaced it
21:45 semiosis none of this makes any sense
21:46 semiosis @seen rshade98
21:46 glusterbot semiosis: rshade98 was last seen in #gluster 1 week, 6 days, 3 hours, 5 minutes, and 6 seconds ago: <rshade98> semiosis, which repo you want to me use for gluster for debian?
21:46 semiosis @later tell rshade98 i think i ran into the same problem with current generation ec2 instance types as you reported, https://botbot.me/freenode/gluster/msg/12275130/
21:46 glusterbot semiosis: The operation succeeded.
21:47 semiosis he was using m3 i'm using c3, so not strictly the type, but the generation
21:54 rshade98 joined #gluster
21:55 rshade98 Hey Semiosis
21:55 rshade98 what's up man
21:55 semiosis i have been struggling all day with gluster and remembered you had a similar issue which turned out to be instance type
21:56 semiosis i was using c3.large for a server and glusterd couldn't establish a tcp connection to another glusterd
21:56 semiosis i switched to an m1.large and then it worked
21:56 semiosis did you ever get any feedback from amazon re: your ticket?
21:57 rshade98 yes, they said 6 weeks to fix
21:57 semiosis wow
21:57 rshade98 anything *3
21:58 rshade98 there are a couple of tcp settings you can do
21:58 semiosis was this through their premium support? https://aws.amazon.com/premiumsupport/
21:58 glusterbot Title: AWS Support (at aws.amazon.com)
21:58 rshade98 but in the end, it was not worth it to us
21:58 rshade98 Yes, it was premium
21:59 semiosis cool
21:59 rshade98 do you want those settings?
21:59 semiosis can you provide me with the tcp settings?
21:59 nightwalk joined #gluster
21:59 semiosis in a pastie or email or something?
22:01 rshade98 http://pastie.org/8992564
22:01 glusterbot Title: #8992564 - Pastie (at pastie.org)
22:02 semiosis interesting
22:02 semiosis heh, you already told me this!  https://botbot.me/freenode/gluster/msg/12283024/
22:02 glusterbot Title: Logs for #gluster | BotBot.me [o__o] (at botbot.me)
22:02 semiosis wish i remembered this earlier
22:03 rshade98 haha it happens
22:05 rshade98 I have started a very basic cookbook. https://github.com/RightScal​e-Services-Cookbooks/gluster
22:05 glusterbot Title: RightScale-Services-Cookbooks/gluster · GitHub (at github.com)
22:06 semiosis rshade98: did that ethtool/ifconfig command work for you?  doesnt look promising to me
22:08 rshade98 it worked for our ops guys for their stuff, we did not basic testing and it seemed alright. you have to hack /etc/network/interfaces if you want it to survive a reboot
22:08 semiosis i'll poke at it a bit more, thanks
22:08 rshade98 since we were going to m3 for cost savings I went back to m1.
22:09 rshade98 Since gluster can scale horizontally, I really like that idea
22:10 semiosis prev. gen. is so expensive compared to current gen
22:13 qdk joined #gluster
22:17 rshade98 yeah, when comparing horsepower very true
22:18 japuzzo joined #gluster
22:19 DV joined #gluster
22:20 semiosis rshade98: odd thing for me is that i only run into this problem on the gluster-gluster connection between servers.  client-server connections work fine and I have a lot of c3.large client machines
22:20 semiosis problem occurs when one or both of the servers is a c3.large
22:21 rshade98 Yep, I think it's because both disobey
22:21 rshade98 when one obeys the window, its fine
22:21 Durzo hey guys, chiming in here. i have ~7 m3.large instances that dont have those settings and are just fine
22:22 tdasilva joined #gluster
22:24 semiosis Durzo: like i said i've been using c3.large for months without issue
22:24 semiosis yesterday one died so i replaced it and now every c3.large i launch has this problem
22:24 Durzo oh, so its a new issue on new instances
22:24 Durzo hmm
22:24 semiosis ???
22:25 Durzo only effects newly launched instances?
22:25 Durzo which region are you launching them in?
22:26 semiosis first time i heard of a problem like this (still not sure it's the same thing) was when rshade98 reported it on March 18
22:26 semiosis https://botbot.me/freenode/gluster/msg/12278052/
22:28 Durzo interesting
22:28 rshade98 Amazon said it is intermittent. Which makes it more annoying
22:29 rshade98 we found it kills all sorts of things also, outbound db connections, haproxy connections
22:29 semiosis this ethtool/ifconfig does nothing for me :(
22:29 Durzo its strange that turning off tcp checksum offloading fixes it in a virtual environment
22:29 semiosis Durzo: it's not helping me
22:29 rshade98 did you set it on both
22:30 semiosis rshade98: yes, i have two test systems, an m1.large & a c3.large
22:30 semiosis set it on both
22:30 semiosis no help
22:30 rshade98 hmmm very weird
22:30 Durzo have you tried turning off "Source/Dest. Check" ?
22:31 rshade98 I remember it still being wishy washy though(technical term)
22:31 rshade98 I gave up to quickly probably
22:31 semiosis heh
22:31 semiosis Durzo: where would i do that?
22:31 Durzo right click the instance in ec2
22:32 Durzo you can also do it on the Elastic Network Interface
22:32 semiosis Durzo: that menu choice is disabled.  maybe it's a vpc thing?  i'm in classic
22:32 semiosis yep, ENI is VPC
22:32 semiosis i dont have that
22:32 Durzo should be available in classic in the ec2 instance list itself
22:32 semiosis nope
22:32 Durzo i have a classic console i can log into gimme a sec
22:33 semiosis take all the time you need :)
22:33 Matthaeus1 It's disabled in my ec2 classic console, just like a bunch of other stuff that has to be selected at provision-time.
22:33 Durzo yep its still there but greyed out, so looks like the instance needs to be stopped for it to become available
22:34 Matthaeus In VPC, you can change a bunch of stuff with the instance running, including src/dest check.
22:34 Durzo try stopping the instance and it should become available
22:35 semiosis trying
22:41 semiosis stopped, still not available in the menu
22:41 semiosis pretty sure that's vpc only
22:41 Durzo fair enough
22:42 semiosis in any case, time to spin up a full scale copy of prod and practice the 3.1.7-3.4.2 upgrade
22:42 Durzo maybe thats the key then.. most of mine are VPC
22:42 semiosis thx for the suggestion though
22:42 Durzo maybe its only classic hypervisors that are affected?
22:42 Durzo if its a networking issue, the differences in network on a VPC cloud are pretty major
22:44 semiosis interesting
22:45 Durzo all my classic clouds only run m1 instances
22:52 Durzo in other news i just received an email from amazon promoting the FireTV and this was the image they used: http://g-ecx.images-amazon.com/images/G/01/kindle/​merch/2014/campaign/KB/KB-Mail-RealPeople_17v2.jpg
22:56 mattappe_ joined #gluster
23:06 fidevo joined #gluster
23:09 mattappe_ joined #gluster
23:13 seapasulli joined #gluster
23:32 nightwalk joined #gluster
23:54 crazifyngers joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary