Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-11-21

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 jackdpeterson weird issue -- both of my peers can see eachother in the peer status. but when creating the volume I'm getitng the eror host (peer1) is not connected.
00:04 jackdpeterson 2014-11-21 00:04:37.894969] E [glusterd-handshake.c:1644:__gl​usterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (10.0.5.11:24007)
00:17 nishanth joined #gluster
00:44 calisto joined #gluster
00:50 sputnik13 joined #gluster
00:53 sputnik13 joined #gluster
01:02 ildefonso joined #gluster
01:16 B21956 joined #gluster
01:19 sputnik13 joined #gluster
01:22 sputnik13 joined #gluster
01:29 calisto joined #gluster
01:30 newdave joined #gluster
01:38 sputnik13 joined #gluster
01:41 harish_ joined #gluster
01:46 bala joined #gluster
01:57 newdave hi all, i'm working on implementing a 2 node replicated cluster setup to replace our existing NFS server for HA reasons
01:57 newdave the existing NFS storage is used for shared storage of temp files generated by a cluster of tomcat app servers
01:57 newdave lots of small files
01:58 newdave i understand cluster will be somewhat slow when dealing with this sort of disk usage, but can anyone provide any recommended gluster/system config/tuning options?
02:00 newdave this is using gluster 3.4 running on ubuntu 14.04
02:06 haomaiwa_ joined #gluster
02:07 plarsen joined #gluster
02:12 plarsen joined #gluster
02:17 plarsen joined #gluster
02:18 topshare joined #gluster
02:45 kdhananjay joined #gluster
02:50 meghanam joined #gluster
02:50 meghanam_ joined #gluster
02:51 hamcube joined #gluster
03:16 hagarth joined #gluster
03:19 meghanam_ joined #gluster
03:19 meghanam joined #gluster
03:34 kumar joined #gluster
03:44 bharata-rao joined #gluster
03:45 kanagaraj joined #gluster
03:51 RameshN joined #gluster
03:53 jackdpeterson What's the correct operating version for the latest EPEL7 centos 3.6.1-1 release?
03:53 jackdpeterson 30601 appears to be incorrect
03:54 maveric_amitc_ joined #gluster
04:10 soumya joined #gluster
04:17 atinmu joined #gluster
04:20 bala joined #gluster
04:21 nbalachandran joined #gluster
04:24 shubhendu joined #gluster
04:24 nishanth joined #gluster
04:31 ndarshan joined #gluster
04:32 aravindavk joined #gluster
04:35 anoopcs joined #gluster
04:40 kshlm joined #gluster
04:42 RameshN joined #gluster
04:47 pp joined #gluster
04:50 anil joined #gluster
04:52 rafi1 joined #gluster
04:53 meghanam_ joined #gluster
04:54 meghanam joined #gluster
04:54 lalatenduM joined #gluster
04:55 lalatenduM_ joined #gluster
05:01 atalur joined #gluster
05:08 atinmu joined #gluster
05:09 kdhananjay joined #gluster
05:18 saurabh joined #gluster
05:19 ppai joined #gluster
05:24 sahina joined #gluster
05:26 jiffin joined #gluster
05:27 spandit joined #gluster
05:50 anil joined #gluster
05:50 deepakcs joined #gluster
06:07 getup joined #gluster
06:11 nshaikh joined #gluster
06:15 sahina joined #gluster
06:16 shubhendu joined #gluster
06:22 nishanth joined #gluster
06:26 glusterbot News from newglusterbugs: [Bug 1166505] mount fails for nfs protocol in rdma volumes <https://bugzilla.redhat.co​m/show_bug.cgi?id=1166505>
06:28 glusterbot News from resolvedglusterbugs: [Bug 1166503] mount fails for nfs protocol in rdma volumes <https://bugzilla.redhat.co​m/show_bug.cgi?id=1166503>
06:30 hagarth joined #gluster
06:31 zerick joined #gluster
06:31 newdave joined #gluster
06:32 bharata-rao joined #gluster
06:41 getup joined #gluster
06:43 maveric_amitc_ joined #gluster
06:44 kdhananjay joined #gluster
06:49 msciciel_ joined #gluster
06:53 ppai joined #gluster
06:56 glusterbot News from newglusterbugs: [Bug 1166515] [Tracker] RDMA support in glusterfs <https://bugzilla.redhat.co​m/show_bug.cgi?id=1166515>
06:56 soumya joined #gluster
06:58 ctria joined #gluster
07:04 corretico joined #gluster
07:10 hagarth joined #gluster
07:12 kdhananjay joined #gluster
07:13 LebedevRI joined #gluster
07:22 basso joined #gluster
07:24 basso Hello. Anyone done backups of a gluster datastore with backuppc?
07:26 basso I am thinking of using that with rsync, could it be okay to rsync the /datastore folder directly, or should I mount the gluster and rsync that instead?
07:28 basso Hmm, mounting the gluster datastore on the backuppc server seems more reasonable, since I only need backup of the samba share files
07:29 shubhendu joined #gluster
07:30 nishanth joined #gluster
07:36 Fen1 joined #gluster
07:41 T0aD joined #gluster
07:42 sahina joined #gluster
07:43 rafi1 joined #gluster
07:43 topshare joined #gluster
07:53 ndarshan joined #gluster
07:56 ppai joined #gluster
08:13 hagarth joined #gluster
08:19 ndarshan joined #gluster
08:19 mbukatov joined #gluster
08:21 deniszh joined #gluster
08:21 nbalachandran joined #gluster
08:25 fsimonce joined #gluster
08:32 SOLDIERz__ joined #gluster
08:33 SOLDIERz joined #gluster
08:33 SOLDIERz joined #gluster
08:36 sharknardo joined #gluster
08:37 liquidat joined #gluster
08:41 warcisan joined #gluster
08:43 vimal joined #gluster
08:44 sahina joined #gluster
08:44 ppai joined #gluster
08:45 kshlm joined #gluster
08:48 newdave joined #gluster
08:49 atinmu joined #gluster
08:52 nishanth joined #gluster
08:53 uebera|| joined #gluster
08:55 ricky-ti1 joined #gluster
08:57 spandit joined #gluster
09:04 sahina joined #gluster
09:08 atalur joined #gluster
09:09 kshlm joined #gluster
09:09 kshlm joined #gluster
09:18 fsimonce joined #gluster
09:30 MrAbaddon joined #gluster
09:34 jiffin joined #gluster
09:37 hagarth joined #gluster
09:37 ppai joined #gluster
09:37 lalatenduM_ joined #gluster
09:38 kumar joined #gluster
09:39 pp joined #gluster
09:41 nishanth joined #gluster
09:41 spandit joined #gluster
09:41 atalur joined #gluster
09:42 sahina joined #gluster
09:46 atinmu joined #gluster
09:48 RameshN joined #gluster
09:48 kanagaraj joined #gluster
09:53 kdhananjay1 joined #gluster
09:58 tryggvil joined #gluster
10:04 ingard joined #gluster
10:04 aravindavk joined #gluster
10:05 ProT-0-TypE joined #gluster
10:06 ingard left #gluster
10:13 ppai joined #gluster
10:22 soumya joined #gluster
10:26 sahina joined #gluster
10:27 kshlm joined #gluster
10:36 tryggvil joined #gluster
10:40 tryggvil joined #gluster
10:49 kdhananjay joined #gluster
10:50 atinmu joined #gluster
10:53 ppai joined #gluster
10:54 ricky-ticky1 joined #gluster
11:04 diegows joined #gluster
11:27 glusterbot News from newglusterbugs: [Bug 1166616] Gluster epel6 32-bit repo broken <https://bugzilla.redhat.co​m/show_bug.cgi?id=1166616>
11:27 Philambdo joined #gluster
11:38 anil joined #gluster
11:39 rjoseph joined #gluster
11:40 lalatenduM joined #gluster
11:41 soumya_ joined #gluster
11:47 ndarshan joined #gluster
11:48 masterzen joined #gluster
11:54 calum_ joined #gluster
11:58 SOLDIERz joined #gluster
12:05 feeshon joined #gluster
12:16 Slashman joined #gluster
12:18 Debloper joined #gluster
12:22 soumya_ joined #gluster
12:24 calisto joined #gluster
12:27 edward1 joined #gluster
12:35 RameshN joined #gluster
12:39 coreping joined #gluster
12:40 calisto joined #gluster
12:51 julim joined #gluster
12:52 calum_ joined #gluster
13:00 Fen1 joined #gluster
13:05 anoopcs joined #gluster
13:13 diegows joined #gluster
13:14 rjoseph joined #gluster
13:18 sharknardo joined #gluster
13:22 plarsen joined #gluster
13:24 newdave joined #gluster
13:26 plarsen joined #gluster
13:38 plarsen joined #gluster
13:38 tdasilva joined #gluster
13:48 newdave joined #gluster
13:56 theron joined #gluster
13:58 newdave joined #gluster
13:59 julim joined #gluster
13:59 harish_ joined #gluster
14:01 sage_ joined #gluster
14:05 smohan joined #gluster
14:09 newdave joined #gluster
14:11 haomaiwa_ joined #gluster
14:15 schrodinger_ joined #gluster
14:15 VeggieMeat_ joined #gluster
14:15 johnnytran joined #gluster
14:16 Slashman_ joined #gluster
14:16 bene joined #gluster
14:18 masterzen joined #gluster
14:19 capri joined #gluster
14:19 ws2k3 joined #gluster
14:20 ricky-ticky joined #gluster
14:22 B21956 joined #gluster
14:22 newdave joined #gluster
14:23 feeshon joined #gluster
14:27 skippy update on the continuing saga of random client network timeouts:  we have a different cluster, 3 servers in replica 3.  Each is both a brick server and client for this volume.  All are VMs on same subnet.
14:27 topshare joined #gluster
14:27 skippy each server reports intermittent timeouts from its peers, while talking just fine to itself.
14:33 Slash__ joined #gluster
14:44 newdave joined #gluster
14:45 sharknardo just curious... but what hardware / os ? any dropped packets or overruns ?
14:46 skippy VMware guests, RHEL6.5.  Gluster 3.5.2.
14:48 skippy not seeing any dropped packets, so far as I can tell.
14:49 lalatenduM joined #gluster
14:51 skippy although, to be honest, I'm not entirely sure how to understand the output of `netstat -s`.
14:52 skippy https://gist.github.com/skpy/cca6f81b446dd71feb7d
14:52 meghanam_ joined #gluster
14:53 meghanam joined #gluster
14:54 sharknardo you may check with "ethtool -S" if you see some xon/xoff
14:55 sharknardo in my case, i had to enable flow control on the switchs...probably because of some bad network card/drivers
15:00 ricky-ticky joined #gluster
15:04 liquidat joined #gluster
15:05 newdave joined #gluster
15:14 wushudoin joined #gluster
15:17 jobewan joined #gluster
15:20 NuxRo I'm seeing a lot of "remote operation failed: Cannot allocate memory" in my nfs.log, i still have some free memory so not sure what to believe
15:20 NuxRo pointers?
15:21 _Bryan_ joined #gluster
15:23 bala joined #gluster
15:29 premera joined #gluster
15:29 ctria joined #gluster
15:30 sprung joined #gluster
15:31 sprung Good morning. I need the best resource for me to take a crash course in gluster administration. I've inherited a system and I'm pressed on time for several issues, the minimum I need to know is if the system is healthy and functioning properly
15:38 virusuy joined #gluster
15:41 hagarth joined #gluster
15:43 mojibake joined #gluster
15:47 Telsin sprung: grab some vms and check this out: http://www.gluster.org/community/document​ation/index.php/Getting_started_configure
15:47 newdave joined #gluster
15:48 Telsin shorter answer is "gluster peer status" on all hosts and "gluster volume status" for the very quick overview
15:55 hamcube joined #gluster
16:01 shubhendu joined #gluster
16:02 neofob joined #gluster
16:12 maveric_amitc_ joined #gluster
16:21 neofob joined #gluster
16:23 sprung joined #gluster
16:26 soumya_ joined #gluster
16:26 CyrilPeponnet Hi guys, we upgrade our gluster setup to 3.5.4 and we experience NFS hang every minutes
16:27 CyrilPeponnet the nfs process is taking lot of CPU from time to time
16:27 CyrilPeponnet looks that we have some RPC issues
16:27 CyrilPeponnet [2014-11-21 16:27:28.257259] W [rpcsvc.c:261:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 100003 4)
16:27 CyrilPeponnet [2014-11-21 16:27:28.257282] E [rpcsvc.c:547:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
16:27 CyrilPeponnet [2014-11-21 16:27:28.521311] E [rpcsvc.c:1258:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x3c8c195, Program: MOUNT3, ProgVers: 3, Proc: 3) to rpc-transport (socket.nfs-server)
16:27 CyrilPeponnet [2014-11-21 16:27:28.521335] E [mount3.c:129:mnt3svc_submit_reply] 0-nfs-mount: Reply submission failed
16:29 ndevos CyrilPeponnet: uh, 3.5.4 has not been released yet, 3.5.3 was released just last week?
16:29 CyrilPeponnet hmm let met check
16:29 CyrilPeponnet 3.5.2
16:29 CyrilPeponnet sorry
16:30 CyrilPeponnet any hint on this ? I have around 170 active nfs connexions
16:30 glusterbot News from resolvedglusterbugs: [Bug 1145000] Spec %post server does not wait for the old glusterd to exit <https://bugzilla.redhat.co​m/show_bug.cgi?id=1145000>
16:30 CyrilPeponnet there are all hanging for 2-5s Hervey minutes
16:30 glusterbot News from resolvedglusterbugs: [Bug 1153900] Enabling Quota on existing data won't create pgfid xattrs <https://bugzilla.redhat.co​m/show_bug.cgi?id=1153900>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1142052] Very high memory usage during rebalance <https://bugzilla.redhat.co​m/show_bug.cgi?id=1142052>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1081016] glusterd needs xfsprogs and e2fsprogs packages <https://bugzilla.redhat.co​m/show_bug.cgi?id=1081016>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1125231] GlusterFS 3.5.3 Tracker <https://bugzilla.redhat.co​m/show_bug.cgi?id=1125231>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1136221] The memories are exhausted quickly when handle the message which has multi fragments in a single record <https://bugzilla.redhat.co​m/show_bug.cgi?id=1136221>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1147243] nfs: volume set help says the rmtab file is in "/var/lib/glusterd/rmtab" <https://bugzilla.redhat.co​m/show_bug.cgi?id=1147243>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1149857] Option transport.socket.bind-address ignored <https://bugzilla.redhat.co​m/show_bug.cgi?id=1149857>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1157661] GlusterFS allows insecure SSL modes <https://bugzilla.redhat.co​m/show_bug.cgi?id=1157661>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1129527] DHT :- data loss - file is missing on renaming same file from multiple client at same time <https://bugzilla.redhat.co​m/show_bug.cgi?id=1129527>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1129541] [DHT:REBALANCE]: Rebalance failures are seen with error message  " remote operation failed: File exists" <https://bugzilla.redhat.co​m/show_bug.cgi?id=1129541>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1139103] DHT + Snapshot :- If snapshot is taken when Directory is created only on hashed sub-vol; On restoring that snapshot Directory is not listed on mount point and lookup on parent is not healing <https://bugzilla.redhat.co​m/show_bug.cgi?id=1139103>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1139170] DHT :- rm -rf is not removing stale link file and because of that unable to create file having same name as stale link file <https://bugzilla.redhat.co​m/show_bug.cgi?id=1139170>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1140549] DHT: Rebalance process crash after add-brick and `rebalance start' operation <https://bugzilla.redhat.co​m/show_bug.cgi?id=1140549>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1140556] Core: client crash while doing rename operations on the mount <https://bugzilla.redhat.co​m/show_bug.cgi?id=1140556>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1136835] crash on fsync <https://bugzilla.redhat.co​m/show_bug.cgi?id=1136835>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1142614] files with open fd's getting into split-brain when bricks goes offline and comes back online <https://bugzilla.redhat.co​m/show_bug.cgi?id=1142614>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1153626] Sizeof bug for allocation of memory in afr_lookup <https://bugzilla.redhat.co​m/show_bug.cgi?id=1153626>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1141558] AFR : "gluster volume heal <volume_name> info" prints some random characters <https://bugzilla.redhat.co​m/show_bug.cgi?id=1141558>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1153904] self heal info logs are filled with messages reporting ENOENT while self-heal is going on <https://bugzilla.redhat.co​m/show_bug.cgi?id=1153904>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1147156] AFR client segmentation fault in afr_priv_destroy <https://bugzilla.redhat.co​m/show_bug.cgi?id=1147156>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1133949] Minor typo in afr logging <https://bugzilla.redhat.co​m/show_bug.cgi?id=1133949>
16:30 glusterbot News from resolvedglusterbugs: [Bug 1144315] core: all brick processes crash when quota is enabled <https://bugzilla.redhat.co​m/show_bug.cgi?id=1144315>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1140338] rebalance is not resulting in the hash layout changes being available to nfs client <https://bugzilla.redhat.co​m/show_bug.cgi?id=1140338>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1141733] data loss when rebalance + renames are in progress and bricks from replica pairs goes down and comes back <https://bugzilla.redhat.co​m/show_bug.cgi?id=1141733>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1153629] AFR : excessive logging of "Non blocking entrylks failed" in glfsheal log file. <https://bugzilla.redhat.co​m/show_bug.cgi?id=1153629>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1100204] brick failure detection does not work for ext4 filesystems <https://bugzilla.redhat.co​m/show_bug.cgi?id=1100204>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1132391] NFS interoperability problem: stripe-xlator removes EOF at end of READDIR <https://bugzilla.redhat.co​m/show_bug.cgi?id=1132391>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1139245] vdsm invoked oom-killer during rebalance and Killed process 4305, UID 0, (glusterfs nfs process) <https://bugzilla.redhat.co​m/show_bug.cgi?id=1139245>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1138922] DHT + rebalance : rebalance process crashed + data loss + few Directories are present on sub-volumes but not visible on mount point + lookup is not healing directories <https://bugzilla.redhat.co​m/show_bug.cgi?id=1138922>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1140348] Renaming file while rebalance is in progress causes data loss <https://bugzilla.redhat.co​m/show_bug.cgi?id=1140348>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1126801] glusterfs logrotate config file pollutes global config <https://bugzilla.redhat.co​m/show_bug.cgi?id=1126801>
16:31 glusterbot News from resolvedglusterbugs: [Bug 1155073] Excessive logging in the self-heal daemon after a replace-brick <https://bugzilla.redhat.co​m/show_bug.cgi?id=1155073>
16:31 * ndevos *cough*
16:31 CyrilPeponnet woo spammy bot :)
16:31 ndevos this spam is new, I just closed the 3.5.3 bugs :)
16:32 CyrilPeponnet our old setup in 3.4.2 works fine under the same load
16:32 ndevos well, you have two different errors
16:32 CyrilPeponnet but the new one (centos7 and gluster 3.5.2) is not reliable at all...
16:32 diegows joined #gluster
16:33 ndevos something tries to mount with NFSv4, and Gluster/NFS only supports NFSv3: RPC program version not available (req 100003 4)
16:33 ndevos 100003=NFS, 4=version
16:33 CyrilPeponnet yes I see that... hard to prevent but it looks that these request make nfs hang for while...
16:33 CyrilPeponnet (at least it appears on the log the same time the nfs share hang)
16:34 ndevos maybe the 'hang' has to do with the "0-nfs-mount: Reply submission failed" message, but I do not know why that would fail...
16:36 CyrilPeponnet and another thing is if I run  rpcinfo -t server nfs 3 in loop
16:36 CyrilPeponnet it hang also when nfs is hanging
16:37 CyrilPeponnet sounds like something get wrong with RPC thing
16:37 ndevos that does not sound like a gluster issue then, rpcinfo is handled by rpcbind, not glusterfs
16:37 CyrilPeponnet dam it
16:38 ndevos but, it does not explain the last error message you get (or at least I can not connect the dots)
16:38 CyrilPeponnet I made a trace log..
16:38 CyrilPeponnet but... chatty
16:38 ndevos a trace of glusterfs?
16:39 CyrilPeponnet yes
16:39 ndevos you could also enable some debugging in rpcbind maybe?
16:39 haomaiwa_ joined #gluster
16:39 CyrilPeponnet I didn't find anything I launch rpcbind in foreground debug but nothing relevant
16:40 ndevos also not  when doing the rpcinfo loop?
16:40 CyrilPeponnet nop rpcbind process is also "hanging"
16:41 ndevos can you observe CPU usage when that happens, with "top" or similar?
16:41 CyrilPeponnet oh yes...
16:41 CyrilPeponnet glusters nfs taking like 800% CPU for 1/2 min
16:41 CyrilPeponnet while hanging
16:41 ndevos wow
16:41 CyrilPeponnet even the log stop
16:42 CyrilPeponnet after that everything is flushed
16:42 CyrilPeponnet into the log
16:43 CyrilPeponnet load average is load average: 1.04, 1.40, 1.48
16:43 ndevos do you have nfs.drc in the volume enabled? it had some memory issues and we disabled it by default now, not sure if it can trigger such a high CPU usage
16:43 CyrilPeponnet nop I didn't set this property
16:43 ndevos any other options?
16:43 CyrilPeponnet and I have 65GB of RAM this is not the bottlenec
16:43 CyrilPeponnet yep wait a sec
16:44 CyrilPeponnet nfs.enable-ino32: on
16:44 CyrilPeponnet nfs.addr-namelookup: off
16:44 CyrilPeponnet performance.io-thread-count: 32
16:44 CyrilPeponnet diagnostics.client-log-level: WARNING
16:44 CyrilPeponnet the io-thread was for testing, but didn't change anything
16:44 CyrilPeponnet I also try the addr-namelookup off but no lock
16:44 CyrilPeponnet luck
16:44 ndevos no, the MOUNT procedure that fails does not really do any IO
16:45 ndevos do you have any other nfs or rpc options for other volumes?
16:45 CyrilPeponnet yep
16:45 CyrilPeponnet a bunch...
16:46 CyrilPeponnet server.root-squash: off
16:46 ndevos some nfs/rpc options are not per volume, but per nfs-server...
16:46 CyrilPeponnet cluster.eager-lock: on
16:46 CyrilPeponnet performance.stat-prefetch: off
16:46 CyrilPeponnet network.remote-dio: enable
16:46 CyrilPeponnet performance.quick-read: off
16:46 CyrilPeponnet performance.read-ahead: off
16:46 CyrilPeponnet performance.io-cache: off
16:46 CyrilPeponnet arf
16:46 sage_ joined #gluster
16:47 ndevos I dont think any of these would couse that
16:47 ndevos could you ,,(paste) part of the trace log?
16:47 glusterbot For RPM based distros you can yum install fpaste, for debian and ubuntu it's pastebinit. Then you can easily pipe command output to [f] paste [binit] and it'll give you a URL.
16:48 CyrilPeponnet well its 270MB
16:48 CyrilPeponnet :p
16:48 CyrilPeponnet for 4s
16:49 ndevos well, I dont need all of it :) only the ~200 lined before you get to [mount3.c:129:mnt3svc_submit_reply] 0-nfs-mount: Reply submission failed
16:49 ndevos or gzip it and put it somewhere?
16:49 CyrilPeponnet sure, give me few minutes
16:52 JoeJulian CyrilPeponnet: Have you checked dmesg? Maybe you have a failing drive...
16:52 sage__ joined #gluster
16:53 CyrilPeponnet Hey JoeJulian, nothing relevant except: TCP: TCP: Possible SYN flooding on port 2049. Sending cookies.  Check SNMP counters.
16:55 JoeJulian Ah, so a network problem.
16:55 CyrilPeponnet nothing hudge
16:55 CyrilPeponnet only 170 TCP connextions
16:56 JoeJulian Are you getting the SYN flooding every time you have a hang?
16:56 DV joined #gluster
16:56 CyrilPeponnet nop
16:57 CyrilPeponnet dmesg |grep SYN
16:57 CyrilPeponnet [  322.573955] TCP: TCP: Possible SYN flooding on port 2049. Sending cookies.  Check SNMP counters.
16:57 CyrilPeponnet [ 4078.346874] TCP: TCP: Possible SYN flooding on port 2049. Sending cookies.  Check SNMP counters.
16:57 CyrilPeponnet the server is up for 24h and hang every minutes
16:57 CyrilPeponnet I guess this is when I restart the volume
16:57 CyrilPeponnet all client try to connect at the same time
16:58 CyrilPeponnet ndevos Im on the log it will come :)
16:58 JoeJulian Darn. I like blaming the network guys... ;)
16:58 CyrilPeponnet me too :)
16:58 sschultz joined #gluster
16:59 sschultz Hello everyone
16:59 CyrilPeponnet but as we are in a network company.... no excuses
16:59 sschultz I've got a question, is it possible to tell gluster which IP address to bind to for the NFS server?
17:00 ndevos CyrilPeponnet: well, many connections to port 2049 (NFS!) could be an issue, glusterfs/nfs could try to handle that somehow
17:00 hagarth joined #gluster
17:01 sschultz right now it's binding to the same IP address as the peer IP.  I would like to expose the NFS server on another IP address or add additional listening IP addresses
17:02 CyrilPeponnet ndevos JoeJulian http://pastebin.com/tYgZzMmd
17:02 glusterbot Please use http://fpaste.org or http://paste.ubuntu.com/ . pb has too many ads. Say @paste in channel for info about paste utils.
17:02 CyrilPeponnet oakay http://ur1.ca/iuebb
17:03 ndevos sschultz: yes, should be possible in 3.5.3 and 3.6.1, they include this change: http://review.gluster.org/8908
17:03 theron joined #gluster
17:04 sschultz ndevos: perfect, I am running 3.6.1.  I'll look into that.  Thanks!
17:04 msmith_ joined #gluster
17:04 ndevos CyrilPeponnet: hmm, that does not really show much :-/
17:05 CyrilPeponnet ndevos Well the 3.4.1 can still handle these connexions
17:05 CyrilPeponnet arf...
17:05 CyrilPeponnet I can paste more
17:06 ndevos sschultz: note that it will cause all the gluster processes on that system to listen on that IP... depending on your environment, that may (not) be useful
17:06 jackdpeterson Hey all, Attempting to configure a distribute-replica volume. After creating it storage doesn't appear to be allocating across the two hosts correctly -- especially once I add an additional pair of bricks: gluster volume create pod1_1T replica 2 transport tcp ${gl_ip_1}:/export/xvdf/brick ${gl_ip_2}:/export/xvdf/brick
17:07 CyrilPeponnet ndevos and something like that ?  D [mount3.c:341:__mount_rewrite_rmtab] 0-nfs-mount: Updated rmtab with 788 entries
17:07 jackdpeterson if I perform a df -h on gl_ip_1 I get a ton of usage across two of te bricks (/export/xvdf and /export/xvdg [the brick that's added later]) and on gl_ip_2 I get almost no usage.
17:07 sschultz ndevos: yeah, I don't think this will work.  I've got Gluster currently listening on a 10GbE interface and I want to expose the NFS server on a 1GbE interface, but leave gluster communicating with the peers on the other 10GbE interface
17:07 jackdpeterson I'm attempting to architecture for high-availability and I want the pairs added in such a manner to basically be RAID-1 across the two nodes
17:07 ndevos CyrilPeponnet: maybe, so you have 788 clients mounting nfs volumes?
17:08 ndevos sschultz: oh, yes, in that case it would not work
17:08 CyrilPeponnet ndevos yep
17:08 sschultz ndevos: so is this not possible to do?
17:09 lmickh joined #gluster
17:10 ndevos sschultz: not with a config option, maybe with using one IP for nfs an blocking nfs on the other IPs
17:10 Lilian` joined #gluster
17:11 ndevos CyrilPeponnet: 3.5.x introduces caching of the connected clients, saving the list in /var/lib/glusterfs/nfs/rmtab
17:11 sschultz ndevos: Can I have it bind to all interfaces so it accepts all incoming connections?
17:11 ndevos CyrilPeponnet: maybe the disk that holds that file is slow? in which case you could put the file somewhere else
17:12 CyrilPeponnet ndevos right the file is growing in a tmp and moved to rmtab every... 2s
17:12 ndevos sschultz: that is the default :)
17:12 sschultz ndevos: now I feel stupid!!!  Thanks, i guess I need to figure out why I can't connect on the 1GbE interface
17:12 sschultz ndevos: thanks for your help
17:12 ndevos CyrilPeponnet: yeah, that is how the updating works, its copied from the rpc.mountd daemon for kernel NFS
17:13 ndevos sschultz: good luck!
17:13 Lilian` left #gluster
17:13 Fen1 joined #gluster
17:14 sprung So, I'm a total noob with Gluster and inherited an existing cluster. One of the peers has a brick offline for a volume. I don't know what to do. It just says Online: N
17:14 CyrilPeponnet ndevos OH
17:14 CyrilPeponnet ndevos it appears while the tmp is growing the nfs hang
17:14 ndevos CyrilPeponnet: you can set nfs.mount-rmtab to /dev/shm/glusterfs.rmtab and see if that makes it faster...
17:15 CyrilPeponnet I need to restart the vol after that
17:15 CyrilPeponnet ?
17:15 ndevos CyrilPeponnet: no, it should work online, and it affects all volumes
17:16 ndevos CyrilPeponnet: if you do not want that file, you can file a bug and we could disable it - so that the list of clients is only kep in memory
17:16 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
17:17 CyrilPeponnet let me see if it fixioes
17:17 CyrilPeponnet fixes
17:17 JoeJulian Oh wow... That's going to be a faq.
17:18 ndevos yeah, I wonder why nobody mentioned issues with that before, we have some users that needed the feature and have several hundreds of clients too...
17:18 msmith_ Hey all, is anyone aware of a bug in Gluster 3.4 (3.4.1 specifically) that would prevent a user from reading a directory which they have the appropriate permissions to read? For what its worth the same directory can be read directly from the brick.
17:18 CyrilPeponnet ndevos god... looks like it fixes
17:19 CyrilPeponnet ndevos give me more time to be sure
17:19 ndevos CyrilPeponnet: sure, you'll have the whole weekend :)
17:19 deniszh left #gluster
17:19 ndevos I'll be going afk in a bit, its dinner time here
17:20 JoeJulian msmith_: Lots of bugs in 3.4.1, but none that I can think of that would cause that.
17:21 ndevos msmith_: no bug in particular, but there is a limit on the number of groups that get passed on through fuse/nfs and over the GlusterFS network protocol
17:21 ghenry joined #gluster
17:21 ghenry joined #gluster
17:22 CyrilPeponnet ndevos sure take your time, have a good dinner
17:22 ndevos msmith_: if you are not using fuse+acls, http://review.gluster.org/7501 in 3.5 and 3.6 would be an option, I am not sure if it was backported to 3.4
17:23 ndevos I think there is still an issue with fuse+acls :-/
17:24 msmith_ ndevos: Ah I'll take a look, thanks!
17:25 ndevos msmith_: short version: nfs has a limit of 16 groups, fuse+acls 32(?) groups, and the GlusterFS protocol ~93 groups
17:26 msmith_ ndevos: interesting, that is very helpful!
17:26 ndevos now you can mix and match, see where the limit is...
17:27 ndevos msmith_: there are some options to not pass the groups and have the groups resolved on the bricks, that removes the limits in the protocols
17:28 * ndevos leaves for the day, cya!
17:29 ndevos CyrilPeponnet: oh, could you send an email to the list with your issue and solution? otherwise I'll forget about it and wont think of a nicer solution
17:31 sage_ joined #gluster
17:35 daMaestro joined #gluster
17:36 lalatenduM joined #gluster
17:40 CyrilPeponnet ndevos sure, can you tell me for what purpose is this rmtab file ?
17:40 Telsin sprung: first figure out why the brick is offline, if there's a disk problem or something, then you can figure out what to do. what Type of volume is it?
17:45 CyrilPeponnet and btw  load average: 0.29, 0.27, 0.56 now... seems to fix the issue...
17:49 dgandhi joined #gluster
17:53 CyrilPeponnet JoeJulian any idea for rmtab flush in file purpose ? could it be disabled ?
17:54 JoeJulian I think ndevos said to file a bug for that possibility.
17:54 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
17:54 ricky-ticky1 joined #gluster
17:55 hagarth joined #gluster
17:57 PeterA joined #gluster
18:01 JoeJulian @learn nfs hangs as "gluster volume set nfs.mount-rmtab /dev/shm/glusterfs.rmtab" should cure that.
18:01 glusterbot JoeJulian: The operation succeeded.
18:04 lpabon joined #gluster
18:09 newdave joined #gluster
18:10 hazmat joined #gluster
18:15 CyrilPeponnet JoeJulian cool the bot can learn, how to you trigger it for this learned lesson ?
18:15 jmarley joined #gluster
18:15 MrAbaddon joined #gluster
18:21 neofob joined #gluster
18:25 calisto joined #gluster
18:33 newdave joined #gluster
18:37 _Bryan_ joined #gluster
18:44 newdave joined #gluster
18:48 jmarley joined #gluster
18:48 MrAbaddon joined #gluster
18:59 glusterbot News from newglusterbugs: [Bug 1166862] rmtab file is a bottleneck when lot of clients are accessing a volume through NFS <https://bugzilla.redhat.co​m/show_bug.cgi?id=1166862>
19:09 newdave joined #gluster
19:27 newdave joined #gluster
19:31 DV joined #gluster
19:41 PeterA which rpm we need for centos as gluster client?
19:41 vipulnayyar joined #gluster
19:42 CyrilPeponnet PeterA glusterfs-fuse
19:43 PeterA just one?
19:45 PeterA cool thanks!
19:45 PeterA seems it depends on gluster and gluster-lib
19:47 semiosis well then you would need those too
19:48 PeterA i just tried to mount a gfs but failed
19:48 PeterA http://pastie.org/9735167
19:48 newdave joined #gluster
19:52 edwardm61 joined #gluster
19:53 rbennacer joined #gluster
19:53 rbennacer hey guys
19:54 rbennacer let say i dont have any replication and i lose one node that has 2 bricks. woudl the volume still be accessible? would i be able to access files that are in the other volumes?
19:55 stomith joined #gluster
20:12 rbennacer anybody here?
20:19 coredump joined #gluster
20:22 PeterA yes
20:22 gotmustard joined #gluster
20:22 PeterA and i did a modprobe fuse and the mount came up
20:22 PeterA on centos 5
20:24 rbennacer sorry?
20:24 gotmustard Hi. I'm having difficulties mounting a gluster share remotely, I'm getting a peer not allowed error, I've got a lot or relevant output here http://pastebin.ca/2876138  please have a look
20:25 gotmustard i have already set the options that i thought were supposed to work, and reset rpcbind, glusterfsd and glusterd
20:25 gotmustard and i can see the nfs mount when i do a showmount -e theserver
20:26 chirino joined #gluster
20:32 theron joined #gluster
20:33 gotmustard also thanks everybody for your help earlier
20:40 gotmustard hmm, channel's a lot slower now than earlier
20:44 jvandewege joined #gluster
20:48 gotmustard anyone awake?
20:50 ildefonso joined #gluster
20:53 stomith gotmustard: I have the exact same problem this afternoon :)
20:53 gotmustard i just fixed it
20:53 gotmustard all those weird options i set, i unset them
20:53 gotmustard i did a gluster volume reset
20:54 gotmustard stomith, this, after i ensured nfs was turned off (you can't have both running), restarted glusterfsd and glusters, then restarted rpcbind, and waiting for about 2 minutes because it takes a while to set up the nfs
20:55 stomith ah, okay, that's good to know.
20:55 newdave joined #gluster
20:57 CyrilPeponnet PeterA you need to load the fuse module, a reboot can do that for you.
21:01 skippy what does this mean?  W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed (Invalid argument)
21:01 JoeJulian @nfs hangs
21:01 glusterbot JoeJulian: gluster volume set nfs.mount-rmtab /dev/shm/glusterfs.rmtab should cure that.
21:01 JoeJulian CyrilPeponnet: Like that... ^^^
21:01 CyrilPeponnet nice :)
21:02 msmith_ So I seem to be hitting the fuse 32 gid limit with gluster 3.4 on ubuntu precise, but not with gluster 3.0 on lucid. Did something change between these versions that would cause this or is it more likely a configuration difference?
21:02 JoeJulian CyrilPeponnet: You can also trigger factoid content contextually by enclosing it thusly, CyrilPeponnet thanks for finding out about ,,(nfs hangs) for us.
21:03 glusterbot CyrilPeponnet: gluster volume set nfs.mount-rmtab /dev/shm/glusterfs.rmtab should cure that.
21:04 CyrilPeponnet You're welcome :p
21:05 CyrilPeponnet Thanks to ndevos JoeJulian for helping with that today :)
21:10 tryggvil joined #gluster
21:13 skippy I'm seeing rather a lot of these: W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/e10791a217b689e256898a4886fa2acb.socket failed (Invalid argument)
21:13 skippy seeing on test servers and prod.
21:13 skippy `gluster volume status <vol>` fails with "Another transaction is in progress. Please try again after sometime."
21:14 skippy is there a known cause for this kind of thing?
21:17 semiosis msmith_: gluster 3.0 on lucid???  archaic!
21:18 semiosis how about 3.6.1 on trusty?
21:18 semiosis or 3.5.3, or even 3.4.6?
21:18 msmith_ semiosis haha trust me I know...
21:20 msmith_ 3.6 on trusty is coming soon, but I would still like to figure out what is up with the 3.4 setup
21:27 newdave joined #gluster
21:34 stomith is there a disadvantage of using a mount point as a brick point?
21:36 elyograg stomith: if the actual mount point is the brick path, then if the mount goes away for some reason, gluster will happily write data to the containing filesystem (which may be the root fs).  Creating another directory inside the mount point and using that as your brick path is a good idea, to prevent that.
21:36 elyograg if the mount goes away, so does the directory.
21:38 elyograg "go away" doesn't necessarily mean that the mount just disappears.  Could be that a filesystem doesn't mount properly at boot.
21:39 stomith sure, got it.
21:44 stomith is it just me, or does the documentation leave a lot to be desired?
21:45 skippy it's not just you.  submit patches!
21:45 elyograg it's better than many projects I've tried using.  but there ae holes.
21:45 stomith I'm new to gluster, so I'm just trying to get it to work initially. once I figure it out, maybe I can help.
21:47 semiosis elyograg: i think that was fixed (with an xattr) in a recent version
21:47 semiosis not sure which though.  JoeJulian knows
21:48 elyograg I filed a bug or two related to the idea of using a directory. ;)
21:48 semiosis elyograg++
21:48 glusterbot semiosis: elyograg's karma is now 1
21:49 JoeJulian It was fixed by setting the trusted.glusterfs.volume-id xattr on the brick root (while leaving no way other than xattr manipulation to overcome that) with version 3.4.0.
21:51 stomith so I'm just trying to make a redundant web root. so should I be making my brick at /storage/brick, or say, /www/brick ?
21:51 JoeJulian s/redundant/highly available/
21:51 glusterbot What JoeJulian meant to say was: An error has occurred and has been logged. Check the logs for more informations.
21:51 JoeJulian bite me glusterbot.
21:52 stomith JoeJulian: yes.
21:52 JoeJulian separate the concepts of storage from server.
21:53 JoeJulian Make highly available storage, mount that on your servers. Sure, they can be the same hardware, but the concepts should be considered separately.
21:53 stomith That helps.
21:55 JoeJulian You know... I actually get almost giddy when someone gets that so quickly. :D Some people it's like beating it in to their skull.
21:55 JoeJulian Right semiosis? :D
21:56 semiosis Wha?
21:57 * semiosis skull needs moar beatings
21:57 CyrilPeponnet stomith give a try with puppet-gluster and vagrant for quick setup and testing: http://ttboj.wordpress.com/2014/01/08/automaticall​y-deploying-glusterfs-with-puppet-gluster-vagrant/
21:58 stomith I could try that, sure.
22:03 neofob joined #gluster
22:03 bit4man joined #gluster
22:04 newdave joined #gluster
22:07 stomith okay, since I'm using storage instead of server, I deleted my volume, but now I can't create any volume at all.
22:07 msmith_ joined #gluster
22:07 stomith so of course, I find http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/
22:08 msmith_ joined #gluster
22:09 stomith what's throwing me off is the 'For the directory (or any parent directories) that was formerly part of a volume'
22:10 calisto joined #gluster
22:10 stomith .. these are totally different volumes.
22:10 msmith_ joined #gluster
22:11 msmith_ joined #gluster
22:16 stomith wtf.
22:16 stomith volume create: datavol: failed: /storage/brick is already part of a volume
22:16 stomith [root@test ~]# gluster volume status
22:16 stomith No volumes present
22:16 semiosis path or a prefix
22:16 semiosis path or a prefix of it
22:16 semiosis @path or prefix
22:16 semiosis i can never remember this trigger :(
22:16 semiosis path or prefix of it
22:16 semiosis bah
22:17 glusterbot semiosis: To clear that error, follow the instructions at http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/ or see this bug https://bugzilla.redhat.com/show_bug.cgi?id=877522
22:17 glusterbot semiosis: http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/
22:17 semiosis glusterbot: you lagging!
22:18 stomith which is the exact link I pasted and followed.
22:18 semiosis oh
22:18 semiosis try again
22:18 stomith heh.
22:18 stomith on both client and server?
22:20 stomith and this says it's fixed in 3.4. :)
22:25 msmith_ joined #gluster
22:27 JoeJulian "what's throwing me off is the 'For the directory (or any parent directories)..." meaning if your trying to use /mnt/foo/bar/baz as a brick and formerly used /mnt, /mnt/foo, or /mnt/foo/bar as a brick, that's where the xattr is set that's blocking you.
22:32 stomith in all places?
22:34 calisto joined #gluster
22:36 gotmustard stomith, he's generalizing to get you to think along those lines so you start critical thinking. anything that mentions "foobar" is an example.
22:37 stomith right, I got that. thank you.
22:38 stomith I used /mnt/foo/bar/baz only, and it still complained. So I'll figure it out.
22:43 stomith Oh, and thank you for the kind help :)
22:55 CyrilPeponnet One quick question... in a replicate mode, is it better to realize HA with virtual IP between nodes or use a LB (round robbin or better) to spread the traffic between nodes ?
22:57 badone joined #gluster
22:59 elyograg CyrilPeponnet: if you're using the native gluster mount, the gluster client will connect to all the bricks in the volume.
22:59 elyograg If it's NFS or SMB, I use a virtual IP.
23:00 elyograg for native gluster, the name/address used on the mount command is only used to retrieve the volume information, then the client connects directly to everything.
23:01 elyograg so you MIGHT want a virtual IP on two of your peers, but you wouldn't need to go farther than that.
23:01 elyograg round-robin DNS on a name specifically for mounting might be good enough.
23:05 tryggvil joined #gluster
23:05 CyrilPeponnet elyograg thanks for these clarification
23:37 msmith_ joined #gluster
23:38 gnudna joined #gluster
23:39 gnudna left #gluster
23:47 ploo joined #gluster
23:48 ploo can gluster be used a shared filesystem and handle locking etc.
23:56 julim joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary