Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2016-06-30

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:09 F2Knight joined #gluster
00:17 luizcpg_ joined #gluster
00:22 plarsen joined #gluster
00:30 F2Knight joined #gluster
00:36 shdeng joined #gluster
00:42 gem joined #gluster
00:58 julim joined #gluster
01:15 Alghost_ Hi guys, I need your help. I setup the 2 nodes (A: 10.10.59.65, B:10.10.59.66). When I create a distribute volume on B node using "gluster v create vol_test_dist 10.10.59.65:/volume/__vol_test_dist force", I can't mount the volume on both nodes.
01:17 Alghost_ so I try to get information to fix the issue.
01:17 Alghost_ gluster vol status vol_test_dist:
01:18 Alghost_ [root@build_08-1 ~]# gluster vol status vol_test_dist
01:18 Alghost_ Status of volume: vol_test_dist
01:18 Alghost_ Gluster process                             TCP Port  RDMA Port  Online  Pid
01:18 Alghost_ ------------------------------------------------------------------------------
01:18 glusterbot Alghost_: ----------------------------------------------------------------------------'s karma is now -15
01:18 Alghost_ Brick 10.10.59.65:/volume/__vol_test_dist   49154     0          Y       9722
01:18 Alghost_ NFS Server on localhost                     2049      0          Y       9742
01:18 Alghost_ NFS Server on 10.10.59.66                   2049      0          Y       4981
01:18 Alghost_
01:18 Alghost_ Task Status of Volume vol_test_dist
01:18 Alghost_ ------------------------------------------------------------------------------
01:18 glusterbot Alghost_: ----------------------------------------------------------------------------'s karma is now -16
01:18 Alghost_ There are no active volume tasks
01:18 Alghost_
01:19 kramdoss_ joined #gluster
01:19 Alghost_ and then I was watching /var/log/glusterfs/export-__vol_test_dist-.log over mount
01:20 Alghost_ mount point is /export/__vol_test_dist
01:21 Alghost_ E [socket.c:2279:socket_connect_finish] 0-vol_test_dist-client-0: connection to 10.10.59.65:49153 failed
01:22 Alghost_ I think gluster should try to mount through 49154 port, but try to mount through 49153 port
01:23 Alghost_ It is fixed by restarting glusterd on A node (having a brick)
01:24 Alghost_ I'd like to fix the issue without restarting glusterd.. or prevent the issue ahead
01:24 Alghost_ any help?
01:38 Vaelatern joined #gluster
01:39 Lee1092 joined #gluster
01:41 julim joined #gluster
01:46 magrawal joined #gluster
02:03 RameshN joined #gluster
02:31 baojg joined #gluster
03:21 baojg joined #gluster
03:36 kramdoss_ joined #gluster
03:36 wushudoin joined #gluster
03:51 itisravi joined #gluster
03:53 Saravanakmr joined #gluster
04:05 Manikandan joined #gluster
04:11 atinm joined #gluster
04:21 sakshi joined #gluster
04:23 hgowtham joined #gluster
04:29 plarsen joined #gluster
04:30 poornimag joined #gluster
04:35 shubhendu joined #gluster
04:36 gem joined #gluster
04:40 aravindavk joined #gluster
04:41 RameshN joined #gluster
04:44 nehar joined #gluster
04:45 nehar_ joined #gluster
04:48 ashiq joined #gluster
04:59 DV_ joined #gluster
05:03 jiffin joined #gluster
05:07 prasanth joined #gluster
05:15 ndarshan joined #gluster
05:17 anil joined #gluster
05:20 hagarth joined #gluster
05:20 surabhi joined #gluster
05:20 nbalacha joined #gluster
05:25 [diablo] joined #gluster
05:32 satya4ever joined #gluster
05:35 jiffin1 joined #gluster
05:38 skoduri joined #gluster
05:39 Bhaskarakiran joined #gluster
05:40 ppai joined #gluster
05:48 hchiramm joined #gluster
05:48 nehar_ joined #gluster
05:50 kshlm joined #gluster
05:52 prasanth joined #gluster
05:53 poornimag joined #gluster
05:59 MikeLupe joined #gluster
05:59 atinm JoeJulian, are you there?
06:00 ppai joined #gluster
06:02 sabansal_ joined #gluster
06:04 baojg joined #gluster
06:06 armyriad joined #gluster
06:16 karthik___ joined #gluster
06:17 kramdoss_ joined #gluster
06:19 Saravanakmr joined #gluster
06:20 gowtham joined #gluster
06:23 rouven joined #gluster
06:26 nishanth joined #gluster
06:27 atalur joined #gluster
06:28 jtux joined #gluster
06:29 rafi joined #gluster
06:30 Seth_Karlo joined #gluster
06:30 rafi joined #gluster
06:34 jiffin1 joined #gluster
06:34 baojg_ joined #gluster
06:34 Apeksha joined #gluster
06:36 pur joined #gluster
06:38 baojg joined #gluster
06:44 poornimag joined #gluster
06:45 Sue joined #gluster
06:46 manous joined #gluster
06:50 kdhananjay joined #gluster
06:54 karnan joined #gluster
06:56 deniszh joined #gluster
06:57 RameshN_ joined #gluster
06:57 sanoj joined #gluster
06:58 skoduri joined #gluster
07:00 surabhi joined #gluster
07:01 msvbhat joined #gluster
07:23 karnan joined #gluster
07:29 baojg joined #gluster
07:31 jiffin joined #gluster
07:36 baojg joined #gluster
07:46 karthik___ joined #gluster
07:47 Seth_Karlo joined #gluster
07:47 kshlm joined #gluster
07:48 ppai joined #gluster
07:48 ivan_rossi joined #gluster
07:53 Slashman joined #gluster
07:53 surabhi joined #gluster
07:55 baojg joined #gluster
07:58 nehar_ joined #gluster
07:58 jri_ joined #gluster
07:59 prasanth joined #gluster
07:59 Saravanakmr joined #gluster
08:10 msvbhat joined #gluster
08:16 karnan joined #gluster
08:24 robb_nl joined #gluster
08:28 ashiq joined #gluster
08:30 Gnomethrower joined #gluster
08:31 jri joined #gluster
08:33 jwd joined #gluster
08:34 [o__o] joined #gluster
08:37 baojg joined #gluster
08:38 kramdoss_ joined #gluster
08:39 kshlm joined #gluster
08:43 baojg joined #gluster
08:49 atinm joined #gluster
08:50 jri joined #gluster
08:56 nehar_ joined #gluster
08:59 skoduri joined #gluster
09:00 mrErikss1n joined #gluster
09:01 mattmcc_ joined #gluster
09:02 R0ok__ joined #gluster
09:03 darshan joined #gluster
09:04 valkyr1e_ joined #gluster
09:04 Dave_ joined #gluster
09:05 prasanth joined #gluster
09:05 zoldar_ joined #gluster
09:06 skoduri joined #gluster
09:08 malevolent_ joined #gluster
09:08 rafaels joined #gluster
09:09 ndk_ joined #gluster
09:12 yoavz joined #gluster
09:12 Seth_Karlo joined #gluster
09:16 Alghost joined #gluster
09:16 baojg joined #gluster
09:18 shubhendu joined #gluster
09:18 red-lichtie joined #gluster
09:19 red-lichtie How do I mount a glusterfs volume on linux so that it works correctly with PHP ?
09:19 Ulrar red-lichtie: Use NFS if you can
09:19 Ulrar And in any case, use APCu and OPCache
09:19 Ulrar Of performances will be horrible
09:19 Ulrar Or*
09:20 Ulrar If your gluster servers aren't the same as your clients, use the gluster fuse client but configure APCu to have stat = 0
09:20 red-lichtie Ulrar: Is that an option for fstab ?
09:21 Ulrar red-lichtie: APCu and OPCache ? No, that's php extensions for caching
09:21 Ulrar That way the PHP files are read from the gluster only once then cached in ram
09:21 red-lichtie It isn't a caching problem that I am having
09:21 Ulrar But when you update the code you'll need to restart php to have the change applied
09:21 partner joined #gluster
09:22 Ulrar What is your problem ?
09:23 red-lichtie I want to install the data directory from owncloud on a gluster mount. This is currently impossible because php::stat fails when owncloud tests for the directory
09:23 Ulrar That's strange, it should work
09:23 Ulrar I believe we are doing that for our ownclouds too
09:24 Ulrar Did you use the fuse client ?
09:24 red-lichtie picluster1:/gv0 on /shared type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev)
09:25 red-lichtie I think so
09:25 red-lichtie I'm using glusterfs 3.7.11
09:25 Ulrar yeah
09:26 Ulrar You might want to try the NFS export instead
09:26 Ulrar Just replace fuse.glusterfs by nfs I believe
09:26 red-lichtie In fstab ?
09:26 Ulrar (You'll probably have to tweaks options after that, but you should be able to check if it works at least)
09:26 Ulrar yeah
09:26 red-lichtie Will it still stay in sync if I use nfs ?
09:27 Ulrar It will
09:27 Ulrar Works the same, but you lose the High Availability
09:27 Ulrar If the server you mounted goes down, your mount will too
09:27 red-lichtie It is the high availability that I want/need
09:27 Ulrar https://www.gluster.org/pipermail/gluster-users/2013-December/015196.html
09:28 glusterbot Title: [Gluster-users] Errors from PHP stat() on files and directories in a glusterfs mount (at www.gluster.org)
09:28 Ulrar Looks like someone already had the same problem
09:28 red-lichtie There are 3 servers, all replicated
09:28 Ulrar You might want to re-open that question on the mailing list
09:28 Ulrar Doesn't seem like it was solved back then
09:28 red-lichtie Yes, that was me, 2 1/2 years ago
09:28 Ulrar Ha :)
09:29 red-lichtie I'm setting up a new server and thought that I would revisit glusterfs
09:29 Ulrar I don't know about back then, but lately I've been using the ML and they were very helpful
09:29 ppai joined #gluster
09:30 Ulrar I checked, we are using glusterFS for our ownckouds too but through NFS
09:30 Ulrar If your servers and clients are on the same nodes, you can mount localhost:/ so not losing HA
09:30 Ulrar But if they are separate the fuse client is the only solution
09:31 red-lichtie but using 3 replicated bricks all locally mounted locally for their own owncloud instance should work, right?
09:32 Ulrar Exactly yes
09:32 Ulrar If localhost goes down you have other problems than the gluster anyway
09:32 Ulrar So it seems safe to do that this way
09:32 red-lichtie This is what I have "picluster1:/gv0 /shared  glusterfs defaults,_netdev 0 0"
09:33 Ulrar If those nodes are also glusterfs servers, you can just replace picluster1 by localhost
09:34 Ulrar And glusterfs by nfs
09:34 red-lichtie So all I change is glusterfs for nfs ?
09:34 red-lichtie Cool
09:34 red-lichtie I'll try that immediately
09:38 red-lichtie Nope
09:38 baojg joined #gluster
09:38 red-lichtie PHP Warning:  stat(): stat failed for /shared/owncloud/.ocdata in /srv/http/phpinfo.php on line 4
09:41 red-lichtie I guess the problem lies a little bit deeper
09:41 red-lichtie _llseek(0, 0, 0x7ec433e0, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
09:41 red-lichtie and
09:42 red-lichtie ioctl(3, TCGETS, 0x7ec40dec)            = -1 ENOTTY (Inappropriate ioctl for device)
09:42 red-lichtie May there is something I can tune in the gluster device?
09:47 Ulrar That's strange
09:47 Ulrar I don't really know sorry, you should try sending those errors to the mailing list
09:47 Ulrar So the gluster team can take a look
09:47 red-lichtie Maybe someone on the dev channel has an idea
09:49 ndevos red-lichtie: have you read the ,,(php) suggestions already?
09:49 glusterbot red-lichtie: (#1) php calls the stat() system call for every include. This triggers a self-heal check which makes most php software slow as they include hundreds of small files. See http://joejulian.name/blog/optimizing-web-performance-with-glusterfs/ for details., or (#2) It could also be worth mounting fuse with glusterfs --attribute-timeout=HIGH --entry-timeout=HIGH --negative-timeout=HIGH
09:49 glusterbot --fopen-keep-cache
09:50 red-lichtie ndevos: I searched all over the place for php stat and error, never saw anything
09:51 ndevos red-lichtie: I guess stat() can give an error if the file is in split-brain, no idea what else it could be
09:52 ndevos well, except for permission errors...
09:52 ndevos red-lichtie: there should not be an issue with lseek, thats weird
09:53 red-lichtie ndevos: "gluster volume info" returns "Status: Started"
09:55 red-lichtie Here is the relevant part of strace: https://gist.github.com/anonymous/bce42f1017af0d8e9fdff68074e713d4
09:55 glusterbot Title: php stat fails on glusterfs · GitHub (at gist.github.com)
09:55 rafi joined #gluster
09:57 ndevos red-lichtie: stat64() returned 0, which means success, whatever the PHP warning writes out is wrong
09:58 red-lichtie ndevos: I'm no expert :-)
09:58 red-lichtie I know that when I umount the shared device, it works fine
09:58 ndevos red-lichtie: you can run 'stat /shared/owncloud/.ocdata' and check for yourself :)
09:59 ndevos red-lichtie: well, if you unmount, the files would not be there, how could that work?
09:59 red-lichtie ndevos: I know, I have also tried exec('stat /shared/owncloud/.ocdata') in php and that works too
10:00 red-lichtie ndevos: I tarred the contents and restored them on the unmounted mount point :-)
10:00 ndevos red-lichtie: oh, yuck, but thats a fair test
10:00 red-lichtie I wanted to make sure that my test was sane
10:01 ndevos red-lichtie: it is possible that the application doing the stat64() inspects the details of the stat structure and if it find something weird, it prints out the confusing messahe
10:01 ndevos *message
10:01 red-lichtie Ohh, I cut off too early I think
10:02 red-lichtie No, I didn't, excuse that
10:03 red-lichtie ndevos: app is a 3 lines: <?php
10:03 red-lichtie stat('/shared/owncloud/.ocdata');
10:03 red-lichtie ?>
10:03 red-lichtie I wanted to make it as simple as possible
10:03 skoduri joined #gluster
10:04 ndevos red-lichtie: 3 lines with a whole PHP environment in a webserver wrapped around it?
10:04 red-lichtie ndevos: No, it is failing to stat it in owncloud. I wanted to isolate the call that was failing
10:05 red-lichtie They might be calling is_dir or something like that
10:05 ndevos red-lichtie: okay, maybe you can run the php script through php-cli or whatever the commandline is?
10:05 red-lichtie strace php phpstat.php
10:06 red-lichtie That was the strace output that I put on gist
10:06 red-lichtie It doesn't get much simpler
10:06 ndevos red-lichtie: can you run that again, with: strace -v ...
10:06 red-lichtie sure
10:06 baojg joined #gluster
10:07 ndevos red-lichtie: and compare that strace with a run against a file on the local filesystem
10:07 ndevos red-lichtie: also, you cans use ,,(paste) for convenience
10:07 glusterbot red-lichtie: For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999
10:08 ndevos well, with strace you probably need to do: strace -v .... 2>&1 | nc termbin.com 9999
10:08 red-lichtie ndevos: Local FS - http://termbin.com/d12b
10:09 red-lichtie ndevos: glusterfs mounted - http://termbin.com/nwkb
10:10 red-lichtie stat64("/shared/owncloud/.ocdata", {st_dev=makedev(179, 2), st_ino=286544, st_mode=S_IFREG|0644, st_nlink=1, st_uid=33, st_gid=33, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2016/06/30-01:48:16.869403795, st_mtime=2016/06/30-01:48:16.869403795, st_ctime=2016/06/30-01:48:24.079314218}) = 0
10:11 ndevos and
10:11 ndevos stat64("/shared/owncloud/.ocdata", {st_dev=makedev(0, 33), st_ino=9307805301626576804, st_mode=S_IFREG|0644, st_nlink=1, st_uid=33, st_gid=33, st_blksize=1048576, st_blocks=0, st_size=0, st_atime=2016/06/30-01:49:49.828250514, st_mtime=2016/06/30-01:12:53.545853000, st_ctime=2016/06/30-01:12:58.755791492}) = 0
10:11 ndevos so, the main difference is the st_ino value
10:11 foster joined #gluster
10:11 ndevos red-lichtie: I guess that your version does not like the 64-bit inode number
10:12 ndevos red-lichtie: is that an older version of PHP maybe?
10:12 red-lichtie it is a brand new install with php7
10:12 red-lichtie but it is on a 32 bit machine
10:13 ndevos red-lichtie: ok, some 32-bit applications do not like the (several years old default of) 64-bit inode numbers
10:13 ndevos red-lichtie: can you unmount, and mount like "mount -t glusterfs -o enable-ino32 ..."?
10:14 red-lichtie php 7.0.8-1
10:14 ndevos it also depends on how it is compiled
10:14 red-lichtie ndevos: Sure
10:16 red-lichtie ndevos: Brilliant, that works
10:16 ndevos red-lichtie: nice, but you should report the problem to whatever distributions provides those php binaries
10:17 red-lichtie picluster1:/gv0 /shared  glusterfs defaults,enable-ino32,_netdev 0 0
10:17 red-lichtie now works a treat in /etc/fstab
10:17 ndevos red-lichtie: not only GlusterFS uses 64-bit inodes, any modern filesystem can do that, some only start to use them when there are enough files...
10:18 red-lichtie ndevos: The arch linux arm people will be informed :-)
10:18 red-lichtie This si great, now I can continue with my high availability oc pi cluster ;-)
10:19 ndevos red-lichtie: I'd appreciate it if you can send an email to gluster-users@gluster.org with the error you got, how it was debugged and resolved, bonus points for including a link to the bug against the Arch/php package :)
10:19 red-lichtie Thanks a million ndevos, I would never have found that
10:19 red-lichtie :-D
10:20 ndevos you're welcome, glad we figured it out :)
10:45 Alghost_ joined #gluster
10:46 ItsMe` joined #gluster
10:47 ira joined #gluster
10:47 hchiramm joined #gluster
10:48 Bhaskarakiran joined #gluster
10:51 _fortis joined #gluster
10:54 partner joined #gluster
11:00 jiffin1 joined #gluster
11:03 Seth_Karlo joined #gluster
11:05 nehar_ joined #gluster
11:10 atinm joined #gluster
11:24 msvbhat joined #gluster
11:27 prasanth joined #gluster
11:31 kdhananjay joined #gluster
11:32 sanoj joined #gluster
11:33 luizcpg joined #gluster
11:38 robb_nl joined #gluster
11:48 jiffin1 joined #gluster
11:49 ppai joined #gluster
11:56 rafaels joined #gluster
11:57 johnmilton joined #gluster
12:12 karthik___ joined #gluster
12:18 surabhi joined #gluster
12:19 johnmilton joined #gluster
12:24 hi11111 joined #gluster
12:27 ramky joined #gluster
12:28 post-factum after this error on one of replica 2 node:
12:28 post-factum [rpcsvc.c:270:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 1.2.3.60:65534
12:29 post-factum i got several files that are pending to be healed, but heal full does not heal them
12:29 post-factum they are present on one node and absent on another
12:29 post-factum what's the issue?
12:39 Wizek joined #gluster
12:39 surabhi joined #gluster
12:44 Wizek joined #gluster
12:47 d0nn1e joined #gluster
12:49 mdavidson joined #gluster
12:50 kshlm post-factum, in which log file did you see it?
12:50 post-factum kshlm: /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
12:51 kshlm That error says that the process doesn't understand the GLusterFS FOP RPCs.
12:51 kshlm Ah, okay.
12:51 kshlm Sometimes a client tries to send requests to glusterd instead of bricks.
12:51 post-factum that is strange O_o
12:52 kshlm I've seen it sporadically, but never bothered to root cause it.
12:52 kshlm It could possibly occur because of a client portmap request failing.
12:52 post-factum btw, 1.2.3.60:65534 — this is client
12:53 kshlm Yup.
12:53 post-factum but also IP of another node appeared there as well
12:53 luizcpg left #gluster
12:53 post-factum there are several records like this
12:53 kshlm A client translator needs to connect to glusterd to get the port of a brick by doing a portmap query.
12:53 kshlm It should reconnect to the brick port once it gets the port.
12:54 kshlm But maybe sometimes this reconnection doesn't happen.
12:54 kshlm An client xlator continues to use the glusterd connection itself to send IO requests.
12:55 kshlm The portmap query and reconnection code is not very clean. This is something I'm hoping to get fixed soon.
12:56 kshlm You may have to restart those client processes to get them to reconnect correctly to the bricks.
12:56 post-factum that is what i did
12:58 post-factum kshlm: thanks for the explanation. should i fill bugreport for that?
13:08 jwd joined #gluster
13:10 jwd joined #gluster
13:18 skoduri joined #gluster
13:24 johnmilton joined #gluster
13:26 jwaibel joined #gluster
13:29 nehar joined #gluster
13:29 jwd joined #gluster
13:31 kovshenin joined #gluster
13:32 morgbin Does anyone have experience integrating kerberos authtication into Gluster shares?
13:49 dnunez joined #gluster
13:51 rafi joined #gluster
13:53 jwd joined #gluster
13:53 DV__ joined #gluster
13:57 anoopcs ndevos, ^^
14:00 post-factum kshlm: ^^ (on bugreport)
14:03 kshlm post-factum, Sure file a bug report.
14:03 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
14:04 post-factum ok
14:04 kshlm This patch https://review.gluster.org/14254 is cleaning up portmap reconnection.
14:04 glusterbot Title: Gerrit Code Review (at review.gluster.org)
14:05 post-factum kshlm: should that be "protocol" component?
14:05 kshlm It mentions encryption in the title, but the changed to reconnection apply to unencrypted connections as well.
14:05 kshlm Use client as the component.
14:06 post-factum hmm, no such component there
14:07 plarsen joined #gluster
14:08 kshlm Okay, use protocol itself then.
14:08 kshlm Client translator should fall under protocol.
14:09 kramdoss_ joined #gluster
14:13 post-factum ok
14:13 post-factum also, I'll copy-paste your explanation
14:17 rafi joined #gluster
14:19 Klas morgbin: it's not currently possible, some people are working on it
14:19 Klas unless I'm entirely mistakemn
14:20 morgbin Okay, thanks
14:20 shubhendu joined #gluster
14:22 Wizek joined #gluster
14:27 gowtham joined #gluster
14:32 deniszh joined #gluster
14:34 deniszh joined #gluster
14:36 jiffin joined #gluster
14:37 MikeLupe joined #gluster
14:40 deniszh1 joined #gluster
14:45 jiffin1 joined #gluster
14:53 bluenemo joined #gluster
14:54 jiffin joined #gluster
15:01 hchiramm joined #gluster
15:04 ivan_rossi left #gluster
15:08 wushudoin joined #gluster
15:21 nishanth joined #gluster
15:27 bowhunter joined #gluster
15:28 squizzi joined #gluster
15:33 robb_nl joined #gluster
15:39 siel joined #gluster
15:44 Gnomethrower joined #gluster
15:51 nathwill joined #gluster
15:52 jbrooks joined #gluster
15:53 deniszh1 joined #gluster
15:54 nathwill hello. i'm having some trouble setting up geo-replication, hoping someone can give me some assistance. it looks like it's gverify.sh failing due to some weirdness, but i'm not sure why that would be... https://gist.github.com/nathwill/286ab2d08d164b606241a5a5f04b4a4e
15:55 glusterbot Title: gluster geo-replication err · GitHub (at gist.github.com)
15:56 kpease joined #gluster
15:58 nathwill i can ssh from the master to the slave no problem, so i'm not sure why the "transport endpoint disconnected' error would come up
16:00 skoduri joined #gluster
16:05 msvbhat joined #gluster
16:09 Seth_Karlo joined #gluster
16:12 kpease joined #gluster
16:26 deniszh joined #gluster
16:31 manous joined #gluster
16:33 nathwill joined #gluster
16:52 julim joined #gluster
17:01 nathwill joined #gluster
17:07 nishanth joined #gluster
17:11 jbrooks joined #gluster
17:13 skylar joined #gluster
17:19 JoeJulian nathwill: That "transport" error is coming from the socket connection to the named pipe. Look in /var/log/glusterfs/geo-replication-slaves/slave.log on the slave.
17:20 jri joined #gluster
17:21 nathwill JoeJulian: hmm, that file doesn't exist on the slave. i think i *might* have a clue though... the wdc.trhou.se is a hosts file entry on master, but the slave cluster doesn't have that name as the brick (they just have the LAN IPs)... could that do it?
17:22 JoeJulian No, I don't think so.
17:22 JoeJulian Does that log file exist on master?
17:22 nathwill yes. i added a comment to that gist with the command i'm running and the contents of that log file on the master
17:24 JoeJulian According to that error, there's no "workspaces" volume on "wdc.trhou.se".
17:25 nathwill hmm, there definitely is... https://gist.github.com/nathwill/463a6201ebdeb5c9b4a421b3dc7e2643
17:25 glusterbot Title: slave volume info · GitHub (at gist.github.com)
17:26 nathwill 10.120.0.54 is the node i assigned a floating ip to, and added the hosts entry for "wdc.trhou.se" on the master to point to
17:26 JoeJulian Does that hostname resolve to the slave from the master?
17:26 nathwill it resolves to a floating IP address that's nat'd to the slave (this is in openstack)
17:26 JoeJulian should work...
17:27 nathwill and i confirmed i can telnet to 24007 and 49152 on the slave from the master
17:28 JoeJulian Well, since this is openstack using floating-ips, all your bricks will be natted so you're partially correct in that they'll need to be hostnames for this to work.
17:29 nathwill so would the master be trying to connect to all of the nodes in the slave cluster?
17:29 jri joined #gluster
17:29 nathwill i can tear down the volume on the slave set and re-add with hostnames that are pointed to the private addresses on the slave side and public addresses on the master side
17:29 JoeJulian The client (glusterfs) connects to the management daemon on the slave (24007) to retrieve the volume definition (hasn't even gotten this far yet) then it connects to all the bricks in that volume.
17:30 nathwill ok. i'll try to make sure that all the bricks are reachable from the master then, they definitely wouldn't be currently
17:31 nathwill thanks
17:31 nathwill fingers crossed! :)
17:32 JoeJulian You're welcome. I'm packing boxes today so if you still need help, drop a message and I'll be back from time to time.
17:32 nathwill much obliged
17:46 dgandhi joined #gluster
17:49 Lee1092 joined #gluster
17:57 jri joined #gluster
18:08 bowhunter joined #gluster
18:08 deniszh joined #gluster
18:12 gnulnx Could use a little help with extended attributes on freebsd.  I stopped my cluster and then moved the brick directory around a bit, basically moved it to a backup location, created a zfs dataset at the existing location, and then moved gthe brick data inside taht zfs dataset
18:13 gnulnx Now the vol won't start, saying 'Failed to get extended attribute trusted.glusterfs.volume-id'
18:13 gnulnx Guess I need to set that attr on the zfs dataset directory to the value that exists on the other gluster server?
18:16 gnulnx However this doesn't look right to me: https://gist.github.com/kylejohnson/7db2d6a67c0b10f5ce59588f78517a5f
18:16 glusterbot Title: gist:7db2d6a67c0b10f5ce59588f78517a5f · GitHub (at gist.github.com)
18:27 gnulnx OK, managed to get it
18:27 gnulnx Last two digits of each of the longer hex blocks, and the '0x' at the beginning
18:33 bluenemo joined #gluster
18:34 nathwill JoeJulian: huh, ok, so i re-set the slave cluster to use the hostnames, added the root pub key to all slave nodes, added floating ips to all the slave nodes, did the hosts file entries for private addresses on the slave side, and public addresses on the master side, now getting a new error (whoo hoo!).
18:34 nathwill gverify.sh is passing now (exit 0), so that at least is working
18:34 nathwill https://gist.github.com/nathwill/6d92cff56c64403acf253f16c7400bdb
18:34 glusterbot Title: new errors · GitHub (at gist.github.com)
18:35 nbalacha joined #gluster
18:36 nathwill confirmed i can telnet to 24007 and 49153, and can do public-key auth SSH as root on all the slave nodes from the master
18:36 nathwill progress :)
18:36 JoeJulian :)
18:36 hagarth joined #gluster
18:37 JoeJulian "gluster volume status" on the slave. Are all the bricks listening on 49153?
18:37 JoeJulian @ports
18:37 glusterbot JoeJulian: glusterd's management port is 24007/tcp (also 24008/tcp if you use rdma). Bricks (glusterfsd) use 49152 & up. All ports must be reachable by both servers and clients. Additionally it will listen on 38465-38468/tcp for NFS. NFS also depends on rpcbind/portmap ports 111 and 2049.
18:38 JoeJulian that error message is really helpful... :/
18:39 nathwill https://gist.github.com/nathwill/3369b8d30f29bc5e3722fbf69ac5db50
18:40 glusterbot Title: gist:3369b8d30f29bc5e3722fbf69ac5db50 · GitHub (at gist.github.com)
18:40 nathwill yeah, right?
18:41 nathwill we disable NFS, and yeah, they all seem to be listening on 49153; we secgroup and iptables the 49152-49155 range since it does seem to jump around sometimes.
18:42 JoeJulian Need to get a trace of where that timeout is coming from
18:43 nathwill any advice on a good way to do that? some debug flag on the push-pem command or something?
18:46 ben453 joined #gluster
18:51 bluenemo joined #gluster
18:56 JoeJulian nathwill: Doesn't push-pem just run a bash script at the far end? iirc that's all bash scripts.
18:57 nathwill maybe? i didn't see a bash script in the ps output; when i ran ps when the georepsetup hung it was running create no-verify force
18:57 nathwill and stracing that gluster command, it seems to be looping around connecting to a quotad socket?
18:58 nathwill sec, gisting some stuff
18:58 nathwill https://gist.github.com/nathwill/baeca82990603254d134901cd3b56957
18:58 glusterbot Title: gist:baeca82990603254d134901cd3b56957 · GitHub (at gist.github.com)
18:58 nathwill which is weird, we don't use quotas
18:59 nathwill not sure if that's a red herring
19:01 nathwill i can try to strace the push-pem command directly
19:01 JoeJulian which version is this?
19:01 nathwill 3.7.12
19:01 nathwill on both ends
19:02 wadeholler joined #gluster
19:05 JoeJulian does /var/run/gluster exist?
19:06 nathwill yes
19:06 nathwill we just package-installed from the gluster repos
19:06 nathwill for epel
19:08 nathwill quotad.socket doesn't exist in that dir though
19:08 nathwill a couple uuid.socket and changelog-uuid.socket on the master side
19:09 nathwill ditto on the slaves
19:12 JoeJulian I see a number of log messages that would get logged if your log-levels weren't abnormal.
19:12 JoeJulian Might tell us something.
19:12 JoeJulian I think I would set the client-log-level to debug until we figure this out.
19:13 JoeJulian back in a bit.
19:21 nathwill kk
19:23 bluenemo joined #gluster
19:30 kovshenin joined #gluster
19:33 bluenemo joined #gluster
19:34 nathwill ok, turned on client debug logs: https://gist.githubusercontent.com/nathwill/e6877a274429e244821b5f269b9a470c/raw/17bec833fe9e0ac8358f7f5e1c78885e29e4d29d/gistfile1.txt
19:38 JoeJulian nathwill: is that slave running an oracle kernel?
19:38 nathwill no, just the usual centos kernel
19:39 JoeJulian ENODEV when reading /dev/fuse
19:39 nathwill yeah, i dunno wth that is... it definitely exists
19:39 JoeJulian but maybe that's not what it looks like
19:41 nathwill JoeJulian: does the slave need to be able to initiate connections to the master or anything?
19:41 JoeJulian Nope
19:41 nathwill ok
19:44 kovsheni_ joined #gluster
19:45 JoeJulian Well there's nothing in gluster's source that can create ENODEV.
19:45 jri joined #gluster
19:45 nathwill yeah, i'm looking in https://bugzilla.redhat.com/show_bug.cgi?id=764033, seems weird
19:45 glusterbot Bug 764033: medium, low, ---, csaba, CLOSED NOTABUG, glusterfs-fuse: terminating upon getting ENODEV when reading /dev/fuse
19:45 JoeJulian but... maybe gverify.sh unmounts the volume...
19:47 nathwill would create no-verify skip gverify.sh?
19:47 nathwill i'm thinking somehow not since the georepsetup errors similarly
19:47 nathwill and it uses the no-verify on the geo-rep create
19:48 JoeJulian I'm guessing no because "[2016-06-30 19:27:54.926551] I [fuse-bridge.c:5013:fuse_thread_proc] 0-fuse: unmounting /tmp/gverify.sh.8whivW"
19:49 JoeJulian Interesting. They specifically make the mount busy so it cannot unmount, lazy umount and delete the path so the script can continue operating on the mount but the rest of the os can't see it.
19:50 JoeJulian Oh, ok, they do leave the directory eventually
19:51 JoeJulian That seems dumb.
19:53 nathwill in slave_stats() {...} ?
19:54 JoeJulian do_verify
19:55 JoeJulian nm, that's cmd_master
19:56 JoeJulian https://github.com/gluster/glusterfs/blob/release-3.7/geo-replication/src/gverify.sh#L35-L43
19:56 glusterbot Title: glusterfs/gverify.sh at release-3.7 · gluster/glusterfs · GitHub (at github.com)
19:56 JoeJulian Seems like it would do the same thing without line 35 and would eliminate the need for "-l" from umount.
19:59 nathwill well, i can try commenting if we think it's worth a shot
20:02 robb_nl joined #gluster
20:11 ben453 Does anyone know when 3.8 will be packaged for CentOS?
20:12 JoeJulian ben453: it's in the storage SIG.
20:15 arcolife joined #gluster
20:16 ben453 I tried the steps outlined here: https://wiki.centos.org/SpecialInterestGroup/Storage
20:16 glusterbot Title: SpecialInterestGroup/Storage - CentOS Wiki (at wiki.centos.org)
20:16 ben453 yum install centos-release-gluster
20:16 ben453 and yum install glusterfs-server
20:17 ben453 but they both got me a flavor of 3.7
20:17 JoeJulian hrm
20:19 JoeJulian which centos release?
20:20 JoeJulian Looks like it's only being built for el7
20:22 ben453 I'm on centos 7.2, so I should've found it then
20:23 ben453 Which is running el7
20:40 darylllee you have to setup the rep yourself for 3.8.  It's there, but only 3.7 is installed when using the centos-release-gluster install
20:40 robb_nl joined #gluster
20:44 ben453 darylllee: do I have to make it from source myself?
20:45 ben453 I can do that but I'd rather just yum install it =P
20:47 dnunez joined #gluster
20:48 pwa joined #gluster
20:49 darylllee if you look int  /etc/yum.repos.d you will see CentOS-Gluster-3.7.repo
20:49 darylllee just copy it and change the contents from 3.8 to 3.8  is probably the easiest way
20:49 darylllee then just do yum update or yum install glusterfs-<whatever>
20:50 darylllee its all there, its just the main storage repo doesnt yet recognize 3.8 as the latest release for whatever reason
21:00 dnunez joined #gluster
21:28 dnunez joined #gluster
21:38 morsik_ joined #gluster
21:42 klaas- joined #gluster
21:44 deniszh joined #gluster
21:52 caitnop joined #gluster
22:02 nbalacha joined #gluster
22:10 nathwill joined #gluster
22:16 julim joined #gluster
22:31 hagarth joined #gluster
22:31 nathwill JoeJulian: yeah, there must be something weird with that script; if i try to mount the slave volume direction (`mount -t glusterfs ws-gluster0.prod.wdc04:workspaces /mnt`), it works no problem
22:33 nathwill *directly
22:36 DV__ joined #gluster
22:36 Ulrar joined #gluster
22:39 nage joined #gluster
22:44 bowhunter joined #gluster
22:56 F2Knight joined #gluster
22:59 cliluw joined #gluster
23:16 dnunez joined #gluster
23:30 nbalacha joined #gluster
23:46 nathwill i did try commenting the cd and changing the umount -l to a straight umount, but no luck :/
23:46 nathwill i may blow away all the geo-replication stuff and reinstall that package so i can start clean and make sure there's no fallout from the initial issue

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary