Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2017-01-20

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 farhorizon joined #gluster
00:07 aleksk [root@lon1ulq8400xen006 ~]# gluster volume set q8400_gfs_vol06 nfs.disable off
00:07 aleksk Error : Request timed out
00:07 aleksk :-/
00:12 aleksk looks like it was just hanging.. responsive now
00:13 aleksk is 3.9.x considered a devel branch or a stable branch? :)
00:20 aleksk now i'm having issues getting nfsd started on the newly reinstalled system
00:21 aleksk keep seeing it spawn repeatedly in the nfs log file:
00:21 aleksk [2017-01-20 00:20:24.158063] I [MSGID: 100030] [glusterfsd.c:2455:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.9.0 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/20b588c3f3d9​80fb52ac79756873966a.socket)
00:22 plarsen joined #gluster
00:23 aleksk and the server reboot again :-/
00:25 mb_ joined #gluster
00:26 aleksk and the server is not booting back up.. wonderful
01:02 farhorizon joined #gluster
01:07 shdeng joined #gluster
01:08 renout_away joined #gluster
01:17 ashiq joined #gluster
01:31 nbalacha joined #gluster
01:35 mb_ joined #gluster
02:00 ashiq joined #gluster
02:13 mb_ joined #gluster
02:23 derjohn_mobi joined #gluster
02:26 bbooth_ joined #gluster
02:29 Wizek joined #gluster
02:32 nh2_ joined #gluster
02:32 gyadav joined #gluster
02:33 om2 joined #gluster
02:36 daMaestro joined #gluster
02:46 farhorizon joined #gluster
03:02 newdave joined #gluster
03:21 sanoj joined #gluster
03:27 blu__ joined #gluster
03:33 om2 joined #gluster
03:34 nbalacha joined #gluster
03:41 sanoj joined #gluster
03:44 ahino joined #gluster
03:44 magrawal joined #gluster
03:44 kdhananjay joined #gluster
03:45 newdave joined #gluster
03:46 farhorizon joined #gluster
03:46 karthik_us joined #gluster
03:51 kramdoss_ joined #gluster
03:55 atinm_ joined #gluster
03:57 gyadav joined #gluster
03:58 atinm joined #gluster
03:59 buvanesh_kumar joined #gluster
04:00 Lee1092 joined #gluster
04:03 Shu6h3ndu joined #gluster
04:04 nishanth joined #gluster
04:09 RameshN joined #gluster
04:13 gyadav_ joined #gluster
04:15 jbrooks joined #gluster
04:43 apandey joined #gluster
04:46 ppai joined #gluster
04:47 newdave joined #gluster
05:02 rjoseph joined #gluster
05:05 itisravi joined #gluster
05:08 rafi joined #gluster
05:15 midacts joined #gluster
05:21 msvbhat joined #gluster
05:23 prasanth joined #gluster
05:25 sanoj joined #gluster
05:26 skoduri joined #gluster
05:28 mb_ joined #gluster
05:30 armyriad joined #gluster
05:36 armyriad joined #gluster
05:41 riyas joined #gluster
05:41 SlickNik joined #gluster
05:43 sanoj joined #gluster
05:46 sbulage joined #gluster
05:51 mahendratech joined #gluster
05:51 ndarshan joined #gluster
05:57 msvbhat joined #gluster
05:58 ashiq joined #gluster
06:09 susant joined #gluster
06:17 ppai joined #gluster
06:23 jiffin joined #gluster
06:48 ankit_ joined #gluster
06:49 k4n0 joined #gluster
06:53 mb_ joined #gluster
07:04 aravindavk joined #gluster
07:09 squizzi joined #gluster
07:21 mhulsman joined #gluster
07:23 jtux joined #gluster
07:23 mhulsman joined #gluster
07:35 newdave joined #gluster
07:37 ashiq joined #gluster
07:37 msvbhat joined #gluster
07:38 [diablo] joined #gluster
07:40 TFJensen joined #gluster
07:42 ankit__ joined #gluster
07:44 TFJensen Hi everybody, Which setup would be the best for vm storage. gluster is on 10G. I have 3 nodes with 4*1TB unused disks for storage. Should I create software raid on each host for the 4 disks or replicate each disk/brick over 10G. And how would I set that up ?
07:45 k4n0 joined #gluster
07:45 yosafbridge` joined #gluster
07:46 rastar joined #gluster
07:46 Bardack_ joined #gluster
07:47 Chinorro_ joined #gluster
07:49 unlaudable joined #gluster
07:50 rideh- joined #gluster
07:50 MadPsy joined #gluster
07:50 MadPsy joined #gluster
07:51 tdasilva joined #gluster
07:55 steveeJ joined #gluster
08:00 zoyvind joined #gluster
08:03 zoyvind Hi. Sorry to bother you guys, but can you give me pointers to info about Gluster realtime HA support. We are hit by the 42s network.ping-timeout in the fuse client during a brick-node downtime/maintenance.
08:06 om2 joined #gluster
08:06 musa22 joined #gluster
08:07 cacasmacas joined #gluster
08:09 Debloper joined #gluster
08:10 pulli joined #gluster
08:15 MikeLupe joined #gluster
08:29 fsimonce joined #gluster
08:30 musa22 joined #gluster
08:35 shortdudey123 joined #gluster
08:36 alezzandro joined #gluster
08:37 ashiq joined #gluster
08:41 jri joined #gluster
08:51 auzty joined #gluster
08:57 Humble joined #gluster
09:02 flying joined #gluster
09:03 ivan_rossi joined #gluster
09:05 Shu6h3ndu joined #gluster
09:05 nishanth joined #gluster
09:08 flying joined #gluster
09:09 social joined #gluster
09:11 Saravanakmr joined #gluster
09:19 pulli joined #gluster
09:20 derjohn_mobi joined #gluster
09:20 apandey joined #gluster
09:21 sona joined #gluster
09:21 Seth_Karlo joined #gluster
09:22 Seth_Karlo joined #gluster
09:25 apandey joined #gluster
09:35 percevalbot joined #gluster
09:44 kotreshhr joined #gluster
09:45 Slashman joined #gluster
09:46 kdhananjay joined #gluster
09:48 skumar joined #gluster
10:19 sbulage joined #gluster
10:19 Vide joined #gluster
10:22 TBlaar joined #gluster
10:23 TBlaar hi people,
10:23 TBlaar not sure if this is the correct place to ask.  I am using the puppetlabs gluster module, we have built 2 servers with a single brick shared between the two successfully
10:24 TBlaar I am trying to mount the same brick on a 3rd server, and need to stick this in the hiera yaml file for the 3rd server
10:24 TBlaar the puppet code for this is:
10:25 TBlaar # gluster::mount { 'data1':
10:25 TBlaar #   ensure    => present,
10:25 TBlaar #   volume    => 'srv1.local:/data1',
10:25 TBlaar #   transport => 'tcp',
10:25 TBlaar #   atboot    => true,
10:25 TBlaar #   dump      => 0,
10:25 TBlaar #   pass      => 0,
10:25 TBlaar # }
10:25 TBlaar I have tried so many time but putting this into yaml just does not work.  Can anyone point me at a correct Yaml format for the above?
10:25 TBlaar ie:
10:26 ivan_rossi left #gluster
10:26 TBlaar this
10:32 mahendratech joined #gluster
10:36 gem joined #gluster
10:36 TBlaar gluster::mount:
10:36 TBlaar storage:
10:36 TBlaar volume:  '12.12.12.12:/export/share/glusterfs'
10:36 TBlaar ensure: mounted
10:36 TBlaar transport: 'tcp'
10:39 bfoster joined #gluster
10:39 riyas_ joined #gluster
10:44 buvanesh_kumar joined #gluster
10:53 ashiq joined #gluster
10:55 musa22 joined #gluster
11:16 ndevos TBlaar: I've never used puppet, but a volume consists out of multiple bricks, you mount the bricks on the storage servers, and the volume on the client
11:17 ndevos TBlaar: mounting the volume could be like 'srv1.local:/data1', the volume is then called "data1"
11:17 ndevos TBlaar: but this one looks weird: volume:  '12.12.12.12:/export/share/glusterfs'
11:17 ndevos that could be the path of a brick instead
11:18 TBlaar yes, I have these details, it's just my yaml not working.  I can actually mount it manually but i'm just being dumb about the yaml format to do this with puppet.  thanks
11:18 ndevos if you run "gluster volume info", you should see the volume name clearly, and the bricks it is using
11:18 ndevos ah, ok, I really dont know anything about any of the formats that puppet uses
11:19 TBlaar you see, for me I already have two servers running well.  I just need to mount the glusterfs on a 3rd server
11:20 TBlaar ie, right now, gluster volume info does not work on the 3rd server as gluster is not and should not be on here
11:20 TBlaar I have the gluster-client packages installed
11:21 rastar joined #gluster
11:22 TBlaar gluster volume info
11:22 TBlaar Volume Name: storage
11:22 TBlaar Type: Replicate
11:22 TBlaar Volume ID: 56bdd4f1-2c1e-4def-ad9a-blablabla
11:22 TBlaar Status: Started
11:22 TBlaar Snapshot Count: 0
11:22 TBlaar Number of Bricks: 1 x 2 = 2
11:22 TBlaar Transport-type: tcp
11:22 TBlaar Bricks:
11:22 TBlaar Brick1: server1.lcoal:/export/shared/glusterfs
11:22 TBlaar Brick2: server2.local:/export/shared/glusterfs
11:22 TBlaar Options Reconfigured:
11:22 TBlaar nfs.disable: on
11:22 TBlaar performance.readdir-ahead: on
11:23 ndevos ok, so on the 3rd server you would manually mount it with <hostname>:/<volume> -> mount -t glusterfs server1.local:/storage /mnt
11:24 nishanth joined #gluster
11:24 ndevos now, "storage" is the name of the volume, and I expect that should be the value for puppet too.. or maybe "server1.local:/storage"
11:27 hgowtham joined #gluster
11:28 jkroon joined #gluster
11:28 pulli joined #gluster
11:41 cloph yeah, using the brick name shoes that you did not have that detail down... as it comes to yaml: try quoting the key ('gluster::mount': …)
11:42 percevalbot joined #gluster
11:43 victori joined #gluster
11:44 kotreshhr left #gluster
11:53 ankit_ joined #gluster
11:54 Chewi joined #gluster
11:56 Chewi hello. I'm a bit concerned about security regarding opening ports for gluster to use. I had assumed that geo-replication operated purely over ssh and was very surprised to find that's not the case. the manual just says "open the firewalls" without acknowledging that you really wouldn't want to over the internet. there are VPNs of course, but there really should be more info one what these ports are for and how safe they are to open.
11:58 percevalbot joined #gluster
11:59 nishanth joined #gluster
12:03 apandey joined #gluster
12:04 percevalbot joined #gluster
12:05 backupguru joined #gluster
12:08 rjoseph joined #gluster
12:10 alvinstarr joined #gluster
12:10 nishanth joined #gluster
12:19 jkroon Chewi, don't open them for the world.
12:19 jkroon open them such that only your two machines can communicate on them.
12:19 jkroon if you're concerned about sniffers listening in, ipsec.
12:19 Chewi jkroon: well yeah, I know how firewalls work but it's still not great. we're operating in a PCI-DSS environment.
12:21 jkroon ipsec at least tunnels it so you should be fine, especially in combination with proper ACLs at the firewall level.  not sure "what's not great" that you're worried about.
12:21 ndevos Chewi: I'm pretty sure geo-replication was possible over SSH only, I'm not aware that it changed...
12:22 ndevos aravindavk: is it not possible anymore to do geo-replication only over SSH?
12:23 Chewi ndevos: I've used geo-replication in the past but in a lower security environment so I wouldn't have noticed. now trying to set up 2 test nodes that can only see each other via SSH and saw it was trying port 24007. I also saw a couple of mailing list posts asking about this but not much detail was given in the replies.
12:24 aravindavk ndevos: Georep only uses SSH. I think it uses 24007 of port to get Slave Volume info. That also can be changed to use SSH
12:24 Chewi aravindavk: yeah? that sounds good.
12:25 ndevos aravindavk: great, you dont happen a link to the docs for that?
12:25 Shu6h3ndu joined #gluster
12:26 aravindavk ndevos: that behaviour introduced with distributed georep(I think Gluster 3.6), doc doesn't have this info.
12:27 aravindavk ndevos: Chewi I will open a RFE to implement the volinfo fetch over SSH
12:28 ndevos aravindavk: hmm, ok, so that means at the moment geo-replication really wants to use port 24007?
12:29 Chewi aravindavk: I'm confused, is that implemented now or not?
12:31 Chewi aravindavk: if it's not but it's something it only does during the "create" step then I can live with that for now. or is it needed all the time?
12:31 flying joined #gluster
12:31 aravindavk ndevos,Chewi: during worker start, each worker(one worker per master brick) wants to know a Slave node to connect to. Monitor process gets Slave nodes information by running "gluster volume info --remote-host=slave_primary_host". This CLI contacts remote/slave glusterd using 24007 port
12:31 Chewi I see
12:31 aravindavk Chewi: it is during Create and during worker start
12:33 Chewi aravindavk: so you meant that it didn't use this port before 3.6 but it could be changed to just use SSH later?
12:33 xavih joined #gluster
12:36 aravindavk Chewi: It can be changed to use only SSH. But these remote queries are limited to only read only commands
12:37 Chewi aravindavk: good to know, that was my next question :)
12:37 Chewi I have to go for a bit but thanks, that was very helpful
12:46 backupguru left #gluster
12:47 kettlewell joined #gluster
12:48 flomko hi all! i'm using nfs-ganesha as nfs server(several node), how can i reserve nfs connection? /etc/fsta - nfs-ganesha1:/gluster  /gluster     nfs       defaults,_netdev      0       0 , can i add backup volume(nfs-ganesha2), where trying connect nfs in case of out of service node nfs-ganesha1
12:48 flomko ha-nfs, it would be pretty good
12:57 ankit_ joined #gluster
12:58 cloph isn't the idea with ha-ganesha that you'd use a private IP that is switched automtically depending on what actual servers are up?
13:00 nh2_ joined #gluster
13:00 ira joined #gluster
13:02 jri joined #gluster
13:11 ankit_ joined #gluster
13:22 victori joined #gluster
13:27 musa22 joined #gluster
13:29 unclemarc joined #gluster
13:45 shyam joined #gluster
13:45 alezzandro joined #gluster
13:46 nh2_ joined #gluster
13:48 skylar joined #gluster
14:09 squizzi joined #gluster
14:11 nh2_ joined #gluster
14:12 kpease joined #gluster
14:13 rastar joined #gluster
14:13 vbellur joined #gluster
14:16 Norky joined #gluster
14:22 pdrakeweb joined #gluster
14:33 MikeLupe Hi - I've de-sync'd my oVirt r3a1 (clusterssh yum update on all nodes...don't ask) and I launched "gluster volume heal engine full". " Volumes up, peers ok, gluster volume heal engine info" shows for the 3 node's brick "Number of entries: 1" for each node. This is what I get for the last 4 hours. Is that normal on a 50GB volume?
14:33 MikeLupe Is there another way to check the healing's status? Thanks a lot, I'd like to get my oVirt cluster back after 2 days ;)
14:38 kotreshhr joined #gluster
14:40 k4n0 joined #gluster
14:40 rastar joined #gluster
14:42 saali joined #gluster
14:44 nh2_ joined #gluster
14:51 bbooth joined #gluster
14:56 kotreshhr left #gluster
14:58 XpineX joined #gluster
15:01 plarsen joined #gluster
15:09 nishanth joined #gluster
15:12 ppai joined #gluster
15:14 bluenemo joined #gluster
15:20 Gambit15 joined #gluster
15:23 nh2_ joined #gluster
15:28 farhorizon joined #gluster
15:35 bbooth joined #gluster
15:36 MikeLupe Is there a way to check the healing's status other than "gluster volume heal engine info" ?
15:45 bbooth joined #gluster
15:45 aleksk having issue getting this nfs service working
15:45 skoduri joined #gluster
15:45 aleksk actually having issue with my host getting OOM processes killed and hanging the box
15:45 aleksk what's the minimum memory requirements for gluster 3.9?
15:51 sona joined #gluster
15:51 guhcampos joined #gluster
16:00 nh2_ joined #gluster
16:02 ira joined #gluster
16:03 vbellur joined #gluster
16:06 nh2_ joined #gluster
16:07 wushudoin joined #gluster
16:08 wushudoin joined #gluster
16:11 arpu joined #gluster
16:17 RameshN joined #gluster
16:21 shyam joined #gluster
16:23 bowhunter joined #gluster
16:24 bbooth joined #gluster
16:26 MidlandTroy joined #gluster
16:38 musa22 joined #gluster
16:42 RameshN joined #gluster
16:42 jri joined #gluster
16:44 susant joined #gluster
16:49 shyam joined #gluster
16:49 nishanth joined #gluster
16:51 jdossey joined #gluster
16:58 vbellur joined #gluster
17:24 aleksk any ideas why nfs keeps trying to spawn repeatedly a few times before giving up? i am using gluster 3.9 with the built in nfs services
17:25 kkeithley what's in the log? /var/log/glusterfs/nfs.log
17:26 aleksk kept seeing this repeatedly: Started running /usr/sbin/glusterfs version 3.9.0 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/20b588c3f3d9​80fb52ac79756873966a.socket)
17:26 aleksk like 9 times 2 seconds between
17:29 aleksk seems to have started eventually
17:30 aleksk connection to 10.169.16.101:49154 failed (No route to host)
17:30 aleksk weird.. i can ping that IP just fine
17:30 Chewi left #gluster
17:31 aleksk might be a firewall thing
17:32 aleksk weird.. the brick process on that host is using port 49154 instead of 49152 like every other system..
17:34 kkeithley works for me on with glusterfs-3.9.1 on f25.  and using port 49152
17:34 aleksk doesn't look like 49152 is in use by another process.. not sure why on this one host it is using a different port.. there a way to control the port number/force it to be like the other systems?
17:35 kkeithley it should use 49152 by default.
17:36 aleksk that's what i'm seeing on the other 9 systems.. this one tho is using 49154/49155 for some reason
17:36 kkeithley (,,paste) /var/log/glusterfs/nfs.log
17:37 kkeithley bah
17:37 kkeithley @paste
17:37 glusterbot kkeithley: For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999
17:37 rastar joined #gluster
17:37 aleksk the nfs services are now working..
17:37 aleksk just that i notice this other set of messages in that log file
17:39 aleksk which lead me to realize the brick process on an unrelated host is using a diff por tthan the other systems
17:39 aleksk the host is working otherwise fine but i don't like having a special snowflake in the cluster :-/
17:40 aleksk i'm in the middle of xfering a large file over nfs so i don't want to restar the proess on that particular host
17:40 aleksk speaking of which, is there a recommended way to make the cluster HA (i'm currently pointing my nfs client at one particular node in the cluster).. something that involves VIPs that can float around?
17:41 wkf joined #gluster
17:42 kkeithley people use CTDB for gnfs HA.  3.7 and later uses nfs-ganesha and pacemaker+corosync.
17:43 kkeithley docs are at http://gluster.readthedocs.io/en/latest/
17:43 glusterbot Title: Gluster Docs (at gluster.readthedocs.io)
17:43 aleksk is it not possible to do with the built-in nfs services? i'd rather avoid configuring a different nfs implementation (ganesha)
17:47 kkeithley to do what?
17:48 kkeithley use pacemaker+corosync with gnfs? Sure. You'll have to write it yourself.
17:49 vbellur joined #gluster
17:49 aleksk to do HA without ganesha but with the built-in NFS services
17:49 shyam joined #gluster
17:50 kkeithley yes, people use CTDB. Documentation is at http://gluster.readthedocs.io/en/latest/
17:50 glusterbot Title: Gluster Docs (at gluster.readthedocs.io)
17:53 sona joined #gluster
17:53 aleksk CTDB looks like it's designed for samba and happens to support NFS.. if i have no interest in running CIFS/samba, can CTDB do ONLY nfs?
17:54 aleksk Results for CTDB
17:54 aleksk No results found. Bummer.
17:56 JoeJulian aleksk: because you've added (and probably removed) two bricks from that server making this use the third address.
17:56 JoeJulian aleksk: That's why it's using 49154
17:56 JoeJulian @ports
17:56 glusterbot JoeJulian: glusterd's management port is 24007/tcp (also 24008/tcp if you use rdma). Bricks (glusterfsd) use 49152 & up. All ports must be reachable by both servers and clients. Additionally it will listen on 38465-38468/tcp for NFS. NFS also depends on rpcbind/portmap ports 111 and 2049.
17:57 JoeJulian "is there a recommended way to make the cluster HA?" Yes, fuse mounts are HA by design.
17:57 kkeithley no, you don't have to run Samba to run CTDB.  I searched for "linux gluster nfs ctdb" and found several relevant documents
17:58 kkeithley he wants HA for gnfs
17:59 JoeJulian It's good to want things. Builds character.
17:59 aleksk :) lol
17:59 kkeithley I just want egg in my bear
17:59 kkeithley beer
18:00 aleksk weird.. the search seems broken for me.. i tried searching for 'linux gluster nfs ctdb' as well and it's again saying no results
18:00 JoeJulian Yeah, no eggs in bears. They'll attack students in schools requiring kids to be armed... <smh>
18:00 aleksk i'm using the search box at the top left of the latest docs page
18:00 kkeithley google search
18:01 JoeJulian (Never use docs search for anything. The service that's hosting them can't seem to figure out how to do search.)
18:01 aleksk re 'fuse mounts are HA by design'.. so even tho you're specifying a single node in the cluster on the command line, if that node goes away/crashes.. the mount will know to use other nodes in the cluster (if the volume is created w/ resiliancy/, e.g replication, dispersed, etc)
18:01 JoeJulian ~mount command | aleksk
18:01 glusterbot JoeJulian: Error: No factoid matches that key.
18:02 JoeJulian <sigh>
18:02 JoeJulian ~mount server | aleksk
18:02 glusterbot aleksk: (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrdns, or (#2) One caveat is that the clients never learn of any other management peers. If the client cannot communicate with the mount server, that client will not learn of any volume changes.
18:02 JoeJulian I think I need more coffee
18:02 aleksk mount -t glusterfs node1:/share_name <-- points explicitly to node1 no ?
18:02 glusterbot aleksk: <'s karma is now -26
18:02 aleksk got it
18:02 aleksk i figure i might need to use rrdns as a last resort..
18:03 aleksk unfortunately the app i'm using only supports nfs/iscsi (xenserver).. wondering if i can multipath a iscsi target at least
18:03 aleksk only problem with that is it requires a block device to export over iscsi which means i have to start making sparsefile 'block' files
18:04 aleksk how's the karma system on this channel work ?
18:05 kkeithley aleksk++
18:05 glusterbot kkeithley: aleksk's karma is now 1
18:05 JoeJulian If you don't need migratable locks, you can just use any of a myrriad of vip services to float ips that are used only for nfs.
18:07 nh2_ joined #gluster
18:11 kkeithley and if you do need migratable locks, you're out of luck.
18:11 kkeithley with gnfs anyway
18:16 ashiq joined #gluster
18:20 jbrooks joined #gluster
18:26 MikeLupe hello - is there another way to check r3a1 healing than "gluster volume heal <volum> info" ?
18:27 MikeLupe I've got on the 3 nodes "Number of entries: 1" for a long time
18:29 sona joined #gluster
18:31 vbellur joined #gluster
18:31 shyam joined #gluster
18:32 gnulnx joined #gluster
18:38 JoeJulian MikeLupe: You can check the state dump. Look for self-heal locks.
18:39 JoeJulian The only self-heal entries I've seen get stuck are directories.
18:39 MikeLupe Hi JoeJulian , still here :) How do I do that, sry
18:39 MikeLupe It's like 7 month I used my gluster "skills" ..
18:40 JoeJulian "pkill -USR1 glusterfsd" on a server, or use the state-dump gluster command (see gluster help).
18:41 JoeJulian If it's a directory, clearing the trusted.afr.* ,,(extended attributes) seems to work.
18:41 glusterbot (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://pl.atyp.us/hekafs.org/index.php/​2011/04/glusterfs-extended-attributes/
18:42 musa22 joined #gluster
18:43 MikeLupe urgh. Is this normal, after a "gluster volume heal engine info healed": "Gathering list of healed entries on volume engine has been unsuccessful on bricks that are down. Please check if all brick processes are running." ?
18:44 farhorizon joined #gluster
18:45 MikeLupe JoeJulian: "pkill .." gave no result at all
18:47 gluytium joined #gluster
18:49 JoeJulian it dumps it in /var/run/gluster
18:49 farhorizon joined #gluster
18:50 farhorizon joined #gluster
18:58 MikeLupe Oh man - there's a lot in those dunps
18:58 MikeLupe yes thanks, I found gluster --print-statedumpdir
18:58 MikeLupe :)
18:58 JoeJulian Ah yeah, forgot about that one.
18:59 aleksk i got that same error '..check if all brick processes are running' (they are but this error message seems erroneous)
18:59 farhoriz_ joined #gluster
19:00 JoeJulian It is. I've been seeing that more with 3.8. Not sure why yet.
19:01 aleksk i haven't tried it when i was running 3.7 but i'm running 3.9 now
19:01 MikeLupe 3.7.19 here
19:01 MikeLupe one of the dumps got this:
19:01 MikeLupe [xlator.features.locks.data-locks.inode]
19:01 MikeLupe path=<gfid:ed64ae1c-0536-4634-bd13-0be94f2ff84a>
19:01 MikeLupe mandatory=0
19:01 MikeLupe conn.1.id=node01.domain.tld-15864-2017/​01/20-17:22:16:633739-data-client-0-0-0
19:01 MikeLupe conn.1.ref=1
19:01 MikeLupe conn.1.bound_xl=/gluster/data/brick1
19:01 MikeLupe conn.2.id=node03.domain.tld-22745-2017/​01/20-17:24:23:860237-data-client-0-0-0
19:01 MikeLupe conn.2.ref=1
19:01 MikeLupe conn.2.bound_xl=/gluster/data/brick1
19:01 MikeLupe conn.3.id=node01.domain.tld-15852-2017/​01/20-17:22:15:626528-data-client-0-0-0
19:01 MikeLupe conn.3.ref=1
19:01 MikeLupe conn.3.bound_xl=/gluster/data/brick1
19:01 MikeLupe conn.4.id=node03.domain.tld-22736-2017/​01/20-17:24:22:855060-data-client-0-0-0
19:01 MikeLupe conn.4.ref=1
19:01 MikeLupe yep sorry for flooding
19:03 JoeJulian MikeLupe: You're only looking for "self-heal"
19:03 MikeLupe aleksk: yes, mine are all running aswell
19:04 MikeLupe JoeJulian: well...I don't even know what I'm looking for. I "crashed" my oVirt cluster when doing a great "clusterssh yum update" on all nodes (don't ask me about my state of mind when I did that)
19:05 JoeJulian ouch, split-brain is the most-likely result.
19:05 MikeLupe First rule anyone even in kindergarten learns for clusters - one node a a time.
19:06 MikeLupe I've got no split-brain acording to "gluster volume heal engine info split-brain"
19:06 JoeJulian I would inspect whatever that one entry is on all bricks.
19:07 k4n0 joined #gluster
19:07 cliluw joined #gluster
19:07 MikeLupe I've unfortunately got no idea how to identify that entry
19:08 MikeLupe I've got a GFID, but wth and how should I use it?
19:09 JoeJulian gfid's are hardlinks (or symlinks if directories) under $brick_root/.glusterfs/XX/YY/XXYYZZZZ-ZZZZ-....
19:09 JoeJulian https://joejulian.name/blog/what-is-​this-new-glusterfs-directory-in-33/
19:09 glusterbot Title: What is this new .glusterfs directory in 3.3? (at joejulian.name)
19:10 Acinonyx joined #gluster
19:17 MikeLupe JoeJulian: how to I identify what "XX" folder is used? or should I siply search fr the GFID throufh the .glusterfs folder?
19:18 MikeLupe It's in your link. :)
19:18 JoeJulian :)
19:18 MikeLupe sorry about being dilletantic
19:19 MikeLupe ok - it's a file
19:19 * JoeJulian googles dilletantic.
19:20 JoeJulian It's not often I get to do that, thanks. :)
19:20 MikeLupe hahaaa
19:20 MikeLupe I'm an easy target ;)
19:28 mhulsman joined #gluster
19:31 snehring I have a distributed-disperse volume. Is the recommended way to replace a completely failed brick with the brick-replace command still? If so is there an easy way to do a replace so the brick has the same name as the old brick?
19:32 MikeLupe JoeJulian: 512MB is standard I guess. I'm honestly to dilletantish (googled it aswell) to find out what to do with that one. Can you give me a hint? Nice article bt
19:32 MikeLupe +w
19:33 jkroon joined #gluster
19:33 JoeJulian snehring: I just kill the glusterfsd for that brick, replace it, then start...force the volume which re-adds the volume-id and starts glusterfsd for that brick.
19:34 snehring JoeJulian: Ah, didn't think to try to force start the volume
19:34 snehring JoeJulian: thanks
19:39 jbrooks joined #gluster
19:43 ZiemowitP joined #gluster
19:47 ZiemowitP hey, i have a 4 node gluster with distributed/replicated volumes for virtual machine storage.  When one of the gluster nodes gets shutdown for few minutes of maintenance, i noticed the the VMs fail and need to be restarted.  Isn't distributed/replicated supposed to be highly available?
19:53 MikeLupe JoeJulian: I'm intellectually too limited to know what to do with that specific 512MB file. Can you help?
19:54 JoeJulian MikeLupe: So you have found the gfid file. ls -li to get the inode. find $brick_root -inum $inode_number
19:54 JoeJulian actuall, first...
19:55 JoeJulian If the link count is only 1, just delete it.
19:57 MikeLupe I don't have a link count.. the file is not linked, what do I not get?
19:59 MikeLupe I've got .19 in that inode folder
19:59 MikeLupe so count 19
19:59 mhulsman joined #gluster
20:00 cacasmacas joined #gluster
20:02 MikeLupe "/gluster/engine/brick1/.shard/e4acafe2​-b283-44a1-97b7-bb614e88217b.[1]-[19]"
20:03 JoeJulian You get a link count with a simple ls -l
20:03 JoeJulian The number right after the permissions is the link count.
20:04 MikeLupe between rights and user/group is the count?
20:05 JoeJulian yep
20:06 MikeLupe oh man
20:06 MikeLupe you're killing my weekend ;)
20:06 JoeJulian lol
20:06 jdossey joined #gluster
20:06 MikeLupe 2 is 1 too much
20:06 MikeLupe count 2
20:07 JoeJulian 2 or more is correct.
20:07 JoeJulian Could be more if you hardlink the file to another filename.
20:08 MikeLupe so on the .glusterfs file count 2, and in the folder found by its inode there are 19 files
20:10 MikeLupe that gives like ~10GB userd in that ovirt-engine volume...it's ok
20:12 MikeLupe And the find pointed me to the .3 file, I can finnaly realize
20:13 MikeLupe I didn't see it previously... what can I do with that .3 ?
20:20 gluytium joined #gluster
20:21 MikeLupe JoeJulian: I almost found the holy grail, please stay with me :)
20:21 zakharovvi[m] joined #gluster
20:23 JoeJulian well, the first step is to just stat the file through a client mount and see if it heals itself.
20:23 JoeJulian If not, then I start looking at ,,(extended attributes)
20:23 glusterbot (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://pl.atyp.us/hekafs.org/index.php/​2011/04/glusterfs-extended-attributes/
20:30 MikeLupe Like last year, I'm challenged again :)
20:30 JoeJulian :)
20:31 JoeJulian My goal is to give you to tools to help yourself. It's hard when it only breaks once a year.
20:31 MikeLupe I know, I know, don't worry.
20:32 MikeLupe But it's *argh* challenging
20:32 MikeLupe The most challenging is documenting now what I'm doing ;)
20:32 JoeJulian +1
20:33 JoeJulian If you can document that publicly, you get extra kudos and somebody, somewhere, will buy you a beer.
20:33 MikeLupe trusted.afr.engine-client-1​=0x000000010000000000000000
20:33 MikeLupe trusted.afr.engine-client-2​=0x000000000000000000000000
20:33 MikeLupe that sounds empty - wth
20:34 farhorizon joined #gluster
20:35 gluytium joined #gluster
20:35 MikeLupe Haha, well - you know how it is, my "documentation" is right now totally chaotic. Cleaning that up is the pain...
20:36 JoeJulian So the first one shows a metadata change that needs healed. Why it hasn't done so I have no idea.
20:36 MikeLupe ?? where the heck do you read that?
20:37 JoeJulian "see this article" ^^
20:37 MikeLupe ahh ok
20:37 MikeLupe I just saw the bit
20:44 victori joined #gluster
20:48 MikeLupe But it's too, errm "comlicated" for me to see what to do next.
20:48 MikeLupe +p
20:50 MikeLupe Even the word "complicated" is too "comlicated" for me right now
20:52 rbartl joined #gluster
20:55 rbartl joined #gluster
21:08 MikeLupe JoeJulian: Btw, I got distracted by your blog and saw you already used OpenStack and Puppte 5 years ago. Where could oVirt have advantages over OpenStack?
21:09 misc it is a lot easier to use
21:09 musa22_ joined #gluster
21:09 misc openstack is for much bigger infra than ovirt
21:10 MikeLupe that's what I thought yeah
21:10 misc and thus come with some complexity
21:10 misc now, openstack is the sexy stuff, while ovirt is just plain old virt :)
21:11 MikeLupe well, when you come from Hyper-V everything is sexy ;)
21:11 JoeJulian And I couldn't get oVirt to work over the span of a 3 day weekend.
21:12 MikeLupe 3 days?? It took me at least 3 weeks
21:12 MikeLupe because of gluster ;)
21:12 JoeJulian Some service that it uses to run things as root kept leaking all my memory.
21:12 MikeLupe or better: because of my "skills"
21:12 MikeLupe yes I know
21:12 MikeLupe was srprized aswell
21:13 JoeJulian ... and it runs a long-running service to execute things as root. wtf?
21:13 MikeLupe Still better than Hyper-V, trust me
21:14 JoeJulian Having just said that, I am now thinking about glusterd and its attack vector as a long-running root process...
21:15 ashiq joined #gluster
21:18 MikeLupe Are you messing with our trust in glusterfs?
21:18 JoeJulian I'm a user just like you. My concern is for my systems.
21:18 JoeJulian I will, however, file a bug report.
21:18 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
21:18 MikeLupe :)
21:19 misc JoeJulian: our sec team did a few audit of gluster, but yeah
21:20 JoeJulian How many times have I thought I tested every possibility only to find out later...
21:20 misc I am not sure we can work around that without a rearchitecture of gluster
21:20 misc like, having a separate process listening to network without privileges and one doing the stuff as root
21:21 misc (but then, you add complexity of IPC)
21:21 JoeJulian Don't even need root is uid/gid was in xattrs and managed by a translator.
21:21 JoeJulian s/is/if/
21:21 glusterbot What JoeJulian meant to say was: Don't even need root if uid/gid was in xattrs and managed by a translator.
21:22 JoeJulian For now, run it in a rkt container (not docker or nspawn as neither uses cgroups v2).
21:25 misc I think there is plan to run gluster in container
21:26 misc but rkt can also run docker image, iirc
21:27 JoeJulian yes
21:27 JoeJulian And you can do the whole thing container-agnostically through kubrenetes.
21:29 JoeJulian (thanks to CRI - which I'm looking at right now to see about implementing one for Joyent).
21:29 B21956 joined #gluster
21:32 farhorizon joined #gluster
21:34 Wizek joined #gluster
21:37 squizzi joined #gluster
21:39 Marbug_ joined #gluster
21:41 MikeLupe JoeJulian: May I get back and ask ho I could do further with that trusted.afr.engine-client-1=​0x000000010000000000000000?
21:42 JoeJulian Check it's replica and see what it has for those trusted.afr attributes.
21:46 MikeLupe trusted.afr.engine-client-1​=0x000000010000000000000000
21:46 MikeLupe trusted.afr.engine-client-0​=0x000000030000000000000000
21:47 MikeLupe 1st being on node01, second on node02
21:47 MikeLupe engine-client-2 is on both 0, so ok
21:48 MikeLupe a mess
21:48 MikeLupe I ignored here the 3rd node arbiter
21:49 squizzi joined #gluster
21:50 JoeJulian Pick one to leave alone (trusted.afr.engine-client-0, for instance) and change all the rest to 0x0 all all three replica.
21:55 MikeLupe Following command should be ok? :
21:55 MikeLupe On node-02: setfattr -n trusted.afr.engine-client-0 -v 0x000000000000000000000000 /gluster/engine/brick1/.shard/e4ac​afe2-b283-44a1-97b7-bb614e88217b.3
21:56 JoeJulian Yes and no.
21:56 MikeLupe Well, on both nodes
21:56 JoeJulian which trusted.afr are you keeping?
21:57 JoeJulian You want one of them to stay non-zero so self-heal will do its thing.
21:57 MikeLupe node-01, so client-0
21:57 MikeLupe ah ok
21:57 MikeLupe So I set all to 0 but client-0 ?
21:57 JoeJulian So don't reset client-0 but reset client-1 and client-2 everywhere they exist.
21:57 MikeLupe ok
22:05 MikeLupe Before I do it, I'll show you the corrected getfattr of that file in order node01 node02 node02. I wonder which I should chose, given there's one with "3"..
22:06 MikeLupe trusted.afr.engine-client-1​=0x000000010000000000000000
22:06 MikeLupe trusted.afr.engine-client-2​=0x000000000000000000000000
22:06 MikeLupe trusted.afr.engine-client-0​=0x000000030000000000000000
22:06 MikeLupe trusted.afr.engine-client-2​=0x000000000000000000000000
22:06 MikeLupe trusted.afr.engine-client-0​=0x000000030000000000000000
22:06 MikeLupe trusted.afr.engine-client-1​=0x000000000000000000000000
22:06 JoeJulian Who knows. You could check the sha1sum of both to see if they are the same. If they're not, though, it's a guess which one has the data you don't want to lose.
22:07 MikeLupe lol - oh c'mon
22:07 MikeLupe :)
22:07 MikeLupe I'll go for the first as you said
22:07 JoeJulian My usual expectation with vm images is that the most commonly written thing is logs, so it's no great loss.
22:07 DV__ joined #gluster
22:07 MikeLupe ok
22:13 MikeLupe JoeJulian: So basically, if I want to keep client-0 with the *3*, I should only reset on node-01 (client-9) the client-1 value to 0
22:13 JoeJulian correct
22:13 MikeLupe I meant (client-0) - oh man sry
22:13 MikeLupe and the trigger the healing again?
22:13 JoeJulian keep client-0 with a number. Reset client-1 and client-2 (where they exist)
22:14 JoeJulian wait
22:14 JoeJulian now you've got me discombobulated.
22:14 farhorizon joined #gluster
22:14 MikeLupe I know, sorry, me too. I try again , wait for me
22:14 DV__ joined #gluster
22:14 JoeJulian number means write pending for X. So to keep client-0, reset client-0 everywhere.
22:15 MikeLupe To set on Node-01 (client-0): value for client-1 ("1") to 0
22:15 JoeJulian Then client-0 will be healed to any with a pending status.
22:15 cholcombe anyone have advice for a recommended ssl key size for io path encryption?
22:16 JoeJulian With the current administration? Nothing less than 4k.
22:16 cholcombe JoeJulian, :D
22:16 MikeLupe lol
22:16 cholcombe yeah i was thinking 4K but i wasn't sure if i should go higher
22:17 JoeJulian Compare latency and balance that with liability cost.
22:17 cholcombe indeed
22:17 JoeJulian I suspect 4k is the sweet spot, but I don't know your liability issues.
22:18 cholcombe JoeJulian, i don't either.  i'm building some code to automate the deployment of gluster
22:18 cholcombe just giving people options is all
22:20 MikeLupe JoeJulian: I set the values for client-0 on node02 & node03. Should I retrigger healing?
22:21 JoeJulian yes
22:21 MikeLupe full or normal?
22:22 DV__ joined #gluster
22:22 JoeJulian normal should be fine
22:23 MikeLupe omg - "Number of entreis: 0"
22:23 MikeLupe It's too scary
22:24 JoeJulian excellent
22:25 MikeLupe I observe my " gluster volume heal engine info" loop and saw once the Number getting back to 1. Is that normal?
22:25 MikeLupe and back to 0 immediatly
22:25 JoeJulian Yeah, it's probably files being actively written to.
22:25 MikeLupe oky
22:25 JoeJulian They're "dirty" for a moment.
22:26 MikeLupe Can't be possible you got my cluster back to work
22:26 MikeLupe That would mean I owe you the next sixpack
22:26 JoeJulian I just pointed you in the right direction. you did all the hard part.
22:26 MikeLupe Please, don't say that again. You (again) did the work
22:27 MikeLupe Shall I wait a bit, or get the hosted-engine start a go?
22:27 JoeJulian If you want to pay your gratitude forward, I offer a link on my blog in which you can donate to open-licensed cancer research.
22:27 JoeJulian Crank it up.
22:27 MikeLupe I saw that, good idea
22:35 MikeLupe oVirt running
22:35 JoeJulian NO WAIT!!!
22:35 MikeLupe ...
22:35 JoeJulian Just kidding...
22:35 MikeLupe ;)
22:36 MikeLupe I have to change to the ovirt channel, it only runs n the first node, my work's not over
22:36 MikeLupe that was your "NO WAIT!!" ;)
22:36 JoeJulian Me pulling your leg.
22:36 MikeLupe hehe
22:36 MikeLupe "This person doesn't have Personal Page "
22:37 JoeJulian ?
22:37 MikeLupe But I'd chose rather the beer, than going through PayPal or whatever
22:37 MikeLupe Give to open cancer research
22:37 JoeJulian damn, no!
22:37 MikeLupe :-/
22:37 JoeJulian It worked just a couple weeks ago. :(
22:38 JoeJulian Thanks Trump.
22:38 MikeLupe yep
22:45 MikeLupe JoeJulian: Thanks a lot for your help!
22:46 JoeJulian I've emailed Dr. Bradner. He's changed jobs but hopefully open cancer research is still a thing.
22:50 DV__ joined #gluster
22:50 MikeLupe Hopefully
22:50 MikeLupe It's a good thing
22:55 Jacob843 joined #gluster
23:00 masber joined #gluster
23:02 cacasmacas joined #gluster
23:08 kpease joined #gluster
23:12 colm joined #gluster
23:27 farhorizon joined #gluster
23:34 DV__ joined #gluster
23:41 wushudoin joined #gluster
23:42 DV__ joined #gluster
23:43 wushudoin joined #gluster
23:47 jbrooks joined #gluster
23:51 vbellur joined #gluster
23:59 DV__ joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary