Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2016-07-26

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:26 bkolden joined #gluster
01:32 Lee1092 joined #gluster
02:04 julim joined #gluster
02:12 kshlm joined #gluster
02:41 Bhaskarakiran joined #gluster
02:57 kramdoss_ joined #gluster
03:02 poornimag joined #gluster
03:07 nishanth joined #gluster
03:17 magrawal joined #gluster
03:19 shdeng joined #gluster
03:44 RameshN joined #gluster
03:51 itisravi joined #gluster
04:04 JesperA joined #gluster
04:09 rafi joined #gluster
04:13 shubhendu__ joined #gluster
04:17 shubhendu__ joined #gluster
04:24 aspandey joined #gluster
04:24 nbalacha joined #gluster
04:26 sanoj_ joined #gluster
04:31 atinm joined #gluster
04:31 rafi joined #gluster
04:34 rafi joined #gluster
04:37 nishanth joined #gluster
04:42 Saravanakmr joined #gluster
04:45 sanoj_ joined #gluster
04:51 guhcampos joined #gluster
04:51 ramky joined #gluster
04:53 kramdoss_ joined #gluster
04:57 ppai joined #gluster
05:00 nehar joined #gluster
05:12 poornimag joined #gluster
05:19 prasanth joined #gluster
05:21 satya4ever joined #gluster
05:22 Manikandan joined #gluster
05:23 skoduri joined #gluster
05:25 kdhananjay joined #gluster
05:26 karthik_ joined #gluster
05:27 poornimag joined #gluster
05:30 ndarshan joined #gluster
05:30 shubhendu_ joined #gluster
05:30 Apeksha joined #gluster
05:35 hgowtham joined #gluster
05:37 shubhendu__ joined #gluster
05:37 aravindavk joined #gluster
05:44 harish_ joined #gluster
05:44 karnan joined #gluster
05:59 Bhaskarakiran joined #gluster
05:59 [diablo] joined #gluster
06:02 sakshi joined #gluster
06:03 shubhendu_ joined #gluster
06:12 karnan joined #gluster
06:14 hackman joined #gluster
06:15 Muthu joined #gluster
06:18 Muthu_ joined #gluster
06:20 devyani7_ joined #gluster
06:20 aspandey joined #gluster
06:21 pur joined #gluster
06:22 msvbhat joined #gluster
06:22 harish_ joined #gluster
06:25 anil joined #gluster
06:33 skoduri joined #gluster
06:34 Gnomethrower joined #gluster
06:45 fsimonce joined #gluster
06:49 karnan joined #gluster
06:50 rastar joined #gluster
06:54 poornimag joined #gluster
06:55 Bhaskarakiran joined #gluster
06:56 derjohn_mob joined #gluster
06:57 nishanth joined #gluster
07:05 shortdudey123 joined #gluster
07:06 kovshenin joined #gluster
07:10 shubhendu__ joined #gluster
07:13 Wizek joined #gluster
07:21 Rick_ joined #gluster
07:23 Raide joined #gluster
07:29 jwd joined #gluster
07:34 satya4ever joined #gluster
07:36 auzty joined #gluster
07:48 petan joined #gluster
07:53 robb_nl joined #gluster
07:53 ahino joined #gluster
07:53 bkolden joined #gluster
07:54 Wizek joined #gluster
07:54 gem joined #gluster
07:55 derjohn_mob joined #gluster
07:58 somlin22 joined #gluster
07:59 hackman joined #gluster
07:59 bkolden joined #gluster
08:09 shubhendu__ joined #gluster
08:10 wadeholler joined #gluster
08:13 RameshN joined #gluster
08:16 paul98 joined #gluster
08:17 paul98 hey kshlm you about?
08:21 ppai joined #gluster
08:21 shubhendu__ joined #gluster
08:23 kshlm paul98, Yeah.
08:23 paul98 still got problems! :(
08:23 atalur joined #gluster
08:24 kshlm Okay.
08:24 kshlm Let's try to fix it.
08:24 paul98 i'm getting a error when i try to start glusterd back up after creating the /var/lib/glusterfs/secure-access
08:24 kshlm Okay.
08:24 kshlm What error?
08:24 paul98 "failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled"
08:25 derjohn_mob joined #gluster
08:25 kshlm That error is fine. As mentioned, the DH ciphers will be disabled and glusterd should continue to run.
08:25 kshlm Can you share the glusterd log file?
08:27 misc grmblblb, so the bug in ansible I tought to be fixed wasn't :/
08:27 kshlm misc, Yay! You're here.
08:28 kshlm FYI, www.gluster.org isn't pointing to the right place.
08:28 paul98 kshlm: http://pastebin.centos.org/49616/
08:28 misc kshlm: yeah, I am on it, that's why I grmbled on the bug
08:28 kshlm misc, Cool!
08:28 misc I was quite sure the fix was backported and I did even check before pushing :/
08:29 kshlm paul98, I'm checking the log.
08:29 misc (some issue with include of include forget variable)
08:29 Slashman joined #gluster
08:29 kshlm paul98, line 8 "[2016-07-26 08:19:58.560710] E [socket.c:4075:socket_init] 0-socket.management: could not load our cert"
08:29 kshlm This is what we should be looking at.
08:30 kshlm paul98, Do you have the /etc/ssl/glusterfs.pem file?
08:31 kshlm This file is the 'our cert' being referred to in the log.
08:31 paul98 [root@gfs-gb-th-1-01 ~]# ls /etc/ssl/
08:31 paul98 certs  glusterfs.ca  glusterfs.key  glusterfs.pem
08:33 kshlm Hmm... what OS/distro are you running on?
08:33 kshlm Also which GlusterFs version.
08:33 paul98 centos 6.7
08:34 paul98 if i remove the secure-access file, it starts up
08:34 paul98 and i guess the secure-access file should be blank
08:35 shubhendu_ joined #gluster
08:35 kshlm Preferrably blank, but doesn't matter what it contains.
08:36 paul98 [root@gfs-gb-th-1-01 ~]# rpm -qa glusterfs
08:36 paul98 glusterfs-3.7.11-1.el6.x86_64
08:36 kshlm secure-access enables TLS for GlusterD connections, which is why it tries to load the cert, ca and private key.
08:36 kshlm Are the permissions right on /etc/ssl/glusterfs.*
08:37 paul98 -rw-r--r--. 1 root root 5.4K Jul 25 12:35 /etc/ssl/glusterfs.ca
08:37 paul98 -rw-r--r--. 1 root root 1.8K Jul 25 11:56 /etc/ssl/glusterfs.key
08:37 paul98 -rw-r--r--. 1 root root 1.8K Jul 25 11:58 /etc/ssl/glusterfs.pem
08:37 glusterbot paul98: -rw-r--r's karma is now -20
08:37 glusterbot paul98: -rw-r--r's karma is now -21
08:37 glusterbot paul98: -rw-r--r's karma is now -22
08:37 shdeng joined #gluster
08:38 kshlm That is okay.
08:38 paul98 could it be the way the ssl was generated?
08:38 kshlm The cert? Could be.
08:39 paul98 yer
08:39 paul98 whats best way to generate the cert?
08:41 kshlm Give me a minute. I'll dig the ansible role I use to generate the certs.
08:43 kshlm The two commands the ansible role uses is, `openssl genrsa -out /etc/ssl/glusterfs.key 2048`
08:44 kshlm and `openssl req -new -x509 -key /etc/ssl/glusterfs.key -subj "/CN=<server-hostname>" -out /etc/ssl/glusterfs.pem`
08:44 kshlm For reference the role is available here https://github.com/kshlm/my-ansible-roles/b​lob/master/kshlm.gluster-ssl/tasks/main.yml
08:44 glusterbot Title: my-ansible-roles/main.yml at master · kshlm/my-ansible-roles · GitHub (at github.com)
08:45 kshlm It could be that openssl is tripping on the on rw-r--r-- permissions.
08:45 glusterbot kshlm: rw-r--r's karma is now -1
08:46 paul98 let me regenerate all the ssls the way you just done it and see
08:46 paul98 but looks simlar to what i used to be fair
08:46 kshlm Can you change the permission to "rw-------" ie 0600
08:48 paul98 right so i've run the above
08:48 paul98 now i need to put the .pem into the .ca so should have 3 crts
08:48 paul98 then the .ca goes onto each server / client
08:49 kshlm Yup.
08:51 cholcombe joined #gluster
08:53 paul98 right ok let me try that, generated the .key and .pem then put .pem into .ca on each server
08:53 paul98 let me try that if not i'll try premissions
08:54 kshlm After that you can check selinux. Maybe it's playing tricks.
08:54 paul98 ok thats started up
08:54 paul98 but i'm now getting this in the logs
08:55 paul98 http://pastebin.centos.org/49621/
08:55 shubhendu__ joined #gluster
08:56 kshlm That is because someone is trying to connect to glusterd without using TLS.
08:56 paul98 hmm ok
08:56 kshlm Do you have a client or server running?
08:57 paul98 ok both servers are up and running
08:57 kshlm When you enable/disable encryption for glusterd (or management encryption) all gluster processes need to be restarted.
08:57 paul98 i have one client which has the ssl's etc which is prob trying to connect
08:57 paul98 unless it's thw two servers?
08:58 kshlm You also need to touch /var/lib/glusterd/secure-access on the client, to make it use management encryption as well.
08:58 paul98 yup i did that
08:58 paul98 although the client had no /var/libg/glusterd/ dir
08:59 paul98 i restarted glusterd glusterfsd on both the servers after i created the /var/libg/glusterd/secure-access file
08:59 kshlm Yeah. that's an ugly way to turn it on.
08:59 kshlm And the did you restart the client as well?
09:00 paul98 to be fair restart the client, all we have is the glusterfs installed on client and then i do  mount -t glusterfs 192.168.101.47:/data /glusterfs
09:01 kshlm I meant restart the client process.
09:01 kshlm umount followed by mount.
09:01 paul98 ah yes i unmounted
09:01 paul98 but now i try to mount i get mount failed
09:02 kshlm Does the client log file say failed to fetch volfile?
09:02 derjohn_mob joined #gluster
09:02 paul98 http://pastebin.centos.org/49626/
09:02 paul98 is the client
09:03 kshlm Are you using at-rest encryption? The data-crypt xlator is used for at-rest encryption.
09:03 kshlm And I don't know how well it works.
09:04 paul98 to be fair i don't know, i just followed your blog post
09:04 kshlm My blog post should cause data-crypt to be enabled.
09:05 kshlm Can you post the volume info?
09:05 kshlm `gluster volume info <volname>`
09:05 Gnomethrower joined #gluster
09:05 paul98 http://pastebin.centos.org/49631/
09:08 kshlm You've got data-crypt turned on. The options features.encryption, encryption.master-key, encryption.data-key-size, encryption.block-size turn on at-rest encryption.
09:09 paul98 ok, so i should be turning this off?
09:09 kshlm You can  turn it off by doing 'gluster volume set <volname> features.encryption off'
09:09 kshlm Yup. Unless you require it.
09:09 paul98 what does it do?
09:09 paul98 ah
09:09 kshlm Where did you find those options by the way.
09:09 paul98 we wanted to incrypt the data
09:10 paul98 as it's going to be used for email so we wanted the data incrypted
09:10 kshlm Ah.
09:11 kshlm I'm not really familiar with the data-crypt translator.
09:11 paul98 or does glusterfs not work to well with ssl / data encryption?
09:12 ctria joined #gluster
09:12 kshlm Those should be two seperate things.
09:12 paul98 ah ok so one of the other
09:12 paul98 well if i can ge tit working with ssl off i would be more happy then that
09:12 kshlm From the error message for data-crypt, I'm guessing that the master-key file is not present on the client.
09:12 paul98 as you wouldn't be able to read the data with out the ssl
09:13 paul98 i've turned features.encryption off for now see what happens
09:13 kshlm '[2016-07-26 09:01:00.269017] E [crypt.c:4307:master_set_master_vol_key] 0-data-crypt: FATAL: can not open file with master key'
09:14 kshlm As per the vol-info, the client is searching for the file /root/volume-key.txt on the client.
09:17 paul98 http://pastebin.centos.org/49636/
09:17 paul98 get that error now
09:18 kshlm auth.ssl-allow shoud have the common-names of the certificates. You've used IP addresses.
09:18 paul98 ah
09:20 paul98 auth.ssl-allow: gfs-gb-th-1-01,gfs-gb-pwr-1-01,zimbra
09:21 paul98 like that?
09:21 kshlm Yup.
09:21 paul98 yayay :D it's mounted
09:23 paul98 would you say you would need to encrypt data or would the ssl be enough?
09:24 kshlm SSL encrypts data on the way to the glusterfs-servers.
09:24 kshlm The data is stored on disk unencrypted.
09:25 kshlm data-crypt encrypts data itself.
09:25 kshlm So if using SSL, anyone snooping on your network wouldn't be able to figure out what's happening.
09:25 paul98 i'm surprised the call for encrypted data isn't higher
09:26 kshlm data-crypt has been around for quite a while, but it's not been used much.
09:26 Alghost joined #gluster
09:26 karthik_ joined #gluster
09:28 paul98 any one else used data-crypt as well as ssl?
09:29 paul98 to be fair i'll update the email with above convo, and it's upto them if they want me to carry on lol
09:33 raghug joined #gluster
09:38 paul98 so i turned encryption back on
09:38 paul98 how do i unmount the parition
09:39 kotreshhr joined #gluster
09:40 jiffin joined #gluster
09:41 somlin22 joined #gluster
09:47 atinm joined #gluster
09:47 shubhendu_ joined #gluster
09:48 Siavash joined #gluster
09:49 paul98 misc: ?
09:49 hwcomcn joined #gluster
09:49 shaunm joined #gluster
09:50 somlin22 joined #gluster
09:50 jiffin joined #gluster
09:53 hwcomcn joined #gluster
09:55 muneerse joined #gluster
09:58 cholcombe joined #gluster
09:58 derjohn_mob joined #gluster
10:03 misc paul98: ?
10:04 paul98 any ideas on disk encryption, i can't find anything that gives you a how to, mines moaning about not open file with master key
10:05 misc nope, i am just a sysadmin, not a gluster dev or user for now :)
10:05 paul98 fair enough! was worth a ask
10:06 somlin22 joined #gluster
10:08 shubhendu_ joined #gluster
10:17 atinm joined #gluster
10:22 ashiq joined #gluster
10:25 somlin22 joined #gluster
10:29 nehar joined #gluster
10:32 raghug joined #gluster
10:40 DV joined #gluster
10:41 msvbhat joined #gluster
10:48 somlin22 joined #gluster
10:55 ashiq joined #gluster
10:59 Alghost joined #gluster
11:01 om joined #gluster
11:03 raghug joined #gluster
11:05 kotreshhr left #gluster
11:05 Saravanakmr joined #gluster
11:21 kkeithley Gluster Community Bug Triage in ~40min in #gluster-meeting
11:24 DV joined #gluster
11:25 ira joined #gluster
11:31 johnmilton joined #gluster
11:44 somlin22 joined #gluster
11:44 B21956 joined #gluster
11:44 karnan joined #gluster
11:51 skoduri joined #gluster
11:55 Muthu_ REMINDER: Gluster Community Bug Triage Meeting in #gluster-meeting in ~ 5minutes from now.
11:56 somlin22 joined #gluster
11:58 Telsin joined #gluster
12:00 Saravanakmr joined #gluster
12:02 rastar joined #gluster
12:03 shubhendu joined #gluster
12:04 shubhendu joined #gluster
12:10 kdhananjay joined #gluster
12:11 DV joined #gluster
12:12 jkroon joined #gluster
12:12 nohitall joined #gluster
12:13 nohitall I got a question: got a replicated dstributed volume, and all is working fine except file deleting, it delets with few kb/s. Anyone got an idea what to check there?
12:15 newdave joined #gluster
12:16 atalur joined #gluster
12:18 ppai joined #gluster
12:28 Alghost joined #gluster
12:30 atalur joined #gluster
12:33 hackman joined #gluster
12:34 cholcombe joined #gluster
12:36 Apeksha joined #gluster
12:37 jkroon hi all, hoping to get some advise here, i'm not even sure this is glusterfs related.
12:37 jkroon sitting with two hosts with load averages of 180+, and neither top nor iotop is providing any explanation.
12:37 jkroon the processes using CPU and IO primarily are all glusterfs related.
12:40 ndevos jkroon: maybe you have many processes in D-state? That is a (normally temporary) state were processes do I/O and can not be interrupted, they add +1 to the loadavg
12:42 jkroon ndevos, jip, bunch of processes blocking on glusterfs mounted filesystems.
12:43 jkroon but normally if I see that I also see either large IO on the underlying bricks, or INSANE cpu.
12:43 ndevos jkroon: sounds like either the fuse-mountpoint or the bricks can not keep up, or are non-responsive otherwise
12:43 jkroon how much can the mount fuse handle?
12:43 jkroon it's been fine and then just exploded this morning, already rebooted both servers as well.
12:44 jkroon have some entries in split brain - could that affect things?
12:44 jkroon busy runninga find over the mountpoint and it seems to be slowly lowering the counts that's in split brain.
12:44 ndevos hmm, maybe, that would be logged by the fuse client in that case
12:45 ndevos fuse can handle a lot, but there surely is a limit, some people that need more performance, mount the same volume on different directories, having more fuse processes
12:48 jkroon fuse client == glusterfs client?
12:48 jkroon we're mounting using native gluster client, with the following entry in /etc/fstab:  localhost:mail  /var/spool/mail glusterfs _netdev,defaults 0 0
12:49 jkroon interesting idea.
12:49 v12aml left #gluster
12:49 jkroon we're not even talking 10MB/s currently that's going through.
12:49 ndevos jkroon: the fuse-client is a glusterfs-client :)
12:50 jkroon does glusterfs log IO operations taking longer than a certain time?
12:50 ndevos jkroon: I guess that a volume called "mail" has many small files, that tends to be a heavy meta-data (small file) workload
12:50 jkroon fair enough.
12:51 ndevos you can use "gluster volume top" to get some stats about latency
12:51 jkroon 400G over 3.2M files.
12:52 ben453 joined #gluster
12:52 ndevos or, maybe with "gluster volume profile"
12:54 jkroon 0.73 1488028.00 us 1488028.00 us 1488028.00 us              1       FSYNC ... longest call, but since there is only 1 ...
12:55 jkroon %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop <-- what would %-latency mean?
12:55 glusterbot jkroon: <'s karma is now -22
12:55 ben453 I just had one of the servers in my gluster 3 node replica cluster go down. I tried to restart my cluster (with a new 3rd server) and now there are a bunch of gfids in my gluster volume heal info list. Anyone know how to resolve this issue? I've tried explicitly calling gluster heal full but that call failed
12:56 jkroon ben453, you need to wait for the data to clone afaik.
12:56 ben453 I can't read from the mount until it finishes healing?
12:56 ben453 2 of my 3 servers should have all of the correct data
12:56 ben453 I'm using the glusterfs.fuse mount if that makes any difference
12:57 chirino joined #gluster
12:57 jkroon no, you should be able to continue normally.
12:57 cloph better to create gluster on a raid10 (software, created with mdadm) or using striping with two bricks per host?
12:58 jkroon 25.77    4874.46 us       7.00 us 6500488.00 us          96375      LOOKUP - 4ms average?
12:59 jkroon that doesn't sound completely horrible bad, however, 6.5s worst case is pretty bad.
13:01 ben453 Yeah previously when I've had this issue when I read from a file in the heal info list gluster would figure it out and remove that file from the list. However now when trying to interact with one of my files I'm getting an I/O error, even when accessing other files works fine
13:01 jkroon ben453, that means you're in complete split brain as I understand it.
13:01 jkroon how did you replace the bricks?
13:03 bwerthmann joined #gluster
13:03 arcolife joined #gluster
13:03 shubhendu joined #gluster
13:04 unclemarc joined #gluster
13:06 Pupeno joined #gluster
13:11 ben453 Instead of running a brick replace, I shut down all of my servers and brought them back up, and created another volume on top of my 3 data bricks.
13:12 ben453 I'm on AWS, so the IP addresses of the servers had changed, but I did not change the underlying data. So I thought that with a force volume creation gluster would be okay with this
13:16 jkroon i'm not that into glusterfs internals, but I do recall something about not just bringing a new brick in, there is some stuff that has to happen first ...
13:17 ashiq joined #gluster
13:21 shubhendu joined #gluster
13:24 julim joined #gluster
13:29 Alghost joined #gluster
13:30 kkeithley you have to add the new node to the trusted pool, i.e. with `gluster peer probe $newnode`
13:31 Pupeno joined #gluster
13:32 chirino_m joined #gluster
13:35 hybrid512 joined #gluster
13:38 shubhendu joined #gluster
13:38 dlambrig_ joined #gluster
13:39 cholcombe joined #gluster
13:40 shubhendu joined #gluster
13:40 Philambdo joined #gluster
13:42 jkroon would it help to migrate from two bricks to four?
13:43 jkroon ie, still have a single volume, but distribute-replicate 2x2 instead of replicate x 2
13:43 jkroon we would still utilize the same underlying storage though ...
13:44 jkroon and LOOKUP and ENTRYLK are the hardest hitting file operations at this point.
13:44 jkroon (95% between them)
13:52 dnunez joined #gluster
13:59 ahino1 joined #gluster
13:59 jkroon what's the disadvantage of turning off self-heal as per https://www.gluster.org/pipermail/gl​uster-users/2015-January/020197.html ??
13:59 glusterbot Title: [Gluster-users] How To Turn Off Self Heal (at www.gluster.org)
14:00 wushudoin joined #gluster
14:00 wushudoin joined #gluster
14:02 dlambrig_ left #gluster
14:03 Pupeno joined #gluster
14:04 jkroon switching off self-heal dropped load averages from 220+ to <50.
14:04 jkroon i'm just worried about sync problems now ...
14:10 karnan joined #gluster
14:10 somlin22 joined #gluster
14:12 side_control so im using gluster as my storage backend for ovirt, and i'm seeing that VMs on a volume that contains single brick on a particular node will not start when healing processes are running on another volumes... is this the expected behavior?
14:12 Saravanakmr joined #gluster
14:12 side_control VMs fail with "unable to get volume size for $id"
14:15 ndevos jkroon: maybe you can use ,,(targeted self heal) to do the healing in staps/periodically
14:15 glusterbot jkroon: https://web.archive.org/web/20130314122636/htt​p://community.gluster.org/a/howto-targeted-sel​f-heal-repairing-less-than-the-whole-volume/
14:17 ndevos side_control: yes, if a heal is in progress, the file is locked and can not be accessed by others - you can reduce the amount of time to heal when you use sharding
14:18 side_control ndevos: jsut for clarification a heal is going on another volume, and a volume that is not replicated/distributed is having issues?
14:18 side_control ndevos: process locks then?
14:19 ndevos side_control: oh, no, healing locks the file that is undergoing the heal, no other locks
14:19 side_control ndevos: so then this behavior is strange then?
14:19 ndevos side_control: yes, definitely strange!
14:19 jkroon ndevos, i'll look into that thanks.
14:20 ndevos side_control: could it be that the particular node is not responding very well? in that case, the client may notice it and reject any request if quorum is not met
14:21 side_control ndevos: well that vol doesnt have a quorum, i created a volume on a node with one brick because i didnt want the volume to be replicated
14:22 side_control ndevos: its strange, that gluster node after a while seems to get fussy, and all vms fail to respond under high io
14:23 Pupeno joined #gluster
14:23 Pupeno joined #gluster
14:23 ndevos side_control: ok, so it a one-brick volume, that should make it pretty straight forward... no idea what the issue could be then :-/
14:23 nbalacha joined #gluster
14:25 jkroon ndevos, that's basically what I've been doing, just not as "formally".
14:25 bowhunter joined #gluster
14:25 jkroon for some reason running find /var/spool/mail &>/dev/null doesn't do the insane load thing.
14:25 jkroon it'll probably take MUCH longer though, but should trigger a stat on everything (I Hope).
14:25 side_control ndevos: it seems like ovirt makes an api request for vol size and that fails, so the only thing i can think of, during the heal, gluster is slow to respond which causes the failure... mounting via glusterfs/nfs the data is there, well its only a theory
14:26 ndevos jkroon: what version are you on? recent updates have the multi-threaded self-heal changes that can bring down servers...
14:27 ndevos side_control: I thought oVirt uses fuse-mounts?
14:27 jkroon 3.7.4
14:27 jkroon can one limit that concurrency?
14:27 side_control ndevos: it does, but you can mount using other mechs as well
14:28 jkroon although the problem started this morning on another version (just upgraded to that because it's know good on another cluster, and have ebuilds in place)
14:29 Alghost joined #gluster
14:33 ndevos jkroon: yes, there is a volume option to set the number of threads, but the version you run does not have the multi-threaded bit at all
14:33 jkroon kicked off multiple instances?
14:33 ndevos jkroon: there should be only one self-heal deamon per volume
14:34 side_control hrmm... sometimes i have to issue 'gluster volume heal $VOL info' a few times to get the results back
14:34 jkroon figured.  just wondering why switching that off immediately caused the load to be cut by ~80% (220 => ~45) and now by a further ~60% - down to sub-10.
14:34 ndevos side_control: sure, but if you run the hypervisor on the same servers as the gluster services, nfs-locking will not work
14:35 side_control ndevos: i'm only running the arbiter on a hypervisor, but for these errors its glusterfs errors
14:35 jkroon the latter drop was exim having flushed it's queue out ...
14:35 jkroon so basically delivered the email that queued during the "outage"
14:37 ben453 BTW I figured out the issue with not being able to read one file from the gluster mount, even when all of the other files were able to be read. The file that I couldn't read had been deleted and re-created while one of my servers were down, giving it a new gfid. So when I brought the servers back up, the file had a different gfid on the server that was coming back into the cluster.
14:38 ben453 When trying to read a file from the mount when one of the bricks has a different gfid than the other two bricks, gluster returns a generic I/O error
14:39 side_control meh, so for a distributed vol, is there a way to specify x percent of the data should reside on one node? its a rather large volume and one node has significantly more space
14:39 ndevos side_control: maybe the logs of the fuse-mount have some hints in them?
14:39 farhorizon joined #gluster
14:40 ndevos ben453: thats called a gfid split-brain :-/
14:40 side_control ndevos: nothing useful... though i am seeing files in the single brick/node vol get renamed
14:41 ben453 ndevos: is the only way to recover from gfid split-brain that to manually go in and delete the
14:41 ben453 file on the brick that has the different gfid?
14:41 ndevos ben453: yes, I think so
14:41 side_control ndevos: thank you btw, this is has been very informative
14:42 ndevos ben453: there are some policies for automatic split-brain resolution, but gfid split-brains can not be fixed with that, I think
14:42 ndevos ben453: prevent them all together, and use a 3-way replica (or 2-way + arbiter)
14:42 ben453 I am using a 3 way replica
14:42 ben453 the gfid was the same on two of my bricks, but different on the third
14:43 ben453 because the file was deleted and recreated when one of the servers was down
14:43 ndevos ben453: hmm, then quorum should have prevented creating the file on the 3rd brick...
14:43 ben453 the file already existed before the split
14:44 ben453 so the node coming back into the cluster had the old version of the file (with the exact same name, but the old gfid) whereas the two nodes that were always in the cluster had a new version of the file with a new gfid
14:44 ndevos oh, so the file existed, 3rd brick goes down, file gets deleted, file get re-created, 3rd brick comes up again?
14:44 ben453 yup that's the scenario!
14:45 ndevos hmm, no idea how that is expected to be handled, could you send an email to gluster-users@gluster.org and mention that scenario, and the glusterfs version you use?
14:45 ben453 So ideally gluster would realize that the two nodes with the same file had the correct one, and heal it to the node coming back in
14:45 alvinstarr joined #gluster
14:45 hagarth joined #gluster
14:45 alvinstarr left #gluster
14:45 ben453 Yup will do, thanks for the insight ndevos
14:45 ndevos the replication developers probably know how to prevent/fix that, and maybe that has been done in a newer version already
14:46 ben453 I was running on 3.7.11, so a pretty recent version
14:47 ndevos hmm, yeah, we're at 3.7.13 and 3.8.1 now, so at least 100+ patches went in
14:47 ben453 I'll definitely ask the mailing list about it though
14:48 ndevos side_control: do you know the filename on the fuse-mount? maybe you can use "qemu-img info <file>" to see if that gives a warning too
14:49 side_control ndevos: i know the file name but a dht renaming is being issued
14:49 side_control ndevos: i dont think its related
14:50 ndevos side_control: I would think a rename should not affect it, surely not when it is done on the same (single) brick
14:51 side_control ndevos: i think, gluster is hanging on the api requests from ovirt.. so im trying to figure out how to throttle the repair or tune gluster/these nodes
14:52 side_control ndevos: cause about once every two days this volume hangs and i have to restart one gluster node (full restart), let it heal for about 2-4 hours and then i can start these vms on this single brick
14:52 side_control yea few minutes ago, i wasnt able to start the VMs but now the heal is down to 1 entry, they started fine
14:53 side_control so bizarre
14:59 ndevos side_control: thats really weird, there should be little to heal on a single-brick-volume?!
15:00 side_control ndevos: yea, cpu/mem usage isnt all that high either, but i have seening gluster commands fail, i need to dive deeper
15:01 dnunez joined #gluster
15:01 side_control ndevos: hopefully this will all go away when i upgrade the storage fabric to 40G infiniband, the current fabric is very old
15:02 side_control ndevos: when i get more information, i'll file a bug, but in the meanwhile i have to go, doctor's appointment
15:02 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
15:02 ndevos side_control: ok, thanks and ttyl!
15:06 farhorizon joined #gluster
15:09 armyriad joined #gluster
15:10 skylar joined #gluster
15:12 kramdoss_ joined #gluster
15:14 somlin22 joined #gluster
15:16 squizzi joined #gluster
15:17 Pupeno joined #gluster
15:24 rwheeler joined #gluster
15:28 farhorizon joined #gluster
15:29 Pupeno_ joined #gluster
15:30 Alghost joined #gluster
15:35 Jacob843 joined #gluster
15:36 derjohn_mob joined #gluster
15:36 Jacob8432 joined #gluster
15:36 Manikandan joined #gluster
15:37 Jacob8433 joined #gluster
15:43 Pupeno joined #gluster
15:57 bwerthmann joined #gluster
15:57 somlin22 joined #gluster
16:01 atinm joined #gluster
16:02 bwerthmann joined #gluster
16:07 RameshN joined #gluster
16:08 glustin joined #gluster
16:10 Saravanakmr joined #gluster
16:16 Pupeno joined #gluster
16:22 Pupeno joined #gluster
16:23 msvbhat joined #gluster
16:26 Pupeno joined #gluster
16:31 farhoriz_ joined #gluster
16:32 Alghost joined #gluster
16:33 somlin22 joined #gluster
16:33 dnunez joined #gluster
16:37 CyrilPeponnet joined #gluster
16:37 harish_ joined #gluster
16:38 CyrilPeponnet Hi guys, I have one question regarding distributed volume. I have one volume which is just distributed accross two bricks hosted on two different serves, We had an outage, and one of the server goes down.
16:39 CyrilPeponnet looks like the volume did'nt go down, and later on we notice some issue where files where prensents on both bricks
16:39 CyrilPeponnet Doint some weird stuff client side.
16:41 CyrilPeponnet So two questions 1/ How to prevent this (mark volume offline as soon as one of the brick is going down), 2/How to recover this ? (for replicated we still have the heal cmd but I don't see anything for distributed volumes)
17:00 ic0n joined #gluster
17:01 johnmilton joined #gluster
17:04 coval3nce joined #gluster
17:05 coval3nce Anyone ever see “volume replace-brick: failed: Fuse unavailable” ?
17:19 nehar joined #gluster
17:19 johnmilton joined #gluster
17:19 bowhunter joined #gluster
17:20 plarsen joined #gluster
17:22 karnan joined #gluster
17:27 cryonv joined #gluster
17:27 farhorizon joined #gluster
17:31 cryonv Can someone give me ideas on troubleshooting my gluster install. It says it's functioning, but its not replicating
17:32 atinm joined #gluster
17:33 Alghost joined #gluster
17:35 cryonv When I look at the nfs.log it shows that "0-rpc-service: Could not register with portmap 100005 3 38465"
17:36 cryonv Thoughts?
17:36 anoopcs ndevos, ^^
17:36 anoopcs jiffin, ^^
17:36 Siavash joined #gluster
17:37 ahino joined #gluster
17:38 crashmag joined #gluster
17:41 plarsen joined #gluster
17:41 shinta_ joined #gluster
17:43 cryonv anoopcs thanks for flagging those guys
17:44 ndevos cryonv: rpc-program 100005 is for mountd, you might be running some nfs-server processes that are not part of gluster?
17:48 cryonv Hmmm.... Ok. Not that I have setup
17:48 cryonv But let me check...
17:49 nohitall anybody got an idea what to do when file removal on a volume is extrem slow? like kb/s, I have to delete 2TB from some volume and at this rate it will take few days
17:49 nohitall read/write is normal speed :/
17:49 om joined #gluster
17:51 Philambdo1 joined #gluster
17:54 ndarshan joined #gluster
17:58 shubhendu_ joined #gluster
18:02 farhorizon joined #gluster
18:07 ndevos cryonv: "rpcinfo -p" should list the ports that are in use, check if 100005 is already setup with port 38465
18:11 kkeithley is the rpcbind or portmap package installed? How is this installed? If those packages aren't installed (or kernel nfs is running) then you won't be able to register with portmap.
18:11 cryonv ndevos: nether of them are setup
18:11 atinm joined #gluster
18:12 ndevos cryonv: did you run the command on the system that runs the nfs-server?
18:12 XpineX joined #gluster
18:12 ndevos cryonv: and yeah, what kkeithley mentions is true too, if rpcbind (or the much older portmap) is not running, it'll fail as well
18:14 kkeithley If you install rpms or .debs then the dependencies should get installed automatically.  If you build+install from source then you have to remember to install them yourself.
18:15 cryonv Um... (Running CentOS)
18:16 ndevos cryonv: try "systemctl status rpcbind" ?
18:16 cryonv Active (running)
18:17 ndevos looks good
18:18 cryonv When I do a rpcinfo -p the only thing I get are a bunch of Portmapper with port 111
18:18 ndevos cryonv: I'd try "systemctl restart glusterd" and see if the nfs.log gets more of those errors
18:19 ndevos port 111 is the portmapper port, the nfs-server connects to it and will try to register some rpc-services with it, all services get their own port then
18:21 cryonv ndevos: Nothing new added to the logs for nfs.log
18:23 kkeithley selinux? Is selinux blocking? Anything in audit.log or /var/log/messages?  Try setting selinux to not enforcing if it isn't already.
18:24 ndevos cryonv: sounds good, what does "gluster volume status" report? does it list the NFS-server processes?
18:25 ndevos cryonv: you can also check with "showmount -e" to see if the volumes are exported with NFS
18:26 cryonv NFS Server on localhost                     N/A       N/A        N       N/A
18:26 cryonv NFS Server on 192.168.100.101               N/A       N/A        N       N/A
18:26 bluenemo joined #gluster
18:27 cryonv for the showmount I get...
18:27 cryonv clnt_create: RPC: Program not registered
18:27 ndevos cryonv: what version of gluster do you use? if it is 3.8.x you will need to "gluster volume set $VOLUME nfs.disable false" to enable the nfs-server
18:28 cryonv OH! Ok. Hang on.
18:28 arcolife joined #gluster
18:30 cryonv ndevos... That was the key that I didn't have in my documentation.
18:30 Wizek joined #gluster
18:30 shubhendu_ joined #gluster
18:31 cryonv Do I then force it to heal?
18:31 cryonv Just to make sure everything is in sync?
18:31 ndevos cryonv: ,,(self heal) and ,,(targeted self heal) maybe
18:31 glusterbot cryonv: I do not know about 'self heal', but I do know about these similar topics: 'targeted self heal'
18:31 ndevos @targeted self heal
18:31 glusterbot ndevos: https://web.archive.org/web/20130314122636/htt​p://community.gluster.org/a/howto-targeted-sel​f-heal-repairing-less-than-the-whole-volume/
18:32 ndevos cryonv: but, it should also sync automatically, just remember to always use the filesystem through a gluster-client (either fuse or nfs mounts)
18:33 ndevos cryonv: there is also "gluster volume heal ..."
18:34 Alghost joined #gluster
18:34 cryonv ndevos: (BTW Thanks for taking the time to help me.) Ok... I have it currently setup as an fstab mount
18:35 cryonv .  /dev/sdb on /data/gluster type ext4 (rw,relatime,seclabel,data=ordered)
18:36 ndevos that should be fine, but do not write there directly, only gluster processes may do so
18:36 karnan joined #gluster
18:36 ndevos if you write there directly, gluster does not know about the new files/data and can not replicate it for you
18:37 cryonv Interesting... Ok. So I have a /data/gluster/brick/ARCHIVES where I write to... Should I create a different mount?
18:38 Pupeno joined #gluster
18:38 ndevos yes, add a mountpoint like: localhost:/$VOLUME  /mnt/whatever glusterfs _netdev 0 0
18:38 ndevos and then have your applications use the /mnt/whatever path
18:40 cryonv . localhost:/data/gluster/brick/ARCHIVES /mnt/ARCHIVES glusterfs _netdev 0 0
18:41 ndevos ok, and try "mount -t -a glusterfs" to see if it works - after creating the /mnt/ARCHIVES directory, of course
18:41 ndevos uh, swap the -t -a in the command :)
18:43 alvinstarr joined #gluster
18:43 cryonv ok. Trying it now. (Wow... lots of changes since the 3.6 version it seems)
18:45 cryonv Oooh... unhappy server...:(
18:46 ndevos ah, the localhost:/data/gluster/brick/ARCHIVES is wrong, it needs to be localhost:/$VOLUME , not the path to the brick
18:47 ndevos the name of the volume is what is listed in "gluster volume list"
18:47 cryonv Ah... ok
18:49 farhorizon joined #gluster
18:54 * ndevos leaves for the day and will be back tomorrow
18:57 barajasfab joined #gluster
18:58 Siavash joined #gluster
18:58 Siavash joined #gluster
19:00 farhorizon joined #gluster
19:01 farhoriz_ joined #gluster
19:02 plarsen joined #gluster
19:07 hchiramm joined #gluster
19:15 shubhendu__ joined #gluster
19:18 joshin joined #gluster
19:28 harish_ joined #gluster
19:34 Alghost joined #gluster
19:55 cryonv Going back to the docs... something isn't quite right
19:57 Pupeno joined #gluster
19:59 JoeJulian something in particular, or just generally?
20:06 farhorizon joined #gluster
20:06 cryonv Just got SELinux accepting glusterd on my CentOS box
20:07 cryonv Now working on my 2nd box
20:07 ashiq joined #gluster
20:08 JoeJulian ? glusterd should work without altering any selinux rules.
20:08 cryonv I had gluster working pretty good with 3.6... and I did updates including going to CentOS version of gluster 3.8
20:09 cryonv And it broke it...
20:09 cryonv So I've been working on trying to fix it.
20:09 cryonv And learning a lot
20:10 JoeJulian What did you have to change?
20:10 cryonv The original instructions that I had, setup the mount points differently
20:10 cryonv (Still have to set those up right.)
20:11 JoeJulian Ah, so nothing to do with staring glusterd, just setting the correct context for your bricks?
20:12 cryonv True... the method for getting into the bricks changed... (at least according to the directions that I followed initially setting them up.)
20:12 cryonv It HAD an uptime of slightly over a year, before doing updates.
20:14 cryonv ndevos recommended using a
20:14 cryonv localhost:/brick /mnt/brick  glusterfs _netdev 0 0
20:15 cryonv For my access into the brick
20:15 cryonv Rather than the ext4 mount for /data/gluster
20:17 cryonv JoeJulian: Is that the best way?
20:20 JoeJulian You don't access bricks, you access volumes.
20:21 JoeJulian In the same way that you don't write your file to /dev/sda
20:22 post-factum JoeJulian: hmm, but you definitely *can* write your file to /dev/sda
20:22 JoeJulian Hush, troublemaker.
20:22 post-factum :D
20:22 cryonv LOL...
20:24 cryonv post-factum: I know it works that way because my instructions actually have it that way, but best practices evolve. (Although I'm pretty sure the docs could be updated so they're not referring to 3.5 for installing... ;) )
20:25 JoeJulian I see no such string in the docs: https://github.com/gluster/glusterd​ocs/search?utf8=%E2%9C%93&amp;q=3.5
20:25 glusterbot Title: Search Results · GitHub (at github.com)
20:25 cryonv JoeJulian: Ok... Multiple Bricks can form a volume right? (My environment has a 1:1 ratio on that
20:26 post-factum correct
20:26 cryonv http://gluster.readthedocs.io/en/latest/​Install-Guide/Install/#for-red-hatcentos
20:26 glusterbot Title: Install - Gluster Docs (at gluster.readthedocs.io)
20:26 post-factum cryonv: https://gluster.readthedocs.io/en/la​test/Quick-Start-Guide/Architecture/
20:26 glusterbot Title: Architecture - Gluster Docs (at gluster.readthedocs.io)
20:26 JoeJulian stupid search engine.
20:27 cryonv The Debian version shows 3.5...
20:27 bowhunter joined #gluster
20:27 cryonv LOL
20:27 JoeJulian I should have 'git grep'ped.
20:27 post-factum JoeJulian: and this man tells me about /dev/sda
20:30 somlin22 joined #gluster
20:35 Alghost joined #gluster
20:37 cryonv Yip. That's the architecture that I thought it was; although, not sure how my envisioned goals of gluster line up with it. since I was wanting to create a replicated distributed volume.
20:38 Pupeno joined #gluster
20:42 newdave joined #gluster
20:45 JoeJulian Now you're losing me again. What's interfering with you creating a distribute-replicate volume?
20:50 cryonv JoeJulian: No worries... Right now... I just need to get this basic system working. *again*
20:52 cryonv I think I need to step away for the day, and try again tomorrow. Something here isn't makeing any sense, and it feels like I'm banging my head against a brick wall.
20:55 cryonv left #gluster
20:57 arif-ali joined #gluster
21:04 farhoriz_ joined #gluster
21:05 farhorizon joined #gluster
21:08 Pupeno joined #gluster
21:18 JesperA joined #gluster
21:26 newdave joined #gluster
21:36 jlrgraham joined #gluster
21:36 Alghost joined #gluster
21:38 hchiramm joined #gluster
21:38 Pupeno_ joined #gluster
21:39 robb_nl joined #gluster
21:40 derjohn_mob joined #gluster
22:20 JesperA- joined #gluster
22:27 delhage joined #gluster
22:34 Pupeno joined #gluster
22:34 arcolife joined #gluster
22:37 Alghost joined #gluster
22:38 hchiramm joined #gluster
22:39 farhoriz_ joined #gluster
22:51 newdave joined #gluster
22:56 Jacob843 joined #gluster
23:37 Alghost joined #gluster
23:38 hchiramm joined #gluster
23:41 Alghost_ joined #gluster
23:41 Alghost__ joined #gluster
23:43 RameshN joined #gluster
23:58 plarsen joined #gluster
23:59 newdave joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary