Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-08-11

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:10 gildub joined #gluster
00:24 T0aD joined #gluster
00:50 bene2 joined #gluster
01:07 lyang0 joined #gluster
01:11 bala joined #gluster
01:34 sputnik13 joined #gluster
02:12 cristov_mac joined #gluster
02:15 overclk joined #gluster
02:25 Ramereth joined #gluster
02:36 sputnik13 joined #gluster
02:40 coredump joined #gluster
02:52 sputnik13 joined #gluster
03:02 ACiDGRiM joined #gluster
03:02 sahina joined #gluster
03:04 bharata-rao joined #gluster
03:08 glusterbot New news from newglusterbugs: [Bug 1128525] gluster volume status provides N/A ouput NFS server and Self-heal Daemon <https://bugzilla.redhat.co​m/show_bug.cgi?id=1128525>
03:26 sputnik13 joined #gluster
03:33 sputnik13 joined #gluster
03:43 kshlm joined #gluster
03:53 shubhendu__ joined #gluster
03:56 dockbram joined #gluster
03:56 itisravi joined #gluster
04:12 kanagaraj joined #gluster
04:13 kdhananjay joined #gluster
04:18 nbalachandran joined #gluster
04:23 kshlm joined #gluster
04:28 jiku joined #gluster
04:28 kanagaraj_ joined #gluster
04:32 lalatenduM joined #gluster
04:33 sputnik13 joined #gluster
04:33 nishanth joined #gluster
04:37 Rafi_kc joined #gluster
04:39 anoopcs joined #gluster
04:58 ramteid joined #gluster
05:01 ndarshan joined #gluster
05:18 ppai joined #gluster
05:20 prasanth_ joined #gluster
05:20 gildub joined #gluster
05:21 spandit joined #gluster
05:22 JoeJulian BossR_: Every question you've asked, short of "where is the documentation" has been answered.
05:25 msvbhat joined #gluster
05:27 hagarth joined #gluster
05:27 overclk joined #gluster
05:37 dusmant joined #gluster
05:52 rastar joined #gluster
06:09 nshaikh joined #gluster
06:10 anands joined #gluster
06:13 rastar joined #gluster
06:16 raghu joined #gluster
06:19 karnan joined #gluster
06:21 jiffin joined #gluster
06:25 ricky-ticky joined #gluster
06:32 kshlm joined #gluster
06:38 fsimonce joined #gluster
06:44 chriso73 joined #gluster
06:49 shylesh__ joined #gluster
06:50 bala joined #gluster
06:53 mbukatov joined #gluster
06:55 ctria joined #gluster
07:03 ekuric joined #gluster
07:03 BossR_ JoeJulian - Here is the problem - I am not on here 24/7 - there is usually a 6-8 hour delay.  Forgive me for sounding demanding it was not my intention - the biggest thing I am dealing with is a substantial increase in latency.  So I wanted to know if I need to use the glusterfs-client to write to the volume - or can I write directly into each glusterfs node's copy directly... so basically to give you the scenario I am thin
07:03 BossR_ king... having 3 web servers all running nginx/php/apc/glusterfs and all sharing the same copy ... basically doing a multi-master lsyncd configuration with GlusterFS
07:05 andreask joined #gluster
07:08 JoeJulian BossR_: You have to write to the client mount. If latency is increasing, my guess is that your directories are getting bigger.
07:09 hybrid512 joined #gluster
07:09 ekuric joined #gluster
07:10 saurabh joined #gluster
07:10 BossR_ LOL, the directory is already large... 25,000+ files and directories
07:10 ramteid joined #gluster
07:10 BossR_ I am moving from a single web server to a 3-5 node cluster and need to share files
07:11 JoeJulian Can't you tree it out somehow?
07:11 keytab joined #gluster
07:12 BossR_ no I am somewhat limited to the changes I can make... I am thinking about a hybrid solution - php on the local web servers lsyncd from a master to slaves... and put the static content on the gluster volume
07:13 BossR_ but I am still not sure how much performance I will gain and if it will be enough
07:14 bala joined #gluster
07:18 hagarth joined #gluster
07:18 nbalachandran joined #gluster
07:38 JoeJulian BossR_: That's what engineering and testing are for.
07:39 JoeJulian out of curiosity, it's a php app. What's restricting you from changing it? Just office politics?
07:44 BossR_ money
07:44 BossR_ we are using an prebuilt CMS called www.socialengine.com - and we have already invested a lot into it...
07:45 JoeJulian I keep falling asleep at my desk. I'm going to bed. later.
07:55 nbalachandran joined #gluster
07:55 ghenry joined #gluster
07:55 ghenry joined #gluster
08:04 rastar joined #gluster
08:05 hagarth joined #gluster
08:07 hybrid512 joined #gluster
08:18 Norky joined #gluster
08:32 BossR joined #gluster
08:38 Guest71832 joined #gluster
08:40 Slashman joined #gluster
08:43 R0ok_ I'm testing gluster 3.5.2 on centos 7 &  keep getting selinux error "SELinux is preventing /usr/sbin/glusterfsd from write access on the sock_file ."  I updated selinux-policy & selinux-policy-targeted, then reloaded the policy by running load_policy. I still got the same selinux error. Has anyone experienced this before?? ideas?
08:43 BossR joined #gluster
08:44 shubhendu__ joined #gluster
08:44 dusmant joined #gluster
08:45 nishanth joined #gluster
08:47 stickyboy JoeJulian: Does splitmount still work on GlusterFS 3.5?
08:50 stickyboy JoeJulian: It creates the mount root but doesn't actually mount anything.  CentOS 6.5 here.
08:53 R0ok_ After running load_policy to reload selinux policy, I did a reboot, & there was no selinux error on systemctl status glusterd
08:54 atalur joined #gluster
08:56 vimal joined #gluster
09:03 stickyboy joined #gluster
09:03 stickyboy joined #gluster
09:04 ctria joined #gluster
09:15 mbukatov joined #gluster
09:16 msciciel joined #gluster
09:16 shubhendu__ joined #gluster
09:18 nishanth joined #gluster
09:19 wgao joined #gluster
09:23 keytab joined #gluster
09:28 stickyboy Having fun resolving some weird split brain
09:36 Guest71832 joined #gluster
09:48 nishanth joined #gluster
09:48 deepakcs joined #gluster
09:48 shubhendu__ joined #gluster
09:49 ndarshan joined #gluster
09:51 karnan joined #gluster
09:52 bala joined #gluster
09:55 mhoungbo joined #gluster
10:08 edward1 joined #gluster
10:30 ctria joined #gluster
10:39 ricky-ti1 joined #gluster
10:46 doekia joined #gluster
10:46 doekia_ joined #gluster
10:50 doekia joined #gluster
10:54 chriso73 joined #gluster
10:55 LebedevRI joined #gluster
11:08 bene2 joined #gluster
11:08 tdasilva joined #gluster
11:17 rsavage_ joined #gluster
11:17 rsavage_ morning
11:18 ppai joined #gluster
11:26 ndarshan joined #gluster
11:27 shubhendu__ joined #gluster
11:29 dusmant joined #gluster
11:42 andreask joined #gluster
11:45 nshaikh joined #gluster
11:47 nishanth joined #gluster
11:48 chriso73 joined #gluster
12:01 ira joined #gluster
12:01 Slashman_ joined #gluster
12:02 dusmant joined #gluster
12:11 lalatenduM joined #gluster
12:13 mojibake joined #gluster
12:19 Philambdo joined #gluster
12:21 bene2 joined #gluster
12:22 bennyturns joined #gluster
12:22 itisravi_ joined #gluster
12:24 diegows joined #gluster
12:29 caiozanolla hello ppl, despite having restarted glusterd process on both nodes, I still have a non functional self healing daemon. It used to work fine before, but stopped working when I changed hardware (and SO, it it was running on amazon linux 2014, now on centos6), anyway, its a 2 node replicated setup. its running the same version as before 3.5.2
12:30 caiozanolla it shows as offline on both nodes. also, I noted something strange. "gluster v status" shows only its own bricks, not the peer's.
12:30 rsavage_ what does gluster peer status show?
12:34 coredump joined #gluster
12:46 itisravi_ joined #gluster
12:46 rsavage_ I have a question.  I have a 3 node Gluster server setup which has multiple volumes.  Most of the volumes are Replicants only, and one of them is a Distributed-Replicate.   I took one of them down by doing a service glusterd stop.   I did this because I wanted to rsync the volume data over to new XFS/SSD backed drives.  When I did this, and started gluster back up, all looked well enought, but then I ran into multiple problems.   On
12:47 rsavage_ started getting I/O errors, which caused me to stop the volume on all three gluster nodes
12:47 rsavage_ and then start them back up
12:47 rsavage_ this seemed to work while the volumes resynced up but now I am running into issues where the volumes aren't fully in sync...
12:48 rsavage_ According to the documents I read a simple gluster volume stop was all I needed to do.   Is there a better way I can do this in the future?  I am runnin gGluster 3.5.1
12:48 diegows joined #gluster
12:49 kkeithley `service glusterd stop` only stops glusterd, it doesn't stop any of the volumes. You have to explicitly stop the volumes before you stop glusterd if that's what you want.
12:50 rsavage_ kkeithley I don't want the volumes to stop on the other nodes.
12:50 rsavage_ I am only doing maintenance on server3
12:50 ctria joined #gluster
12:50 kkeithley then you need to explicitly kill the glusterfsd and glusterfs processes on _this_ node
12:51 rsavage_ kkeithley When I did service glusterd stop, eventually everything was down.
12:51 rsavage_ on server3
12:51 rsavage_ so then I rsynced everything from /oldvolume to /newvolume
12:52 B21956 joined #gluster
12:52 rsavage_ mount /newvolume as the proper mount point, and then started glusterd
12:52 rsavage_ everything started up like it should
12:53 rsavage_ and gluster volume status looked right
12:53 kkeithley normally stopping glusterd does not stop any of the volumes
12:53 rsavage_ kkeithley I did not want to _stop_ any volumes.
12:54 rsavage_ kkeithley my volumes needed to be up on server1 and server2
12:54 rsavage_ I took server3 down for maintenance
12:54 rsavage_ by simply doing service glusterd stop
12:54 rsavage_ and then checking to see when the gluster processes where gone on server3 only
12:54 rsavage_ once that was clear on server3
12:54 kkeithley normally stopping glusterd does not stop any of the gluster{fs,fsd} processes
12:54 rsavage_ kkeithley hmm
12:55 rsavage_ I am pretty sure the processes were all down on server3
12:55 rsavage_ before I did anything...
12:55 rsavage_ ps -ef|grep -i gluster, showed no returns
12:56 rsavage_ so then I did the rsync
12:56 rsavage_ and once that was complete, renamed the mounts, so the new mount matched the existing one, etc...
12:56 rsavage_ started glusterd and all was well for hours
12:56 rsavage_ and then I started getting File I/O errors
12:57 rsavage_ I had to stop the volume, and then start it
12:57 rsavage_ and that seemed to fix the problem
12:57 rsavage_ but...
12:57 rsavage_ now I am running into an issue where entries are not getting synced
12:57 rsavage_ after 2 days
13:00 diegows joined #gluster
13:07 julim joined #gluster
13:08 ninthBit joined #gluster
13:11 julim joined #gluster
13:13 theron joined #gluster
13:14 dblack joined #gluster
13:15 ninthBit should there be any extra concern if in a heal status on a replica-set the same file shows up on the two bricks?  i'll try rewording that. in gluster heal status for a volume i have files that how up on both bricks as heal status entries.  the files have show up longer than 10 minutes and i am starting to look into if there is a problem.
13:16 ninthBit good news that i have been continuously testing gluster version 3.4.5 from semiosis in my test environment and so far it has held up.
13:17 ninthBit semiosis: thank you for pushing the 3.4.5 again.  i still have to follow up with you if we had the same issues with the bugs fixed in 3.4.5
13:19 ninthBit the files showing up on both bricks in the replica-set heal status one is only 1.1K in size.
13:21 msmith joined #gluster
13:23 diegows joined #gluster
13:28 B219561 joined #gluster
13:29 theron joined #gluster
13:32 R0ok_ joined #gluster
13:35 ninthBit How does gluster check if it is in sync with other nodes?
13:36 ninthBit how much network usage mght the sync check between nodes use up?
13:46 tdasilva joined #gluster
13:46 JustinClift joined #gluster
13:56 B21956 joined #gluster
13:56 ramteid joined #gluster
13:57 B21956 joined #gluster
13:57 julim_ joined #gluster
13:58 georgem2 joined #gluster
14:04 hagarth joined #gluster
14:04 tdasilva joined #gluster
14:06 andreask joined #gluster
14:08 chirino joined #gluster
14:11 Rafi_kc left #gluster
14:14 bala joined #gluster
14:18 XpineX_ joined #gluster
14:19 sputnik13 joined #gluster
14:19 chirino joined #gluster
14:25 wushudoin joined #gluster
14:25 ninthBit ok, more details on the duplicate files showing up in the heal status.  One file shows up on all bricks. distributed replica-set(2) is the setup.  one replica-set has the file with the t-bit set.  the other replica-set has the correct file and that is what the volume is returing when queried.  Now, would this indicate any kind of problems with my setup?
14:25 ninthBit the file shows up on both replica-sets appears to be a problem to me.  is this a correct assumption?
14:25 gmcwhistler joined #gluster
14:26 ninthBit the other. when gluster is writing the file is gluster the system setting the t-bit?
14:31 plarsen joined #gluster
14:34 bala joined #gluster
14:40 sputnik13 joined #gluster
14:40 glusterbot New news from newglusterbugs: [Bug 1128771] 32/64 bit GlusterFS portability <https://bugzilla.redhat.co​m/show_bug.cgi?id=1128771>
14:42 _Bryan_ joined #gluster
14:44 firemanxbr joined #gluster
14:44 llabatut joined #gluster
14:44 llabatut hi all
14:49 stickyboy joined #gluster
14:49 stickyboy joined #gluster
14:49 nage joined #gluster
14:50 dastar joined #gluster
14:51 dastar i need help to re-peer an brick in a replicate with 18M files
14:51 dastar hi
14:51 glusterbot dastar: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
14:54 dastar one of server was destroy. i was re-install a new server, when i add it in the replicate, the 'primary' server comsume 395% of cpu
14:54 dastar all gluster-client timeout to read the mount point
14:55 dastar my topology is very simple : 2 servers,  one volume
14:56 julim joined #gluster
14:56 dastar i am using 3.3.1-1 under debian
14:59 diegows joined #gluster
15:02 Xanacas joined #gluster
15:02 bala joined #gluster
15:03 bennyturns joined #gluster
15:04 Slashman_ hello, I have a question regarding big rsync of glusterfs, currently I use rsync with NFS or fuse from a mirror glusterfs with 2 members to an other server, it is very slow, since this is replicated setup with only 2 members, is there a problem if I simply rsync from the underlying fs on one of the 2 peers? (I guess rsync will copy the metadata used by gluster)
15:04 Xanacas left #gluster
15:05 jbrooks joined #gluster
15:10 theron joined #gluster
15:10 theron joined #gluster
15:11 theron joined #gluster
15:12 theron joined #gluster
15:14 ndk joined #gluster
15:15 muhh joined #gluster
15:16 theron_ joined #gluster
15:17 ekuric joined #gluster
15:18 ndk joined #gluster
15:19 JoeJulian rsavage_: When you rsync'd, did you make sure to copy the extended attributes?
15:20 JoeJulian rsavage_: (assuming you were just moving the brick and rsync'ing from the local machine to itself)
15:21 JoeJulian ninthBit: "How does gluster check if it is in sync with other nodes?" see ,,(extended attributes)
15:21 glusterbot ninthBit: (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://hekafs.org/index.php/2011/​04/glusterfs-extended-attributes/
15:22 JoeJulian dastar: 3.3.1's pretty old. Did you replace the destroyed server with the same hostname or did you use a new hostname?
15:23 JoeJulian Slashman_: Have you considered using geo-replication?
15:24 Slashman_ JoeJulian: this may be a good idea, I'll have to read the doc for that, I have only done replicated/distributed setup until now
15:24 shubhendu__ joined #gluster
15:24 sputnik13 joined #gluster
15:24 JoeJulian I think it will do what you're trying to do better than how you're trying to do it. :D
15:25 dastar the same hostname
15:26 Slashman_ it should do this then : peer1 <= gluster replication => peer2 ==>> geo-replication ==>> backup server
15:27 JoeJulian dastar: This process should still work even though you're way back on 3.3.1: http://gluster.org/community/documen​tation/index.php/Gluster_3.4:_Brick_​Restoration_-_Replace_Crashed_Server
15:27 glusterbot Title: Gluster 3.4: Brick Restoration - Replace Crashed Server - GlusterDocumentation (at gluster.org)
15:28 JoeJulian Slashman_: yes
15:30 Slashman_ JoeJulian: interesting, I'm reading http://blog.gluster.org/category/geo-replication/, I'm not familliar with it obviously, but can I make consistent snapshot on the backup server? at the moment I'm using zfs to snapshot before rsync
15:31 JoeJulian No
15:31 JoeJulian but you might be able to figure something out. The gsyncd client is in python. Probably can hook something in there pretty easily.
15:32 Slashman_ I see, I'll look at it, thank you for those useful information :)
15:32 skippy we're looking at using a Gluster volume for "live" data, and another Gluster volume for long-term archiving.  The live data will get purged every X days, but the archive will live forever, hopefully using the WORM feature.
15:33 skippy could we use geo-replication to get new files from the live volume to the arvhive volume?
15:33 skippy and just let the WORM feature deny deletes / changes?
15:34 JoeJulian I think that should work, yeah. If it doesn't, then the geo-rep will go into error state. :/
15:34 B21956 joined #gluster
15:35 sahina joined #gluster
15:36 dastar JoeJulian: ok, i will try this post
15:39 georgem2 left #gluster
15:39 cultav1x joined #gluster
15:40 dastar JoeJulian: thanks for the help
15:40 glusterbot New news from newglusterbugs: [Bug 1128820] Unable to ls -l NFS mount from OSX 10.9 client on pool created with stripe <https://bugzilla.redhat.co​m/show_bug.cgi?id=1128820>
15:43 bala joined #gluster
15:46 shubhendu__ joined #gluster
15:50 daMaestro joined #gluster
15:53 B21956 joined #gluster
15:58 semiosis ninthBit: glad to hear it. thanks for the feedback.
16:02 Peter2 joined #gluster
16:03 Peter2 any clue why glustershd.log not rolling?
16:04 rwheeler joined #gluster
16:05 dusmant joined #gluster
16:11 JoeJulian I don't know what you're asking.
16:16 Peter2 there are no activity on glustershd.log on some of the nodes
16:16 Peter2 maybe just no activity?
16:18 JoeJulian Maybe... maybe your log rotation leaves the old one open ( semiosis and I recommend copytruncate ).
16:18 Peter2 it's already on copytruncate...
16:23 JoeJulian Tools you can use to determine your answer would be things like lsof, strace, maybe even gdb if you really get in to it.
16:23 JoeJulian kill -HUP will re-open log files.
16:24 sjm joined #gluster
16:24 JoeJulian And, of course, you can always just kill it and restart it by restarting glusterd.
16:24 Peter2 ic, that's what i did
16:24 Peter2 just wonder if the underneath process are actually hang
16:24 Peter2 http://pastie.org/9462392
16:24 glusterbot Title: #9462392 - Pastie (at pastie.org)
16:24 Peter2 the brick ports were in N/A
16:24 Peter2 and wonder how it happened
16:25 Peter2 i had to stop glusterfs-server then pkill -f glusterfs then start
16:25 Peter2 i m on 3.5.2 and been a lot stable and better then 3.5.1
16:25 Peter2 but this is the few issues that i expereinced for the last couple days
16:28 sjm joined #gluster
16:29 shubhendu__ joined #gluster
16:34 dlozier joined #gluster
16:40 dlozier Is anyone using glusterfs with qemu on ubuntu 14.04 - just wondering what the steps were for the standard glusterfs package or perhaps 3.5 is needed too?
16:42 semiosis dlozier: last time i checked it was necessary to use a modified version of qemu that was compiled with glusterfs support, in addition to having glusterfs installed.  the version qemu is compiled with should match the version of glusterfs on the servers
16:43 semiosis i build qemu for glusterfs 3.4.x (i dont remember exactly) in this ppa, https://launchpad.net/~semiosis/+ar​chive/ubuntu/ubuntu-qemu-glusterfs, but it's outdated now
16:44 semiosis s/build/built/
16:44 glusterbot What semiosis meant to say was: i built qemu for glusterfs 3.4.x (i dont remember exactly) in this ppa, https://launchpad.net/~semiosis/+ar​chive/ubuntu/ubuntu-qemu-glusterfs, but it's outdated now
16:54 cultav1x joined #gluster
16:57 bene2 joined #gluster
17:05 julim joined #gluster
17:21 MacWinner joined #gluster
17:24 zerick joined #gluster
17:37 dlozier is there an updated qemu package for 3.4.x then? or should https://launchpad.net/~semiosis/+ar​chive/ubuntu/ubuntu-qemu-glusterfs be used?
17:37 glusterbot Title: ubuntu-qemu-glusterfs : semiosis (at launchpad.net)
17:54 caiozanolla hello, me again, despite having restarted glusterd process on both nodes I still have a non functional self healing daemon. It stopped working when I changed hardware and distro but not hostname or gluster version. It now runs on centos6 its a 2 node replicated setup and its running the same version as before 3.5.2 "gluster v status" shows only its own bricks and self-healing is offline on both nodes. Replication i working find and I've restarted
17:54 caiozanolla glusterd on both nodes several times. dont know what to do anymore.
18:00 mjrosenb morning, all.
18:01 mjrosenb I've noticed that when my init scripts start glusterd, it doesn't spawn glusterfsd
18:01 mjrosenb but if I kill glusterd, and run it manually, it works.
18:02 edwardm61 joined #gluster
18:03 skippy should adding bricks be a non-invasive operation? My testing suggests that adding bricks causes major client interruptions: https://gist.github.com/skpy/1fb1297815d0b02df326
18:03 glusterbot Title: gluster add bricks (at gist.github.com)
18:03 JoeJulian caiozanolla: iptables or selinux?
18:04 JoeJulian skippy: Nope, that's not normal. paste up your client log.
18:07 getup- joined #gluster
18:12 skippy JoeJulian: https://gist.github.com/skpy/1fb1​297815d0b02df326#comment-1279576
18:12 glusterbot Title: gluster add bricks.md (at gist.github.com)
18:14 caiozanolla JoeJulian, iptables is disabled. Is there any specific issue with selinux? it is in its default state (enforcing, targeted)
18:15 JoeJulian check auth.log
18:15 JoeJulian you too, skippy
18:15 JoeJulian skippy: on your new brick servers.
18:16 skippy no such log, JoeJulian
18:16 JoeJulian ok
18:17 caiozanolla JoeJulian, u mean audit.log?
18:18 JoeJulian d'oh! That's what I meant.
18:18 skippy i dont have one of those, either. :)
18:19 skippy well, I do have /var/log/audit/audit.log; but no such log in /var/log/gluster/
18:19 caiozanolla no mentions about glusterd in audit.log
18:20 caiozanolla although it is targeted
18:20 rsavage_ What does it mean when you see a file on Gluster client with this "??????????? ? ?      ?            ?            ? f1b068026eb843c7ace232d60c0dbb70.jpeg"
18:20 caiozanolla let me change that to permissive, targeted
18:21 rsavage_ I see it fine on the Gluster server.  Is there a way to force a heal just on that file?
18:23 JoeJulian rsavage_: check the client log to see what the error is.
18:23 rsavage_ ls: cannot access f1b068026eb843c7ace232d60c0dbb70.jpeg: Input/output error
18:24 rsavage_ 722749: LOOKUP() /prod/content/f1b06/f1b068026​eb843c7ace232d60c0dbb70.jpeg => -1 (Input/output error)
18:24 rsavage_ Received NULL gfid for /prod/content/f1b06/f1b068026​eb843c7ace232d60c0dbb70.jpeg. Forcing EIO
18:24 rsavage_ also
18:24 rsavage_ gfid or missing entry self heal  failed,   on /prod/content/f1b06/f1b068026​eb843c7ace232d60c0dbb70.jpeg
18:34 diegows I have an split-brain issue and I've tried to follow the steps here https://github.com/gluster/gluster​fs/blob/master/doc/split-brain.md to discard data from a node
18:34 glusterbot Title: glusterfs/split-brain.md at master · gluster/glusterfs · GitHub (at github.com)
18:34 caiozanolla joeJulian, disabled SElinux, no dice.
18:35 diegows but I'm still getting errors :(
18:35 diegows info split-brain report the same number of files and if I start the services in one of the nodes (where I can to discard the data) it keeps logging self-hailing failures all the time
18:35 diegows :P
18:40 JoeJulian sorry guys, I've lost my free time for IRC this morning. Please talk amongst yourselves.
18:44 B21956 joined #gluster
18:46 rotbeard joined #gluster
18:46 caiozanolla JoeJulian, disabled iptables and selinux, self-healing is still not running, gluster v status still shows only node's own entries, and log shows this: [glusterd-op-sm.c:3404:glusterd_op_modify_op_ctx] 0-management: op_ctx modification failed
18:55 diegows joined #gluster
19:00 B21956 joined #gluster
19:28 tom[] joined #gluster
19:28 caiozanolla dont know if its relevand or not, but although the cluster is functional and replicating, one peer shows "State: Sent and Received peer request (Connected)" and the other shows "State: Probe Sent to Peer (Connected)"
19:34 _dist joined #gluster
19:35 Elico joined #gluster
19:35 Elico left #gluster
19:36 getup- joined #gluster
19:39 semiosis caiozanolla: try restarting glusterd on both of those peers.  that may get them into the correct state of "Peer In Cluster (connected)"
19:52 diegows joined #gluster
19:56 caiozanolla semiosis, already tried that, several times. actually, what is really bothering me is that although the cluster is functional and replicating, the self healing daemon is not running on any nodes.
20:00 bene joined #gluster
20:11 ron-slc joined #gluster
20:13 andreask joined #gluster
20:16 caiozanolla another strange thing is that if i empty /var/lib/glusterd (preserving glusterd.info), start glusterd and peer probe the other peer, the information is synced (writen to /var/lib/glusterd). gluster v status will show all bricks offline and those bricks will only go online if I "gluster v sync nodeA all". without sync it wont show bricks online.
20:16 siel joined #gluster
20:36 semiosis caiozanolla: what version of glusterfs?
20:36 caiozanolla 3.5.2
20:38 caiozanolla semiosis, 3.5.2
20:38 semiosis yes i see thanks
20:38 semiosis caiozanolla: you set selinux to permissive & restarted?
20:39 semiosis did you check for errors in the shd.log file?  maybe it tried to start but died
20:39 caiozanolla semiosis, disabled selinux completelly.
20:40 caiozanolla semiosis, also disabled ipv6 and iptables.
20:41 caiozanolla added the fqdn from each node to its own /etc/hosts
20:42 caiozanolla there is no firewall between those machines (running on aws, same security group, allowing all tcp and udp ports amongst them)
20:42 caiozanolla I can telnet all listenning ports on/from each machine
20:43 caiozanolla only strange thing in logs is this: [glusterd-op-sm.c:3404:glusterd_op_modify_op_ctx] 0-management: op_ctx modification failed
20:44 caiozanolla when running "gluster v status"
20:47 PsionTheory joined #gluster
20:53 caiozanolla other nodes names are being resolved correctly.
20:57 nullck joined #gluster
21:06 semiosis caiozanolla: can you truncate logs on a server, restart, and pastie.org the logs?
21:07 caiozanolla semiosis, sure can, just a sec!
21:07 andreask joined #gluster
21:12 caiozanolla semiosis, sorry, but pastie is acting up… here it is: http://paste.ubuntu.com/8020638/
21:12 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
21:13 JustinClift caiozanolla: Use fpaste.org then?
21:13 JustinClift Ahhh, you found an alternative already.  Ignore that. :)
21:13 * JustinClift wanders off
21:15 caiozanolla semiosis, and this is "gluster v status"  w debug… http://paste.ubuntu.com/8020666/
21:15 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
21:21 semiosis caiozanolla: what about the shd log?  do you have one of those?
21:21 caiozanolla semiosis, i dont have this file… who generates it?
21:22 semiosis the file is /var/log/glusterfs/glustershd.log (on my 3.4.2 server) and it's generated by the shd process
21:22 semiosis see ,,(processes)
21:22 glusterbot The GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal).
21:23 getup- joined #gluster
21:29 caiozanolla semiosis, there is no glustershd file. the self healing daemon is not running and it shows as offline on gluster v status. actually, what I really need is self healing to work again.
21:29 semiosis ,,(pasteinfo)
21:30 glusterbot Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
21:30 semiosis whatever paste site you want
21:30 semiosis i'm fine with paste.ubuntu.com obviously
21:31 caiozanolla semiosis, this is from node1… http://paste.ubuntu.com/8020787/
21:31 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
21:32 caiozanolla semiosis, and this is from node2… http://paste.ubuntu.com/8020790/
21:32 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
21:32 semiosis sorry i'm running out of ideas.  afaik the shd should always be running.
21:33 caiozanolla semiosis, ok, let me just get you up to speed on what has been done...
21:39 caiozanolla 2 nodes working fine, running full heal as 1st data load continued with one node crashed, had more crashing issues, finger pointed to kernel 3.10, took one machine down (amazon linux), had another on up w centos6 minimal kernel 2.6.32-431, followed std procedure to replace crached machine, kept glusterd.info, peer probe, volume sync all, node is up. reinse repeat, both nodes replaced. obviously, kept same uuid, volume id, hostnames. now, cluster wo
21:39 caiozanolla fine, it is replicating, clients will heal files, but no node will have their bricks online wo a "sync vol all" and info from one node wont show on the second, and vice versa.
21:48 semiosis if you run 'gluster peer status' on all the servers, do they all show all the other servers as "Peer in Cluster (connected)"?
21:48 azalime joined #gluster
21:50 drajen joined #gluster
21:52 azalime Hi guys, is there a way to sync glusterfs replica without mounting the volumes? for instance if I touch a file test1 on node1:/vol1 it'll be synced to node2:/vol1 without actually using mount.glusterfs node1:/vol1 /somevolume then touch /somevolume/test1.
22:00 theron joined #gluster
22:00 Peter2 why am i getting these errors?
22:00 Peter2 http://pastie.org/9463158
22:00 glusterbot Title: #9463158 - Pastie (at pastie.org)
22:01 Peter2 seems like when we try to delete certain files will get UNLINK (Permission denied) error
22:01 semiosis azalime: all access should go through a client mount.
22:01 semiosis azalime: once a directory is used as a brick in glusterfs you should not write to it by any other means
22:02 semiosis except a glusterfs client mount
22:16 marmalodak joined #gluster
22:34 caiozanolla semiosis, yes, see: http://paste.ubuntu.com/8021217/
22:34 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
22:35 semiosis caiozanolla: hmm not sure about that state
22:35 semiosis never seen that one before
22:35 semiosis was expecting "Peer in cluster (connected)"
22:36 caiozanolla semiosis, each shows at the other's /var/lib/glusterd/peers/
22:37 caiozanolla semiosis, see: http://paste.ubuntu.com/8021245/
22:37 glusterbot Title: Ubuntu Pastebin (at paste.ubuntu.com)
22:43 caiozanolla semiosis, got to leave now, but i'll be back in a few hours, if you have any insight on what is wrong, then please! anyways, thanks man!
22:44 semiosis ok good luck
22:44 semiosis yw
22:44 semiosis my advice would be 1. get peers into good state, 2. stop volumes, 3. start volumes
22:44 semiosis but i suspect you can't do 2 or 3 because peers are not in good state right now
22:45 semiosis could test by doing a simple config change, like setting client log level on a volume
22:48 zerick joined #gluster
22:49 Slashman joined #gluster
23:26 azalime q
23:29 gildub joined #gluster
23:42 glusterbot New news from newglusterbugs: [Bug 1121014] Spurious failures observed in few of the test cases <https://bugzilla.redhat.co​m/show_bug.cgi?id=1121014> || [Bug 1122732] remove volume hang glustefs <https://bugzilla.redhat.co​m/show_bug.cgi?id=1122732> || [Bug 1091898] [barrier] "cp -a" operation hangs on NFS mount, while barrier is enabled <https://bugzilla.redhat.co​m/show_bug.cgi?id=1091898> || [Bug 917901] Mismatch in calculation for quota dire
23:43 Andreas-IPO joined #gluster
23:48 glusterbot New news from resolvedglusterbugs: [Bug 852318] license: server-side dual license GPLV2 and LGPLv3+ <https://bugzilla.redhat.com/show_bug.cgi?id=852318> || [Bug 896410] gnfs-root-squash: write success with "nfsnobody", though file created by "root" user <https://bugzilla.redhat.com/show_bug.cgi?id=896410> || [Bug 896411] gnfs-root-squash: read successful from nfsnobody for files created by root <https://bugzilla.redhat.com/show_bug.cgi?id=896411

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary